Azure Reference Architecture

Introduction

The goal of the Azure Reference Architecture is to help organizations quickly develop and implement Microsoft Azure-based solutions while reducing complexity and risk. The Azure Reference Architecture combines Microsoft software and recommended compute, network, and storage guidance to support the extension of their datacenter environment through the use of Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) constructs.

Scope

The scope of this document is to provide the necessary guidance to develop Microsoft Azure-based solutions by establishing an Azure subscription model that meets the business, identity, security, infrastructure, and development requirements held by most organizations adopting a public cloud services strategy.

The focus of this document is on the design and implementation guidelines for general Azure subscription planning. This document is not intended to replace existing documentation about Microsoft Azure features. It seeks to integrate and complement that information with associated design guidance. For most organizations that want to seamlessly integrate Azure services, a firm understanding of the features and capabilities of the Azure platform along with tested models and practices is key towards proper consumption and adoption of the services.

This This document's primary scope focuses on the generally available (GA) feature set of Azure. Azure features and capabilities are surfaced in one of three ways:

  • Private Preview
  • Public Preview
  • Generally Available

Preview features are included in this document where possible: however, the primary focus is on conveying the tested design practices and solutions based on GA features. Preview features may not have full capabilities, global scale, or repeatable design patterns that can be leveraged in your planning activities.

Cloud OS

The Cloud Platform is Microsoft's vision of a modern platform for the world's apps. It provides a platform that is unified across on-premises, service provider, and Microsoft Azure environments. The Cloud Platform delivers the hybrid cloud, which effectively provides one consistent platform that spans customer datacenters and multiple clouds.

The Infrastructure as a Service (IaaS) product line architecture (PLA) utilizes the core capabilities of Windows Server, Hyper-V, and System Center to deliver a private cloud IaaS offering.

The Azure Reference Architecture compliments the IaaS PLA and completes the Cloud Platform vision by providing a reference architecture and design patterns for the public cloud (Microsoft Azure).

Microsoft Cloud Service Provider Program

The Microsoft Cloud Service Provider (CSP) program allows service providers to sell Microsoft cloud services along with their own offerings and services. Partners own the complete customer lifecycle through direct billing, provisioning, management, and support. The CSP program enables service providers to:

  • Create a customer offer, set the price, and own the billing terms
  • Integrate service offerings with Microsoft cloud services
  • Stay at the center of the Microsoft cloud customer lifecycle

Microsoft Azure is an open and flexible cloud platform that enables service providers to rapidly build, deploy, and manage secure applications to scale on premises, in the cloud, or both. Bringing Azure to Cloud Service Providers enables partners to capitalize on this Azure opportunity with the capabilities of a CSP, where partners own the end-to-end customer lifecycle with direct provisioning, billing, and support of Microsoft's cloud services.

Modern Datacenter and Cloud Offering Portfolio

The Datacenter and Cloud Infrastructure Services portfolio from Microsoft Enterprise Services is designed to help organizations implement technologies that introduce the efficiency and agility of cloud computing, along with the increased control and management of infrastructure resources.

The key attribute of the Cloud Platform vision is a hybrid infrastructure, in which customers have the option of utilizing an on-premises infrastructure or services provided by Azure. The IT organization is a consumer and a provider of services. This enables workload and application development teams to make sourcing selections for services from any of the provided infrastructures or to create solutions that span them.

The Datacenter and Cloud Infrastructure Services portfolio are Microsoft Services engagements and frameworks through which Intellectual Property (IP), such as the IaaS Product Line Architecture (PLA) and the Azure Reference Architecture is delivered. The portfolio includes offerings for scenarios such as infrastructure deployment, consolidation and migration, modernization, automation, and operations. All of the offerings and scenarios leverage the best practices and design patterns found in the IaaS PLA and the Azure Reference Architecture.

Azure Reference Architecture Overview

The Azure Reference Architecture (AZRA) is an initiative to address the need for detailed, modular, and current architecture guidance for solutions being built on Microsoft Azure. AZRA is a collection of materials including design guidance and design patterns to support a structured approach to architecting services and applications hosted within Microsoft Azure.

Unlike the Microsoft PLAs, it is not the intention of the Azure Reference Architecture to result in a single design, nor will it encompass an exhaustive definition of Azure features. The primary reason for this approach is that customer solutions that use Azure services vary greatly in their implementation. Given the pace of changes and enhancements to Azure services, it is critical that organizations are provided with durable recommended practices related to subscription and architectural planning within Microsoft Azure.

The focus of the Azure Reference Architecture is to identify common services and reusable models that can be used broadly when designing cloud-based solutions. These models assist customers through a series of decision points that lead to reusable design patterns. They are based on successful customer implementations and recommended practices.

Azure Deployment Models and Audiences

Unlike many on-premises solutions, Azure deployment models vary in size, composition, and end-state design, which presents a clear challenge to organizations looking to build solutions based on established standards and best practices. Although there are significant variances between projects that use Azure services, many of these can be classified into a small number of key deployment models and audiences. Each audience or model falls into two broad focus areas: Development or Infrastructure. Additionally, deployment models differ by the subscription ownership type; whether it's the customer organization or their Cloud Solution Provider (CSP) who manages their Azure subscriptions.

Within these categories and corresponding constraints, the following deployment models and audiences can be defined:

  • Customer-Owned Models
  • Cloud Service Provider-Managed Models

Customer-Owned Models

  • Application Owner hosting their application in Azure – A native public cloud or hybrid cloud scenario where an application development team within a customer environment wants to take advantage of Azure capabilities to host their application outside of services managed by Enterprise IT.
  • Application Division (Business Unit IT) hosting their services in Azure - A native public cloud or hybrid cloud scenario where an application development division within a customer environment wants to take advantage of Azure capabilities to host their suite of applications and development/test environment outside of Enterprise IT.
  • Enterprise IT extending their datacenter infrastructure to Azure - Typically a hybrid cloud scenario where a mature Enterprise IT organization wants to extend their existing physical or virtual environment to Azure to support the large number of growing IT requirements for their organization and its customers.
  • Organization without on-premises IT hosting all infrastructure in Azure - Typically a hybrid cloud scenario where a startup, divesting, or enterprise organization with a distributed workforce is looking to provide traditional IT services without an on-premises infrastructure.


Cloud Solution Provider –Managed Models

  • Cloud Solution Provider - Connect Through – A Cloud Solution Provider (CSP) public cloud scenario where a CSP provides cloud services and/or hosts, and directly manages, customer application workloads deployed on top of Azure services. In the "Connect-Through" model, the customer consumes the provider's cloud services delivered via the provider's network with end services hosted in the provider's provisioned Azure subscription. The Azure subscription is created, owned, and managed by the service provider. The following diagram illustrates the Connect Through model:

  • Cloud Solution Provider - Connect To – A Cloud Solution Provider (CSP) public cloud scenario where a CSP hosts and manages customer application workloads deployed on top of Azure services. In the "Connect-To" model, the provider makes cloud services accessible directly to the customer's network. The Azure subscription is created, owned, and managed by the service provider but the customer consumes cloud services by interacting directly with Azure cloud footprint.
  • The following diagram illustrates the Connect To model:

When choosing a management approach for consuming Azure services, the decision is driven by how much management the customer wants to deliver versus how much the cloud service provider will deliver; as well as the connectivity approach to Azure. The figure below provides a comparison view of management responsibility based on the CSP scenarios described above. It's important to note that CSP models provide both built-in and optional services that the customer can select from.

When it comes to planning, designing, and consuming Azure services, these categories are complimentary in some respects. In other respects, they have the potential to create divergent paths.

A key consideration to remember is that within any organization or project, developers consume infrastructure. Similarly, infrastructure is deployed to support applications and services. Understanding the needs of both is important towards developing an Azure subscription model that satisfies the needs of the project or organization.

Azure Reference Architecture Guide Use

As outlined previously, the Azure Reference Architecture guide provides the basis for the decisions that must be considered as part of any project that encompasses a solution design using Azure services. The design of the solution should leverage the architecture design patterns (infrastructure, foundation, and solution) described later in this document.

The Azure Reference Architecture guide does not outline a single Azure design for hybrid enterprise solutions. Rather, it provides a comprehensive framework for decisions based on the core Microsoft Azure services, features, and capabilities required by most solutions. The guide is structured to cover each of the broader topic areas outlined previously, and it uses the following framework for each component:

  1. Technology Definition – Each topic area has a brief section that outlines "what" the technology is and general information about its role within the Microsoft Azure service. Given the rate of changes to the Azure service, all applicable references to product documentation for the feature set or capability are provided.
  2. Design Principle – Some features and capabilities within Microsoft Azure are critical to any solution design, and they require that decisions be made by organizations as part of their project. When a specific topic area requires a decision or has a recommended practice, a clearly defined rule for that technology or design principle is identified. This is a useful component in recording key design decisions as part of the project.
  3. Design Guidance – For each technology topic area, this section provides key information, considerations, and a potential design model to help organizations understand design constraints and recommended practices towards implementing this feature or capability within Microsoft Azure.

A sample topic area is outlined here to illustrate this relationship:

Figure 2: Azure Reference Architecture Sample Topic Area

Rule Set Criteria

Rule set requirements are vendor-agnostic and are categorized as one of the following:

Mandatory: Mandatory recommended practice or area that is critical towards building solutions and services within Microsoft Azure. These requirements are necessary for alignment with the reference.

Recommended: Recommended practice or area that represents a standard recommended approach that is strongly recommended when developing a solution or service within Microsoft Azure. However, implementation is at the discretion of each customer and is not required for alignment with the Azure Reference Architecture.

Optional: Optional recommended practice. These requirements are voluntary considerations that can be implemented in the solution or service being developed in Microsoft Azure and can be followed at the discretion of each customer.

Azure Architecture Patterns

Both Public and private cloud environments provide common elements to support running complex workloads. Although these architectures are relatively well understood in traditional on-premises physical and virtualized environments, the constructs found within Microsoft Azure require additional planning to rationalize the infrastructure and platform capabilities found within public cloud environments.

To support the development of a hosted application or service in Azure, a series of patterns are required to outline the various components and to compose a given workload solution. These architectural patterns fall within the following categories:

  • Infrastructure – Microsoft Azure is a platform that provides Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) services, and it is comprised of several underlying services and capabilities. These services largely can be decomposed into compute, storage, and network services. However, there are several capabilities that may fall outside of these definitions. Infrastructure patterns detail a functional area in Microsoft Azure that is required to provide a service to one or more solutions hosted within an Azure subscription.
  • Foundation – When composing a multi-tiered application or service within Microsoft Azure, several components must be used in combination to provide a suitable hosting environment. Foundation patterns compose one or more services from Microsoft Azure to support a layer of functionality within an application. This may require the use of one or more of the components described in the infrastructure patterns outlined previously. For example, the presentation layer of a multi-tier application requires compute, network, and storage capabilities within Azure to become functional. Foundation patterns are meant to be composed with other patterns as part of a solution.
  • Solution – Solution patterns are composed of infrastructure and/or foundation patterns to represent an end application or service that is being developed. It is assumed that complex solutions are not developed independently of other patterns. Rather, they should utilize the components and interfaces defined in each of the pattern categories outlined previously.

This spectrum of patterns is illustrated in the following model.

Figure 3: Azure Architecture Model

Architectural patterns for cloud-hosted workloads (applications and services) should generally adhere to this model and complex scenarios can be implemented using one or more of the pattern types outlined previously. To learn more about the Azure architectural patterns, see Cloud Platform Integration Framework (Azure Architecture Patterns).

The following diagram illustrates how they can be composed to define a solution, application, or service in Microsoft Azure.

Microsoft Azure Overview

Microsoft Azure Services

What is Azure? In short, it's the Microsoft public cloud platform. Microsoft Azure includes a growing collection of integrated services (compute, storage, data, networking, and applications) that help customers move faster, do more, and save money. With Microsoft Azure, you can build an infrastructure, develop modern applications, gain insights from data, and manage identity and access.

Azure offers dozens of different services in the cloud. These services include all of the commonly referenced cloud computing models:

  • Software as a Service (SaaS)
  • Infrastructure as a Service (IaaS)
  • Platform as a Service (PaaS)

These models can be combined and integrated to build complex robust solutions for any audience and use case.

Available Azure Services

The availability of Azure services varies by region and whether the service is currently in Preview or is Generally Available (GA). For up-to-date information about service availability in each datacenter, see the Services by region page. Determining which services are available is a key consideration when deploying applications or enabling services within Azure.

The concept of Azure regions will be covered later, but consider the sample webpage that follows. The area outlined in red serves as an example of a customer-selected region. Using this example, if the requirement was to deploy a solution within the South Central US Azure region, the solution would be constrained from using G-Series virtual machines (currently in Preview and covered in the Compute IaaS section of this document). Conversely, if the solution required G-Series virtual machines, the organization would need to select an Azure region that supports that feature or service.

When deploying solutions in Azure and planning Azure subscription models for your organization, consider the following questions:

  • What services are needed to support your solution?
  • From where are your customers accessing the solution?
  • Do internal or external users (or both) need access to your solution?
  • Do you require geographic redundancy, and do both of your selected regions support the service and feature sets used in your solution?

The answers to these questions will help govern your decisions about regions and service consumption in Microsoft Azure. For additional details about each Azure services offering, refer to Directory of Azure Cloud Services.

Cloud Computing Models

The United States National Institute of Standards and Technology (NIST) published Special Publication (SP) 800-145, "The NIST Definition of Cloud Computing to provide a clear definition about cloud computing to United States government agencies. Since its release, it has become an unofficial standard in the computing industry when it comes to defining cloud models.

Using the definitions provided in NIST SP 800-145, Microsoft Azure (and other online properties, such as Office 365) is classified as a Public cloud offering because it is owned and managed by Microsoft and is open for use by the general public. Within Microsoft Azure and other cloud solutions, Microsoft provides Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS), and Software-as-a-Service (SaaS) capabilities.

Infrastructure as a Service (IaaS)

NIST SP 800-145 defines Infrastructure-as-a-Service (IaaS) as "The capability provided to the consumer is to provision processing, storage, networks, and other fundamental computing resources where the consumer is able to deploy and run arbitrary software, which can include operating systems and applications. The consumer does not manage or control the underlying cloud infrastructure but has control over operating systems, storage, and deployed applications; and possibly limited control of select networking components."

Deploying an application and managing an IaaS environment provides the most flexibility that Azure has to offer. With any deployment choice, there will be pros and cons that must be considered. The greatest benefit of an IaaS implementation is that it offers the greatest amount of control from the operating system to manage access to the application.

IaaS is most like traditional IT delivery. Customers provision their own virtual machines, define their own networks, and allocate their own virtual hard disks. IaaS shifts the burden of operating datacenters, virtualization hosts, and hypervisors. In addition, the business continuity and disaster recovery infrastructure is shifted from the enterprise to the service provider.

Platform as a Service (PaaS)

NIST SP 800-145 defines Platform-as-a-Service (PaaS) as "The capability provided to the consumer is to deploy onto the cloud infrastructure consumer-created or acquired applications created using programming languages, libraries, services, and tools supported by the provider. The consumer does not manage or control the underlying cloud infrastructure including network, servers, operating systems, or storage, but has control over the deployed applications and possibly configuration settings for the application-hosting environment."

With PaaS applications, many of the layers of management are removed and more flexibility is provided than an application running on IaaS instances. Specifically, there is no need to manage the operating system, including patching, which reduces some of the complexity of designing the deployment.

A significant benefit of deploying an application running in a PaaS environment is the ability to quickly and automatically scale up the application to meet the demand when traffic is high, and inversely scale down when the demand is less. Deploying an application in the PaaS model is very cost effective from a scalability and manageability perspective.

PaaS extends IaaS further by providing multitenant services that customers subscribe to. Platform services are a transformational computing model that can dramatically reduce the costs and increase the agility of delivering applications to end users internally and externally. PaaS users bring their own application code but leverage robust platforms, which they do not need to maintain.

Software as a Service (SaaS)

NIST SP 800-145 defines Software-as-a-Service (SaaS) as "The capability provided to the consumer is to use the provider's applications running on a cloud infrastructure. The applications are accessible from various client devices through either a thin client interface, such as a web browser (e.g., web-based email), or a program interface. The consumer does not manage or control the underlying cloud infrastructure including network, servers, operating systems, storage, or even individual application capabilities, with the possible exception of limited user-specific application configuration settings."

Choosing an Azure SaaS offering provides the least amount of responsibility on the customer's side. At the same time, providing a lesser amount of flexibility in comparison with an IaaS or PaaS approach.

SaaS is the real promise of cloud computing. By integrating applications from one or multiple vendors, customers need to bring only their data and configurations. They can eliminate the costs of building and maintaining applications and platform services and still deliver the secure, robust solutions to the end users.

Hybrid

Many scenarios need to implement a blend of Azure offerings to meet the needs of their organization and application requirements. The following diagram highlights the main differences from a manageability perspective, when using public cloud SaaS, PaaS, IaaS and On-Premises implementations

This is important to understand when making a decision about implementation, because each offering has a different impact on the cost, security, scalability, and staff needed to maintain the application or environment.

Azure Datacenter Model

Like most cloud computing services, Microsoft Azure's cloud computing capacity and capabilities are delivered at hyper scale across a series of well-connected global datacenters. These datacenters are represented in constructs such as Azure regions, which are intended to be easily understood by customers and can be easily consumed based on customer needs. This section will review the Azure datacenter model and provide an overview of the constructs established for their use by customers.

Global Datacenter Presence

Microsoft Azure is deployed around the world in strategic areas that best meet the demand of customers. These areas are known as regions, and are placed at distances greater than 300 miles from each other to help avoid the possibility that a common natural disaster would affect more than one region at a time.

Azure Regions

When it comes to deploying an application or service in Microsoft Azure, there needs to be an understanding about the following:

  • What is a Region?
  • Where are the regions located?
  • What are their capabilities of regions with respect to each other?
  • What are the preview features vs. general availability within a region?
  • Are there any restrictions on where you can deploy to a region? For example:
    • Legal compliance
    • Government regulations

Microsoft Azure is a worldwide network of distributed datacenters that are strategically located around the world to support Microsoft Azure customers. This global presence of datacenters provides Microsoft customers with the ability to deploy an application or service in any datacenter in the world or in multiple datacenters. Whether a customer is a small company or a major corporation, all the services Azure has to offer in that particular region can be consumed.

Locations

For list of Azure datacenter locations, see Azure Regions.


Azure operates out of 17 regions around the world. Geographic expansion is a priority for Azure, because it enables our customers to achieve higher performance and it supports their requirements and preferences regarding data location. The following table is provided as reference:

Azure Region

Location

Azure Region

Location

Central US

Iowa

North Europe

Ireland

East US

Virginia

West Europe

Netherlands

East US 2

Virginia

East Asia

Hong Kong

US Gov Iowa

Iowa

Southeast Asia

Singapore

US Gov Virginia

Virginia

Japan East

Saitama Prefecture

North Central US

Illinois

Japan West

Osaka Prefecture

South Central US

Texas

Brazil South

Sao Paulo State

West US

California

Australia East

New South Wales

   

Australia Southeast

Victoria

It is very important to correctly choose a region or regions that meet your organization's needs. There are a number of elements to consider when choosing a region to deploy your applications and services:

  • Data
  • Location of service consumers
  • Service capability and availability
  • Network performance
  • Pricing
  • Redundancy for high availability
Data

Where Azure data is physically stored is very important to most customers. If the organization is restricted by any government regulations or internal company policies about data storage and location, this needs to be transparent. Many times there are restrictions about data export and Government Regulatory Compliance (GRC) for some data sets. This information needs to be understood before deploying any applications or services.

When you create a storage account, you select the primary region for the account. When enabling geographic replication of a storage account, the secondary region is determined based on the primary region, and it cannot be changed. The following table shows the current primary and secondary region pairings when geographically replicated storage is used:

Primary Region

Secondary Region

North Central US

South Central US

South Central US

North Central US

East US

West US

West US

East US

US East 2

Central US

Central US

US East 2

North Europe

West Europe

West Europe

North Europe

Southeast Asia

East Asia

East Asia

Southeast Asia

East China

North China

North China

East China

Japan East

Japan West

Japan West

Japan East

Brazil South

South Central US

Australia East

Australia Southeast

Australia Southeast

Australia East

Service Capability and Availability

As described earlier, all the Azure Regions are not equal when it comes to the available capabilities and services. Azure will first release a new feature in Preview and may only be available in certain regions, prior to being Generally Available (GA).

Before deploying an Azure service, review the following link and choose a region or regions to verify what services are available in your selected region: Services by region.

Network Performance

The network topology of the Internet is complex when looking at bandwidth and routing. Routes from one end-point to another are not clear, and they propagate between different ISPs while in route. It is best to validate the latency between the customer location and Microsoft Azure regions. Choose the one with the lowest latency, which will provide the best performance from a networking perspective

Pricing

The costs associated with services within the different Azure regions are not necessarily the same. The cost for Azure services is controlled by many factors. If latency and GRC are not influencing the architectural design of the application or service, it may be best to deploy to the region with the lowest costs. Please refer to the following site for the pricing details of each service provided by Microsoft Azure: Azure Pricing.

Geographic Redundancy and High Availability Across Datacenters

One way to reduce the impact of a datacenter or regional service outage is to place these applications and services in multiple regions. Placing a web application or service in multiple Azure regions and tying those services together with Traffic Manager provides the required redundancy to keep the service running.

When an outage happens in one of other regions, the required high availability components will be in place and the services will remain available to the end users. Establishing virtual network-to-virtual network VPNs between datacenters to route data and infrastructure services is another way to support enterprises with high availability.

Using Regions and Affinity Groups

Affinity Groups tell the Azure Fabric Controller that two or more Azure virtual machines should always be placed together or close to one another within a cluster of compute resources. In the past, it was a requirement to have an affinity group associated with a virtual network. Recent architectural improvements have removed this prior requirement, and it is no longer recommended to use affinity groups in general for virtual networks or virtual machines.

Virtual Machine Considerations

Although it is not generally recommended to use affinity groups with virtual machines, there is one particular scenario where it may be necessary to use an affinity group, specifically only when it is required to have the absolute lowest network latency between the virtual machines. Associating a virtual machine with an affinity group ensures that all virtual machines in the affinity group are in the same compute cluster or scale unit.

Although it may be necessary to use an affinity group when configuring a virtual machine to ensure the least amount of latency, the following drawbacks can be difficult to change later should they occur:

  • Limitation of virtual machine sizes available on the computer scale unit associated with the affinity group
  • Higher probability of not being able to allocate a new virtual machine (caused by the scale unit of the affinity group being out of capacity)

The link between the virtual machine and the affinity group is the cloud service rather than the virtual machine alone. Should capacity issues or the inability to resize an existing virtual machine to a larger size occur, it is necessary to:

  • Remove the virtual machine and import to a new cloud service associated with the region.
  • Remove all virtual machines from the existing cloud service before deleting and re-creating the cloud service to reference the region rather than the affinity group.

The process to remove virtual machines from an affinity group is not very easy, and this further emphasizes why you should not make the association to an affinity group rather than a region unless the requirement for the least amount of latency is present.

For more information about the current guidance for affinity groups and specifics regarding virtual networks and virtual machines, see How to migrate from Affinity Groups to a Regional Virtual Network.

Virtual Network Considerations

In May of 2014, the ability to create a virtual network that can span the entire region (datacenter) was introduced. Now when creating a new virtual network in the Azure portal, the only option is to associate the virtual network to a location rather than to an affinity group.

A regional virtual network is required for many of the newer Azure features, including internal load balancers. Customers with an affinity group virtual network need to request support to migrate the virtual network to a regional type.

Optionally, you can create a new virtual network that is associated with the region. Then migrate the existing deployments from the affinity group virtual network.

For more information, see: Regional Virtual Networks.

Datacenter Architecture

As described earlier, Azure hosts its services in a series of globally distributed datacenters. These datacenters are grouped together in regions, and datacenters within a given region are divided into "clusters," which host services. This interaction is outlined in the following diagram:

Within each datacenter, the racks of equipment are built to be fault tolerant with respect to networking, physical host servers, storage, and power. The physical host servers are placed in high availability units called a cluster. The cluster configurations are spread across multiple server racks.

A single rack is referred to as a Fault Domain (FD), and it can be viewed as a vertical partitioning of the hardware. The fault domain is considered the lowest common denominator within the datacenter for fault tolerance. Microsoft Azure can lose a complete rack, and the hosted services can continue unaffected.

A second partition within the datacenter is called the Upgrade Domain (UD) and it can be viewed as a set of horizontal stripes passing through the vertical racks of fault domains. Upgrade domains are used to deploy updates (security patches) within Azure without affecting the availability of the running services within the Azure fabric. The following diagram shows a high-level relationship between fault domains and update domains in the Azure datacenters.

Virtual machines are placed in specific fault domains and update domains based on the location of respective virtual machines in the same Availability Set. For more information about properly configuring availability sets, refer to the Compute (IaaS) section.

Server Optimization

Servers reside within each Azure datacenter. The servers are divided into clusters, which then are partitioned by the Azure Fabric Controller to deliver a given service. This relationship is outlined in the following diagram below:

Additional details about virtual machine compute instances are provided later in this document; however, a brief overview is provided here to give a general understanding of Azure compute services.

Concerning compute sizes for Azure IaaS (virtual machines), Azure currently has three series: A, D, and G. Each series has different characteristics. For example, the D series offer up to 800 GB of temporary SSD storage, while the G series machines are the largest and offer the highest performance.

The following article has details about each series and examples where decisions have been made based on scenarios: Azure A-SERIES, D-SERIES and G-SERIES: Consistent Performances and Size Change Considerations.

With the newer D and G series virtual machines, the temporary drive (D:\ on Windows, /mnt or /mnt/resource on Linux) are local SSDs. This high-speed local disk is best used for workloads that replicate across multiple instances, such as MongoDB, or for workloads that can leverage this high I/O disk for local and temporary cache, such as the Buffer Pool Extensions in SQL Server 2014.

Note: These drives are not guaranteed to be persistent. Thus, although physical hardware failure is rare, when it occurs, the data on this disk may be lost, unlike your operating system drive and any attached durable disks that are persisted in Azure Storage.

Also available when using premium storage are DS series virtual machines that offer high-performance and low-latency disk support for I/O intensive workloads. The underlying disks for DS series virtual machines are SSDs rather than HDDs, and they achieve 64,000 IOPS.

The following tables list the details of the D Series and G Series virtual machines.

D Series

General Purposes Sizes

Name

vCores

Memory (GB)

Local SSD (GB)

Max Persistent Data Disks

Standard_D1

1

3.5

50

2

Standard_D2

2

7

100

4

Standard_D3

4

14

200

8

Standard_D4

8

28

400

16

Memory Intensive Sizes

Name

vCores

Memory (GB)

Local SSD (GB)

Max Persistent Data Disks

Standard_D11

2

14

100

4

Standard_D12

4

28

200

8

Standard_D13

8

56

400

16

Standard_D14

16

112

800

32

G Series

Name

vCores

Memory (GB)

Local SSD (GB)

Max Persistent Data Disks

Standard_G1

2

28

412

4

Standard_G2

4

56

824

8

Standard_G3

8

112

1,649

16

Standard_G4

16

224

3,298

32

Standard_G5

32

448

6,596

64

Microsoft Azure Enterprise Operations

To operate your application, service, or infrastructure within Microsoft Azure, it is important to understand the roles, access methods, and components that make up a given organizations Azure environment. This section covers each of these areas at a high level.

Enterprise Roles and Portals

Roles

Within a given enterprise enrollment, Microsoft Azure has several roles that individuals play. These roles range from creating subscriptions (covered later in this document) to provisioning resources. The following top level roles exist within Azure:

Role

Quantity/Description

Functions/Permissions

Enterprise Administrator

There may be multiple Enterprise Administrators per Enterprise Enrollment

  • Manage accounts and Account Owners
  • Manage Enterprise Administrators
  • View usage across all accounts
  • View unbilled charges across all accounts

Account Owner

Each account requires a unique Microsoft Account or Organizational Account

  • Create and manage subscriptions – only the account owner is able to perform these functions
  • Manage Service Administrators and Co-Administrators
  • View usage for subscriptions
  • View account charges – if the Enterprise Administrator has provided access

Service Administrator

A single Microsoft Account or Organizational Account may be used across subscriptions and between hierarchal levels

  • Access and manage subscriptions and development projects on the developer portal

A detailed breakdown of each role, how it is created and what primary tool they use is provided in the following table:

Role

How Created

Primary Tool

Enterprise Administrator

First account created at on-boarding. Full access and visibility into all activity and resources of a corporate enrollment.

https://ea.azure.com

Departmental Administrator

Delegated by the Enterprise Administrator, this role is typically cost focused at the business unit level. Approves rolled up IT budgetary requests for multiple organizations. Can create and have visibility into multiple account owners. Consumption information can be rolled up and isolated at this level.

https://account.windowsazure.com

Account Owner

Delegated by a Departmental Administrator, this role typically is cost focused at the departmental or project level. Role creates the subscriptions and Service Administrators, and approves hardware and resource requests by project. Can create and have visibility into multiple Service Administrators and subscriptions..

https://account.windowsazure.com

Service Administrator

Owns a subscription at the resource level. Manages who can create and use IT resources; is solution and project delivery focused. Sets roles and responsibilities at the project level. Has visibility into a single subscription's consumption..

https://manage.windowsazure.com

Co-administrator

A resource administrator within a subscription that can manage provisioning and delegation of additional co-administrators. Project and resource focused.

https://manage.windowsazure.com

Resource Group Administrator

Manages a group of resources within a subscription that collectively provide a service and share a lifecycle. Single project or service focused. (Currently in Preview)

https://manage.windowsazure.com

Portals

Microsoft Azure has several portals to support holistic management of the accounts, subscriptions, and features outlined in this document. The following sections cover available portals depending on the account management model:

Customer-Owned Models

When the Azure subscription is provisioned using the customer-owned account models described previously in this document, the customer's organization deploys and manages Azure workloads on their own. The following portal offerings are available for resource management:

Portal

Location

Purpose

Enterprise Portal

https://ea.azure.com/

  • Manage access
  • Manage accounts
  • Manage subscriptions
  • View price sheet
  • View usage summary
  • Manage usage and lifecycle email notifications
  • Manage Authentication Type
    • Microsoft Account Only – for organizations using only Microsoft Accounts
    • Organizational Account – for organizations that have set up Active Directory in Azure or synchronized from an on-premises Active Directory using ADFS or Directory Synchronization (DirSync), and chose to add users with cloud-based Active Directory authentication
    • Organizational Account Cross Tenant – for organizations that want to add an Enterprise Azure user from an Active Directory tenant outside of their own
    • Mixed Account – for organizations that want to add a combination of Microsoft Account users and cloud-based Active Directory users

Account Portal

https://account.windowsazure.com

  • Edit subscription details
  • Enroll in or enable Preview features

Management Portal

https://manage.windowsazure.com

or

https://portal.azure.com (Preview)

  • Provision/deprovision Azure services
  • Manage co-administrators on subscriptions
  • Open support tickets for issues within the subscription

Note - any support ticket under a Premier Azure Support agreement should be opened using the Premier portal

Cloud Solution Provider-Managed Models

When the Azure subscription is provisioned by Cloud Solution Provider, who manages end customer Azure subscriptions, the following portal offerings are available for resource management:

Portal

Location

Purpose

Partner Portal

https://partnercenter.microsoft.com

  • Manage CSP customers
  • Manage CSP customer accounts
  • Manage CSP customer subscriptions
  • Manage user accounts assigned to CSP customer subscription administration
  • Retrieve billing data as a CSP on behalf of CSP customers
  • Administer services provisioned within CSP customer subscriptions
  • Create and Manage Azure service requests

Management Portal

If managed by CSP (exact URLs are provided in customer subscription "Service management" view on Partner portal):

Azure Active Directory Management:

https://account.windowsazure.com/PremiumOffer/Index?offer=MS-AZR-0110P&returnUrl=https://manage.windowsazure.com/<-csp->.onmicrosoft.com#Workspaces/ActiveDirectoryExtension/Directory/<-guid->/directoryQuickStart

Customer Azure resource management

https://portal.azure.com/<tenant>.onmicrosoft.com
(Preview)

When accessed by the end customer:

https://manage.windowsazure.com

or

https://portal.azure.com (Preview)

  • Provision/de-provision Azure services
  • Manage co-administrators on subscriptions
  • Open support tickets for issues within the subscription

The following table provides a summary of portal access by role:

Role

Enterprise Portal

Account Portal

Management Portal

Enterprise Administrator

Yes

Yes – if account is also Account Owner

Yes – if account is also the Service Administrator or Co-administrator

Account Owner

Yes – limited access if provided by Enterprise Administrator

Yes

Yes – if account is also the Service Administrator or Co-administrator

Service Administrator

No

No

Yes

Partner Center Portal

Partner Center Portal is the primary destination for Cloud Solution Providers (CSPs) to onboard customers, resell first party and third party services, onboard customers, and manage customer services. It also provides access to billing data, powerful analytics, and tools that enable upsell and cross sell for Cloud Solution Provider partners. The following sequence of steps demonstrate the onboarding process of new customer to the Azure platform as a CSP-managed entity:

Additional Role Considerations

The following considerations are provided for the operational roles identified within this section:

  • By default, the Account Owner will be the Service Administrator on any new subscriptions. The Service Administrator can be updated to any other eligible ID in the Account Portal.
  • For subscriptions created by a Microsoft Account Owner, the Service Administrators and Co-Administrators must also be Microsoft Accounts. For subscriptions created by an Organization Account, either type of Microsoft Account or Organization ID may be used for the Service Administrators and/or Co-Administrators.
  • Discounted Offer Subscribers (MSDN, BizSpark, Microsoft Action Pack) – When associating an ID that is receiving one of these listed benefits as an Account Owner, the benefit will be lost, but can it be recovered. This is not the case when associating an ID as an Enterprise Administrator or Co-Administrator on a subscription.
  • By default, each new subscription is named Microsoft Azure Enterprise. It is best practice to rename it to something more unique so that each subscription can be identified by name when you are managing them. See the following section about Azure subscriptions for information about subscriptions and naming conventions.
  • When you first activate an Enterprise Azure enrollment, we recommend that the customer request a concierge onboarding meeting so staff can provide an overview of Enterprise Azure and answer any questions. To request this onboarding session, use the following URL: http://aka.ms/AzureEntSupport. Choose the problem type Onboarding, and for the category choose Scheduling a Customer Onboarding Call.

Azure Subscriptions

What is a Subscription?

Initially, a subscription was the administrative security boundary of Microsoft Azure. With the advent of the Azure Resource Management (ARM) model, a subscription now has two administrative models: Azure Service Management and Azure Resource Management. With ARM, the subscription is no longer needed as an administrative boundary.

ARM provides a more granular Role-Based Access Control (RBAC) model for assigning administrative privileges at the resource level. RBAC is currently being released in stages with 22 new roles available at this time.

A subscription additionally forms the billing unit. Services charges are accrued to the subscription currently. As part of the new Azure Resource Management model, it will be possible to roll up costs to a resource group. A standard naming convention for Azure resource object types can be used to manage billing across projects teams, business units, or other desired view.

A subscription is also a logical limit of scale by which resources can be allocated. These limits include hard and soft caps of various resource types (for example, 10,000 compute cores per subscription). Scalability is a key element for understanding how the subscription strategy will account for growth as consumption increases.

Design Considerations

Assessment

One of the most critical items in the process of designing a subscription is assessing your current environment and needs.

It is critical to develop the Subscription, Network, Storage, Availability, and Administrative models together to have a cohesive approach. Understanding how each component is limited and how each impacts the others is critical to a solution that can scale and be flexible enough to support the needs of the business.

Specifically, it is important to have a thorough understanding of the following aspects:

Identify business requirements

  • Availability
  • Recoverability
  • Performance

Identify technical requirements

  • Is network connectivity a shared resource or dedicated to single use or group?
  • Are there Active Directory requirements?
  • Do you need to consider clustering, identity, or management tools?

Security requirements

  • Who are the subscription administrators?
  • Are the appropriate network connectivity and identity requirements being deployed?
  • Have you implemented a least privilege administrative model?

Scalability requirements

  • What are the growth plans?
  • How will limited resources be allocated?
  • How will the model evolve over time considering additional users, shared access, and resource limits?

Additional considerations

  • Is the subscription owned by the customer or managed by a cloud service provider?
  • Is an Office 365 Azure Active Directory tenant set up?
  • Are there plans for Office 365 enrollment?
  • Are there other Azure subscriptions in use?
  • Have you deployed a trial Azure subscription?
  • Have you run a trial Power BI evaluation?
  • Have you run a RMS evaluation?
  • Can you use the desired OrgID *.onmicrosoft.com for company directory?

Many of the early decisions in architecting and planning an Azure environment and related subscriptions can have an impact on future decisions and designs as the cloud environment grows. As such, it is important to have participation and input from many groups within an organization including networking, security, identity, domain administrators, and IT leadership.

Pulling in specific teams early and having an open dialogue of different perspectives provides a better design and implementation. By ensuring that any objections are exposed early and can be dealt with thoroughly is ideal rather than finding them in the middle of a project so that they have a negative impact on the schedule.

Following is an example subscription design based on a subscription per
Organizational Unit.

Here is another example subscription design that is based on one subscription per environment in the development process of an application.

For CSP-managed scenarios, here is an example subscription design that illustrates a model for one or
more
subscription per specific customer where a separate service deployment for a given customer may be assigned a dedicated subscription.

Administration

At its core, a subscription is a logical grouping of services and administration. It is the base unit of administrative granularity and it is used to track and bill service consumption.

Subscription Administrators have the ability to read and download anything stored in an Azure Storage account, including operating system VHDs, SQL Server data disks, and blobs.

Subscription Administrators can stop, start, provision and delete existing and new services.

Subscription Administrators can grant co-administrative access to new users.

All of these capabilities require careful consideration for who is given these rights in the subscription. Domain administrators have a similar situation regarding the level of rights and the need to carefully choose who has these rights.

In CSP-specific scenarios, customer subscriptions are often created, owned and managed by the service provider who then designates administrative agents to manage customer subscription resources. In this scenario the subscriptions are ARM based subscriptions and require an RBAC model to control the access and management of the subscription and resources.

Recommended: The minimum number of users should be assigned as Subscription Administrators and/or Co-administrators.

Recommended: Use Azure Resource Management RBAC whenever possible to control the amount of access that administrators have, and log what changes are made to the environment.

Connectivity

Adding network connectivity (whether using a site-to-site VPN or a dedicated ExpressRoute connection) brings additional considerations to the subscription requirements discussion.

The subscription is a required container to hold a virtual network, and often networking is a shared resource within an enterprise.

Site-to-site VPNs and ExpressRoute circuits require defining IP address ranges that do not overlap with on-premises ranges.

Site-to-site VPN connectivity requires setting up and configuring a public-facing gateway and VPN services at the corporate edge.

ExpressRoute connectivity is through a private connection from an on-premises datacenter to Azure through a service provider's private network. For more information, see the Microsoft Azure Networking section later in this document.

Routing and firewall configurations are typically necessary when enabling connectivity. Administration and connectivity are often at odds with respect to autonomy and sharing resources, but when designing the subscription architecture for the enterprise, both must be part of the solution. Business requirements including availability and reliability will impact the network architecture, and subsequently, the subscriptions necessary to support that architecture.

Because a virtual network must exist inside a subscription, some constraints of a subscription also impact decisions made for virtual networks. For example, only 20 virtual networks can be attached to a single ExpressRoute circuit. Therefore, only 20 subscriptions could be attached to that circuit.

In another scenario, if a design used 20 virtual networks within a single subscription, and ExpressRoute was used for connectivity to corporate network resources, there would be no way to attach another subscription to the same ExpressRoute, regardless of the bandwidth utilization on the circuit.

If multiple virtual networks are to share a single enterprise ExpressRoute connection, essentially there is no network isolation between those networks. In this case, any separation the subscription design may try to define is eliminated and must be achieved through subnet layer Network Security Groups (NSGs). When the virtual networks are attached to the same ExpressRoute circuit, they are essential a single routing domain.

A subscription hosting only PaaS services could have no virtual network at all, and the design limitations discussed above would not apply.

If a subscription will host a virtual network for on-premises connectivity and will not be used to host IaaS and or PaaS resources, it can be inferred that the cost of the subscription with a virtual network is about one-tenth the cost of the ExpressRoute circuit.

Security and Identity

Identity services provided by an IaaS Active Directory, an Azure Active Directory tenant, or a customer OrgID tenant will have an impact on how security is implemented and subsequently on how that security impacts the number and configuration of subscriptions necessary.

Subscription Administrators have a broad authority, and as such they must be considered administrators over all the resources in the subscription. If the subscription includes Azure Active Directory, IaaS domain controllers, or if it connects to domain controllers from an on-premises Active Directory, the Subscription Administrators and Co-administrators are also domain owners. They must be trusted individuals and treated like any domain administrator appropriate for that directory.

Productivity goals, single sign-on, and federation requirements impact identity services decisions, and subsequently, the supporting subscriptions.

Scale

Subscriptions form the scale limit in Azure. Many resources—from computing cores and storage accounts to reserved IP addresses—have quantity and size limitations based on the subscription.

When thinking about the subscriptions for an environment, it is important to think about how the design will scale if and when limits are reached.

In subscription discussions, a number of considerations determine the decisions made about the design. The number of connections that can be shared by a tunnel or circuit, bandwidth requirements, the source of identity, and the number of groups, users, and applications associated with a subscription are all important topics when considering scale.

Multiple Subscriptions Introduce Complexities

The use of a subscription as a security boundary may be considered when designing an Azure subscription model. A project requiring isolation should consider subscription administration very carefully. Some considerations for multiple subscriptions include:

  • A subscription on its own doesn't cost anything.
  • A subscription has its own administrators.
  • A subscription is accountable for its own consumption.

Complexities are introduced when you consider that the on-premises networking and security infrastructures are typically shared resources.

Patching, monitoring, and auditing are frequently provided by dedicated organizations, and staff is trained in the related tools. Business continuity and disaster recovery are almost always dependent on enterprise solutions to mitigate the cost.

An enterprise that allowed Azure subscriptions to be based on a project or team, could find itself:

  • Purchasing dedicated network circuits arbitrarily rather than for bandwidth need.
  • Supporting multiple edge gateway devices.
  • Increasing management of IP address space allocation.
  • Increasing management of routing and firewall configurations.
  • Duplicating services required, including monitoring, patching, and anti-virus.

If a business unit manages its own networking, operations, business continuity, and disaster recovery, or the use case is such that a dedicated VPN connection to on-premises resources is sufficient, this type of subscription model could work very efficiently.

Enterprise Model

The following diagram shows a robust enterprise Azure enrollment. There are multiple subscriptions, one of which is a "Tier 0" subscription used to host domain controllers and other sensitive roles when extending an on-premises Active Directory forest to Azure.

This is configured as a separate subscription to ensure that only administrators with domain administrator level privileges are able to exert administrative control over these sensitive servers through Azure subscriptions, while still allowing server administrators to manage virtual machines in other subscriptions.

QA and production networks share the same dedicated ExpressRoute circuit to on-premises resources. They are separated into distinct subscriptions to allow separation of access and to allow the QA subscription to scale on its own without impacting production.

This model will scale based on need. Second, third, and subsequent QA and production subscriptions can be added to this design without significant impact on operations. The same applies to network bandwidth—the circuit can be used until its limits are reached without any artificial limitations forcing additional purchases.

Subscriptions are the foundational building block of an Azure enterprise enrollment. The requirements for administration, operations, accountability, connectivity, scalability, and security shape the subscription model.

Note that multiple existing resource forests are depicted here only to show that some forests can be extended to Azure while others don't have to be. Microsoft does not recommend creating a separate resource forest for Azure-hosted resources as a security separation method.

This approach typically requires two-way trust relationships that negate any potential security isolation benefits and the organization will be left with increased operational overhead for no benefit. The use of Read Only Domain Controllers (RODCs) for Azure-hosted resources also offers no meaningful security benefits, while adding increased operational overhead.

Subscription Naming Convention Considerations

When naming the Microsoft Azure subscription, it is a recommend practice to be verbose. Try using the following format or a format that has been agreed upon by the stake holders for the company.

<Company> <Department (optional)> <Product Line (optional)> <Environment>

  • Company, in most cases, would be the same for each subscription. However, some companies may have child companies within the organizational structure. These companies may be managed by a central IT group, in which case, they could be differentiated by having both the parent company name (Contoso) and child company name (North Wind).
  • Department is a name within the organization where a group of individuals work. This item within the namespace as optional. This is because some companies may not need to drill into such detail due to their size. The company may want to use a different identifier.
  • Product line is a specific name for a product or function that is performed from within the department. As with the department namespace, this area is optional and can be swapped out as needed.
  • Environment is the name that describes the deployment lifecycle of the applications or services, such as Dev, Lab, or Prod.

What you are trying to accomplish with a naming convention, is to put together a meaningful name about the particular subscription and how it is represented within the company. Many organizations will have more than one subscription, which is why it is important to have a naming convention and use it consistently when creating subscriptions.

NOTE: In CSP-specific scenarios the naming convention can also incorporate the CSP identifier to mark the subscription as managed by a service provider.

This is simply an example naming convention to use as a base. Many of the decisions about the naming convention will come from the subscription model that is chosen.

The following table shows how a company might use the naming convention outlined previously.

Company

Department (OU)

Product Line

Environment

Full Name

Contoso

Services

Business

Dev

Contoso Services Business Dev

Contoso

Services

Business

Lab

Contoso Services Business Lab

Contoso

Services

Business

Prod

Contoso Services Business Prod

Contoso

Services

Consumer

Dev

Contoso Services Consumer Dev

Contoso

Services

Consumer

Lab

Contoso Services Consumer Lab

Contoso

Services

Consumer

Prod

Contoso Services Consumer Prod

North Wind

Databases

Business

Dev

North Wind Databases Consumer Dev

North Wind

Databases

Business

Lab

North Wind Databases Consumer Lab

North Wind

Databases

Business

Prod

North Wind Databases Consumer Prod

Subscription Management

Azure AD Authentication

The recommended way to access your subscription when using Azure PowerShell is to authenticate by using the Add-AzureAccount PowerShell cmdlet. This cmdlet prompts for authentication in a window where you input your credentials that are associated with Azure Active Directory. You input either your Microsoft Account credentials or Org ID credentials that are associated with the Azure subscription.

Using this method of authentication even once with your subscription takes precedence over any management certificates you may have for your profile. (i.e., running the Import-AzurePublishSettings cmdlet.). To remove the Azure AD token and restore the management certificate method, use the Remove-AzureAccount cmdlet.

When using Azure AD authentication, occasionally you may see an error message: "Your credentials have expired. Please use Add-AzureAccount to log on again." To restore access to your subscription by using Azure PowerShell, simply run Add-AzureAccount again and authenticate.

This method of authenticating to the subscription is most convenient when working with commands or scripts interactively. It is possible to use this method with automated processes, and pass secured credentials by using the –Credential switch. However, at this time this method only works when you are using Org ID credentials, not Microsoft Account credentials.

Certificate Authentication

Management certificates are used to allow client devices to access resources within the Microsoft Azure subscription. The management certificates are x.509 v3 certificates that only contain a public key. They have the .cer file extension.

If a user requires the ability to deploy or change services running in Microsoft Azure, but does not require access to the Microsoft Azure portal, they'll need a certificate. It is very common for a developer to deploy to Azure services through Visual Studio and they will require a certificate to accomplish this task.

The x.509 v3 certificates are mapped to one or more Azure subscriptions. The possession of the private keys associated with these certificates should be given the same level of security as passwords. If the certificate private key becomes compromised, whoever holds this key can perform actions on the subscriptions for which the certificate is valid.

At this time, an Azure subscription can import 100 certificates. Certificates can be shared across multiple subscriptions. There is also a 100 certificate limit for all subscriptions for a specific Service Administrator's ID.

There are a few ways to generate a certificate. You can create a self-signed management certificate or you can download a certificate from the Microsoft Azure portal as part of what is known as a Publish Settings file.

To create your own self-signed certificate, use makecert.exe. Thus is a command-line tool that ships with Visual Studio. Or if you have access to a computer running Internet Information Services (IIS), you can generate one from there.

Using the Publish Settings File

The Publish Settings file is an XML file that contains information about the Microsoft Azure subscriptions. The file contains specific information about all subscriptions associated with the user's Microsoft ID. These are the subscriptions in which the particular Microsoft ID is associated with the Administrator or Co-administrator. The Published Settings file exposes your Azure subscription to be used with Visual Studio and Azure PowerShell.

To use Azure PowerShell within your environment, open an elevated Windows PowerShell console and execute the following commands:

Record the path used to save the Publish Settings file, for example:

C:\Users\ProfileName\Documents\AzurePublishSettingFile\YourFileName.publishsettings

  • Import-AzurePublishSettingsFile C:\Users\ProfileName\Documents\AzurePublishSettingFile\YourFileName.publishsettings
  • Get-AzureSubscription (this will give you the list of your subscriptions).
  • Find the "SubscriptionName" value.
  • Select-AzureSubscription "Your Subscription Name."

Now Azure PowerShell has set up a management certificate to interface with your Microsoft Azure subscription. To validate the association between Azure PowerShell and your subscription, execute the following Azure PowerShell cmdlet: Get-AzureLocation.

Some drawbacks of using management certificates for interfacing with an Azure subscription include:

  • It is difficult to manage and keep track of certificates in portal.
  • You cannot ensure that access to a subscription is revoked when you remove a Co-administrator unless all the certificates are removed.
  • You might experience an unknown sharing of management certificates.
Additional Setup

When using either authentication method in Azure PowerShell, some scripts, such as provisioning new virtual machines, will not function properly until you have associated a storage account with your subscription. To add this association, run the following script:

Set-AzureSubscription –SubscriptionName 'My Subscription Name' –CurrentStorageAccountName 'storageacctname001'

After running this script, you can verify that your storage account is now associated with the subscription by running Get-AzureSubscription. There should now be a value under CurrentStorageAccountName. You should only need to set this value once for most Azure PowerShell operations, and the value can be changed at any time by running Set-AzureSubscription
again.

If you have multiple subscriptions, you also have to ensure that you are targeting the correct subscription with Azure PowerShell operations. There is a default and current subscription setting that you can use to control this. When you load the published settings file or use Add-AzureAccount with access to multiple subscriptions, one subscription is configured with the default and current tag. Any operation will target this subscription unless you change the focus. To redirect PowerShell operations to a different subscription, just add the current option at the end of the Select-AzureSubscription cmdlet with the subscription name you want to target. If you want to permanently change the default subscription, then use the default option.

Development and Management Tools

For the development and management of Azure resources, there are a wide variety of tools that can be used from the Azure Management portal, Azure PowerShell, SDKs, and cross-platform and third-party downloads.

Software Development Kits (SDKs)

Following are examples of some SDKs that are available for download and the respective platforms the SDKs can be used to develop on. To get the SDKs and command-line tools you need, see the Microsoft Azure Downloads site.

.NET

Java

Node.js

PHP

VS 2015 install

Windows install

Windows install

Windows install

VS 2013 install

Mac install

Mac install

Mac install

VS 2012 install

Linux install

Linux install

Linux install

Client libraries

     

Python

Ruby

Mobile

Media

Windows install

Windows install

iOS install

iOS SDK install

Mac install

Mac install

Android install

Flash OSMF install

Linux install

Linux install

Windows Store C# install

Windows Store JS install

Windows Phone 8 install

Windows 8 install

Silverlight install

.NET SDK install

Java SDK install

Azure PowerShell

You can use Windows PowerShell to perform a variety of tasks in Azure, either interactively at a command prompt or automatically through scripts. Azure PowerShell is a module that provides cmdlets to manage Azure through Windows PowerShell.

You can use the cmdlets to create, test, deploy, and manage solutions and services delivered through the Azure platform. In most cases, you can use the cmdlets to perform the same tasks that you can perform through the Azure Management portal. For example, you can create and configure cloud services, virtual machines, virtual networks, and web applications.

The module is distributed as a downloadable file and the source code is managed through a publicly available repository. A link to the downloadable files is provided in the installation instructions later in this topic. For information about the source code, see the Azure PowerShell code repository.

XPlat-CLi

The Azure Cross-Platform Command-Line Interface (Azure CLI, or sometimes referred to as xplat-cli) provides a set of open source, cross-platform commands for working with the Azure platform. The Azure CLI provides much of the same functionality found in the Azure Management portal, such as the ability to manage websites, virtual machines, mobile services, SQL Server databases, and other services provided by the Azure platform.

The Azure CLI is written in JavaScript, and requires Node.js. It is implemented by using the Azure SDK for Node.js, and it is released under an Apache 2.0 license. To access the project repository, see Microsoft Azure Cross Platform Command Line.

Azure Service Limits Considerations

Subscriptions now exist for both ARM and ASM models. Subscriptions have associated "hard" (upper boundary) and "soft" (default) limits for many of the Azure services, features, and capabilities. Many of these soft limits can be increased greatly by simply creating a support request, but some of the hard limits have a big impact on decisions regarding subscription design. ASM based subscriptions have limits based purely on the subscription and is cumulative across all regions. ARM based subscriptions typically have limits based on the region that is being targeted in the subscription. Following are some of the hard limits in a subscription that have the greatest impact on design decisions.

Azure Object

Limit

Virtual networks

100 per subscription

Virtual machines

10,000 CPU cores per subscription

ExpressRoute

1 circuit across 20 subscriptions

10 dedicated circuits per subscription

Cloud services

200 per subscription

Network security groups

100 per subscription

Storage accounts

100 per subscription

Management certificates

100 per Service Administrator

Co-administrators

200 per subscription

For a more detailed and up-to-date list of Azure limits, see Azure Subscription and Service Limits, Quotas, and Constraints..

Azure Billing

There are differences in how billing is viewed and the available data in the output based on type of subscription. Billing information and details for pay-as-you-go subscriptions are viewed in the Usage and Billing Portal, whereas billing information for enterprise subscriptions is viewed in the account portal.

For either subscription type, the billing details can make it difficult to discern charges for the items listed if a specific naming convention is not being used. For more information about naming conventions, see the respective sections in this document, such as Virtual Machines, Virtual Networks, and Storage Accounts.

Billing Unit Conversions

An important item to note regarding the detailed billing, regardless of the subscription type is that all standard virtual machine instances are converted into Small instance hours on the bill. For example, a Windows Extra Small (A0) would have a clock hour of 1, but it shows as ¼ hour when converted to small instance hours. Similarly, any A-Series Cloud Service instance is converted into Small (A1) instance hours on the billing detail.

For a complete list of Windows and non-Windows conversions use the Azure FAQ page titled How do various instance sizes get billed?.

The following chart shows the details for each A-Series cloud service instance conversion:

Cloud Services Instance

Clock Hours

Small Instance Hours

Extra Small (A0)

1

¼ hour

Small (A1)

1

1 hour

Medium (A2)

1

2 hours

Large (A3)

1

4 hours

Extra Large (A4)

1

8 hours

Pay-As-You-Go Usage Details

Following is an example of usage details for a pay-as-you-go subscription. Notice the Component column has the name of the resources and further highlights why a good naming convention is key.

In this particular subscription, there are very few applications or databases, so it is not difficult keep track of them. However, if there were many applications following the naming style of app1, app2, app3, and the random generated name for the database, it would quickly become very difficult to decipher costs per application.

For a full list of usage details and the definition of each, refer to Understand your bill for Microsoft Azure.

Enterprise Usage Detail

Enterprise subscription billing details are slightly different. In the following example, you can see that a Component column also exists in the billing details; however, not all fields are the same as those in the previous subscription details.

The following table lists the fields for the enterprise usage details:

Detail Fields

AccountOwnerId

Day

ResourceQtyConsumed

AccountName

Year

ResourceRate

ServiceAdministratorId

Product

ExtendedCost

SubscriptionId

ResourceGUID

ServiceSubRegion

SubscriptionGuid

Service

ServiceInfo

SubscriptionName

ServiceType

Component

Date

ServiceRegion

ServiceInfo1

Month

ServiceResource

ServiceInfo2

AdditionalInfo

Tags

Store Service Identifier

Department Name

Cost Center

 
Billing Visibility and Rollup

Under Manage Access, it's possible to enable Department Administrators to see the costs associated with all accounts and subscriptions in their departments. You can also enable Account owners to see their costs.

Billing and ARM Tags

ARM tags can be used to group billing data where ARM-compliant Azure services allow defining and applying tags to organize the billing usage for Azure resources. As an example, if a customer organization is running multiple virtual machines for different organizations, tags can be used to group usage by cost center. Alternatively, tags can be used to categorize costs by runtime environment; for example, the billing usage for virtual machines running in production environment. Tags appear in billing and usage artifacts, such as usage CSV data or billing statements. For more information about ARM tags, see the respective section in this document. The following example illustrates a sample scenario utilizing tags.

Billing and CSP Model

The Microsoft Cloud Service Provider (CSP) program allows service providers to own the complete customer lifecycle, including direct billing. The service providers are able to implement their own pricing and billing policies to create customer offers, set the price, and own the billing terms. CSP partners can automatically receive monthly invoices and billing statements and incorporate incurred costs into their billing accounting for value-added services they provide to their customers.

Microsoft Azure Pricing Models

Pricing for pay-as-you-go subscriptions is based on the Pricing details page. Each service is listed on an individual page, so it may be best to use the Pricing Calculator for the majority of cost estimating.

For MSDN subscribers, Partner Network, and BizSpark accounts the pricing may differ from the pay-as-you-go model, for more information on these types of accounts, use the Member Offers page as a resource.

Enterprise account pricing differs based on commitment and other variables. The Licensing Azure for the Enterprise page reviews some of the benefits of this type of agreement. For more details about the Enterprise account pricing model, the Pricing Overview for Microsoft Azure in Enterprise Programs document is a great resource.

Azure Accounts and Access Permissions

Account Types and Directory Source

Azure Active Directory Accounts

Azure Active Directory (Azure AD) is the standalone directory service within Azure. Customers can create their administrative structure within Azure AD by defining their users and groups. This service can work on its own, because Azure AD can perform authentication without integrating with an on-premises directory.

On the other hand, organizations can choose to synchronize Azure AD with their users and groups from an on-premises Active Directory to Azure AD. This syncing effort rapidly provides availability to resources within Azure for on-premises users and groups.

All users who access the organization's Azure subscriptions, are now present in the Azure AD, which the subscription is associated with. This enables the company to manage what the users can access, or to revoke access to Azure by disabling the account in the directory.

Microsoft Accounts

The creation of Microsoft accounts is typically controlled by the users and not by the organization. With an Azure subscription, we recommend using Organization Accounts where possible to provide access to resources.

When creating Microsoft Accounts, we recommend establishing guidelines that will be used within the organization.

Do not allow the use of existing personal Microsoft Accounts. Depending on the individual permissions, these accounts may be tied to the company Azure subscriptions, and have access to storage accounts and billing information.

A Microsoft account is mapped to a person and it should be formatted to identify the user, for example: FirstName.LastName.xyz@outlook.com and not alias2763@outlook.com.

The reason for using specific naming for the Microsoft Accounts is that at the time the account is created, the identity of the user may be known. However, as time goes on and roles change within the company, accounts may be difficult to identify.

In the previous example, the xyz after FirstName.LastName is optional, and it could be used for any number of things, such as environment name, development, lab, or organization name, if that is preferred.

Organizational Accounts

Using Organizational Accounts for managing an Azure subscription is recommended over Microsoft Accounts for various reasons.

The main reason is that the organization has more control over access for adding administrators and removing access when an employee is no longer with the company.

Additionally, many of the newer Azure services offerings are relying heavily on Organizational Accounts. In some cases, having existing Microsoft Accounts tied to services prior to switching to Organizational Accounts can cause issues with the respective tenant IDs.

Access Permissions

Service Administrators and Co-Administrators

A Microsoft Azure subscription and the associated resources can be accessed via the Azure Management portal, Azure PowerShell, Visual Studio, or other SDKs and tools. When a subscription is created, a Service Administrator is assigned. The default Service Administrator is the same as the Account Administrator, who is also the contact person (via email) for the subscription.

The Account Administrator can assign a different Service Administrator by editing the subscription in the Microsoft Online Services Customer Portal.

To assist with the management of the Azure Services, the Service Administrator will add Co-administrators to the subscription. To be added as a Co-administrator, a user must have a valid Microsoft Account or Org ID, if this is the method of authentication used in the subscription. The first Co-administrator in the subscription must be added by the Service Administrator. After that, any Co-administrator can add or remove other Co-administrators in the subscription.

Removing or adding Co-administrators must be done in the Azure Management portal, and the option is located under Settings > Administrators.

Subscription Co-administrators share the same rights and permissions that the Service Administrator has, with the following exception: a Co-administrator cannot remove the Service Administrator from a subscription. Only the Microsoft Azure account owner (Account Administrator) can change the Service Administrator for a subscription, by editing the subscription in the Microsoft Online Services Customer Portal, as shown previously.

The Co-administrator account can sign in to the Microsoft Azure Management portal, and view all services. The Service Administrator and Co-administrator have the ability to add, modify, or delete Azure services such as websites, cloud services, and mobile services. A single subscription is limited to a maximum of 200 Co-administrators.

Role Based Access Control Models

With the introduction of Role Based Access Control (RBAC), Microsoft Azure now has a security model to perform access control of resources by users on a more granular level. Users specified in RBAC permissions can access and execute actions on the resources within their scope of work. Because there is a limit of 200 Co-administrators per subscription, RBAC allows more users to manage their Azure Services. At the same time, RBAC limits access to only the specific resources needed rather than the entire subscription.

RBAC is only available in the Azure Preview portal and when using the Azure Resource Manager APIs. The Service Administrator and Co-administrator will continue having access to all portals and APIs, however any user added only via RBAC will not be able to access the current version of the Azure Management Portal or Service Management APIs.

Mandatory:

  • Service Administrator and Co-Administrators see all resources in all portals and through APIs.
  • Users defined in RBAC models do not have access to the Service Management portal or APIs.
  • Users not assigned to either group see only the empty Azure Resource Management portal, and they cannot access the Service Management portal.

With RBAC, the subscription is no longer the management boundary for permissions in Azure. Resource Groups are new constructs to group resources that have a common application or service lifecycle. In addition to granting access at the Resource Group level, RBAC permissions can be applied to an individual resource such as SQL Database, websites, virtual machines, and storage accounts.

RBAC administration is implemented by the subscription Service Administrator and Co-administrators. Customers can leverage their existing Azure AD users and groups, or use on-premises Active Directory accounts for access management.

Role Permissions

There are twenty-two built-in Azure RBAC roles for controlling access to Azure resources:

  1. The Owner can perform all management operations for a resource and its child resources, including access management or granting access to others.
  2. The Contributor can perform all management operations for a resource, including creating and deleting resources. A contributor cannot grant access to others.
  3. The Reader has Read-only access to a resource and its child resources. A Reader cannot read secrets.
  4. The API Management Service Contributor lets users manage API Management Services, but not access to them.
  5. The Application Insights Component Contributor lets users manage Application Insights components, but not access them.
  6. The BizTalk Contributor lets users manage BizTalk services, but not access them.
  7. The ClearDB MySQL DB Contributor lets users manage ClearDB MySQL Databases, but not access them.
  8. The Data Factory Contributor lets users manage data factories, but not access them.
  9. The DocumentDB Account Contributor lets users manage DocumentDB, but not access it.
  10. The Intelligent Systems Account Contributor lets users manage Intelligent Systems accounts, but not access them.
  11. The NewRelic APM Account Contributor lets you manage New Relic Applications Performance Management accounts and applications, but not access them.
  12. The Redis Cache Contributor lets users manage Redis caches, but not access them.
  13. The Scheduler Job Collections Contributor lets users manage scheduled job collections, but not access them.
  14. The Search Service Contributor lets users manage Search Service, but not access it.
  15. The SQL DB Contributor lets users manage SQL Databases, but not access them. Users also cannot manage security-related policies or parent SQL servers.
  16. The SQL Security Manager lets users manage the security-related policies of SQL Server instances and databases, but not access them.
  17. The SQL Server Contributor lets users manage SQL Server instances and databases, but not access them or their security-related policies.
  18. The Storage Account Contributor lets users manage storage accounts, but not access them.
  19. The User Access Administrator lets users manage user access to Azure resources.
  20. The Virtual Network Contributor lets users manage virtual networks, but not access them.
  21. The Web Plan Contributor lets users manage the web plans for websites, but not access them.
  22. The Website Contributor lets users manage websites (not web plans), but not access them.
Command-Line and API Access for Azure Role Based Access Control

Enforcing the access policies that you configure using RBAC is done by using Azure Resource Manager APIs. The Azure Preview portal, command-line tools, and Azure PowerShell use the Resource Manager APIs to run management operations. This ensures that access is consistently enforced regardless of what tools are used to manage Azure resources.

The following article provides additional details: Role-based access control in the Microsoft Azure portal.

Extending the Datacenter Fabric to Microsoft Azure

Extending services from on-premises implementations to Azure resources is largely driven by operational requirements.

  • How will systems be patched and maintained?
  • How will systems be monitored?
  • What security and audit tools will be required?

If any of the answers to these and other similar questions is by using an on-premises management tool, decisions need to be made as to how that is achieved.

  • Do on-premises systems need to be augmented with gateways or additional infrastructure that resides in Azure?
  • Are agents necessary, and can they be built-in to an image, or must they be post deployed?
  • What protocols and communications flows are needed?
  • What identities and permissions are required?
  • Is a common user interface required by operations?
  • Is it the same or different operators on-premises and in Azure?

The answers to these questions can drive decisions about identity, security, and network connectivity or be driven by them, depending on the organizations priorities.

Moving to Azure and the cloud provides opportunities to do things differently. It is important to think about processes and functions from a cloud perspective. Treat everything like a service. Can Azure Services meet the needs? Think about minimum viable solutions—the agility and cost benefits can be enormous.

Azure offers Operational Insights, Application Insights, Log collection, and antivirus solutions from multiple venders, and encryption and backup solutions from Microsoft and third-parties. The more the platform can be leveraged and SaaS offerings utilized, the greater the benefits to the organization.

Start with a cloud first mentality. That means using platform services to reduce infrastructure, management costs, and anchors to legacy solutions. Focus on being agile and scalable so the organization can capitalize on the elasticity and pay-for-use characteristics of Azure.

Microsoft Azure Management Models

Azure Service Management Overview

The Azure Service Management (ASM) Representational State Transfer (REST) API has historically been the primary model for managing Azure resources. The original and present iterations of the Azure Portal, the Azure PowerShell cmdlets, the cross-platform CLI and the Azure Management Libraries for .NET are all built on top of the ASM API. The ASM API was initially developed several years ago and is missing many modern cloud management capabilities, whether it's desired state configuration, role based access control (RBAC) or a flexible extensibility model for future Azure first-party services. ASM supports authentication with either X.509 certificates or Azure Active Directory (AAD).

Note: Azure Service Management (ASM) is not supported in the CSP subscriptions as defined in the CSP model. Only customer-managed subscriptions can be managed using Azure Service Management (ASM). CSP subscriptions are compliant only with the Azure Resource Management model described in the following section.

Azure Resource Manager Overview

The Azure Resource Manager REST API (ARM) has been developed to replace ASM as the authoritative method to manage Azure resources. ARM supports both desired state configuration and RBAC, while providing a pluggable model allowing new Azure services to be cleanly integrated. The preview Azure Portal and the ARM mode of the Azure PowerShell cmdlets both use ARM. AAD is the only authentication method supported by ARM.

ARM introduces the concept of a resource group which is a collection of individual Azure resources. A resource group is associated with a specific Azure region but may contain resources from more than one region.

A resource group can be described in the following scenarios:

Type

Description

Example

Vertical

Contains all resources comprising the single application

Company HR Application Resource Group

Horizontal

Combines all resources that comprise the specific deployment topology layer such as shared services used by multiple applications or app-specific tier

Shared Management Services Resource Group

ARM supports the use of a parameterized resource group template file that can be used to create one or more resource groups along with their individual resources. The deployment of a resource group uses desired state configuration. ARM ensures that the resources are deployed in accordance with the appropriately parameterized template file for the resource group. Resource providers exist for many types of Azure resources, and more Azure services are currently adding ARM support, gradually migrating from the legacy ASM model.

ARM supports role based access control (RBAC), and this support is expressed in the preview Azure Portal and the ARM mode of the Azure PowerShell cmdlets. ARM provides several core roles – Owner, Contributor, and Reader. Individual resource providers support additional resource-specific roles, such as Search Service Contributor and Virtual Machine Contributor.

Azure Resource Manager Templates

Azure Resource Manager (ARM) templates enable quick and easily provisioning of Azure applications via declarative JSON. The single JSON template can be constructed to deploy multiple services, such as virtual machines, virtual networks, storage, app services, and databases. The same template can be used to repeatedly and consistently deploy the application during every stage of the application lifecycle. Consequently, templates provide a reusable declarative model that complements imperative management patterns defined by PowerShell.

Azure Resource Manager (ARM) templates can be deployed from Azure PowerShell, Azure CLI, or the Azure preview portal. The following excerpt demonstrates how a known quickstart template defining a simple Azure VM can be deployed using Azure PowerShell.

$deployName="<deployment name>"
$RGName="<resource group name>"
$locName="<Azure location, such as West US>"
$templateURI="https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/101-simple-windows-vm/azuredeploy.json"
New-AzureResourceGroup -Name $RGName -Location $locName
New-AzureResourceGroupDeployment -Name $deployName -ResourceGroupName $RGName -TemplateUri $templateURI

Template Constructs

The list below describes typical constructs that can be found in an ARM template:

The parameters section represents a collection of the parameters that are defined in all of the resources, includes property values provided when setting up a resource group.

"parameters": {
  "siteName": {
    "type": "string"
  },
  "hostingPlanName": {
    "type": "string"
  },
  "siteLocation": {
    "type": "string"
  },
}

The resources section lists the resources that the template creates, with each resource described in detail, including its properties, and parameters for user-defined values.

{
    "name": "[parameters('databaseName')]",
    "type": "databases",
    "location": "[parameters('serverLocation')]",
    "apiVersion": "2.0",
    "dependsOn": [
      "[concat('Microsoft.Sql/servers/', parameters('serverName'))]"
    ],
    "properties": {
      "edition": "[parameters('edition')]",
      "collation": "[parameters('collation')]",
      "maxSizeBytes": "[parameters('maxSizeBytes')]",
      "requestedServiceObjectiveId": "[parameters('requestedServiceObjectiveId')]"
    }
},

The templateLink references another template from the current one. The following excerpt shows how the dependent JSON template file located in Azure storage can be linked from the primary template definition:

{
  "properties": {
    "template": {
      "$schema": "http://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
      "contentVersion": "1.0.0.0",
      "parameters": {
        "childParameter": { "type": "string" }
      },
      "resources": [ {
        "name": "Sub-deployment",
        "type": "Microsoft.Resources/deployments",
        "apiVersion": "2015-01-01",
        "properties": {
          "mode": "Incremental",
          "templateLink": {
            "uri": "http://<stac>.blob.core.windows.net/templates/template.json",
            "contentVersion": "1.0.0.0",
          },
          "parameters": {
            "subParameterName": { "value": "[parameters('childParameter']" }
          }
        }
      } ]
    }
  }
}

The CustomScriptExtension references a Custom Script Extension for Windows that allows it to execute PowerShell scripts on a remote Virtual Machine, without logging into it. The scripts can be executed after provisioning the VM or anytime during the lifecycle of the VM without requiring to open any additional ports on the VM. The most common use case for Custom Script include running, installing, and configuring additional software on the VM post provisioning.

The following excerpt illustrates how to reference the Custom Script Extension from within the JSON template to run a custom Windows PowerShell script to apply post-provisioning configuration:

{
   "type": "Microsoft.Compute/virtualMachines/extensions",
   "name": "MyCustomScriptExtension",
   "apiVersion": "2015-05-01-preview",
   "location": "[parameters('location')]",
   "dependsOn": [
      "[concat('Microsoft.Compute/virtualMachines/',parameters('vmName'))]"
   ],
   "properties": {
      "publisher": "Microsoft.Compute",
      "type": "CustomScriptExtension",
      "typeHandlerVersion": "1.4",
      "settings": {
         "fileUris": [
            http://<stacn>.blob.core.windows.net/customscriptfiles/start.ps1
      ],
      "commandToExecute": "powershell.exe -ExecutionPolicy Unrestricted -File start.ps1"
      }
   }
}

Common Template Scopes

The following key solution templates scopes have been identified in practical experience. These three scopes (capacity, capability, and end to end solution) are described in more detail below.

Type

Description

Example

Capacity Scope

Delivers a set of resources in a standard topology that is pre-configured to be in compliance with regulations and policies

Deploying a standard development environment in an Enterprise IT or SI scenario

Capability Scope

Deploying and configuring a topology for a given technology

Common scenarios including technologies such as SQL Server, Cassandra, Hadoop, etc.

End to End Solution Scope

Targeted beyond a single capability, and instead focused on delivering an end to end solution comprised of multiple capabilities. A solution scoped template scope manifests itself as a set of one or more capability scoped templates with solution specific resources, logic, and desired state.

An end to end data pipeline solution template that might mix solution specific topology and state with multiple capability scoped solution templates such as Kafka, Storm, and Hadoop

Template Free-form vs. Known Configurations

While the template is generally perceived to give customers the utmost flexibility, many considerations affect the choice of whether to use free-form configurations vs. known configurations.

Free-form Configurations

Free-form configurations provide the most flexibility by allowing customization of the resource type and supplying values for all resource properties, such as selecting a VM type and providing an arbitrary number of nodes and attached disks for those nodes.

Nonetheless, since in mature organizations templates are expected to be used to deploy large Azure resource topologies, the complexity of building a template for a sophisticated infrastructure deployment potentially containing hundreds of varied resources results in substantial overhead for designing, maintaining and deploying the free-form template.

Known Configurations

Rather than offer a template that provides total flexibility and countless variations, the common pattern is to provide the ability to select known configurations—in effect, standard sizes such as sandbox, small, medium, and large. Other examples of such sizes are product offerings, such as community edition or enterprise edition. In other cases, it may be workload specific configurations of a technology – such as map reduce or nosql.

Many enterprise IT organizations, OSS vendors, and SIs make their offerings available today in this way in on-premises, virtualized environments (enterprises) or as software-as-a-service (SaaS) offerings (CSVs and OSVs). This approach provides good, known configurations of varying sizes that are preconfigured for customers.

Without known configurations, end customers must determine cluster sizing on their own, factor in platform resource constraints, and do math to identify the resulting partitioning of storage accounts and other resources (due to cluster size and resource constraints). Known configurations enable customers to easily select the right standard size for a given deployment. In addition to making a better experience for the customer, a small number of known configurations is easier to support and can help deliver a higher level of density.

Template Dependencies

For a given resource, there can be multiple upstream and child dependencies that are critical to the success of deployment topology. Such dependencies can be defined on other resources using dependsOn keyword and resources property of a resource in the ARM template. As an example, a virtual machine may be dependent on having a database resource successfully provisioned. In another case, multiple cluster nodes must be installed before deploying a virtual machine with the cluster management tool.

While dependsOn is a useful tool to map dependencies between resources comprising the deployment, it needs to be used judiciously since it can impact the deployment performance characteristics. As such dependsOn should not be used to document how resources are interconnected. The lifecycle of dependsOn is just for deployment and is not available post-deployment. Once deployed there is no way to query these dependencies. Use of the dependsOn keyword may have implication on the deployment engine operation that would prevent it from using parallelism where it might have otherwise. The mechanism called resource linking should be used instead to document and provide query capability over the relationships between resources.

Azure Resource Manager Resource Locks

There are numerous scenarios where an administrator needs to place a lock on a resource or resource group to prevent other users in the organization from committing write actions or accidentally deleting a critical resource. Azure Resource Manager provides the ability to restrict operations on resources through resource management locks. Resource locks are policies which enforce a lock level at a particular scope. The lock level identifies the type of enforcement for the policy, which presently has two values – CanNotDelete and ReadOnly. The scope is expressed as a URI and can be either a resource or a resource group.

For example, various resources are used in an off-and-on pattern, such as virtual machines which are turned on periodically to process data for a given interval of time and then turned off. In this scenario, the VM shut down must be enabled but it is imperative that the underlying storage account not be deleted. In this scenario, a resource lock with a lock level of CanNotDelete can be applied on the storage account.

In another scenario, business organization may have periods where updates must not go into production. In these cases, the ReadOnly lock level stops creation or updates. For example, a retail company may not want to allow updates during holiday shopping periods; a financial services company may have constraints related to deployments during certain market hours. A resource lock can provide a policy to lock the resources as appropriate. This could be applied to just certain resources or to the entirety of the resource group. The resource lock can be applied both via Azure PowerShell or added within the context of the Azure Resource Manager template.

Azure Resource Manager QuickStart Templates

The Microsoft Azure Product Group created a community-maintained set of quickstart ARM templates that could be used as building blocks to author custom JSON templates for complex workloads to be deployed in Azure. For representative purposes a subset of provided templates is listed in the table below:

QuickStart ARM Templates

Application Gateway with Public IP

Create an Application Gateway with Public IP

Virtual Network with Subnets, a local network, and a VPN gateway

Create a Virtual Network with two Subnets, a local network, and a VPN gateway

VM with multiple NICs

Create a VM with multiple network interfaces and RDP accessible

Windows VM with tags

This template takes a minimum amount of parameters and deploys a Windows VM with tags, using the latest patched version.

2 VMs in a Load Balancer

2 VMs in a Load Balancer and Load Balancer rules

Multi tier VNet with NSGs and DMZ

Install Virtual Network with DMZ Subnet

Network Interface in a Virtual Network with Public IP Address

Network Interface in a Virtual Network with Public IP Address

Azure Resource Manager Tags

Azure Resource Manager provides a tagging feature that facilitates resource categorization according to customer requirements for managing or billing. Tags are defined as Name-Value pairs assigned to resources or resource groups and can be used in scenarios where customer business processes and organizational hierarchy call for a complex collection of resource groups and resources and subscription assets need to be structured according to established policies. Each resource can have up to 15 tags. Users are able to sort and organize resources by tags. Tags may be placed on a resource at the time of creation or added to an existing resource. Once a tag is placed on a billable resource, created via the Azure Resource Manager, the tag will be included in the usage details found in the Usage and Billing portal.

Tags are persisted in the resource's properties in the order they are added. The following Azure PowerShell excerpt demonstrates how to obtain the tag info associated with the existing virtual machine demonstrating the order in which tags were associated with the virtual machine resource.

PS C:\> Get-AzureVM -Name MyVM -ResourceGroupName Group-1

ResourceGroupName : Group-1

Id : /subscriptions/<..>/resourceGroups/Group-1/providers/Microsoft.Compute/virtualMachines/MyVM

Name : MyVM

Type : Microsoft.Azure.Management.Compute.Models.VirtualMachineGetResponse

Location : westus

Tags : {

"Department": "MarketingDepartment",

"Application": "LOBApp",

"Created By": "CEO",

"AppPropOn1": "AppInsightsComponent",

"AppPropOne": "One"

}

...

NetworkInterfaceIDs : {...c}

Alternatively, tags can be defined in the Resource Manager template as demonstrated in the excerpt below:

{
  "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
  "contentVersion": "1.0.0.0",
  "parameters": {
    "newStorageAccountName": {
      "type": "string",
      "metadata": {
        "description": "Unique DNS Name for the Storage Account where the Virtual Machine's disks will be placed."
      }
    },
  "resources": [
    {
      "type": "Microsoft.Storage/storageAccounts",
      "name": "[parameters('newStorageAccountName')]",
      "apiVersion": "2015-05-01-preview",
      "location": "[variables('location')]",
      "tags": {
        "Department": "[parameters('departmentName')]",
        "Application": "[parameters('applicationName')]",
        "Created By": "[parameters('createdBy')]"
      },
    },
    {
      "apiVersion": "2015-05-01-preview",
      "type": "Microsoft.Network/publicIPAddresses",
      "name": "[variables('publicIPAddressName')]",
      "location": "[variables('location')]",
      "tags": {
        "Department": "[parameters('departmentName')]",
        "Application": "[parameters('applicationName')]",
        "Created By": "[parameters('createdBy')]"
    },
  ]
}

Generally, subscription resources can have tags defined to accommodate the following scenarios:

  • Resources that serve a similar role in customer organization
  • Resources that belong to the same department – i.e. finance, legal, retail, etc.
  • Resources comprising the same environment – i.e. dev, test, or prod.
  • Resources that are managed by the same responsible party - i.e. Alice, Bob, etc.

Mature organizations are encouraged to create a custom tag taxonomy adopted to all Azure organizational assets to ensure all actors consuming Azure resources are compliant with established policies. For example, users utilizing organization-specific tags, such as "Contoso-DeptOne" instead of applying duplicate but slightly different tags (such as "dept" and "department").

The following template excerpt contains JSON that describes tags for a resource that specify the environment type, project name and internal billing chargeback ID. The values for these are passed in via parameters to make this template more re-usable and of higher value for Systems Integrators, Corporate IT, and Cloud Service Vendors. This approach enables them to use the same template to deploy capacity or capabilities for a multitude of customers that each will have distinct values for these tags.

"tags": {
"ChargebackID": "[parameters(chargebackID)]",
"ProjectName": "[parameters(projectName)]",
"EnvironmentType" :"[parameters('environmentType')]"
},

Azure has many partners who have incorporated tags into their cost management solutions. Specifically, such as
Apptio, Cloudability, Cloudyn, Cloud Cruiser, Hanu Insight
and RightScale
are leveraging tags in their products.

Microsoft Azure Storage

Microsoft Azure Storage Services Overview

Every solution deployed in Microsoft Azure leverages an aspect of Azure Storage, making storage a common component and critical to planning any Azure-based solution design. The storage planning considerations covered in this section include:

  • Designing Azure Storage solutions, which account for the service and application use cases, impact storage account quantity, type of storage leveraged, locations, and availability models.
  • Planning considerations including the types of workloads, the method of access (internal, external, or peered) and storage account security.
  • Considerations surrounding cloud-integrated storage solutions including Azure Backup, System Center Data Protection Manager, StorSimple, and third-party tools and solutions that leverage Azure Storage.

Managing, monitoring, and troubleshooting topics, including storage throttling behaviors, storage analytics, and IaaS virtual machine considerations, including I/O profiles and maintaining disk consistency within workloads.

Recommended: For organizations that are new to Azure Storage, it is often helpful to draw comparisons to private cloud storage or traditional SAN storage as a way to understand some of the basic concepts required in an Azure Storage design (for example, compare BLOB storage account to a LUN).

Storage Planning

Planning the storage account infrastructure is perhaps the most important step of any Microsoft Azure deployment because it sets the foundation for performance, scalability and functionality. It is first necessary to understand the two types of storage accounts (standard and premium) and the services available within each type of account. The following sections outline the differences at a high level.

Storage Account Types and Services

A standard storage account includes Blob, Table, Queue, and File storage services. These storage services are included in every storage account created. A storage account provides a unique namespace for working with the blobs, queues, and tables.

  • Blob storage stores file data. A blob can contain any type of text or binary data, such as a document, media file, or application installer. Every data blob is organized into an object called a container within each storage account. A storage account can contain any number of containers (but it must have at least one), and a container can contain any number of blobs. Blob storage offers two types of blobs: block blobs and page blobs.
    • Block blobs are optimized for streaming and storing cloud objects, and they are a good choice for storing documents, media files, and backups.
    • Page blobs are optimized for representing IaaS disks and supporting random writes. An Azure virtual machine IaaS disk is a virtual hard disk that is stored as a page blob.
  • Table storage stores structured datasets. Table storage is a NoSQL key-attribute data store, which allows for development and fast access to large quantities of data.
  • Queue storage provides messaging for workflow processing and for communication between components of cloud services.
  • File storage offers shared storage for legacy applications that leverage the SMB 2.1 protocol. Azure virtual machines and cloud services can share file data across application components by using mounted shares. On-premises applications can also access data in a share via the File service REST API.
    Note: Access to files is available to virtual machines residing in the same Azure region over standard universal naming convention (UNC) paths. Access across regions or from on-premises infrastructures is not supported. There are key differences to areas such as storage capacity and I/O between blob and file access, which should be accounted for in every design. These differences are outlined in the Storage Limits section of the reference article Azure Subscription and Service Limits, Quotas, and Constraints.

Standard storage accounts are available with four redundancy types:

  • Locally redundant (LRS)
  • Geo-redundant (GRS)
  • Zone redundant (SRS)
  • Read access geo-redundant (RA-GRS)

These redundancy values and their potential use will be covered in later sections of this document.

A premium storage account currently supports only Azure virtual machine disks that are backed by page blobs. A premium storage account stores only page blobs, and only REST APIs for page blobs and their containers are supported. From an infrastructure perspective, premium storage stores data on solid-state drives (SSDs), whereas standard storage stores data on hard disk drives (HDDs). As a result, premium storage delivers high-performance, low-latency disk support for I/O intensive workloads running on Azure virtual machines. The following characteristics summarize the current capabilities of Azure premium storage. Premium storage offers:

  1. 4 – 5 IOPS per GB @ 256KB I/O size
  2. 5,000 IOPS with no-cache. Much higher with cache depending on virtual machine size
  3. 200 MB/sec bandwidth per disk.
  4. 1 TB maximum disk size.
  5. 50,000 IOPS @ 8KB; 512 MB/sec virtual machine limit

Premium storage is limited to local replication only. Premium Storage GRS is not currently available. However, you can optionally create snapshots of your disks and copy those snapshots to a standard GRS storage account if required. This enables the ability to maintain a geo-redundant snapshot of data for disaster recovery purposes.

For high-scale applications and services, you can attach several premium storage disks to a single virtual machine, and support up to 32 TB of disk storage per virtual machine and drive more than 64,000 IOPS per virtual machine at less than 1 millisecond latency for read operations. Like standard storage accounts, premium storage keeps three replicas of data within the same region, and ensures that a Write operation will not be confirmed until it is durably replicated.

Every object that is stored in Azure Storage has a unique uniform resource identifier (URI) address; the storage account name forms the subdomain of that address. The subdomain together with the domain name, which is specific to each service, form an endpoint for your storage account.

For example, if your storage account is named azra1, the default endpoints for your storage account would be:

  • Blob service: http://azra1.blob.core.windows.net
  • Table service: http://azra1.table.core.windows.net
  • Queue service: http://azra1.queue.core.windows.net
  • File service: http://azra1.file.core.windows.net

The endpoints for each storage account are visible on the storage Dashboard in the Azure Management portal after the account has been created.

The URI for accessing an object in a storage account is built by appending the object's location in the storage account to the endpoint. For example, a blob address might have this format: http://azra1.blob.core.windows.net/mycontainer/myblob.

Storage Security

Access to storage accounts is possible through two means:

  • Azure authentication – These mechanisms include Azure Active Directory (Organization ID) and Microsoft Accounts (formerly Live ID)
  • Storage Account Key – This mechanism primarily is used for programmatic access by applications. This includes cloud and line-of-business applications and graphical user interface (GUI) tools to manage storage accounts such as CloudBerry or Storage Explorer.

Please refer to the Storage Security section later in this document to understand common practices and implications for using storage account keys.

Feature References

Introduction to Microsoft Azure Storage

http://azure.microsoft.com/en-us/documentation/articles/storage-introduction/

Azure Storage documentation and intro videos

http://azure.microsoft.com/en-us/documentation/services/storage/

Introduction to Premium Storage

http://azure.microsoft.com/en-us/documentation/articles/storage-premium-storage-preview-portal/

Technical Overview

http://azure.microsoft.com/en-us/documentation/articles/storage-create-storage-account/

Quick Start Guide

http://azure.microsoft.com/en-us/documentation/articles/storage-getting-started-guide/

Microsoft Azure Storage Team Blog

http://blogs.msdn.com/b/windowsazurestorage

Understanding Block Blobs and Page Blobs

https://msdn.microsoft.com/en-us/library/ee691964.aspx

Introducing Azure Storage Append Blob

http://blogs.msdn.com/b/windowsazurestorage/archive/2015/04/13/introducing-azure-storage-append-blob.aspx

Mandatory:

  • There are hard limits on the quantity, size, and expected performance of Azure Storage accounts. It is critical to review the Azure subscription and service limits, quotas, and constraints pertaining to storage accounts when planning Azure solutions. For more information, see Standard Storage Limits.
  • A single storage account is limited to a maximum of 500 TB. If this limit is exceeded, a new storage account must be created.
  • The maximum size of any Azure file share is 5 TB. If this limit is exceeded, a new file share must be created.

Recommended: Consider the use of premium storage when a higher level of disk performance is needed for a given workload or application. Premium storage is high performance SSD-based storage designed to support I/O intensive workloads with significantly high throughput and low latency. With premium storage, you can provision a persistent disk and configure its size and performance characteristics to meet your application requirements.

Design Guidance

When designing storage account types and services, consider the following:

Capability Considerations

Capability Decision Points

Capability Models in Use

Different storage account types serve different purposes.

Each storage account should be allocated for a specific purpose and not be a generic, all-purpose container.

You need to decide how to allocate storage accounts for different purposes within your project..

Within an IaaS deployment, consider a separate storage account for maintenance of master images that can be deployed to other storage accounts throughout the subscription.

Within an IaaS deployment, consider a separate storage account for any backup purposes, separate from any production data, such that it can be created in a different region than the primary data.

Different storage services provide unique capabilities.

Understand the type of data and data flow that the storage account will serve to determine the storage service that the account will provide.

For key lookups at scale for structured data, use Tables.

For scans or retrievals of large amount of raw data, such as analytics or metrics, use Blobs.

For streaming and storing documents, videos, pictures, backups, and other unstructured text or binary data, use Blobs.

For IaaS virtual machine VHDs, use Blobs.

For process workflows or decoupling applications, use Queues.

To share files between applications running in virtual machines that are using familiar Windows APIs or the File service REST API, use Files

The storage service offers two types of blobs: block blobs and page blobs.

Understand and decide on the use of block blobs or page blobs when you create the blob..

In the majority of cases, page blobs will be utilized. Page blobs are optimized for random Read and Write operations (best for virtual machines and VHDs).

Page blobs have a maximum storage of 1 TB (compared to only 200 GB for a block blob), and they commit immediately (compared to a block blob, which remains uncommitted until a commit is issued).

Block blobs are for streaming and storing documents, videos, pictures, backups, and other unstructured text or binary data.

Additionally, there are cost differences associated to each type of storage.

Microsoft Azure provides several ways to store and access randomly access data in the cloud (blobs).

Decide when to use Azure Blobs, Azure Files, or Azure data disks.

Azure Files is most often used when data stored in the cloud needs to be accessed by multiple IaaS or PaaS virtual machines with a standard SMB interface or UNC path.

Azure Blobs is most often used for larger capacity uses and where random access is needed, such as for multiple disks, and you want to be able to access data from anywhere.

Azure data disks are most often used when you want to store data that is not required to be accessed from outside the virtual machine to which the disk is attached. It is exclusive to a single virtual machine (only one at a time).

Depending on how data is replicated (LRS, GRS, ZRS, RA-GRS), the blob type, storage service, storage transactions, and the use of premium storage will affect the overall cost of the Azure Storage solution.

When making decisions about how your data is stored and accessed, you should also consider the costs involved.

Your total cost depends on how much you store, the volume of storage transactions and outbound data transfers, and which data redundancy option you choose.

T The type of data will drive most of these decisions. For example, data that is critical to a business may drive the decision to have GRS, whereas data that is less critical may suffice with LRS.

Data that must be quickly accessed with the highest possible IOPS may drive the usage of premium storage, where data without that requirement may accept the use of standard storage.

These requirements will necessitate and support the higher costs associated with the storage services.

Storage containers can be used to further organize data in storage accounts.

Decide how you want the data in Azure Storage to be organized.

Deciding how to design and build containers is similar to how you would design and build a folder structure on a file server. It is simply how you want to organize the data.

By default, all VHDs will be put into a "vhds" folder, but you can change or specify whatever container structure you want to use.

Concurrency settings can be modified for Azure Storage accounts.

Modern applications usually have multiple users viewing and updating data simultaneously. This requires developers to think carefully about how to provide a predictable experience to their end users, particularly for scenarios where multiple users can update the same data.

There are three main data concurrency strategies developers typically consider:

  • Optimistic concurrency
  • Pessimistic concurrency
  • Last writer wins

You can opt to use optimistic or pessimistic concurrency models to manage access to blobs and containers in the Blob service. If you do not explicitly specify a strategy, last writer wins is the default.

For IaaS, concurrency settings do not need to be modified.

For PaaS, the developers need to consider the type of application, the user base, and the data types to help determine the concurrency settings.

There are storage account limitations that must be understood and respected.

Aside from the size limits of Azure Storage accounts, you must consider the throughput limitations of each account and design your storage accounts with those in mind.

You are more likely to hit the throughput limitations before you hit the size limitations. You are also limited by the number of storage accounts per subscription.

The primary constraining factor is the number of VHD files that can be stored in each storage account. For virtual machines in the Basic tier, do not place more than 66 highly used VHDs in a storage account to avoid the 20,000 total request rate limit (20,000/300).

For virtual machines in the Standard tier, do not place more than 40 highly used VHDs in a storage account (20,000/500). The term highly used refers to VHDs that push the upper limits of the total request rates.

If you have VHDs that are not highly used and do not come close to the maximum request rates, you can put more VHDs in the storage account.

Note that this refers to virtual hard disks and not virtual machines. Virtual machines may indeed contain multiple virtual hard disks.

Single or multiple storage accounts can be used.

Additional storage accounts may be used to get more scale than a single storage account.

Consider how to design the IaaS or PaaS workloads to dynamically add accounts, in the event that more scale is needed for the solution in the future, beyond what a single storage account can provide.

Storage account throughput is the determining factor in using single or multiple storage accounts.

Consider the throughput limitations of each of the storage account types. Also, consider that throughput can be maximized by using:

  • More simultaneous outstanding I/O
  • Parallel page writes or block writes in a single blob
  • Parallel uploads for multiple blobs

Naming Conventions

The choice of a name for any asset in Microsoft Azure is an important choice because:

  • It is difficult (though not impossible) to change that name at a later time.
  • There are certain constraints and requirements that must be met when choosing a name.

This table covers the naming requirements for each element of a storage account.

Item

Length

Casing

Valid characters

Storage account name

3-24

Lower case

Alphanumeric

Blob name

1-1024

Case sensitive

Any URL char

Container name

3-63

Lower case

Alphanumeric and dash

Queue name

3-63

Lower case

Alphanumeric and dash

Table name

3-63

Case insensitive

Alphanumeric

It is also possible to configure a custom domain name for accessing blob data in your Azure Storage account. The default endpoint for the Blob service is:

https://mystorage.blob.core.windows.net

But if you map a custom domain (such as www.contoso.com) to the blob endpoint for your storage account, you can also access blob data in your storage account by using that domain. For example, with a custom domain name, http://mystorage.blob.core.windows.net/mycontainer/myblob could be accessed as http://www.contoso.com/mycontainer/myblob.

Use the following reference when this capability is required.

Feature References

Naming and Referencing Containers, Blobs, and Metadata

https://msdn.microsoft.com/en-us/library/dd135715.aspx

Naming Queues and Metadata

https://msdn.microsoft.com/en-us/library/dd179349.aspx

Naming Tables

https://msdn.microsoft.com/en-us/library/azure/dd179338.aspx

Configure a custom domain name for blob data in an Azure Storage account

http://azure.microsoft.com/en-us/documentation/articles/storage-custom-domain-name

Mandatory:

  • A blob name can contain any combination of characters, but reserved URL characters must be properly escaped. Avoid blob names that end with a period (.), a forward slash (/), or a sequence or combination of the two. By convention, the forward slash is the virtual directory separator.
    Also don't use a backward slash (\) in a blob name. The client APIs may allow it, but then fail to hash properly, and the signatures will mismatch.
  • It is not possible to modify the name of a storage account or container after it has been created. You must delete it and create a new one if you want to use a new name.

Recommended: We recommend that you establish a naming convention for all storage accounts and types before you create any.

Design Guidance

When you choose naming conventions for storage objects, consider the following:

Resource

Restrictions

Recommendations

Storage account

Must be between 3 and 24 characters in length and use numbers and lower-case letters only. Not only must it be unique within the subscription, but it also must be unique across Azure.

Naming should be representative of its contents (for example, virtual machines, backup data, archive data, or images).

Storage Blob container

Container names must start with a letter or number, and they can contain only letters, numbers, and the hyphen (-) character.

Every hyphen must be immediately preceded and followed by a letter or number; consecutive hyphens are not permitted in container names.

All letters in a container name must be lower case.

Container names must be from 3 through 63 characters long.

Naming should be representative of its contents (for example, vhds, server images, or backup-Mar03-2015)

Storage Blob

A blob name can contain any combination of characters.

A blob name must be at least one character long and cannot be more than 1,024 characters long.

Blob names are case sensitive.

Reserved URL characters must be properly escaped.

The number of path segments comprising the blob name cannot exceed 254. A path segment is the string between consecutive delimiter characters (for example, the forward slash) that corresponds to the name of a virtual directory.

Naming should be representative of its contents.

Queues

Every queue within an account must have a unique name. The queue name must be a valid DNS name.

A queue name must start with a letter or number, and can only contain letters, numbers, and the hyphen (-) character.

The first and last letters in the queue name must be alphanumeric. The hyphen cannot be the first or last character. Consecutive hyphens are not permitted in the queue name.

All letters in a queue name must be lower case.

A queue name must be from 3 through 63 characters long.

Naming should be representative of its contents.

Tables

Table names must be unique within an account.

Table names can contain only alphanumeric characters.

Table names cannot begin with a numeric character.

Table names are case insensitive.

Table names must be from 3 to 63 characters long.

Some table names are reserved, including "tables." Attempting to create a table with a reserved table name returns error code 404 (bad request).

Table names preserve the case with which they were created, but are case insensitive when used.

Naming should be representative of its contents.

Location, Durability, and Redundancy

The location and durability of the storage accounts must also be taken into account. Durability and redundancy options also have an impact on the cost of the storage. When creating a storage account (either through the portal, Azure PowerShell, or REST APIs), you are required to specify an affinity group or a location.

  • Affinity groups allow you to group your Azure services to optimize performance. All services, storage accounts, and virtual machines within an affinity group will be located in the same Azure datacenter or region. An affinity group can improve service performance by locating computer workloads in the same Azure datacenter or region or near the target user audience. Also, no billing charges are incurred for egress when data in a storage account is accessed from another service that is part of the same affinity group.
  • Location refers to Azure region where the storage account will be deployed. Using a location instead of affinity groups will allow you to deploy different services in different locations independently, as needed.
  • Resource groups enable you to manage all your resources in an application or service deployed together in Microsoft Azure. Given the close relationship of Azure Storage to Azure Services, the alignment of Azure Storage accounts to associated resource groups is a key consideration when used.

To ensure that the Azure SLAs can be met, there are several levels of data replication available for the storage accounts:

  • Locally redundant storage (LRS) maintains three copies of your data. LRS is replicated three times within a single facility in a single region. LRS protects your data from normal hardware failures, but not from the failure of a single facility.
  • Zone-redundant storage (ZRS) maintains three copies of your data. ZRS is replicated three times across two to three facilities, either within a single region or across two regions, providing higher durability than LRS. ZRS ensures that your data is durable within a single region.
  • Geo-redundant storage (GRS) is enabled for your storage account by default when you create it. GRS maintains six copies of your data. With GRS, your data is replicated three times within the primary region. It is also replicated three times in a secondary region hundreds of miles away from the primary region. This provides the highest level of durability. In the event of a failure at the primary region, Azure Storage will fail over to the secondary region. GRS ensures that your data is durable in two separate regions.
  • Read-access geo-redundant storage (RA-GRS) allows you to have higher read availability for your storage account by providing Read-only access to the data replicated to the secondary location. When you enable this feature, the secondary location can be used to achieve higher availability in the event the data is not available in the primary region. Read-access geo-redundant storage is recommended for maximum availability and durability.

Feature References

Azure Storage Replication for Durability and High Availability

http://azure.microsoft.com/en-us/documentation/articles/storage-introduction/#replication-for-durability-and-high-availability

Azure Storage Redundancy Options

http://azure.microsoft.com/en-us/documentation/articles/storage-redundancy/

Azure SLAs (including Storage)

http://azure.microsoft.com/en-us/support/legal/sla/

Azure Storage Pricing Guide

http://azure.microsoft.com/en-us/pricing/details/storage/

Using resource groups to manage your Azure resources

http://azure.microsoft.com/en-us/documentation/articles/azure-preview-portal-using-resource-groups

Mandatory:

  • ZRS is currently available only for block blobs. Consider the storage type when selecting the redundancy type.
  • When you create a GRS storage account, you select the primary region for the account. The secondary region is determined based on the primary region, and it cannot be changed.

Recommended: Not all storage services are available in all regions. Be sure to check the availability of the service you desire, in the region you desire, during the planning phase. (For example, premium storage is limited to only a few regions.) For more information, see Services by regions.

Optional:

  • GRS is recommended over ZRS or LRS for maximum durability. However, please note that there is a price difference amongst the different redundancy types.
  • You can change how your data is replicated after your storage account has been created, but note that you may incur an additional one-time data transfer cost if you switch from LRS to GRS or RA-GRS.

Design Guidance

When you design storage durability and redundancy, consider the following:

Capability Considerations

Capability Decision Points

Capability Models in Use

Each availability option provides a different level of data redundancy.

Carefully consider the level of redundancy that you may need with your data. Not all data needs the same level of redundancy (and redundancy costs).

Locally redundant storage (LRS) is less expensive than geographically redundant storage GRS, and it also offers higher throughput. If your application stores data that can be easily reconstructed, you may opt for LRS.

Some applications are restricted to replicating data only within a single region due to data governance or privacy requirements.

If your application has its own geo-replication strategy (for example, SQL AlwaysOn and Active Directory domain controllers), then it may not require GRS.

ZRS is currently available only for block blobs. Note that once you have created your storage account and selected zone-redundant replication, you cannot convert it to use to any other type of replication, or vice versa.

Locally redundant storage (LRS) provides economical local storage or data governance compliance.

Zone redundant storage (ZRS) provides an economical, yet higher durability option for block Blob storage.

Geographically redundant storage (GRS) provides protection against a major datacenter outage or disaster.

Read-access geographically redundant storage (RA-GRS) provides Read access to data during an outage, for maximum data availability and durability.

In general, plan to design to regional redundancy (GRS), unless the workload already accounts for it. In that case, there is no need to duplicate it. Also, see the previous section about premium storage for special considerations on premium storage redundancy options.

Plan with failure in mind.

Redundancy options are available not because failures may occur, but because they will occur. Accept that hardware failures are part of running hyper-scale datacenters, and plan that failures will occur through the use of available redundancy options.

Ensure applications and workloads have "retry options" for storage connection failures, in the event the storage becomes unavailable in the primary location. No code changes are required, but small latency may occur. Latency sensitive apps may also benefit from the use of cache.

Storage Security

When a storage account is created, only the owner of that account may access the blobs, tables, files, and queues within that account. There are several ways to grant and share access to storage accounts to other users. This section discusses some of the available options.

Access Control

When you create a storage account, Azure generates two storage access keys, which are used for authentication when the storage account is accessed. By providing two storage access keys, Azure enables you to regenerate the keys with no interruption to your storage service or access to that service.

One simple way to grant access to the storage account is to share that storage access key. However, if your service or application needs to make these resources available to other clients without sharing your access key, you have other options for permitting access:

  • You can set a container's permissions to permit anonymous Read access to the container and its blobs. This is for blobs only, not for tables or queues.
  • You can use a shared access signature, which enables you to delegate restricted access to a container, blob, table, or queue resource by specifying the interval for which the resources are available and the permissions that a client will have to it.
  • You can also use a stored access policy to manage shared access signatures for a container, blob, queue, or table. The stored access policy gives you an additional measure of control over your shared access signatures. You can use a stored access policy to change the start time, expiration time, or permissions for a signature, or to revoke it after it has been issued.

Authentication

Every request made to an Azure Storage account must be authenticated, unless it is an anonymous request against a public container or its blobs. There are two ways to authenticate a request against the storage accounts:

  • Use the shared key or shared key lite authentication schemes for the Blob, Queue, Table, and File services.
  • Create a shared access signature. A shared access signature includes the credentials required for authentication and the address of the resource being accessed. Because the shared access signature includes all data needed for authentication, it can be used to grant access to a Blob, Queue, or Table service, and it can be distributed separately from any code.

Feature References

Manage Access to Azure Storage Resources

http://azure.microsoft.com/en-us/documentation/articles/storage-manage-access-to-resources/

Authenticating Access to Your Azure Storage Account

https://msdn.microsoft.com/en-us/library/azure/hh225339.aspx

Authentication for the Azure Storage Services REST API reference

https://msdn.microsoft.com/library/azure/dd179428.aspx

Microsoft Azure Storage Explorers

http://blogs.msdn.com/b/windowsazurestorage/archive/2014/03/11/windows-azure-storage-explorers-2014.aspx

Constructing the Shared Access Signature URI

https://msdn.microsoft.com/en-us/library/azure/dn140255.aspx

Mandatory: You need the Azure Storage access key to access the storage account through any GUI tools, such as Azure Storage Explorer or any third-party tools.

Mandatory: The primary access key and secondary access key for storage accounts should be changed periodically to mitigate unauthorized access.

  • Ensure to update applications that use the key with the new key as you change each key.
  • Do not change both storage account access keys at the same time. This can result in loss of access by applications.

We suggest changing each key (primary or secondary) every 60 to 120 days. This allows for an ongoing monthly or quarterly key change cadence that affects only one key at a time.

Additional events that could cause you to regenerate keys include when a security incident occurs, if you fear compromise of storage account keys, or when key administrative personnel leave your organization.

Comparisons can be made within each organization regarding password changes for critical service accounts or credentials in Active Directory and other authentication systems.

As a general practice, Azure storage account and vault keys should follow similar practices and procedures currently established within the organization.

Recommended:

  • Only regenerate storage account keys if absolutely necessary. Regenerating your access keys affects virtual machines, media services, and any applications that are dependent on the storage account. All clients that use the access key to access the storage account must be updated to use the new key.
  • If you require more granular control over blob resources, or if you want to provide permissions for operations other than Read operations, you can use a shared access signature to make a resource accessible to users.

Optional: A container or blob can be made available for public
access by setting a container's permissions. A container, blob, queue, or table may be available for signed access via a shared access signature (a shared access signature is authenticated through a different mechanism).

Design Guidance

When you design storage security, consider the following:

Capability Considerations

Capability Decision Points

Capability Models in Use

Storage accounts can be created with internal (private) or external (public) access.

The decision about what type of access control to apply to the storage accounts depends entirely on the type of data stored in those accounts and how that data needs to be accessed and protected.

This is something that is unique to every customer in Azure. In general, the guidance is to always start with internal (private) access only, then find reasons (exceptions) why the data may need external (public) access.

Most companies do not need to have external access directly to their data.

Storage keys can be used to protect storage accounts against unauthorized usage.

Storage keys should be treated like highly privileged credentials (such as Domain Admin credentials). They should be limited to a few selected, trusted resources within the organization.

If you need to grant access to storage accounts without sharing the storage keys, there are other methods to accomplish this.

To permit access to storage resources without giving out your access keys, you can use a shared access signature. A shared access signature provides access to a resource in your account for an interval that you define and with the permissions that you specify.

If your service requires that you exercise more granular control over blob resources, or if you want to provide permissions for operations other than Read operations, you can use a shared access signature to make a resource accessible to users.

You can specify that a container should be public, in which case all Read operations in the container and any blobs within it are available for anonymous access.

An anonymous request does not need to be authenticated, so a user can perform the operation without providing account credentials.

Encryption

Client-side encryption for Microsoft Azure Storage contains new functionality to help developers encrypt their data inside client applications before uploading it to Azure Storage. The data also can be decrypted when it is downloading.

Client-side encryption also supports integration with Azure Key Vault to store and manage the encryption keys. The storage service never sees the keys and is incapable of decrypting the data. This gives you the most control you can have. It's also fully transparent so you can inspect exactly how the library is encrypting your data to ensure that it meets your standards.

Storage Protection

Service-Level Storage Protection

From a service-level perspective, Microsoft has a responsibility to protect stored data to mitigate threats related to physical drives within each Azure datacenter. Storage within Azure is exposed in one or more storage accounts within each Azure subscription.

Within a Microsoft Azure datacenter, storage accounts do not reside on a single disk. Rather, data is distributed across several disks in the form of extents within the Azure fabric. They are replicated within and across datacenters based on customer-selected preferences, such as locally redundant storage or geo-redundant storage.

Microsoft protects the data stored within each datacenter with a comprehensive set of controls in alignment with the security certifications outlined at the Azure Trust Center website.

Subscription-Level Storage Protection

From a subscription-level perspective, customers can additionally protect storage accounts within their Azure subscription to mitigate threats related to subscription administrators within their organization. Access to data found within each storage account is accessible to workloads in multiple ways: Queue, Table, Blob, and Files (SMB).

Each storage account has several layers of protection, including those that are provided by Microsoft and those that are controlled by the customer, using both Microsoft and third-party mechanisms.

Depending on the workload type and how it is accessed, data can be protected in the following ways:

Subscription-Level Workload Types and Storage

IaaS

IaaS workloads (virtual machines) contain their storage inside virtual hard disks (VHDs), which are stored as page blobs in one or more storage accounts. Some lift-and-shift virtual machine workloads access data over Files (SMB).

PaaS

PaaS workloads access storage by using one or more of the accessible methods outlined previously (Queues, Tables, and Blobs).

StorSimple

StorSimple appliances access storage over the storage account's REST API URL and encrypts storage.

Subscription-Level Protection Types

 

Data-At-Rest

Data-In-Transit

Data Access

IaaS

Performed by the customer by encrypting the virtual hard disk (VHD) files. Microsoft and third-party mechanisms are used.

Workloads (such as SQL Server) also support Transparent Data Encryption (TDE).

Technologies that assist with this are:

  • Key Vault
  • SQL Server Transparent Data Encryption
  • Azure Disk Encryption
  • Third-party virtual machine volume encryption

Performed by the customer by using transport encryption of traffic traversing exposed virtual machine network endpoints. Microsoft and third-party mechanisms are used.

Actions performed by Microsoft include disk encryption using BitLocker Drive Encryption for bulk import/export operations and encrypting traffic between Azure datacenters.

Technologies that assist with this are:

  • HTTPS/REST API
  • Azure endpoints
  • Azure Import/Export service

Performed by the customer by using native protections within the installed operating system to authenticate and authorize access to the virtual hard disk (VHD) data that is exposed through the operating system and published endpoints (for example, operating system file shares).

PaaS

Performed by the customer by encrypting data located in Queue, Table, and Blob storage. Uses Microsoft encryption mechanisms.

Technologies that assist with this are:

  • Key Vault
  • Client-Side Encryption (Preview)
  • Azure SQL Database Transparent Data Encryption (Preview)
  • Azure Table Encryption

Performed by the customer by using transport encryption of traffic traversing storage account network endpoints. Microsoft and third-party mechanisms are used.

Actions performed by Azure include encryption of traffic between Azure datacenters.

Technologies that assist with this are:

  • HTTPS/REST API
  • Storage account endpoints


Performed by the customer by using shared access keys and shared access signatures to provide access to data stored in Queue, Table, and Blob storage.

Technologies that assist with this are:

  • Shared access signatures
  • Storage account access keys
  • Storage account endpoints

StorSimple

Performed by the appliance using AES-256 encryption with Cipher Block Chaining (CBC) prior to saving to the mapped Azure storage account.

Performed by the customer by using transport encryption of traffic traversing exposed physical or virtual machine network endpoints. Performed by the appliance using SSL encryption.

Performed by the customer by using native protections within the installed operating system to authenticate and authorize access to attached StorSimple volumes.

Performed by the appliance by using authentication protocols (such as CHAP), ACLs, network access control, and Role-Based Access Control (RBAC).

Feature References

BitLocker Drive Encryption

https://technet.microsoft.com/en-us/library/cc732774.aspx

Key Vault

http://blogs.technet.com/b/kv/archive/2015/01/08/azure-key-vault-making-the-cloud-safer.aspx

http://azure.microsoft.com/en-us/services/key-vault

Azure Disk Encryption

https://channel9.msdn.com/Events/Ignite/2015/BRK3490

http://blogs.msdn.com/b/azuresecurity/archive/2015/05/11/azure-disk-encryption-management-for-windows-and-linux-virtual-machines.aspx

Third-Party virtual machine Volume Encryption

http://azure.microsoft.com/blog/2014/08/19/azure-virtual-machine-disk-encryption-using-cloudlink/

http://azure.microsoft.com/blog/2014/11/13/encrypting-azure-virtual-machines-with-cloudlink-securevm/

http://www.cloudlinktech.com/choose-your-cloud/microsoft-azure/

https://channel9.msdn.com/Blogs/AzurePartner/Guest-Post-CloudLink-Secures-Azure-VMs-via-BitLocker-and-Native-Linux-Encryption

Client-Side Encryption (Preview)

http://blogs.msdn.com/b/windowsazurestorage/archive/2015/04/29/microsoft-azure-storage-client-library-for-c-v1-0-0-general-availability.aspx
http://blogs.msdn.com/b/windowsazurestorage/archive/2015/04/29/getting-started-with-client-side-encryption-for-microsoft-azure-storage.aspx

https://azure.microsoft.com/en-us/documentation/articles/storage-client-side-encryption

Azure Import/Export Service

https://azure.microsoft.com/en-us/documentation/articles/storage-import-export-service/

http://blogs.msdn.com/b/windowsazurestorage/archive/2014/05/13/announcing-microsoft-azure-import-export-service-ga.aspx

Storage Account Access Keys

https://azure.microsoft.com/en-us/documentation/articles/storage-create-storage-account
http://blogs.msdn.com/b/mast/archive/2013/11/07/why-does-a-storage-account-have-two-access-keys.aspx

Shared Access Signatures

https://azure.microsoft.com/en-us/documentation/articles/storage-dotnet-shared-access-signature-part-1/ and https://azure.microsoft.com/en-us/documentation/articles/storage-dotnet-shared-access-signature-part-2/

http://blogs.msdn.com/b/windowsazurestorage/archive/2012/06/12/introducing-table-sas-shared-access-signature-queue-sas-and-update-to-blob-sas.aspx

http://blogs.msdn.com/b/skaufman/archive/2012/10/15/blob-storage-and-shared-access-signatures.aspx

SQL Server Transparent Data Encryption

https://msdn.microsoft.com/en-us/library/bb934049.aspx

http://blogs.msdn.com/b/sqlsecurity/archive/2015/04/29/announcing-transparent-data-encryption-for-azure-sql-database.aspx

https://channel9.msdn.com/Shows/Data-Exposed/TDE-in-Azure-SQL-Database

http://azure.microsoft.com/blog/2013/08/01/using-microsoft-sql-server-security-features-in-windows-azure-virtual-machines/

StorSimple Security

http://www.storsimple.com/Portals/65157/docs/StorSimple-Solution-Overview-Security.pdf

http://download.microsoft.com/download/6/9/A/69A81B3A-E111-4797-AD31-02671D501D87/StorSimple_Security_Brief.pdf

Storage for IaaS

A Microsoft Azure virtual machine is created from an image or a disk. All virtual machines use one operating system disk, a temporary local disk, and they enable the use of multiple data disks depending on the selected size of the virtual machine. All images and disks, except the temporary local disk, are created from virtual hard disk (VHD) files that are stored as page blobs in a storage account in Microsoft Azure.

You can use platform images that are available in Microsoft Azure to create virtual machines, or you can upload your own images to create customized virtual machines. The disks that are created from images are also stored in Azure Storage.

Disks

Disks can be leveraged in different ways with a virtual machine in Microsoft Azure. An operating system disk is a VHD that you use to provide an operating system for a virtual machine. A data disk is a VHD that you attach to a virtual machine to store application data. You can create and delete data disks whenever you have to.

Temporary Disk

Each virtual machine that you create has a temporary local disk, which is labeled as drive D by default. This disk exists only on the physical host server on which the virtual machine is running. It is not stored in blobs in Azure Storage. This disk is used by applications and processes that are running in the virtual machine for transient and temporary storage of data. It is used also to store page files for the operating system.

Caching

The operating system disk and data disk each have a host caching configuration setting called Host cache preference, which can improve performance under some circumstances. By default, Read/Write caching is enabled for operating system disks and all caching is off for data disks. Note that some workloads have specific configuration requirements with this setting. Its use should be reviewed carefully with vendor and for a workload's specific needs.

Images

An image is a VHD file (.vhd) that you can use as a template to create a new virtual machine. You can use images from the Azure Image Gallery, or you can create and upload your own custom images. To create a Windows Server image, you must run the Sysprep command on your server to generalize and shut it down before you can upload the .vhd file that contains the operating system.

VHD Files

A .vhd file is stored as a page blob in Microsoft Azure Storage, and it can be used to create images and operating system disks or data disks in Microsoft Azure. You can upload a .vhd file to Microsoft Azure and manage it as you would any other page blob. The .vhd files can be copied, moved, or deleted if a lease does not exist on the VHD (for example, it belongs to an existing virtual machine).

A VHD can be in a fixed format or a dynamic format. Currently, however, only the fixed format is supported in Microsoft Azure. Often, the fixed format wastes space because most disks contain large unused ranges. However, in Microsoft Azure, fixed VHD files are stored in a sparse format, so you receive the benefits of fixed and dynamic disks at the same time.

When you create a virtual machine from an image, a disk is created for the virtual machine, which is a copy of the original VHD file. To protect against accidental deletion, a lease is created if you create an image, an operating system disk, or a data disk from a VHD file.

Feature References

Disks and Images in Azure

https://msdn.microsoft.com/en-us/library/azure/jj672979.aspx

Virtual Machine Disks in Azure

https://msdn.microsoft.com/en-us/library/azure/dn790303.aspx

Virtual Machine Images in Azure

https://msdn.microsoft.com/en-us/library/azure/dn790290.aspx

VHDs in Azure

https://msdn.microsoft.com/en-us/library/azure/dn790344.aspx

Manage Images using Windows PowerShell

https://msdn.microsoft.com/en-us/library/azure/dn790330.aspx

How To Change the Drive Letter of the Windows Temporary Disk

http://azure.microsoft.com/en-us/documentation/articles/virtual-machines-windows-change-drive-letter

Mandatory:

  • Note that Microsoft Azure virtual machines do not currently
    support the VHDX format. Ensure that any virtual machines or images that are planned for use in or migration to Microsoft Azure use the VHD format.
  • Note that if a VHD file on-premises is a dynamic disk, it is converted to a fixed disk when it is uploaded to Microsoft Azure.
  • Do not store data on the temporary disk. This disk provides temporary storage for applications and processes, and it is intended to only store transient data such as page or swap files. No data on the temporary disk will persist a host-machine failure or any other operation that requires moving the virtual machine to another piece of hardware.

Recommended: You can read or write to a single blob at up to a maximum of 60 MB/second (this is approximately 480 Mbps, which exceeds the capabilities of many client-side networks (including the physical network adapter on the client device).

In addition, a single blob supports up to 500 requests per second. If you have multiple clients that need to read the same blob, and you might exceed these limits, you should consider using a content delivery network (CDN) for distributing the blob.

Optional: Different virtual machine sizes allow for a different number of data disks to be attached. Be sure to choose the appropriate size virtual machine, based on the number of data disks that you may anticipate needing. For example, a size A1 virtual machine can have a maximum of two data disks. If you need more than two data disks, choose something bigger than an A1.

Design Guidance

When you design storage for IaaS, consider the following:

Capability Considerations

Capability Decision Points

Capability Models in Use

The IaaS design will depend heavily on the storage account design.

For IaaS workloads, it is important to first understand the I/O (or IOPS) and the profile of a workload to determine the stress and expectations it will put on the storage accounts. Based on this information, you can determine how VHDs should be stored in storage accounts and what kind of limitations you will be subject to.

The primary constraining factor is the number of VHD files that can be in each storage account.

For virtual machines in the Basic tier, do not place more than 66 highly used VHDs in a storage account to avoid the 20,000 total request rate limit (20,000/300).

For virtual machines in the Standard tier, do not place more than 40 highly used VHDs in a storage account (20,000/500). The term highly used refers to VHDs that push the upper limits of the total request rates. If you have VHDs that are not highly used and do not come close to the maximum request rates, you can put more VHDs in the storage account.

Note that this refers to virtual hard disks, not virtual machines. Virtual machines may indeed contain multiple virtual hard disks.

Deployable virtual machine images also reside in storage accounts.

When uploading and deploying images, a storage account must be used to house those images. The decision comes to which storage account should be used for images versus live virtual machines.

We recommend that all custom virtual machine images are stored in a separate dedicated storage account, from which deployments can occur.

This keeps images separate from live virtual machines and prevents them from usurping any IOPS. Deployment can occur by copying an image from one storage account to another, thus keeping the images isolated and protected. This also allows you to give special permissions to the images storage account that you might not grant to the live virtual machines storage account (such as permissions for image deployment engineers).

Also, never deploy an image across a VPN connection. Always maintain the source image in an Azure Storage account. This will provide for a much faster deployment, instead of pushing it across the VPN each time.

Choosing which storage account to deploy a virtual machine into is not a permanent decision.

If necessary, it is possible to migrate a virtual machine from one storage account to another. For detailed procedures, see Migrate Azure Virtual Machines between Storage Accounts.

Migrating a virtual machine to another storage account should be done on an as-needed basis. If this is needed more frequently, we recommend that you re-examine the storage account architecture to ensure adequate coverage for all deployment points.

Consider what to do if multiple data disks are needed in a single (striped) volume.

If multiple data disks need to appear as a single volume within a virtual machine, you are limited to using LRS only (you cannot use GRS for those VHDs).

If multiple data disks need to appear as a single volume, it is not possible to enable this in a storage account that is configured with GRS.

Those VHDs must be stored in a storage account configured as LRS, or if GRS is still a requirement, each data disk must be kept as a separate volume.

Data loss may occur if you use striped volumes (Windows or Linux) in geo-replicated storage accounts due to loose consistency for VHDs distributed across storage accounts.

If a storage outage occurs and it requires restoring data from a replicated copy, there is no guarantee that the Write order of the striped disk set would be intact after it is restored.

Disk cache settings have an effect on the performance of the virtual machine disks.

The operating system disk and the data disk have a host caching setting that can improve performance under some circumstances. However, these settings can also negatively affect performance in other circumstances, depending on the application.

Host caching is off by default for Read and Write operations for data disks. Host-caching is on by default for Read and Write operations for operating system disks.

Only change these settings if the workload would benefit from the change in cache to improve performance.

Cache setting changes for the operating system disk require a reboot. Cache setting changes for a data disk do not.

Storage Management

Data in Azure Storage can be accessed and managed in a variety of ways, and through numerous tools and process. This section will cover the various mechanisms that support managing Microsoft Azure Storage.

Graphical User Interface (GUI) Tools

Graphical user interface (GUI)-based tools are those that access Azure Storage in an interface that mimics File Explorer. This includes functionality such as drag-and-drop-based tools that allow you to view and access data as you would on a local or network drive on a server. These tools are easy to use and understand, and they are the best option for those who are new to Azure Storage.

Command-Line Interface (CLI) Tools

Command-line interface (CLI)-based tools are those that access Azure Storage from a command line, such as Azure PowerShell. This allows you to include data operations (such as move, copy, and delete) within automation scripts. These interfaces have many options and switches to allow for a variety of options in working with the data. They are best used by advanced users and those who are already familiar with Windows PowerShell and require automation as part of their Azure-based solution.

REST Interfaces

The REST APIs for the Azure Storage services offer programmatic access to the Blob, Queue, Table, and File services in Azure, or in the development environment, via the storage emulator.

All storage services are accessible via REST APIs. Storage services can be accessed from within a service running in Azure, or directly over the Internet from any application that can send an HTTP/HTTPS request and receive an HTTP/HTTPS response. These interfaces are best suited for developers or solutions that require detailed information or control over Azure Storage services.

Client Libraries

The Azure Storage Client Library reference for .NET contains the current version of the Storage Client Library for .NET. You can install the Storage Client Library for .NET from NuGet or from the Azure SDK for .NET. The source code for the Storage Client Library for .NET is publicly available in GitHub. The Azure Storage Native Client Library is a C++ library for working with the Azure Storage services.

Cross-Platform Options

The Azure Cross-Platform Command-Line Interface (xplat-cli) provides a set of open source, cross-platform commands for working with the Azure platform. The xplat-cli provides much of the same functionality found in the Azure portal, such as the ability to manage websites, virtual machines, storage, and SQL Databases. The xplat-cli is written in JavaScript, and it requires Node.js.

Vendor Solutions

Other vendors are free to distribute their Azure Storage management tools, in addition to what Microsoft provides.

Storage Emulator

The Azure Storage Emulator provides a local environment that emulates the Azure Blob, Queue, and Table services for development purposes. By using the storage emulator, you can test your application against the storage services locally, without incurring any costs.

Data Migration

After the suitability of the data has been determined, there are multiple methods to move data to Azure and manage its lifecycle:

  • StorSimple (outlined in the following section) can be used for unstructured data that is typically stored on traditional file servers.
  • The Azure Import/Export service is used to transfer large amounts of data to Azure Blob storage in situations where uploading data over the network is not feasible. You can also use this service to transfer a large amount of data from Blob storage to your on-premises storage without the network transfer time and network egress costs. Data is stored on physical hard drives, encrypted with BitLocker, and physically shipped to an Azure datacenter.

For smaller amounts of data, a manual copy using a GUI tool or a CLI program can accomplish the data move. AzCopy is perhaps the most popular choice to move small amounts of data to and from Azure Storage accounts. For more information, see Getting Started with the AzCopy Command-Line Utility.

Feature References

Storage Services REST API Reference

https://msdn.microsoft.com/en-us/library/azure/dd179355.aspx

Storage Client Library Reference

https://msdn.microsoft.com/en-us/library/azure/dn261237.aspx

Import/Export Service REST API Reference

https://msdn.microsoft.com/en-us/library/azure/dn529096.aspx

Azure GUI Storage Explorers

Link

Using the Azure Cross-Platform Command-Line Interface

http://azure.microsoft.com/en-us/documentation/articles/virtual-machines-command-line-tools/#Commands_to_manage_your_Storage_objects

Install and Configure the Azure Cross-Platform Command-Line Interface

http://azure.microsoft.com/en-us/documentation/articles/xplat-cli

Use the Microsoft Azure Import/Export Service to Transfer Data to Blob Storage

http://azure.microsoft.com/en-us/documentation/articles/storage-import-export-service

Using the Azure Storage Emulator

https://azure.microsoft.com/en-us/documentation/articles/storage-use-emulator/

Azure Throughput Analyzer

http://research.microsoft.com/en-us/downloads/5c8189b9-53aa-4d6a-a086-013d927e15a7/default.aspx

Mandatory:

  • When you import publish settings, the information for accessing your Azure subscription is stored in a ".azure" directory located in your user directory. Your user directory is protected by your operating system; however, it is recommended that you take additional steps to encrypt your user directory to protect that information. Products for encryption include BitLocker (Windows), FileVault (Mac), and Encrypted Home (Ubuntu or other equivalent systems).
  • When using the Import/Export service, the data on the drive must be encrypted by using BitLocker prior to creating the job and transferring data. This protects your data while it is in transit. For an export job, the Import/Export service encrypts your data before shipping the drive back to you. Additional details about the Import/Export service and its operation are provided in the previous Feature References table.
  • You must provide your shipping tracking number to the Azure Import/Export service; otherwise, your job cannot be processed. Additional details about the Import/Export service and its operation are provided in the previous Feature Reference table.

Recommended: Azure Storage supports HTTP and HTTPS; however, using HTTPS is highly recommended..

Optional: During shipping, physical media may need to cross international borders. You are responsible for ensuring that your physical media and data are imported and exported in accordance with the applicable laws. Before shipping the physical media, check with your advisors to verify that your media and data legally can be shipped to the identified datacenter. This helps ensure that it reaches Microsoft in a timely manner.

Design Guidance

When you design storage management, consider the following:

Capability Considerations

Capability Decision Points

Capability Models in Use

There is a need to determine whether it is appropriate to store the data in Azure Storage.

When moving data into Azure Storage (aside from IaaS VHD files), it is important to examine and determine the suitability of placing that data into Azure. There are three primary factors to examine to determine if data should move to Azure Storage:

  • Access frequency – How often is the data is accessed? Data that is accessed on a frequent or regular basis may not be a good candidate to store in the cloud. Data that is accessed less frequently, such as archive data, is better suited for the cloud.
  • Active data vs. passive data – How much data latency is tolerable? When data that is stored in the cloud needs to be accessed, additional latency should be expected when reading or writing that data. Applications or services that cannot handle latency should not store its data in the cloud. For example, do not install a SQL Server on-premises and store its databases in cloud storage.
  • Privacy requirements –What kind of data is permitted in cloud services for your organization? Based on what country, industry, and customer you are working with, there may be unique data privacy rules that you must take into consideration. These data privacy rules may govern what type of data is allowed to be stored in the cloud. For a good reference, see International Privacy Laws.

Suitability is something that is unique to every customer. It is important to first understand the data privacy laws that may apply to the customer's country, industry, and any regulatory bodies that may govern it.

From there, begin to understand the sensitivity of the data that may be stored. Then look at the technical aspects of access frequency and active vs. passive data.

It doesn't make sense to look at the technical aspects of the data before looking at the legal aspects of storing the data in public cloud storage.

Storage Monitoring

Monitoring

Monitoring your Azure Storage environment is as important as monitoring your on-premises storage environment. Within your storage service, the following areas need to be monitored: service health, availability, performance, and capacity.

Monitoring Service Health

You can use the Azure portal to view the health of the storage service and other Azure services in all Azure regions. This is where you can see if there are any issues outside of your control that maybe affecting your storage service.

Monitoring Availability

You should monitor the availability of the storage services in your storage account by monitoring the value in the Availability column in the hourly or minute metrics tables. The availability of your storage should be at 100%. If not, you need to identify what is causing degradation in your storage.

Monitoring Performance

There are multiple areas in your storage services that you should monitor for performance trends. Some of key areas to monitor include AverageE2ELatency, TotalIngress, and TotalEgress.

Additionally, it is important to consider monitoring for storage account throttling. Throttling is the mechanism that Azure uses to limit the IOPS allocated to a given Azure storage account (currently 20,000 IOPS). When this amount is exceeded, Azure implements throttling to ensure that limitations in the service are preserved.

Although it is somewhat unlikely in well-planned Azure environments, monitoring of the Throttling Error and Throttling Error Percentage metrics is an effective mechanism to identify when throttling events occur in the service. This operation is outlined in the following article: How to Monitor for Storage Account Throttling.

Optional: You should continuously monitor your Azure applications to ensure they are healthy and performing as expected.

Alerting

You can configure alerting for different services, including Azure Storage. When you configure alerting, you have the option of having those alerts emailed to the Co-administrators for the subscription.

An alert rule enables you to monitor an Azure service based on a metric value that is set by your organization. When the value that is configured for a specific metric is reached and the threshold is assigned for a rule, the alert rule becomes active and registers an alert. This alert is then logged in the system.

Diagnosing Azure Storage Issues

When there are problems with your storage, there are a number of ways that you may become aware of these issues. These include:

  • A major failure that causes the application to crash or to stop working.
  • Significant changes from baseline values in the metrics you are monitoring as described in the previous section.
  • Reports from users of your application that some particular operation didn't complete as expected or that some feature is not working.
  • Errors generated within your application that appear in log files or through some other notification method.

Typically, issues related to Azure Storage services fall into one of the following four broad categories:

  • Your application has a performance issue – This problem is reported by your users or revealed by changes in the performance metrics.
  • There is a problem with the Azure Storage infrastructure in one or more regions.
  • Your application is encountering an error - This problem is reported by your users or revealed by an increase in one of the error count metrics you monitor.
  • Storage emulator-related issues - During development and test you may be using the local storage emulator, and you may encounter some issues that relate specifically to usage of the storage emulator.

Troubleshooting

There are some common issues that you may encounter from your Azure Storage services that you will need to troubleshoot. These issues include:

  • Performance of the storage services
  • Availability of the storage services

To troubleshoot applications that use Azure Storage, you can use a combination of tools to determine when an issue has occurred and what the cause of the problem may be. These tools include:

  • Azure Storage Analytics provides metrics and logging for Azure Storage.
  • Storage metrics tracks transaction metrics and capacity metrics for your storage account. By using metrics, you can determine how your application is performing according to a variety of different measures.
  • Storage logging logs each request to the Azure Storage services to a server-side log. The log tracks detailed data for each request, including the operation performed, the status of the operation, and latency information.
  • The Azure Management portal allows you to configure metrics and logging for your storage accounts. You can also view charts and graphs that show how your application is performing over time, and configure alerts in the portal to notify you if your application performs differently than expected for a specified metric.
  • Server logs for Azure Storage are stored as blobs, so you can use AzCopy to copy the log blobs to a local directory for analysis using Microsoft Message Analyzer.
  • Microsoft Message Analyzer is a tool that consumes log files and displays log data in a visual format that makes it easy to filter, search, and group log data into useful sets that you can use to analyze errors and performance issues.

Azure Portal Options

The Storage Metrics feature is available in the Azure portal to help you monitor your storage performance. Storage Metrics can be thought of as an equivalent to Windows Performance Monitor counters in the Microsoft Azure service.

A comprehensive set of metrics (counters) enable the ability to see data from services, such as the percentage of successful or failed service requests and the service's availability. The following image shows the monitoring page in the Azure portal where you can view metrics such as total request, success percentage (Blob), success percentage (Table), and availability.


System Center Operations Manager Management Packs

Microsoft System Center Operations Manager allows for monitoring Azure Storage by utilizing Management Packs. The management pack for Microsoft Azure enables monitoring the availability and performance of Azure Fabric resources that are running on Microsoft Azure.

The management pack runs on a specified server pool and then uses various Microsoft Azure APIs to remotely discover and collect instrumentation information about a specified Microsoft Azure resource, such as a cloud service, storage, or virtual machines.

Feature References

Monitor a Storage Account in the Azure Management Portal

http://azure.microsoft.com/en-us/documentation/articles/storage-monitor-storage-account/

Monitor, diagnose, and troubleshoot Microsoft Azure Storage

http://azure.microsoft.com/en-us/documentation/articles/storage-monitoring-diagnosing-troubleshooting/

End-to-End Troubleshooting using Azure Storage Metrics and Logging, AzCopy, and Message Analyzer

http://azure.microsoft.com/en-us/documentation/articles/storage-e2e-troubleshooting

Storage Analytics

http://azure.microsoft.com/en-us/documentation/articles/storage-analytics

How to: Receive Alert Notifications and Manage Alert Rules in Azure

https://msdn.microsoft.com/en-us/library/azure/dn306638.aspx

Understanding Monitoring Alerts and Notifications in Azure

https://msdn.microsoft.com/en-us/library/azure/dn306639.aspx

Azure Storage Analytics Metrics Management Pack

http://blogs.technet.com/b/omx/archive/2014/08/15/azure-storage-analytics-metrics-management-pack.aspx

System Center Management Pack for Microsoft Azure

http://www.microsoft.com/en-us/download/details.aspx?id=38414

How to Monitor for Storage Account Throttling

http://blogs.msdn.com/b/mast/archive/2014/08/02/how-to-monitor-for-storage-account-throttling.aspx

Mandatory:

  • Storage Analytics is not enabled by default so you need to enable this feature for any services that you want to monitor.
  • You can create up to 10 alert rules for each Azure subscription. If you reach the maximum number of allowable rules, you must remove an existing rule before you can create another.
  • The Azure File service does not currently support Storage Analytics.
  • Currently, capacity metrics are only available for the Blob service.

Recommended:

  • Consider costs when you select the metrics. There are transaction and egress costs associated with refreshing monitoring displays. Additional costs are associated with examining monitoring data in the Management portal.
  • As a user of Azure Storage services, you should continuously monitor the services your application uses for any unexpected changes in behavior (such as slower than usual response times). Also use logging to collect more detailed data and to analyze a problem in depth. The diagnostics information you obtain from monitoring and logging will help you determine the root cause of the issue your application encountered.

Optional:

  • It is not possible to set minute metrics by using the Azure portal. However, you can set minute metrics programmatically by using Azure PowerShell, or via the Azure Preview portal. We recommend that you set minute metrics for the purposes of testing and for investigating performance issues with your application. Note that the Azure portal cannot display minute metrics, only hourly metrics.
  • The Management Pack for Microsoft Azure provides no functionality for importing. For each Microsoft Azure subscription that contains Azure resources you want to monitor, you must configure discovery and monitoring. First use the Microsoft Azure wizard in the administration section of the Operations console, and then use the Microsoft Azure Monitoring template in the authoring section of the Operations console.

Design Guidance

When you design storage monitoring, consider the following:

Capability Considerations

Capability Decision Points

Capability Models in Use

Storage monitoring and analytics are not enabled by default. Monitoring would grant access to:

  • Metrics - Service health, capacity, availability and performance
  • Logs – Detailed logs of every operation to support troubleshooting

Because storage monitoring and analytics are not enabled by default, the decision must be about whether to enable them, based on why and what they will be used for.

It is possible to log not only storage metrics and performance information, but also authentication requests, anonymous requests, transaction metrics, and capacity metrics.

In most models where organizations are starting their initial use of Azure, it is wise to enabled storage monitoring and analytics to observe the data that is available during the collection process. In general, we recommend that you enable storage analytics at least:

  • During testing to establish baseline metrics.
  • In production to assist with troubleshooting.

Note that analytics and monitoring can be enabled or disabled at any time.

All metrics data is written by the services of a storage account. As a result, each Write operation performed by Storage Analytics is billable. The amount of storage used by metrics data is also billable.

Every request made to a storage account is billable or non-billable. Storage Analytics logs each individual request made to a service, including a status message that indicates how the request was handled.

Similarly, Storage Analytics stores metrics for a service and the API operations of that service, including the percentages and count of certain status messages.

Together, these features can help you analyze your billable requests, make improvements on your application, and diagnose issues with requests to your services.

When looking at Storage Analytics data, you can use the tables in the Storage Analytics Logged Operations and Status Messages areas to determine what requests are billable.

Then you can compare your logs and metrics data with the status messages to see if you were charged for a particular request.

You can also use the tables in this area to investigate availability for a storage service or individual API operation.

Cloud Integrated Storage

A Cloud Integrated Storage solution is one that uses a combination of on-premises storage and cloud storage. The purpose of cloud integrated storage is to take advantage of the lower cost of cloud storage (as compared to traditional on-premises SAN storage), but still connect to and manage it similarly to how you would treat on-premises storage.

StorSimple

Microsoft Azure StorSimple is a cloud integrated storage solution that manages storage tasks between on-premises devices and Azure cloud storage. Azure StorSimple is designed to reduce storage costs, simplify storage management, improve disaster recovery capability and efficiency, and provide data mobility. StorSimple has several components:

  • The Azure StorSimple device is an on-premises hybrid storage appliance that provides primary storage and iSCSI access to data stored elsewhere. It manages communication with cloud storage, and helps ensure the security and confidentiality of all data that is stored in the Microsoft Azure StorSimple system.
  • You can use Azure StorSimple to create a virtual device that replicates the architecture and capabilities of the actual hybrid storage device. The StorSimple Virtual Appliance runs on a single node in an Azure virtual machine. (A virtual device can only be created on an Azure virtual machine. You cannot create one on a StorSimple device or an on-premises server.)
  • You can use it to back up and clone data from your hosts. Note that data stored in Azure Storage by StorSimple is not directly accessible with the standard storage tools outlined previously. This is because it is encrypted and protected by the StorSimple devices that use it. Only StorSimple 8000 series devices can take advantage of the Microsoft StorSimple Virtual Appliance to provide on-premises-like access to data stored in Azure.
  • Azure StorSimple provides a web-based user interface (the StorSimple Manager Service) that enables you to centrally manage datacenter and cloud storage. This is built-in to the Azure portal.
  • The Azure StorSimple Snapshot Manager is an optional console that you can use to create application consistent, point-in-time, backup copies of local and cloud data.
  • The Azure StorSimple Adapter for SharePoint is an optional component that transparently extends StorSimple storage and data protection features to SharePoint server farms. The adapter works with a Remote Blob Storage (RBS) provider and the SQL Server RBS feature, allowing you to move blobs to a server that is backed up by the Microsoft Azure StorSimple system. StorSimple then stores the blob data locally or in the cloud, based on usage.

Feature References

StorSimple MSDN Reference Site

https://msdn.microsoft.com/en-us/library/azure/dn772442.aspx

StorSimple 8000 Series Chalktalk

https://www.youtube.com/watch?v=4MhJT5xrvQw

StorSimple Hybrid Cloud Storage Features and Benefits

http://www.microsoft.com/en-us/server-cloud/products/storsimple/overview.aspx

Hybrid Cloud Storage

http://www.controlyourstorage.com/

Mandatory:

  • StorSimple Update 1 provides a migration toolkit to enable customers to upgrade from a StorSimple 5000/7000 Series device to an 8000 Series device to take advantage of its improved feature offerings while maintaining their data in-place.
  • There are user name and password length requirements for the Snapshot Manager user name and password. Be sure to reference the latest device series documentation before you select a user name or password.
  • There are user name and password length requirements for the CHAP user name and password. Be sure to reference the latest device series documentation before selecting a user name or password.
  • The service data encryption key is generated only on the first device registered with the service. All subsequent devices that are registered with the service must use the same service data encryption key. It is very important to save this in a secure location. A copy of the service data encryption key should be stored in such a way that it can be accessed by an authorized person and can be easily communicated to the device administrator.

Recommended:

  • When registering the StorSimple device with the StorSimple manager, you are provided with a registration key. The service registration key is a long key that contains 100+ characters. We recommend that you copy the key and save it in a text file in a secure location so that you can use it to authorize additional devices as necessary.
  • When using the StorSimple Virtual Appliance, make sure that the virtual network is in the same region as the cloud storage accounts that you are going to be using with the virtual device.

Optional: Remote management is turned off by default. You can use the StorSimple Manager service to enable it. As a security best practice, remote access should be enabled only during the time period that it is actually needed.

Design Guidance

Configuration best practices are implemented during deployment and are specific to the way StorSimple is deployed. The following configuration recommendations are provided.

Area

Recommendation

High availability

Connect the power supply cables from both the power control modules. Connect each power supply cable to different power circuits in the datacenter. This prevents loss of power in the event that a single power circuit fails.

Connect all four network ports from each of the controllers to the respective network subnets. During the initial configuration of StorSimple, designate at least one port for use with the management interface and at least two ports for use with the iSCSI data interface. If eight network ports are not available, a fewer number could be used—albeit at a lowered level of redundancy.

When using multiple VLANs, disable iSCSI for the management (MGMT) interface.

When initially configuring the network interfaces for the appliance and completing the setup, it is required to configure the secondary DNS server. This provides high-availability if the primary DNS server goes down.

Microsoft Multipath I/O (MPIO) for iSCSI, should be installed and configured on the host to leverage the multiple data paths connected to StorSimple. Each iSCSI host should have more than one network adapter and be connected to more than one network switch for physical data path redundancy. The appropriate MPIO policies (for example, performance, round robin or failover) should be selected based on the desired behavior across the multiple network paths.

Configure StorSimple with 2-node file server cluster configurations. By removing single points of failure and building redundancy on the host, the entire solution becomes highly available.

Use Continuously Available (CA) shares, which became available in Windows Server 2012 (SMB 3.0) for high availability during failover of the storage controllers.

Access control

Always associate at least one Access Control Record (ACR) with a volume.

When assigning more than one ACR to a volume, care should be taken to ensure the combinations of the ACRs do not expose the volume in a way where it can be concurrently accessed by more than one non-clustered host. StorSimple is designed to display a pop-up warning message if multiple ACRs, which together expose the volume to more than one host, are assigned to a volume.

Storage accounts

Keep primary volume data at 30-40% and backup data at 60-70% of the total capacity for the storage account. This works out to be about 60-70 TB of primary data and 130-140 TB of backup data per storage account.

When calculating the amount of primary data, it is important to consider the change rate or growth rate of primary data. Factor that into the 60-70 TB limit for primary data.

Customers should create storage accounts with Geo-Redundant Storage (GRS) enabled. This option is included within the pricing agreement.

Customers should also choose storage accounts within the same subscription as the StorSimple Manager resource they have created. This simplifies workflows behind the scenes and provides a more streamlined experience.

Volume size

To minimize the processing overhead, create the volume size as close to the anticipated usage as possible. The volume size can be increased later to accommodate any increase in usage beyond what was originally anticipated if the host file system allows it.

If the dataset within a volume is not required, do not reformat a volume. If you need to reformat a volume, delete the original volume and create a new volume. This prevents unnecessary overhead from the UNMAP feature on the existing volume.

Create a volume size such that 80% of the volume is used for currently expected data consumption and 20% is available for future growth. For example, if you have a data set of 800 GB, create a volume that is 1 TB in size. As the volume approaches fullness, the volume size can be increased. We recommend that you increase the volume capacity when it reaches 95% full.

When configuring volumes that will be used on the StorSimple Virtual Appliance, (by failover or cloning), it is important to consider the size of the volumes that will be used on the StorSimple Virtual Appliance. The total capacity of the StorSimple Virtual Appliance has a maximum capacity of 30 TB. Therefore, all volumes that fail over or are cloned on the StorSimple virtual Appliance cannot exceed the 30 TB maximum capacity.

This capacity is the actual provisioned capacity. Therefore, volumes that are provisioned as 10 TB, but only have 2 TB of data on them, still account for 10 TBs of the 30 TB maximum capacity on the StorSimple Virtual Appliance.

We recommend that you size volumes so that a single volume does not exceed 30 TB or that all the volumes within the volume container do not exceed 30 TB. Volume containers are the unit of failover and all volumes within a volume container will fail over to a StorSimple Virtual Appliance during the failover process.

For volumes that will be used with the StorSimple Virtual Appliance, it is best to consider the use case. For disaster recovery use cases, smaller volumes are recommended with a size of 2-4 TB. This helps meet expected recovery point objective (RPO) and recovery time objectives (RTOs).

For development and test use cases, larger volumes can be used. However, we recommend that you not exceed a maximum volume size of 16 TB for volumes that will be cloned on the StorSimple Virtual Appliance.

Volume usage type

Select the correct volume usage type: primary or archive. Select primary volumes for data that is expected to be used, and select archive volumes for data that is not expected to be used or is already classified as archival data.

Although the tiering algorithm does not change for primary versus archive volumes, the block size is adjusted to expedite data movement to the cloud. The chunk size for primary volumes is 64 KB, and chunk size for archive volumes in 512 KB. With larger chunks sizes for archive volumes, when a chunk tiers to the cloud, more data is transferred to the cloud as compared to primary volumes with smaller chunk sizes.

NTFS allocation unit size

For all StorSimple volumes, regardless of the volume type, format with a 64 KB allocation unit size.

Always select quick formatting when formatting the volume. Windows Server 2012 and later uses GPT by default.

Do not use spanned volumes. They are not supported by StorSimple.

Index search and virus scan

Configure the index search or virus scan application to always perform incremental operations. An incremental operation means that only new data, which is most likely still on the local tiers of StorSimple, is operated on by way of the index search or virus scan. The data that has been tiered to the cloud is not accessed during the incremental operation.

Disable any automatically configured full-scan operations.

Ensure the correct search filters and settings are configured so that only the intended types of files are scanned. For example, image files (JPEG, GIF, and TIFF) and engineering drawings should not be scanned during the incremental or full index rebuild.

Virtual devices

Create the StorSimple Virtual Appliance in the same geographical region as the physical appliance that hosts the primary volumes that will be ported to the StorSimple Virtual Appliance. Specifically, the backups of the volumes need to have a storage account that is in the same geographical region as the storage account for the StorSimple Virtual Appliance. If the StorSimple is deployed with a storage account in a different geographical region than the primary volumes, a customer may see a degradation in performance.

Provision the StorSimple Virtual Appliance before it will actually be needed—especially for disaster recovery use cases. The StorSimple Virtual Appliance can be on standby without incurring additional costs. By having the StorSimple Virtual Appliance provisioned, the customer saves time when the actual workflow for porting data to the StorSimple Virtual Appliance needs to occur.

Design Guidance

Operational best practices are part of everyday (or ongoing) operations. The following operational guidance is provided:

Area

Recommendation

Network connectivity

The minimum Internet network bandwidth for StorSimple should be at least 20 Mbps at all times. This 20 Mbps bandwidth should be dedicated and not shared with any other applications. The optimal bandwidth for a customer deployment is dependent on the RTO.

The maximum Internet network latency between the on-premises appliance and Azure should not exceed a certain limit. The main variables that impact network latency are the distance from the datacenter and the internal network architecture, including the number of routers. Test operational best practices network latency with the solution and verify it does not exceed any maximum latency thresholds.

Configure the Quality of Service (QoS) templates for StorSimple to enable throttling the network throughput by the appliance and at different times of the day. QoS templates can be used very effectively in conjunction with backup schedules to leverage additional network bandwidth for cloud operations during off-peak hours.

Ensure connectivity to the Internet is available at all times. This recommendation also applies to use cases where a very small amount of data exists on StorSimple and the data has not been tiered to the cloud storage.

Although StorSimple can easily buffer temporary glitches in network connectivity, prolonged outages result in the storage becoming unavailable, and iSCSI error messages are returned to the application. StorSimple has a built-in alert mechanism that can be configured. It sends an alert message and displays the alert on the Web UI.

If the network infrastructure supports jumbo frames, they can be configured on the data ports that transmit the iSCSI traffic between the host servers and StorSimple. The jumbo frames should be enabled end-to-end for the network components. This includes all components that interface on the network from StorSimple, and then to the host.

The ports on StorSimple are configured to automatically accept jumbo frames if the network and the iSCSI initiators are set up to support them. Jumbo frames should not be configured on a network port that is connected to the Internet or a network adapter that is cloud-enabled on StorSimple.

StorSimple Virtual Appliance provisioning and sizing

Provisioning a Virtual Appliance may take several hours. Therefore, it is best to provision the Virtual Appliance and turn it off until it is needed.

Migrating data

Set up cloud snapshots that are taken frequently during data migration. When a cloud snapshot is taken, all data in the volume is copied to the cloud. This improves the overall performance of StorSimple when normal operations resume.

Before migrating the data, use a file classification tool like Microsoft File Server Resource Manager to gain insight on the amount, type, and location of the data on the original storage system—specifically the data that is not being accessed and should be tiered to the cloud (this data is generally referred to as cold data). Contact StorSimple Support for additional assistance with developing a data migration strategy.

Migrate the infrequently accessed (cold) data first, if possible, and then the frequently accessed (hot) data. Migration should be done during periods where there is minimal activity on the systems.

The time it takes to migrate data from the older storage system to StorSimple is dependent on the amount of data and any shared resources. In general terms, during a period of migration, a user can expect to see Write throughput between 20 and 100 MB/s depending on how full the solution is.

For a solution that has data on the solid-state drive (SSD) tier only, expected performance would be around 100 MB/s (up to 200 MB/s for deployments with MPIO).

For a solution that has SSD and hard disk drive (HDD) tiers that are relatively full and pushing data to the cloud, expected performance would be up to 20 MB/s, depending on Internet bandwidth available. Deployments configured with MPIO can expect to see higher Write throughput.

Data security and encryption

The backup XML configuration file should be stored outside of Azure. This can be on a passcode protected server or USB drive.

Use different storage accounts for different departments, projects, roles, and so on. There can be 64 storage accounts per StorSimple solution.

Monitoring, reporting, and system health

For ease of debugging, have a serial console connection from both controllers to a server that can be reached remotely.

Regularly check the health of the system to ensure that all hardware is functioning as expected. The health of the system components can be viewed from the StorSimple Manager.

Enable email alerts to be sent when there is a change in the system. This allows prompt notification to any changes that may impact or degrade the system.

StorSimple is architected with high availability in mind and to withstand a failure of a single hardware component (or in some cases, more than one component). We recommend that you correct any failures as soon as possible. Email alerts will expedite this process.

Monitor the system performance and ensure that there is consistent Read performance. Tiering data to the cloud is best for data that is not frequently accessed. However, if the appliance gets to a point where the working set of data is exceeding available on-premises storage capacity, there will be an impact to the performance. Contact StorSimple technical support if an impact to Read performance becomes inconsistent or reaches and unacceptable low threshold.

Data protection

Implement a data protection plan with three tiers of backup in mind:

  • Short term retention that provides high performance for backup and restore.
  • Medium term retention that provides medium performance for backup and restore.
  • Long term retention that provides slow performance for backup and restore.

Carefully consider the required RPO and RTO when selecting a combination of short, medium, and long term schedule policies to back up the data.

The backup retention period should always be set to the desired value to ensure that old backup copies are automatically deleted and the maximum number of backups per volume (256) is not exhausted.

If VSS-based application-level consistency is required for the application backup the backup should be scheduled by using the StorSimple Snapshot Manager. If only crash consistency is required for the backup, the backup can be scheduled by using the StorSimple Snapshot Manager or the StorSimple Manager.

When using the Azure StorSimple Snapshot Manager, make sure that the host server on which the StorSimple Snapshot Manager is running has sufficient processing power to initiate the backup and restore operation.

Microsoft Azure Compute: IaaS

Microsoft Azure provides a comprehensive platform of Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) capabilities and services that can support a wide range of customer applications and services. Cloud infrastructures can be comprised of on-premises customer-, partner-, or public-hosted cloud computing infrastructures that provide a range of capabilities for organizations to consume natively or across these models in a hybrid capacity.

Focusing on the public cloud, the primary difference between IaaS and PaaS constructs is the division of responsibility for common operational functions between the provider and the consumer. In Microsoft Azure, the Microsoft Corporation acts as the cloud provider, and the organization acts as one of many cloud consumers. This relationship is outlined at a high level in the following diagram.

The focus of this section is on the Microsoft IaaS capabilities, which in large part consist of storage, networking, backup and recovery, large scale computing, and traditional virtual machine deployments. The primary building block for IaaS solutions deployed on Microsoft Azure is Virtual Machines and this section explains how Virtual Machines can be used to build solutions for your environment.

Virtual Machines

At a high level, Azure Virtual Machines provides a traditional virtualized server infrastructure to deploy a given application or service. Typically, it includes a compute instance consisting of virtual CPUs (cores), virtual memory, a persistent operating disk, potential persistent data disks, and internal/external networking to allow the system to interact with other aspects of a customer's environment, application, or solution.

Azure Virtual Machines has several considerations as part of its design, including size, storage, placement, source images, and additional functionality, which can be provided through Microsoft and third-party add-ins. This section explains each of these areas to provide an overview, guidance, and potential design decisions that are required when implementing Azure Virtual Machines.

Feature References

Azure Virtual Machines documentation

http://azure.microsoft.com/en-us/documentation/services/virtual-machines/

Virtual Machine Sizes and Tiers

When deploying applications and solutions using Microsoft Azure Virtual Machines, there are various sizing configurations that are available to organizations. Virtual Machines are available in different sizing series (A, D, DS, and G series as examples). Within each sizing series there are incremental sizes (A0, A1, and so on) and different tiers (Standard and Basic).

The sizing and tiering options provide customers with a consistent set of compute sizing options, which expand as time goes on. From a sizing perspective, each sizing series represents various properties, such as:

  • Number of CPUs
  • Memory allocated to each virtual machine
  • Temporary local storage
  • Allocated bandwidth for the virtual machines
  • Maximum data disks

As outlined earlier, some virtual machine series includes the concept of Basic and Standard tiers. A Basic tier virtual machine is only available on A0-A4 instances, and a Standard tier virtual machine is available on all size instances. Virtual machines that are available in the Basic tier are provided at a reduced cost and carry slightly less functionality than those offered at the Standard tier. This includes the following areas:

Capability Consideration

Capability Decision Points

CPU

Standard tier virtual machines are expected to have slightly better CPU performance than Basic tier virtual machines

Disk

Data disk IOPS for Basic tier virtual machines is 300 IOPS, which is slightly lower than Standard tier virtual machines (which have 500 IOPS data disks).

Features

Basic tier virtual machines do not support features such as load balancing or auto-scaling.

The following table is provided to illustrate a summary of key decision points when using Basic tier virtual machines:

Size

Available CPU Cores

Available Memory

Available Disk Sizes

Maximum Data Disks

Maximum IOPS

Basic_A0 –
Basic_A4

1 – 8

768 MB –

14 GB

Operating system = 1023 GB

Temporary = 20 - 240 GB

1 - 16

300 IOPS per disk

In comparison, Standard tier virtual machines are available for all compute sizes.

Capability Consideration

Capability Decision Points

CPU

Standard tier virtual machines have better CPU performance than Basic tier virtual machines.

Disk

Data disk IOPS for Basic tier virtual machines is 500. (This is higher than Basic tier virtual machines, which have 300 IOPS data disks.) If the DS series is selected, IOPS start at 3200.

Availability

Standard tier virtual machines are available on all size instances.

A-Series features

  • Standard tier virtual machines include load balancing and auto-scaling.
  • For A8, A9, A10, and A11 instances, hardware is designed and optimized for compute and network intensive applications including high-performance computing (HPC) cluster applications, modeling, and simulations.
  • A8 and A9 instances have the ability to communicate over a low-latency, high-throughput network in Azure, which is based on remote direct memory access (RDMA) technology. This boosts performance for parallel Message Passing Interface (MPI) applications. (RDMA access is currently supported only for cloud services and Windows Server-based virtual machines.)
  • A10 and A11 instances are designed for HPC applications that do not require constant and low-latency communication between nodes (also known as parametric or embarrassingly parallel applications). The A10 and A11 instances have the same performance optimizations and specifications as the A8 and A9 instances. However, they do not include access to the RDMA network in Azure.

D-Series features

  • Standard tier virtual machines include load balancing and auto-scaling.
  • D-series virtual machines are designed to run applications that demand higher compute power and temporary disk performance. D-series virtual machines provide faster processors, a higher memory-to-core ratio, and a solid-state drive (SSD) for the temporary disk.

DS-Series features

  • Standard tier virtual machines include load balancing and auto-scaling.
  • DS-series virtual machines can use premium storage, which provides high-performance and low-latency storage for I/O intensive workloads. It uses solid-state drives (SSDs) to host a virtual machine's disks and offers a local SSD disk cache. Currently, premium storage is only available in certain regions.
  • The maximum input/output operations per second (IOPS) and throughput (bandwidth) possible with a DS series virtual machine is affected by the size of the disk.

G-Series features

  • Standard tier virtual machines include load balancing and auto-scaling.
  • Leverages local SSD disks to provide the highest performance virtual machine series that is available in Azure.

The following summary of the capabilities of each virtual machine series is provided in the following table:

Size

Available CPU Cores

Available Memory

Available Disk Sizes

Maximum Data Disks

Maximum IOPS

Basic_A0 –
Basic_A4

1 – 8

768 MB –

14 GB

Operating system = 1023 GB

Temporary = 20-240 GB

1 - 16

300 IOPS per disk

Standard_A0 – Standard_A11

(Includes compute intensive A8-11)

1 - 16

768 MB - 112 GB

Operating system = 1023 GB

Temporary = 20-382 GB

1 - 16

500 IOPS per disk

Standard_D1-D4
Standard_D11-D14

(High memory)

1 - 16

3.5 GB – 112 GB

Operating system = 1023 GB

Temporary (SSD) =50 – 800 GB

2 - 32

500 IOPS per disk

Standard_DS1-DS4
Standard_DS11-DS14

(Premium storage)

1 - 16

3.5 – 112 GB

Operating system = 1023 GB

Local SSD disk = 7 GB – 112 GB GB

2 - 32

43 – 576 GB cache size

3200-50000 IOPS total

Standard_G1 – G5

(High performance)

2 - 32

28 GB – 448 GB

Operating system = 1023 GB

Local SSD disk = 384 – 6,144 GB

4 - 64

500 IOPS per disk

These sizing and capabilities are for the current Preview of Azure Virtual Machines, and they might expand over time. For a complete list of size tables to help you configure your virtual machines, please see: Sizes for Virtual Machines.

Design Guidance

When you design solutions for using virtual machines, consider the following:

Capability Considerations

Capability Decision Points

Deployment order

If you intend to deploy an application that may require compute intensive resources, we recommend that you provision a virtual machine to a cloud service with the largest virtual machine (such as Standard_G5) and scale it down to a more appropriate size. The reason is that virtual machines will be placed on the clusters that have the faster processors. It also makes scaling easier and it is more efficient to combine resources.

Supportability

The following are not supported in a virtual machine on Microsoft Azure:

  • Multiple IP addresses
  • 32-bit applications

Virtual Machine Storage

With respect to IaaS solutions, images and disks that are used by Azure virtual machines are stored within virtual hard disks (VHDs). Azure virtual machines are compute instances that have VHDs attached. The VHDs provide persistent and temporary storage to the underlying operating system within the virtual machine.

Like other components of Azure, virtual machines require a storage account to store virtual machine data, which is in the form of VHDs. The VHD specification has several formats, including fixed, dynamic, and differencing. However, Azure supports only the fixed VHD format. VHDs are stored as page blobs in the target storage account, and they can be accessed through automation, the Azure API, or by the virtual machines themselves.

Feature References

About VHDs in Azure

https://msdn.microsoft.com/en-us/library/azure/dn790344.aspx

Exploring Azure Drives, Disks, and Images

http://blogs.msdn.com/b/windowsazurestorage/archive/2012/06/28/exploring-windows-azure-drives-disks-and-images.aspx

How to Attach a Data Disk to a Windows Virtual Machine

http://azure.microsoft.com/en-us/documentation/articles/storage-windows-attach-disk/

Create and upload a Windows Server VHD to Azure

http://azure.microsoft.com/en-us/documentation/articles/virtual-machines-create-upload-vhd-windows-server/

Sizes for Cloud Services

https://azure.microsoft.com/documentation/articles/cloud-services-sizes-specs/

Sizes for Virtual Machines

https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-size-specs/

Azure Subscription and Service Limits, Quotas, and Constraints

http://azure.microsoft.com/en-us/documentation/articles/azure-subscription-service-limits/

Basic Tier Virtual Machines

http://azure.microsoft.com/blog/2014/04/28/basic-tier-virtual-machines-2/

Azure Storage Scalability and Performance Targets

http://azure.microsoft.com/en-us/documentation/articles/storage-scalability-targets/

Performance Best Practices for SQL Server in Azure Virtual Machines – I/O Performance Considerations

https://msdn.microsoft.com/en-us/library/azure/dn133149.aspx#io

Design Guidance

The following general considerations are provided when planning storage accounts for virtual machines:

Capability Considerations

Capability Decision Points

Storage considerations for virtual machines

  • Keep the storage account and virtual machines in the same region.
  • Disable Azure geo-replication on the storage account when the workload provides availability or if multiple disks are spanned together.
  • Avoid using the operating system or temporary disks for database storage or logging.
  • Avoid using Azure data disk caching options (caching policy = None) when it is not supported by the workload or solution running in the virtual machine.
  • Stripe multiple Azure data disks to get increased IO throughput.
  • Format with documented allocation sizes.
  • Separate data and log file I/O paths to obtain dedicated IOPs for data and logs.
  • Move all data-related files to data disks, including system databases.
  • Back up directly to Blob storage when possible.

Scalability and storage throttling

Microsoft provides general guidance about the number of virtual hard disks that should reside in a storage account. The following guidance is provided with the assumption that at any point all virtual machines could consume the available IOPS for all assigned disks, which could result in storage account throttling and have an adverse impact on the workload running within the virtual machine.

The following guidance is provided:

  • For virtual machines in the Basic tier, do not place more than 66 highly used VHDs in a storage account to avoid the 20,000 total request rate limit (20,000/300).
  • For virtual machines in the Standard tier, do not place more than 40 highly used VHDs in a storage account (20,000/500).

For more information, please see Sizes for Cloud Services, Sizes for Virtual Machines.

Virtual Machine Placement

The placement of virtual machines within a given Azure subscription is critical in multiple ways. It is important to consider the consumers of the services that are associated with each virtual machine. It is equally important to understand the relationships each virtual machine has between the Azure resources these compute instances consume. Azure provides several constructs that support effectively placing virtual machines and associated resources within the Azure infrastructure.

Affinity groups tell the Fabric Controller to logically group dependent items together, such as the compute and storage of a given virtual machine. When the Fabric Controller is searching for the best suited container, it chooses where it can deploy these two elements in the same cluster, thereby reducing latency and increasing performance.

Affinity groups provide the following:

  • Aggregation – Affinity groups aggregate items, such as compute and storage services, and provide the Azure Fabric Controller the information needed for them to be kept in the same Azure datacenter and cluster.
  • Reduce latency - Affinity groups provide information to the Fabric Controller about resources that should be kept together, and the result is reduced latency between components.
  • Lower costs – When servers are assigned to affinity groups, these services are placed in the same cluster, reducing intercommunication between resources and potentially reducing costs associated with interconnectivity.

A Resource Group is a unit of management for operations such as deployments, updates, and standard lifecycle operations across a number of different Azure services, including virtual machines. A resource group provides:

  • A single grouping of resources (for example, metering, billing, or quota management)
  • Lifecycle management (deployment, update, delete, status)
  • The ability to assign administrative control (RBAC permissions)

Resource Manager enables the creation of reusable deployment templates that declaratively describe the resources that make up your application (for example, a website and a SQL Database). In essence, it provides an environment to handle the infrastructure and configuration information as code.

Resource Manager provides the following:

  • An application lifecycle container - Deploy and manage your application as you see fit.
  • A declarative solution for deployment and configuration - Deploy multiple instantiations of your application with a single click.

A consistent management layer - Get the same experience of deployment and management whether you work from the portal, command line, or tools in Azure.

Feature References

Importance of Azure Affinity Groups

http://social.technet.microsoft.com/wiki/contents/articles/7916.importance-of-windows-azure-affinity-groups.aspx

About Regional Virtual network and Affinity Groups

https://azure.microsoft.com/documentation/articles/virtual-networks-migrate-to-regional-virtual network/

Role-based access control in the Microsoft Azure portal

http://azure.microsoft.com/en-us/documentation/articles/role-based-access-control-configure/

Using Azure PowerShell with Azure Resource Manager

http://azure.microsoft.com/en-us/documentation/articles/powershell-azure-resource-manager/

Azure Resource Groups

http://azure.microsoft.com/en-us/documentation/articles/azure-preview-portal-using-resource-groups/

Importance of Azure Affinity Groups

http://social.technet.microsoft.com/wiki/contents/articles/7916.importance-of-windows-azure-affinity-groups.aspx

Virtual Machine Availability

It is important to tell the Azure Fabric Controller which resources must be aligned and placed near one another for performance, management, and so on. However, it is critical that Azure be informed of systems that must be placed across mutually exclusive boundaries to ensure that the availability of a given service that spans multiple virtual machines is maintained and not interrupted by planned maintenance activities within Azure.

Azure natively understands the tiers in a PaaS application; and thus, it can properly distribute them across fault and update domains. In contrast, the tiers in an IaaS application must be manually defined using Availability Sets. To meet a given SLA, availability sets are required when building IaaS solutions using Azure virtual machines.

Placing virtual machines in an availability set tells the Fabric Controller in Azure to place the virtual machines in separate fault domains. This ultimately provides redundancy for the services provided by the virtual machines when both systems are responsible for the same tier of service in an application. This is illustrated in the following diagram:

As illustrated, availability sets ensure that all instances of each tier have hardware redundancy by distributing them across fault domains, and are not taken down during an update.

Virtual Machine Load Balancing

If the virtual machines should have traffic distributed across them, you must group the virtual machines in a cloud service, and load balance them across a specific TCP or UDP endpoint. For more information, see Load Balancing Virtual Machines.

If the virtual machines receive input from another source (such as a queuing mechanism), a load balancer is not required. The load balancer uses a basic health check to determine if traffic should be sent to the node. It is also possible to create your own probes to implement application-specific health metrics that determine if the virtual machine should receive traffic. Load balancing mechanisms and constructs within Microsoft Azure are covered in the Networking section of this document.

Virtual Machine Gallery Items and Images

The Azure Gallery contains a library of images that are provided by Microsoft and Microsoft partners, and they can be used to create IaaS virtual machines. Custom images that you upload to your Azure subscription are also available in the Azure Gallery. This section outlines each of the image types and how they can be utilized.

Image Families

Image families are virtual hard disks (VHDs) that are managed and supported by Azure. In some instances, these VHDs may include preinstalled software and configuration. A few examples are a SQL Database, SharePoint, and BizTalk.

The goal of the image families is to make it easier for you to deploy an application into the Azure environment. These image families are updated once a month, going back as far back as two months.

Partner Images

Partner images are VHDs that were uploaded by partners so that their applications can be consumed by Azure customers. Virtual machines that are deployed by using partner images are not deployed on the same clusters as other virtual machine workloads. Examples partners that provide images include Oracle and Puppet Labs.

Latest Images

Azure actively maintains the images that are part of the Azure Gallery. In some instances, you may want to get the latest image. By default, the latest image is chosen when you deploy a virtual machine by using the portal. However, if you would like to use Azure PowerShell, please use the following code snippet:

$image_name
= (Get-AzureVMImage | Where { $_.ImageFamily -eq
$ImageFamily } | sort
PublishedDate
-Descending | Select-Object
-First 1).ImageName

Customized Images

For various reasons, many customers would like to upload their own images as opposed to using the images provided by Azure. Reasons range from internal security standards to cost, especially in the scenario of licensing.

A good example is for a SQL server. You may want to leverage a SQL server license in the cloud as opposed to paying the additional cost for the SQL Database license that is provided by Azure. In this scenario, you may want to upload your image and have it available as a gallery item for tenants in your company to consume. To upload a customized image, see Create and upload a Windows Server VHD to Azure..

Feature References

Virtual Machine Images in Azure

https://msdn.microsoft.com/en-us/library/azure/dn790290.aspx

Preparing SQL Image

https://msdn.microsoft.com/en-us/library/ee210664.aspx

Virtual Machine Extensions

Azure Virtual Machines extensions are built by Microsoft and trusted third-party providers to enable security, runtime, debugging, management, and other features that you can take advantage of to increase your productivity. This section describes the various features that Azure Virtual Machines extensions provide for Windows and Linux virtual machines, and it points to documentation for each operating system.

Virtual Machines extensions implement most of the critical functionality that you want to use with your virtual machines, including basic functionality such as resetting passwords and configuring Remote Desktop Protocol (RDP). Because new extensions are added all the time, the number of possible features that your virtual machines support in Azure continues to increase.

By default, several basic Virtual Machines extensions are installed when you create your virtual machine from the image gallery, including IaaSDiagnostics and BGInfo (currently available for Windows virtual machines only), and VMAccess. However, not all extensions are implemented on both Windows and Linux operating systems at any specific time. This is due to the constant flow of feature updates and new extensions.

Virtual Machines extensions provide dynamic features that Microsoft and third-parties provide. The agent and extensions are added primarily through the Management portal, but you can also use the following options to add and configure extensions when you create a virtual machine or for existing virtual machines:

Extensions include support for Remote Debugging in Visual Studio, System Center 2012, Microsoft Azure Diagnostics, and Docker (to name a few).

Recommended: Evaluate each Virtual Machines extension and define which of them will be used as a standard for all of your Azure Virtual Machines. For example, you may standardize one antivirus extension to use in addition to the Azure PowerShell DSC extension.

Connectivity and Basic Management

The Azure Virtual Machine Agent (VM Agent) is used to install, configure, manage, and run Azure Virtual Machines extensions. The following extensions are critical for enabling, re-enabling, or disabling basic connectivity with your virtual machines after they are created and running.

VM Extension Name

Feature Description

More Information

VMAccessAgent (Windows)

VMAccessForLinux (Linux)

Create, update, and reset user information and RDP and SSH connection configurations.

Windows

Linux

Deployment and Configuration Management

The Azure CustomScript extension automatically runs a specified script or a set of scripts on a running virtual machine after they are created and running.

Name

Custom Script Extension

Description

Execution of Windows PowerShell on the target resource

Applicability

Provisioning in Azure:

  • Post virtual machine installations
  • Multiple point of integration (for example, Azure, virtual machines, or Active Directory)

Pros

  • Developed in Windows PowerShell, code can be easily edited and ported into Azure Automation
  • Reproducible artifact that can be used multiple times
  • Can be executed remotely
  • Extensibility can be integrated with multiple systems (such as Azure, virtual machines, Active Directory, or System Center)
  • Can be integrated with the Azure virtual machine provisioning process

Cons

  • Involves a lot of Windows PowerShell code development
  • Though it is extensible, may not be the best tool for all scenarios
  • Only for IaaS-based and Web Roles scenarios

Supported operating systems

Windows, Linux

The extensions detailed in the following table support different kinds of deployment and configuration management scenarios and features.

Name

Chef

Description

With Chef, you can automate how you build, deploy, and manage your infrastructure. Your infrastructure becomes as versionable, testable, and repeatable as application code.

For more information, see Get Chef.

Applicability

Provisioning in Azure:

Infrastructure as a Service (IaaS)

Pros

  • Extensible and can be implemented many ways.
  • Leveraged mainly for post virtual machine provisioning tasks, updates, and software deployment.
  • Integrated Windows PowerShell DSC

Cons

  • Involves knowledge about the product
  • Though it is extensible, may not be the best tool for all scenarios
  • Recipes may pose potential limitations
  • Only for IaaS-based scenarios
  • Potential license cost

Supported operating systems

Windows, Linux

Name

Puppet Enterprise

Description

With Puppet Enterprise, you can easily configure and manage your Windows environments. Whether you are managing a large datacenter, are taking advantage of Microsoft Azure, or a combination of both, Puppet Enterprise lets you manage your Microsoft Windows machines faster than ever.

For more information, see Puppet Labs.

Applicability

Provisioning in Azure:

Infrastructure as a Service (IaaS)

Pros

  • Extensible and can be implemented many ways
  • Leveraged mainly for post virtual machine provisioning tasks, updates, and software deployment
  • .NET integration
  • Native integration with Open Source repositories

Cons

  • Involves knowledge of the product
  • Though it is extensible, may not be the best tool for all scenarios
  • Only for IaaS-based scenarios
  • Potential license cost

Supported operating systems

Windows, Linux

Name

Windows PowerShell DSC

Description

Desired State Configuration (DSC) is a management platform in Windows PowerShell that enables deploying and managing configuration data for software services and managing the environment in which these services run.

For more information, see Windows PowerShell Desired State Configuration Overview.

Applicability

Provisioning in Azure:

Infrastructure as a Service (IaaS)

Pros

  • Extensible and can be implemented many ways
  • Leveraged mainly for post virtual machine provisioning tasks and desired state configurations
  • Integrated with release management so it can perform application deployments

Cons

  • Involves knowledge of Windows PowerShell DSC
  • Though it is extensible, may not be the best tool for all scenarios
  • Only for IaaS-based scenarios
  • Can be challenging to implement based on requirements

Supported operating systems

Windows, Linux

For more information, see Installing and configuring DSC for Linux

Name

System Center 2012 R2 Virtual Machine Role Authoring Guide

Description

Implements features for support by System Center.

For more information, see System Center 2012 R2 Virtual Machine Role Authoring Guide - Resource Extension Package

Applicability

Provisioning in Azure:

Infrastructure as a Service (IaaS)

Pros

  • Extensible and can be implemented many ways
  • Leveraged mainly for post virtual machine provisioning tasks, updates, and software deployment

Cons

  • Involves knowledge of Virtual Machine Manager (VMM)
  • Though it is extensible, may not be the best tool for all scenarios
  • Only for IaaS-based scenarios

Supported operating systems

Windows

Security and Protection

The extensions in this section provide critical security features for your Azure Virtual Machines.

Virtual Machines Extension Name

Feature Description

More Information

CloudLinkSecureVMWindowsAgent

Provides Azure customers with the capability to encrypt their virtual machine data on a multitenant, shared infrastructure, and fully control the encryption keys for their encrypted data in Azure Storage

Securing Microsoft Azure Virtual Machines leveraging BitLocker and Native operating system encryption

McAfeeEndpointSecurity

Protects your virtual machine against malicious software

McAfee

TrendMicroDSA

Enables Trend Micro Deep Security platform support to provide intrusion detection and prevention, firewall, antimalware, web reputation, log inspection, and integrity monitoring

How to install and configure Trend Micro Deep Security as a Service on an Azure VM

PortalProtectExtension

Guards against threats to your Microsoft SharePoint environment

Securing Your SharePoint Deployment on Azure

IaaSAntimalware

Microsoft Antimalware for Azure Cloud Services and Virtual Machines is a real-time protection capability that helps identify and remove viruses, spyware, and other malicious software, with configurable alerts when known malicious or unwanted software attempts to install itself or run on your system.

Download antimalware documentation

SymantecEndpointProtection

Symantec Endpoint Protection 12.1.4 enables security and performance across physical and virtual systems

How to install and configure Symantec Endpoint Protection on an Azure VM

Virtual Machine Operations and Management

Virtual Machines Extension Name

Feature Description

More Information

IaaSDiagnostics

Enables, disables, and configures Azure Diagnostics, and is also used by the AzureCATExtensionHandler to support SAP monitoring

Microsoft Azure Virtual Machine Monitoring with Azure Diagnostics Extension

OSPatchingForLinux

  • Enables Azure Virtual Machines administrators to automate operating system updates with customized configurations.

You can use the OSPatching extension to configure operating system updates for your virtual machines, including:

  • Specify how often and when to install operating system patches
  • Specify what patches to install
  • Configure the reboot behavior after updates

Operating System Patching Extension Blog Post

See also the Readme and source on Github at Operating System Patching Extension.

Cloud Services

An Azure cloud service is a compute capability within Microsoft Azure that is available to IaaS and specific PaaS workloads. From an IaaS perspective, Azure cloud services leverage virtual machines to provide a unit of access through public endpoints, load balancing, and scalability through auto-scale capabilities. This relationship is illustrated in the following conceptual diagram:

The following diagram shows a visual comparison between leveraging virtual machines and native PaaS capabilities within Azure Cloud Services:

Cloud Services Load Balancing

Load balancing cloud services can be managed between and within each deployed cloud service. To load balance network traffic between deployed cloud services, Azure Traffic Manager can provide redundant and performant paths to the publicly routable virtual IP that is used by the systems within the cloud service.

Azure Traffic Manager provides control over the distribution of network traffic to public Internet endpoints. Traffic Manager works by applying an intelligent policy engine to Domain Name System (DNS) queries for the domain names of your Internet resources. Azure Traffic Manager uses three load-balancing methods to distribute traffic:

  • Failover: Use this method when you want to use a primary endpoint for all traffic, but provide backups in case the primary becomes unavailable.
  • Performance: Use this method when you have endpoints in different geographic locations, and you want requesting clients to use the "closest" endpoint in terms of the lowest latency.
  • Round robin: Use this method when you want to distribute load across a set of cloud services in the same datacenter or across cloud services or websites in different datacenters.

For more information, see Traffic Manager routing methods.

The following image shows an example of the round robin load-balancing method for distributing traffic between different cloud services.

To load balance network traffic across systems deployed within cloud services, the Azure Load Balancer can be used. Virtual machines in the same cloud service or virtual network can communicate with each other directly by using their private IP addresses. Computers and services outside the cloud service or virtual network can only communicate with virtual machines in a cloud service or virtual network with a configured endpoint.

An endpoint is a mapping of a public IP address and port to that private IP address and port of a virtual machine or web role within an Azure cloud service. The Azure Load Balancer randomly distributes a specific type of incoming traffic across multiple virtual machines or services in a configuration known as a load-balanced set.

The following image shows a load-balanced endpoint for standard (unencrypted) web traffic that is shared among three virtual machines for the public and private TCP port of 80. These three virtual machines are configured in a load-balanced set.

By default, a cloud service has a single public facing virtual IP (VIP) address that is assigned an IP address from the Azure IPv4 public address space. Each endpoint uses the VIP for the address component and a unique port. It is possible to add additional public facing VIPs to a cloud service load balancer to support the ability to have endpoints with different IP addresses but the same port.

Azure can also load balance within a cloud service or virtual network by using the internal load balancer. The internal load balancer can be used in the following ways:

  • To balance loads between servers in different tiers of a multitier application (for example, between web and database tiers).
  • To balance loads for line-of-business (LOB) applications hosted in Azure without requiring additional load-balancer hardware or software.
  • To include on-premises servers in the set of computers with traffic that is load balanced.

Internal load balancing is also facilitated by configuring an internal load-balanced set.

The following figure shows an example of an internal load-balanced endpoint for an LOB application that is shared among three virtual machines in a cross-premises virtual network.

Feature References

Cloud Services

https://azure.microsoft.com/en-us/documentation/services/cloud-services/

Multiple VIPs per Cloud Service

https://azure.microsoft.com/en-us/documentation/articles/load-balancer-multivip/

Azure Load Balancer

https://azure.microsoft.com/en-us/documentation/articles/load-balancer-internet-overview/

Internal Load Balancer

https://azure.microsoft.com/en-us/documentation/articles/load-balancer-internal-overview/

Azure RemoteApp

Azure RemoteApp is a service that runs on Microsoft's Azure fabric. It provides an environment for Windows applications to be remotely accessed over the Internet. This environment is scalable to accommodate the end-user demand.

Azure RemoteApp technology expands on the native Windows on-premises service to provide a secure remote connection to applications hosted in Azure. Azure RemoteApp enables remote LOB applications to appear like they are running on the end user's local computer.

RemoteApp uses Microsoft Remote Desktop Protocol (RDP) and RemoteFX. RDP is a WAN optimized protocol to resist network latency and loss. RemoteFX provides a 3D virtual adapter for rendering images. Application delivery provides a highly reliable, fast, and consistent user experience to support content ranging from text to the streaming of multimedia via the Azure global network of datacenters.

Azure RemoteApp is available to run from the following supported end-user devices including:

  • Windows operating system
  • Windows RT
  • Mac operating system
  • iOS
  • Android operating system.

End-users can use the client-side software from their preferred devices to access the Azure RemoteApp programs. Azure RemoteApp provides users with 50 GB of persistent storage. This storage is protected by the fault tolerant nature of Azure Storage accounts.

To test Azure RemoteApp, see: Azure RemoteApp

On the integrated Azure RemoteApp menu, select an application (for example, Excel). The Connecting to dialog will start and you may be prompted for credentials depending on the deployment type.

After the authentication process is complete, the RemoteApp will launch, and the user will have remote access to the application.

RemoteApp is available in two deployment types, which are referred to as collections.

  • A cloud collection is hosted in and stores all data for the programs within the Azure cloud. End-users access apps by signing in with their Microsoft account, synchronized corporate credentials, or credentials that are federated with Azure Active Directory.
    The RemoteApp cloud collection offers a standalone way to host applications within Azure. A cloud collection exists only in the Azure cloud and cannot access the local on-premises network. Cloud collections support creating and sharing custom applications through the use of a custom template image for the application that is being published.
  • A hybrid collection is hosted in and stores data in the Azure cloud, but it allows end-users to access resources that are stored on an on-premises network. Users can access apps by signing in with their synchronized corporate credentials or credentials that are federated with Azure Active Directory.
    The hybrid RemoteApp collection provides a custom set of applications to end-users and access to resources that are stored on an on-premises network. Unlike a custom image that is used with the cloud collection, the image you create for a hybrid collection runs apps in a domain-joined environment, granting full access to the local network and resources.
    When integrating Active Directory with Azure Active Directory by using DirSync or Azure AD Connect, corporate policies can be used within Azure to control the applications being offered, and end-users can use Active Directory credentials to access the RemoteApp applications and resources.

The key differences between the hybrid and cloud collections are how the installation of software updates (patching) is handled. Cloud collection uses preinstalled images (from Office 365 or Office 2013), and the patching process is accomplished by Microsoft.

For both types of collections created from a custom template image, the subscription owner is responsible for managing the image and the applications. Domain-joined images can be managed by Windows Update, Group Policy, Desired State Configuration, or System Center Configuration Manager. After the updates to custom template image are applied, they are uploaded to Azure and the collections (hybrid or cloud) are updated to consume the new image.

Feature References

Introducing Microsoft Azure RemoteApp

http://blogs.msdn.com/b/rds/archive/2014/05/12/windows-apps-in-the-cloud-introducing-microsoft-azure-remoteapp.aspx

How to create a custom template image for RemoteApp

http://azure.microsoft.com/en-us/documentation/articles/remoteapp-create-cloud-deployment/

How to create a hybrid collection of RemoteApp

http://azure.microsoft.com/en-us/documentation/articles/remoteapp-create-hybrid-deployment/

How does licensing work in RemoteApp?

http://azure.microsoft.com/en-us/documentation/articles/remoteapp-licensing/

Best practices for using Azure RemoteApp

http://azure.microsoft.com/en-us/documentation/articles/remoteapp-bestpractices/

Azure RemoteApp FAQ

http://azure.microsoft.com/en-us/documentation/articles/remoteapp-faq/

IaaS Considerations

There are several considerations when deploying IaaS solutions within Microsoft Azure. Deployment considerations include cost, load balancing, resiliency, security, networking, and disaster recovery. Although not exhaustive, this section explores many of these considerations at a high level.

Cost

Cost is one of the top considerations for most organizations consuming services from Microsoft Azure. Being able to develop a predictable consumption model is key for the success of any solution deployed in Azure. The following table itemizes cost factors that you should consider:

Considerations

Decision Points

The size and number of virtual machines

Windows Server licensing costs may be included. Compute hours don't include any Azure Storage costs that are associated with the Windows Server image running in virtual machines. These costs are billed separately.

Azure Storage requirements

Charges apply for Azure Storage costs that are required for virtual machines.

Azure Virtual Network

Charges apply for the creation of a virtual private network (VPN) connection between a virtual network and your VPN gateway. The charge is for each hour that the VPN connection is provisioned and available (referred to as a VPN connection hour). The connection should be 24 hours a day, seven days a week. All data transferred over the VPN connection is charged separately at the Azure standard data transfer rates.

Network traffic

Outbound data is charged based on the total amount of data moving out of the Azure datacenters through the Internet in a given billing cycle. This applies to any traffic, including traffic that traverses the VPN tunnel. In this document, outbound directory synchronization traffic is expected to represent the most significant portion of the network traffic, depending on the amount of directory changes.

Support

Azure offers flexible support options for organizations of all sizes. Enterprises that deploy business-critical applications in Azure should consider additional support options.

Load Balancing

Customers deploying applications in Azure Virtual Machines must consider load balancing their virtual machines. This is for application deployments that require more than one server. For customers wanting to use on-premises load balancing, this configuration is not supported today with Azure Virtual Machines. When considering load balancing in Azure Virtual Machines, note that Azure Virtual Machines currently only supports a round robin load-balancing configuration.

There are two levels of load balancing available for Azure infrastructure services:

  • DNS level: Load balancing for traffic to different cloud services located in different datacenters, to different Azure websites located in different datacenters, or to external endpoints. This is done with Traffic Manager and the round robin load balancing method.

Network level: Load balancing of incoming Internet traffic to different virtual machines of a cloud service, or load balancing of traffic between virtual machines in a cloud service or virtual network. This is done with the Azure Load Balancer.

Feature References

Load Balancing for Azure Infrastructure Services

http://www.windowsazure.com/en-us/manage/windows/common-tasks/how-to-load-balance-virtual-machines/

About Traffic Manager Load Balancing Methods

http://azure.microsoft.com/documentation/articles/traffic-manager-load-balancing-methods

Internal load balancing

http://azure.microsoft.com/documentation/articles/load-balancer-internal-overview

Encryption

A key consideration for workloads deployed in Azure virtual machines is encryption for data-at-rest. For virtual machines, most customers seek the ability to perform platform encryption that they have the ability to control.

Currently, Microsoft BitLocker Drive Encryption is not supported because there is no way for Azure to handle the key management portion during virtual machine startup. Given that Azure consists of multiple physical servers, there is not a simple way to manage BitLocker encryption keys.

Third parties, such as CloudLink, have the capability to manage disk encryption keys on Windows and Linux platforms. You can use CloudLink to support encrypting virtual hard disks that are attached to virtual machines and that use published virtual machine extensions. Additional details about CloudLink are provided in the following table.

Feature References

Azure Virtual Machine Disk Encryption using CloudLink

http://azure.microsoft.com/blog/2014/08/19/azure-virtual-machine-disk-encryption-using-cloudlink/

Encrypting Azure Virtual Machines with CloudLink SecureVM

http://azure.microsoft.com/blog/2014/11/13/encrypting-azure-virtual-machines-with-cloudlink-securevm/

Networking

When The following table itemizes what to consider when you are deciding how to provision virtual machines on a virtual network:

Considerations

Decision Points

Name resolution

When you deploy virtual machines and cloud services to a virtual network you can use Azure-provided name resolution or your own DNS solution, depending on your name resolution requirements.

Enhanced security and isolation

Because each virtual network is run as an overlay, only virtual machines and services that are part of the same network can access each other. Services outside the virtual network have no way to identify or connect to services hosted within virtual networks. This provides an added layer of isolation to your services.

Extended connectivity boundary

The virtual network extends the connectivity boundary from a single service to the virtual network boundary. You can create several cloud services and virtual machines within a single virtual network and have them communicate with each other without having to go through the Internet. You can also set up services that use a common back-end database tier or use a shared management service.

Extend your on-premises network to the cloud

You can join virtual machines in Azure to your domain running on-premises. You can access and leverage all on-premises investments for monitoring and identity for your services hosted in Azure.

Use persistent private IP addresses

Virtual machines within a virtual network will have a stable private IP address. We assign an IP address from the address range you specify and offer an infinite DHCP lease on it. You can also choose to configure your virtual machine with a specific private IP address from the address range when you create it. This ensures that your virtual machine retains its private IP address even when it is stopped or deallocated.

For more information, see Configure a static internal IP address for a VM.

There are two models for network configurations for Azure virtual machines: cloud-only and cloud-premises:

  • Cloud-only virtual network configurations are virtual networks that don't use a virtual network gateway to connect back to your on-premises network or directly to another Azure virtual network. They aren't really a different type of virtual network, but rather, they are a way to configure a virtual network without configuring cross-premises connectivity. You connect to the virtual machines and cloud services from the endpoints, rather than through a VPN connection. For cloud-only configurations, see How to create a virtual network.
  • Cross-premises connections offer an enormous amount of flexibility. You can create multisite configurations, virtual network to virtual network configurations, ExpressRoute connections, and combinations of multiple configuration types. If you are extending your on-premises network to the cloud, this is the way to do it.
    Most cross-premises connections involve using a VPN device to create a secure connection to your Azure virtual network. If you prefer, you can create an ExpressRoute direct connection to Azure through your network service provider or exchange provider and bypass the public Internet altogether.

Feature References

About Virtual Network Secure Cross-Premises Connectivity

https://msdn.microsoft.com/en-us/library/azure/dn133798.aspx

Limitations

Although the capabilities of Azure Virtual Machines are quite comprehensive, some native limitations exist, and they should be understood by organizations prior to deploying solutions in Azure. The following table explores these limitations.

Limitation

Impact

Workaround

Auto-scaling

The application environment does not automatically increase or decrease role instances for increase or decrease in load.

  • Someone needs to manually monitor the load.
  • Sudden increase in load will impact the performance.

Utilize monitoring and automation capabilities such as the Azure Monitoring Agent and Azure Automation to dynamically scale and deploy application code to virtual machine instances in the environment.

Load balancing

Virtual machines are not load balanced by default

  • Azure Virtual Machines does not allow for elasticity of the application environment.
  • Sudden increase in load will impact the performance.

After the virtual machine is provisioned, create an Internal Load Balancer and associate it with the virtual machine.

Multiple network adapters

  • The current release does not support adding or removing network adapters after a virtual machine is created.
  • Network adapters in Azure Virtual Machines cannot forward traffic or act as Layer 3 (IP) gateways.
  • Internet-facing VIP is only supported on the "default" network adapter, and there is only one VIP mapped to the IP of the default network adapter. The additional network adapters cannot be used in a load-balance set.
  • The order of the network adapters inside the virtual machine will be random, but the IP addresses and the corresponding MAC addresses will remain the same.
  • You cannot apply Network Security Groups or Forced Tunneling to the non-default network adapters.

For more information, see:

Multiple virtual machine network adapters and network virtual appliances in Azure

 

Density

The total virtual machines per virtual network currently is 2048.

Create a new virtual network and extend the network by connecting virtual networks together.

Concurrent TCP connections

Concurrent TCP connections for a virtual machine or role instance = 500 K.

 

Static IP address or multiple IP address

  • Cannot assign static IP Addresses on a virtual machine instance
  • Cannot assign multiple IP Address on a virtual machine instance
 

Management Considerations for Azure IaaS Virtual Machines

Azure Diagnostics provides Azure extensions that enable you to collect diagnostic telemetry data from a worker role, web role, or virtual machine running in Azure. The telemetry data is stored in an Azure Storage account. It can be used for debugging and troubleshooting, measuring performance, monitoring resource usage, traffic analysis and capacity planning, and auditing.

The following table explains the types of telemetry Azure Diagnostics can collect.

Data Source

Description

IIS logs

Information about IIS websites

Azure Diagnostic infrastructure logs

Information about diagnostics

IIS failed request logs

Information about failed requests to an IIS site or application

Windows Event logs

Information sent to the Windows event logging system

Performance counters

Operating system and custom performance counters

Crash dumps

Information about the state of the process in the event of an application crash

Custom error logs

Logs created by your application or service

.NET EventSource

Events generated by your code using the .NET EventSource class

Manifest-based ETW

Event Tracing for Windows (ETW) events generated by any process

Operational Insights is an analysis service that enables IT administrators to gain deep insight across on-premises and cloud environments. It enables you to interact with real-time and historical machine data to rapidly develop custom insights, and provides Microsoft and community-developed patterns for analyzing data.

For more information about these topics, please refer to the Cloud Platform Integration Framework section later in this document.

Microsoft Azure Compute: PaaS

Azure Platform-as-a-Service (PaaS) workloads share some common elements with IaaS, but they also have some key differences that should be considered when they are deployed. This service has been a part of the Azure offering since its inception, and in many ways is a desirable service to realize the true value of cloud computing.

A primary goal of PaaS is to remove the need to manage the underlying virtual machines. This allows customers to focus on the real value of the application, which is the functionality that it provides, not the underlying operating system or virtual machine.

PaaS provides great value in that management duties are significantly smaller for most organizations. The ability for Microsoft to maintain the operating system and virtual machines, and keep them patched with the latest security updates is a key differentiator to many cloud solutions in place today.

Another key benefit for targeting PaaS for applications and services is the dynamic scaling features that it affords. A side benefit of not managing the underlying virtual machines is the ability to scale the workloads to upper limits without any preplanning. New instances can be created and destroyed by the Azure platform and controlled by the customer. The real value of auto-scaling is in full effect with PaaS.

The integration of application deployment and release management into the service offering makes PaaS very desirable for customers looking to automate and orchestrate deployment of their application. Every application that gets deployed to Azure is a self-contained, packaged asset. This package is simply deployed to a virtual machine that is provisioned by the platform based on a configuration that the customer provides.

This makes automated and continuous integration of application code a real option. Partnered with the concept of deployment slots to allow VIP swapping makes deployments to the cloud a more predictable and safer deployment model. In addition, rolling back to a snapshot is possible with these options.

Feature References

Cloud Services Explained

http://azure.microsoft.com/en-us/documentation/articles/fundamentals-application-models/#cloud-services

Websites explained

http://azure.microsoft.com/en-us/documentation/articles/fundamentals-application-models/#websites

Cloud Service details / architecture

https://msdn.microsoft.com/en-us/library/azure/jj155995.aspx

Large Scale Services in Azure

https://msdn.microsoft.com/en-us/library/azure/jj717232.aspx

Development Considerations

https://msdn.microsoft.com/en-us/library/azure/jj156146.aspx

Platform updates in PaaS

https://msdn.microsoft.com/en-us/library/azure/hh472157.aspx

Deploying Azure Cloud Service with Release Management

http://blogs.msdn.com/b/visualstudioalm/archive/2015/02/09/deploying-azure-cloud-service-using-release-management.aspx

Mandatory: Azure solutions must contain at least two instances if running web or worker roles. For apps (such as Web Apps), this is not a requirement because the design has inherit fault tolerance built in.

Recommended: Azure solutions should contain multiple upgrade domains to avoid outages caused by updates to the guest and host by the platform. This is a unique item that exists for PaaS services.

Optional: Azure solutions can optionally contain auto-scaling configurations to increase and decrease instance counts for the service, based on a schedule or metric.

Design Guidance

For more information, see these applicable Azure design patterns:

The common design patterns for PaaS workloads can be split into two primary categories:

  • Web-based workloads
  • Back-end processing workloads

Typically, web-based workloads for modern frameworks work on PaaS with few changes.

Web applications that use a framework prior to .NET Framework 4.0, usually require some code changes to fit with this cloud model. The key points to remember are whether there are extra components that need to be installed as part of the application, for example, custom ISAPI filters, drivers, or security models that require full trust. These can be adapted to PaaS web applications, but they require varying levels of changes.

The other very important point to remember when using both web and back-end workloads is that the application needs to be stateless. Applications that require additional components for state management and tight coupling of the tiers of the applications tend to have problems when using modern cloud scale models.

At a minimum, it's important to understand all the components that make up the application and the architecture for the data, business, and front-end tiers. Additionally, its key to understand if the deployment can be a file-based deployment and if the application is self-contained from a binary perspective.

Scenario

Model

Points to Consider

Web-based workload

Web-based applications

  • Software outside of web server needs to be installed
  • Startup tasks are needed

Back-end workload

Service-based applications

  • Avoid tight processing loops
  • Leverage competing consumer for multiple instances

Cloud Services

Azure Cloud services, in the context of PaaS, provides the units that contain the roles instances that comprise a given application. Azure Cloud Services bind to the virtual IP (VIP) that services request and load balance requests over underlying role instances. Azure Cloud Services can be considered a unit of deployment that can be versioned and stored. When you deploy a cloud service, it contains a package that defines the service (such as networking, load balancing, or role instance counts) in addition to the actual code for the application.

This model of deployment makes it very easy to control and deploy specific versions of an application. A cloud service can have multiple deployments running simultaneously. This is possible because of a concept of deployment slots that are implemented with cloud services. There are two deployment slots available for each cloud service.

The intention is for the staging slot to be used to stage new or updated versions of the cloud service, which are assessable to the deployment or DevOps teams for testing, and the production slot is used to host the production deployment of the application.

Cloud services also contain the binaries and scripts to install additional components to the PaaS instance at startup. These are necessary because the deployment that is running in Azure, will move inside the Azure datacenter. As updates are deployed to the host and guest operating systems, the PaaS instances will be moved to other hosts. This means that everything required to make the PaaS instance and application run is required to be a part of the cloud service package.

Feature References

Cloud services explained

https://msdn.microsoft.com/en-us/library/azure/jj155995.aspx

Startup tasks in cloud services

https://msdn.microsoft.com/en-us/library/azure/hh180155.aspx

Tools for packaging and deployment

https://msdn.microsoft.com/en-us/library/azure/gg433055.aspx

Manage guest operating system updates

https://msdn.microsoft.com/en-us/library/azure/ff729422.aspx

Recommended practices for large scale Web Apps

https://msdn.microsoft.com/en-us/library/azure/jj717232

Mandatory: Cloud services must contain all the assets, including code and other installations required to run the application. Everything must be included in the cloud service package, including scripts for installation.

Recommended: Give consideration to deployment models that will be used when updating the application. There are a few options to understand, and they each have pros and cons.

Optional: Cloud services can contain multiple running deployments in the form of production and testing or staging.

Design Guidance

It is best to understand that Azure Cloud Services, in the simplest form, provide a container or package wrapper for applications that are deployed to Azure. This type of application deployment model is not necessarily new. You will find similar models in client applications, such as the .appx format used by modern Windows applications.

The core idea is to build, version, and deploy the service package as a unit. This will make it easier for the DevOps or release management team to deploy updates to the application and to roll back if there are unforeseen side effects from an application update.

It is also important to realize that scaling Azure Cloud Services in the PaaS model is trivial. Because the application and service definition are wrapped in a package, deploying more instances of this is simply a matter of telling the Azure platform how many instances you want.

Web and Worker Roles

Application Type

Description

Web role

This role is used primarily to host and support applications that target IIS and ASP.NET. The role is provisioned with IIS installed, and it can be used to host front-end, web-based applications.

Worker role

This role is used primarily to host and support service applications. These applications target back-end processing workloads. They can be long running processes and can be thought of as providing services in the cloud.

It is important to remember that web and worker roles have dedicated underlying virtual machines per instance. Typically, this is transparent to the consumer, but it's particularly important from a diagnostics perspective. You can enable and log on to the underlying virtual machine if needed; however, this option is disabled by default.

Important things to keep in mind when deploying to the PaaS model are:

  • Additional components and software can be installed on the PaaS instances. This is accomplished via startup scripts.
  • There will most likely be some changes to the application when you migrate from an existing on-premises or private cloud deployment. At the very least, you will be repackaging the application in the cloud service.
  • It's important to consider session affinity and session state management when deploying applications to the cloud.
  • You should consider how upgrade domains are configured with the PaaS application. This affects how deployments are rolled out to PaaS workloads.

Feature References

Web and worker roles

https://msdn.microsoft.com/en-us/library/azure/hh180152.aspx

IIS configuration in PaaS

https://msdn.microsoft.com/en-us/library/azure/gg433059.aspx

Configure web role with multiple sites

https://msdn.microsoft.com/en-us/library/azure/gg433110.aspx

Mandatory: Web and worker roles require at least two instances to provide fault tolerance for the automatic maintenance nature of PaaS.

Recommended: Web and worker roles should be considered if the application requires installing binaries on the web or application servers.

Optional: Virtual networking is common to allow the communication needed for databases, management services, and other services, but it is not a hard requirement for deploying an application via PaaS to a web or worker role.

Design Guidance

Web roles are specifically tailored for IIS-based applications. This limits their use to Windows applications that can target the Microsoft operation system and services. The common design pattern is to configure the scale unit for the instances and ensure that multiple (at least 2) are used for production workloads. This is done by simply setting the configuration in the service definition.

Worker roles specifically target service applications (non-web based). As such, error handling that would be required for an out-of-band management service application should be employed. If exceptions are not handled in the service inner loop, the role instance will be restarted, which will result in downtime for processing.

Azure App Service

Previous PaaS applications called Azure Websites have been integrated into a model that is called Azure App Service. App Service is comprised of the following subcomponents:

  • Web Apps
  • Mobile Apps
  • API Apps
  • Logic Apps

Web Apps is the new term used to describe what was previously named Azure Websites. The Web Apps feature in Azure App Service is a type of PaaS workload that differs slightly from the traditional web and worker role applications. The model is based on decoupling from the underlying infrastructure—even more than traditional PaaS applications. This highly reduces the operational burden when maintaining applications because the maintenance is no longer required for the infrastructure, and it shifts to the underlying application.

This model primarily removes the customer from any connection with the underlying virtual machines that are hosting the application. This means components such as Remote Desktop are not an option and that the installation of components and software is not something a customer can directly execute.

There are extensions available via the Azure portal (Azure Marketplace), which are essentially packages of software that have been tested and can be added to a website deployed via Web Apps.

Web Apps are primarily used to provide a platform to host various web applications and web services. Additionally, Web Apps can run back-end processes via a service offering in Azure WebJobs.

WebJobs encapsulate an existing executable or script that provides some processing output. WebJobs can also be scheduled or run on demand. For more information about WebJobs, see Azure WebJobs documentation resources.

Deployment of Web Apps is in some ways different from other PaaS and IaaS deployment models. Supported deployment models include:

  • Manual – Use a file copy, FTP, or WebMatrix
  • Local Git – Use the Kudo environment
  • Continuous integration – Use GitHub or Team Foundation Version Control (TFVC)

There are some fundamental differences in deployment slots in Web Apps as compared with the web and worker role deployments. Web Apps supports up to five deployment slots.

Web Apps is deployed in an App Service plan, previously called a Web Hosting plan. The service plan represents a set of features and capacity that can be contained and shared with multiple Web Apps in an Azure App Service. The following pricing tiers are provided:

  • Free
  • Shared
  • Basic
  • Standard
  • Premium

For apps to share a hosting plan, they need to be in the same subscription and geographical location. In an Azure App Service, an app can be associated with only a single app hosting plan at one time.

Feature References

App Services explained

https://msdn.microsoft.com/en-us/library/azure/dn948515.aspx

App Services deep dive

http://channel9.msdn.com/Series/Windows-Azure-Web-Sites-Tutorials

App Service migration tools

https://www.movemetothecloud.net/

Mandatory: These lighter weight PaaS services do not allow direct access to the underlying virtual machines. This means no installation of components on the underlying web server (outside of the application folder).

Recommended: Match the service offering with the type of workload. API apps differ from Web Apps because one needs more focus on the back end and the other needs more focus on the front end.

Optional: Plan for capacity needs. Although some thought should be given to how many instances or sizes should be used, these can easily be changed later. The focus here is on rapid deployment.

Design Guidance

Azure App Service is one of the latest models to be employed on Azure. The idea is to simplify the management and cost of running a variety of services in PaaS. This means a service performance level can be set at the service level and then the various services can be deployed inside this service.

For example, a web app could be deployed that is using an API app or a Logic App, and the cost and performance levels are set at the service level. This simplifies the deployments because each app doesn't need to be configured and billed separately.

The app model is growing very fast, and it makes integrating deployed services, APIs, and applications much simpler and faster than previous PaaS models, such as web roles.

Azure SQL Database

Azure SQL Database is the realization of one of the most popular relational databases in a managed, multitenant, PaaS model. When choosing a database deployment model, there are key factors to consider to ensure the end goal is met. Although this is attractive for many reasons, it is important to understand that there are key differences between running an Azure SQL Database and an on-premises or IaaS virtual machine with SQL Server installed.

A key point is that system-level functions cannot be performed from Azure SQL Database. This includes database backups, system level profiling, and extensions to SQL Database, including FILESTREAM and CLR extensions.

Operations such as backups have been accommodated by extensions to TSQL, which allows for the backups without the need to create a backup media object (which would tie to a file system object on the operating system running SQL Server). All other operations such as management for the SQL Database instance are accommodated by using dynamic management views (DMVs) instead of extended stored procedures.

The key benefits of leveraging Azure SQL Database over traditional deployments of SQL Server is that databases and servers can be created in seconds, not hours or days. The other added benefit is that there is less need to focus on the infrastructure and processes to replicate and back up the data in the databases.

By default, Azure SQL Database commits the data to three separate instances in the same Azure region. This is similar to how Azure Storage commits all Writes to three stamps in Azure. This provides the high availability feature and protects against hardware failures inside Azure.

Feature References

Understanding Azure SQL

http://azure.microsoft.com/en-us/documentation/articles/data-management-azure-sql-database-and-sql-server-iaas/

Development considerations

https://msdn.microsoft.com/library/azure/ee730903.aspx

Performance and scaling

https://msdn.microsoft.com/library/azure/e6f95976-cc09-4b46-9d8c-4cf23119598d

Mandatory: Understand the performance and management differences between a traditional SQL Server database and an Azure SQL Database.

Recommended: Analyze databases to be migrated to Azure SQL Database for incompatibles that might be present (for example, FILESTREAM).

Optional: Leverage built-in tools for BACPAC and DACPAC to move databases to Azure SQL Databases.

Design Guidance

Running relational databases in the cloud has deep implications to the application, performance, and resiliency. In many ways, the database is as important as the application in terms of how to run it effectively in a public cloud. There are pros and cons to running SQL Server in an IaaS environment as well as Azure SQL Database.

Some fundamental considerations are encryption requirements, performance requirements, and feature requirements. Although data can be encrypted by the application and stored in an SQL Database, TDE is not yet supported in SQL Databases.

If this is required, a SQL Server in your IaaS environment would be the preferred target. Performance of an Azure SQL Database can appear to be slower, but one must consider that each write, will be committed to three (3) databases in the local datacenter (synchronously) and other is asynchronously if geo replication is enabled. This affords the benefit of having to maintain as many local backups (built-in backups) with the downside of performance. For high TPS loads, consider adding a caching layer to insulate the application from the performance impacts of multiple commits as much as possible.

Advanced features in SQL Server that require access at the disk level or operating system level obviously will not work the same with SQL Database. For example, CLR integration, backup sets, and FILESTREAM tables are not possible with Azure SQL Database.

Leveraging Azure SQL Database has some unique security considerations. Azure SQL Database has a public facing IP address that is accessible by anyone. Communications to Azure SQL Database can be secured by using the SQL Server firewall and the per SQL Database firewall.

This allows for specific IP addresses that can connect to the database. When using ExpressRoute and public peering to access Azure SQL Database, the access flows through a network address translation (NAT) interface. This means the NAT address has to be specified in the firewall rules for Azure SQL Database, and therefore, it does not allow the ability to specify end-to-end security.

Scenario

Model

Points to Consider

FILESTREAM needed

Shred the file objects to a blob in Azure Storage and store indexes in SQL Server

FILESTREAM is not available with Azure SQL Databases

SQL backups

Use geo-replication and point-in-time backups for Azure SQL Database workloads

Traditional backup sets are not supported in Azure SQL Databases

Azure Batch

As discussed earlier, there is often the need to run processing that might not have an immediate UI and runs as a service in the background. Traditional PaaS offering was to leverage a worker role and implement the business logic via custom code.

Azure Batch offers a similar type service, but with a unique twist. It has been designed to run background processes, but it is centered on high-performance data. It can provide scheduling, auto-scaling of compute resources, and partitioning of jobs.

This type of service targets the following type of workloads:

  • Financial risk modeling
  • Image rendering
  • Image processing
  • Media encoding and transcoding
  • Genetic sequence analysis
  • Software testing
  • Azure Batch is comprised of two primary components: Azure Batch and Azure Batch Apps. Azure Batch APIs focus on the ability to create pools of virtual machines and define work items to be scheduled for processing.
  • Before the advent of Azure Batch, the typical model was to create an HPC cluster, and devise a software model that would allow scheduling or queueing items that required processing.
  • Azure Batch includes an API to support the infrastructure. There is no need to manually build servers and software libraries to handle job scheduling. This allows consumers the ability to focus purely on the business logic and use configuration to drive the infrastructure.
  • Azure Batch apps take this a step further. Consumers can publish an "app," which is essentially a service that allows data to be fed to it, and it will run as needed. Definition of the pool for the compute workload is defined in the configuration. This adds the ability to monitor these jobs via the Azure portal and via the REST API for extensibility to build custom dashboards.

Feature References

Azure Batch Technical Overview

http://azure.microsoft.com/en-us/documentation/articles/batch-technical-overview/

Azure Batch APIs

https://msdn.microsoft.com/en-us/library/azure/dn820177.aspx

Mandatory: Define pools of virtual machines that will perform the underlying work for Azure Batch jobs.

Recommended: Analyse the workload to determine which model is the better fit— Azure Batch or Azure Batch apps.

Optional: Leverage the REST API to output monitoring and telemetry to existing systems.

Design Guidance

Azure Batch is a good consideration for workloads in which an existing process or executable is used to process the data. For this to work effectively, the data should be in a format that can allow parallelization (which cuts the data into several chunks). This service can be highly effective for custom processing and it is easy to configure.

Azure HDInsight (Hadoop)

Azure HDInsight is the implementation of Hadoop as a service in Azure. The goal of this service is to enable customers with the ability to create Hadoop cluster services in seconds and minutes instead of hours and days. This significantly reduces the cost of this big data service. Additionally, the service provides storage in the form of the Hadoop Distributed File System (HDFS), which has become the standard for Hadoop clusters. Azure extends this concept to allow Azure Storage to be leveraged by Hadoop.

At its core, HDInsight can be run on Linux-based or Windows-based servers. This makes using Hadoop easy for those coming from a Linux background and approachable by those who use Windows.

The HortonWorks Data Platform (HDP) is the Hadoop distribution used by HDInsight. Additionally, there are several high-level configurations for running Hadoop, which can be used to optimize the cluster based on the operations and activities it will target. Along with this are other components that have been developed primarily by the open source community. These customize the system for specific types of workloads.

Feature References

Hadoop on Azure

http://azure.microsoft.com/en-us/documentation/articles/hdinsight-hadoop-introduction/

Components matrix

http://azure.microsoft.com/en-us/documentation/articles/hdinsight-component-versioning/

Apache Hadoop Core

http://hadoop.apache.org/

Using Pig with Hadoop

http://azure.microsoft.com/en-us/documentation/articles/hdinsight-use-pig/

Mandatory: Create storage accounts for data repositories for HDInsight. Also, be sure to deprovision your clusters when not in use because the cost for running thousands of cores can add up quickly.

Recommended: Know the type of data you have in your system, and think about what actions are most important for your workloads. Get a good sense for your data and what is most important to your workloads. This will help guide what components in HDInsights can be leveraged to take best advantage of the platform services.

Optional: Spend some time checking what is already built by the open-source communities that you can use with must less effort than writing from scratch.

Azure Machine Learning

Machine learning and data science is an exciting technology in today's market. Unlocking the insights that our data holds is key to getting the competitive advantage that most companies need for success.

Although the language around neural networks and machine learning have been in place for decades, only recently has computing power been able to provide the computational performance required to run these algorithms at a large scale. This service allows modeling data and algorithms. This provides a low bar for entry and users can start simply with a web browser. Azure Machine Learning also provides a tool set called Azure Machine Learning Studio.

This service offering turns the algorithms and code from languages (for example, R and Python) into services that can target user's data. Users upload more data, and it can be compared and processed against models created by data scientists. Machine Learning also provides training with existing data to feed the models.

This offering lowers the bar for getting the most out of a predictive analytics system. Historically, it required very different skill sets to develop the models and algorithms as opposed to exposing the results via the web or services. Machine Learning marries these primary skills and focuses on the underlying business logic.

Feature References

Machine learning overview

https://azure.microsoft.com/en-us/documentation/articles/machine-learning-what-is-machine-learning/

Azure for Research

http://research.microsoft.com/en-us/projects/azure/default.aspx

Machine Learning Studio

http://azure.microsoft.com/en-us/documentation/articles/machine-learning-what-is-ml-studio/

Publishing the API

http://azure.microsoft.com/en-us/documentation/articles/machine-learning-publish-a-machine-learning-web-service/

Mandatory: Bring your R and Python code libraries, and understand how to leverage Machine Learning Studio to provide a streamlined development experience.

Recommended: Partition your logic to create consumable services by using the platform services of Machine Learning.

Optional: Explore what the data science community has already created and work to extend or enhance these to speed your development effort and time.

High-Performance Computing

Azure provides high-performance computing (HPC) in the form of high-performance virtual machines. These virtual machines are tailored to support this specific computing need via features such as a back-end network with MPI latency under three microseconds and up to 32 Gbps of throughput. The back-end network leverages RDMA to enable scaling workloads beyond the typical limits (even for cloud platforms) to thousands of cores.

Azure supports a high-performance capacity of up to 500 MB of memory and 6.9 TB of SSD disk performance, up to 32 CPUs per virtual machine. When combined with the Microsoft HPC Pack, a system can be architected to run on-premises in a Windows compute cluster, and then extend to Azure as capacity demands dictate. This allows organizations to also run HPC wholly in Azure if desired.

Combined with a rich ecosystem of applications, libraries, and tools, Azure provides a premier platform for high-performance computing.

Feature References

HPC overview

https://msdn.microsoft.com/en-us/library/azure/dn482130.aspx

Cluster pack for Azure IaaS

https://msdn.microsoft.com/en-us/library/azure/dn518135.aspx

Running MPI apps

https://msdn.microsoft.com/en-us/library/azure/dn592104.aspx

Hybrid cluster

http://azure.microsoft.com/en-us/documentation/articles/cloud-services-setup-hybrid-hpcpack-cluster/

Mandatory: Determine a strategy for on-premises, cloud, and hybrid clustering for HPC workloads.

Recommended: Scale the application resources dynamically, to take advantage of extreme size virtual machines only when it makes sense.

Optional: Make changes to the applications to allow disconnecting the tiers to take advantage of features, such as queuing to allow scaling of independent compute clusters.

Content Delivery Network

A key component for applications that require global reach is to be able to reach customers in the quickest manner. One of the industry standards for distributing static content (typically images and video) is content delivery networks (CDNs).

Although this type of service has been available for years, the Azure platform integrates the CDN with existing data in Azure Blob Storage. This integration enables hosting the applications for a global reach, and makes it easier and less expensive to implement.

Typically, CDNs are used for static content, such as media, images, or documents, which is read more than written to. With an end goal of pushing content globally, this integration with Azure results in a faster response time for clients, based on their location. It also facilitates automatic redundancy and replication of content.

Feature References

CDN overview

http://azure.microsoft.com/en-us/documentation/articles/cdn-overview/

POP locations for CDN in Azure

https://azure.microsoft.com/en-us/documentation/articles/cdn-pop-locations/

Integration of CDN

http://azure.microsoft.com/en-us/documentation/articles/cdn-serve-content-from-cdn-in-your-web-application/

Mandatory: Determine regions to target for the CDN and update the application to use the root URI of the CDN rather than for the local content.

Recommended: Leverage parameters to vary the caching characteristics and lifetime of the content cache.

Optional: Map the CDN content to a custom domain name.

Redis Cache

Application-level caching can be achieved with a variety of products, some from Microsoft and others from external vendors. There are multiple types of caching options in Azure, including Managed Cache Service, In-Role Cache, and Redis Cache.

Redis Cache is a cache-as-a-service offering for the Azure platform. This means that the Azure platform manages the underlying infrastructure to host the caching servers. From an application point of view, the service can be accessed via the same Redis clients that have been in use since Redis was created. These vary based on platforms, and they are available for most of the popular selections (for example, Java, Node, and .NET).

Redis goes beyond a simple key/value pair so it can cache entire data structures, such as collections and sets. Redis also supports non-blocking, first synchronization, and automatic reconnections, which support cache replication to increase uptime.

Feature References

Redis overview

http://azure.microsoft.com/en-us/services/cache/

Caching data in Redis

https://msdn.microsoft.com/library/azure/dn690521.aspx

Management in Azure Portal

https://msdn.microsoft.com/library/azure/dn793612.aspx

Cache planning

https://msdn.microsoft.com/library/azure/dn762132.aspx

Mandatory: Understand which tier of service will be required and implement the Redis client in the application.

Recommended: Use the advanced structure caching options with Redis to simplify the application caching code.

Optional: Set up policies for cache rejection, lifetime, and so on.

Service Bus

Azure Service Bus is one of the core services, and it provides a high-performance, durable-messaging service. But it is actually a bit more than this—Service Bus offers queueing and relay services.

Service Bus queues provide an option to decouple any processing from the request pipelines. This type of architecture is very important, especially when migrating workloads to the cloud because loosely coupled applications can scale, and they are more fault resilient.

Service Bus can use a variety of models, from simple queue-based storage, to topics (which target and partition messages in a namespace). You can even use Event Hubs on top of Service Bus to service very large client bases, where input to Service Bus will include several thousands to millions of messages in rapid succession.

Feature References

Service Bus overview

https://msdn.microsoft.com/en-us/library/ee732537.aspx

Sample scenarios

https://msdn.microsoft.com/en-us/library/dn194201.aspx

Event Hubs

https://msdn.microsoft.com/en-us/library/dn789973.aspx

Application architecture

http://azure.microsoft.com/en-us/documentation/articles/service-bus-build-reliable-and-elastic-cloud-apps/

Mandatory: Determine which model to use when storing messages in Service Bus, based on transaction, lifetimes, and message rates.

Recommended: Modify applications to provide transient fault handling to compliment decoupling message posting from message processing.

Optional: Leverage Event Hubs to handle large scale intake of Service Bus messages.

API Management

In addition to providing business logic, a key consideration when implementing web-service workloads is to allow for features such as throttling, authentication, and partitioning services. Until this point, developers were tasked with building code for this infrastructure, which became the framework for the web services that were deployed.

API Management Service was designed to accommodate this need. It provides this infrastructure with very little effort. Developers can concentrate on the business logic of the web services instead of how they are deployed. This also allows deploying the underlying web services on different servers and different technologies, as needed.

API Management Service can leverage existing on-premises web services in addition to cloud-deployed services. Services such as throttling, rate limits, and service quotas can be applied in a central point, similar to load balancing. Services are based on rules that are established in the API Management service. This allows consolidating services from multiple back-ends to a single entry point for service consumers.

Feature References

API Management overview

http://azure.microsoft.com/en-us/documentation/articles/api-management-key-concepts/

Getting started

http://azure.microsoft.com/en-us/documentation/articles/api-management-get-started/

API management for developers

https://msdn.microsoft.com/en-us/library/azure/dn776327.aspx

Securing APIs

http://azure.microsoft.com/en-us/documentation/articles/api-management-howto-mutual-certificates/

Mandatory: Configure policies for services and profiles for existing web services to use API Management.

Recommended: Protect web services with API Management rate limits and quota policies.

Optional: Customize the developer portal to allow for developer registration and subscription models.

Azure Search

Probably one of the most common components in applications is a platform or infrastructure that can support searching for application data. There are quite a few products offered on the market by third-party and Microsoft. Azure Search was created to provide the search-as-a-service offering in Azure.

While Azure Search does not offer a crawler to index the application data sources, it does provide the infrastructure to intake the index files and provides interfaces for the actual search functions. This service is targeted at developers, and it is not a service that is directly customer facing. At the core, it's a web service that follows the model of a REST-based interface for connected applications.

The index schemas are expressed in JavaScript Object Notation (JSON). Essentially the index contains a list of fields and associated attributes. These attributes are:

  • Name: Describes the data this field contains.
  • Type: Indicates the type of data in this field. Some of the options are String, Int32, Double, Boolean.
  • Searchable: Determines whether a user's search request can access this field.
  • Suggestions: Determines whether Azure Search can provide suggestions for this field. If this is set, an application can call Azure Search regularly while the user is typing in the Search box to get suggestions. These suggestions are added to the index by the people who own that index—they're not created automatically by the search service.
  • Sortable: Indicates that search results can be sorted by this field. Some fields, such as a string containing a paragraph of text, might not allow this because sorting on a paragraph probably wouldn't make much sense.
  • Retrievable: Indicates whether this field can be returned in the search results.
  • Filterable: Indicates that this field can be used as a filter. For example, if a user wants to search for "high heels," the field that contains these search terms must be marked as filterable. This lets Azure Search return only the rows in the index that contain "high heels" in that field.
  • Facetable: Indicates whether a search request can return the number of items in the index with a specific characteristic. An application can also request the number of items within a specific range.

Feature References

Azure Search Overview

https://msdn.microsoft.com/library/azure/dn798933.aspx

Getting started

http://azure.microsoft.com/en-us/documentation/articles/fundamentals-azure-search-chappell/

Azure Search API

https://msdn.microsoft.com/en-us/library/azure/dn798935.aspx

Creating indexes

https://msdn.microsoft.com/en-us/library/azure/dn798941.aspx

Mandatory: Construct and update indexes for Azure Search consumption via back-end services. PaaS-based worker roles work well for these types of jobs.

Recommended: Add additional attributes to the index to support advanced features, such as automatic suggestions.

Optional: Build monitoring data integration to existing monitors to ensure storage or indexes don't exceed the limits for the service.

Design Guidance

For more information about applicable Azure design patterns, see Azure Search Tier (Azure Architecture Patterns).

PaaS Considerations

As with IaaS, there are several considerations when deploying solutions within Microsoft Azure, specifically when targeting PaaS models. Deployment considerations include deployment methods, load balancing, resiliency, security, and networking. This section covers these areas at a high level.

Deployment Methods

When running services in PaaS, it's important to understand and configure the services such that upgrades to the application and upgrades as part of the Azure platform do not result in outages or downtime to the application.

Models for the application lifecycle vary from simple to complex. Some of the most important tradeoffs are detailed in the following table:

Considerations

Decision Points

Upgrade domains

This service is deployed to Azure (specifically for PaaS web and worker roles), and it includes the concept of upgrade domains. It's important to configure the services to use multiple upgrade domains to avoid unnecessary outages when new deployment or upgrades to the application or service are initiated.

Deployment slots

Deployment slots can be used to test new versions or upgrades without affected the production application. A model for using staging slots before releasing to production can enable better testing to avoid downtime.

Web Deploy

Web Deploy is a way to deploy services to cloud services in Azure. Although this is simple, user interaction is typically required with this model. This makes a good option for developer and smaller apps, but larger apps might require more governance and control for the deployments.

Continuous integration

Continuous integration is a great option for larger applications and organizations that require automating deployments. This allows gated check-ins (approval) and continuous check-ins (triggered).

Load Balancing

Customers who deploy applications as PaaS in Azure must consider load balancing as a core part of the application. This is a must for applications hosted in PaaS because even if hardware failures never happen (which is unlikely), the servers need to be upgraded (guest and host upgrades). This means that these instances will be moved at some point.

The Azure fabric will ensure that it doesn't take all the server instances down at one time, but this requires the use of at least two instances (fault and upgrade domains allow the fabric to operate in this way).

Keep in mind that are two levels of load balancing available for Azure PaaS services:

  • DNS level: Load balancing for traffic to different cloud services located in different datacenters, to different Azure websites located in different datacenters, or to external endpoints. This is done with Traffic Manager and the round robin load balancing method.
  • Network level: Load balancing of incoming Internet traffic to different virtual machines of a cloud service, or load balancing of traffic between virtual machines in a cloud service or virtual network. This is done with the Azure Load Balancer.

Feature References

Load Balancing for Azure Services

http://azure.microsoft.com/documentation/articles/load-balancer-overview

About Traffic Manager Load Balancing Methods

http://azure.microsoft.com/documentation/articles/traffic-manager-load-balancing-methods

Internal load balancing

http://azure.microsoft.com/documentation/articles/load-balancer-internal-overview

Networking

When deciding to provision PaaS instances that need to communicate with other servers or services in Azure, a virtual network is required. Some areas of consideration include:

Considerations

Decision Points

Name resolution

When you deploy virtual machines and cloud services to a virtual network, you can use Azure-provided name resolution or your own DNS solution, depending on your name resolution requirements. For information about name resolution options, see Name Resolution (DNS).

Enhanced security and isolation

Because each virtual network is run as an overlay, only virtual machines and services that are part of the same network can access each other. Services outside the virtual network have no way to identify or connect to services hosted within virtual networks. This provides an added layer of isolation to your services.

Extended connectivity boundary

The virtual network extends the connectivity boundary from a single service to the virtual network boundary. You can create several cloud services and virtual machines within a single virtual network and have them communicate with each other without having to go through the Internet. You can also set up services that use a common backend database tier or use a shared management service.

Extend your on-premises network to the cloud

You can join virtual machines in Azure to your domain running on-premises. You can access and leverage all on-premises investments related to monitoring and identity for your services hosted in Azure.

Use persistent public IP addresses

Cloud services within a virtual network have a stable public VIP address. You can also choose to configure your cloud services when you create it by using a reserved public IP address from the address range. This ensures that your instances retain their public IP address even when moved or restarted. See Reserved IP Overview.

There are two models for network configurations for Azure cloud services: cloud-only and cloud-premises virtual network configurations:

  • Cloud-only virtual network configurations are virtual networks that don't use a virtual network gateway to connect back to your on-premises network or directly to another virtual network in Azure. They aren't really a different type of virtual network, but rather, they are a way to configure a virtual network without configuring cross-premises connectivity.
    You connect to the virtual machines and cloud services from the endpoints, rather than through a VPN connection.
  • Cross-premises connections offer an enormous amount of flexibility. You can create multisite configurations, virtual network to virtual network configurations, ExpressRoute connections, and combinations of multiple configuration types. If you are extending your on-premises network to the cloud, this is the way to do it.
    Most cross-premises connections involve using a VPN device to create a secure connection to your Azure virtual network. Or if you prefer, you can create an ExpressRoute direct connection to Azure through your network service provider or exchange provider and bypass the public Internet altogether.

Feature References

How to create a virtual network

https://azure.microsoft.com/documentation/articles/virtual-networks-create-virtual network/

Limitations

Although the capabilities of Azure Virtual Machines are quite comprehensive, there are some native limitations that exist and should be understood by organizations prior to deploying solutions in Azure. These include:

Consideration

Impact

Workaround

Auto-scaling

The application environment does not automatically increase or decrease role instances for increased or decreased loads.

  • Someone needs to manually monitor the environment
  • Sudden increase in load impacts the performance.

Utilize monitoring and automation capabilities such as the Azure Monitoring Agent and Azure Automation to dynamically scale and deploy application code to cloud service instances in the environment.

Load balancing

Application instances are not load balanced by default.

  • It does not allow for elasticity of the application environment.
  • Sudden increase in load impacts the performance.

After the cloud service is provisioned, create an Internal Load Balancer and associate it with the cloud service endpoint.

Density

The total cloud services per subscription is 20.

Leverage multiple subscriptions to provide the proper level of segmentation.

Management Considerations for Azure PaaS Cloud Services

Azure Diagnostics are Azure extensions that enable you to collect diagnostic telemetry data from a worker role, web role, or virtual machine running in Azure. The telemetry data is stored in an Azure Storage account and can be used for debugging and troubleshooting, measuring performance, monitoring resource usage, traffic analysis, capacity planning, and auditing.

Azure Diagnostics can collect the following types of telemetry:

Data Source

Description

IIS logs

Information about IIS websites.

Azure Diagnostics infrastructure logs

Information about Azure Diagnostics.

IIS failed request logs

Information about failed requests to an IIS site or application.

Windows Event logs

Information sent to the Windows event logging system.

Performance counters

Operating system and custom performance counters.

Crash dumps

Information about the state of the process in the event of an application crash.

Custom error logs

Logs created by your application or service.

.NET EventSource

Events generated by your code using the .NET EventSource class.

Manifest-based ETW

ETW events generated by any process.

Microsoft Azure Networking

Microsoft Azure networking leverages a combination of software-defined networking within the Azure fabric infrastructure and physical networking at the edge where customers interface with Azure. Within the Azure fabric infrastructure, there is the concept of virtual networks, subnets within the virtual networks, and the network gateways that allow connectivity between virtual networks and customer networks.

At the edge of the Azure fabric infrastructure, enterprise customers typically use physical devices to provide communications from the on-premises enterprise datacenter environments, where small or medium businesses might use virtual devices or only connect from the client computer to the Azure environment.

This section covers the concepts and planning guidance required for networking infrastructures that interface with and exist within the Microsoft Azure platform.

Cloud Service Provider and Enterprise Customer Connectivity

Connecting to Azure can be accomplished directly by enterprise customers or using a cloud service provider as the interface. When customers connect directly, they create subscriptions, establish connections, and are responsible for managing the private network interfaces and manage all aspects of establishing services within Azure. When customers leverage a cloud service provider, they offload various aspects of the subscription, networking, identity, and management of the Azure environment to the cloud service provider.

From a networking perspective, cloud service providers offer two types of network connectivity to Azure for customers.

  • The customer can choose to "connect through" the CSP to connect to Azure. In this model, the CSP creates the customer subscription, connects their datacenter to the subscription, and then the customer connects to the CSP network to access Azure and the subscription's resources.

  • The customer can also choose a "connect to" model. The CSP creates the customer's subscription, but the customer is responsible for connecting their datacenter to Azure. The CSP connects to the customer network to obtain access to the Azure subscription and assists the customer in management aspects.

Azure Resource Management versus Service Management

When leveraging Azure networking technologies, a key consideration are the capabilities that can be implemented, depending on the API and portal approach that you are targeting. Before a capability can be used, the implementation API (ARM or ASM) has to enable that capability. Most common Azure networking technologies are available using the ASM API, however advanced capabilities logging or diagnostics data are only implemented using the ARM API. While the transition between ASM and ARM APIs continue, it is important to verify that a specific area of functionality is available in either API before committing to a path of implementation.

For customer-managed environments, the customer has the option of choosing whether to use the existing ASM API and migrate to the ARM API when required networking capabilities are available in ARM. Conversely, they can choose to immediately adopt the ARM API understanding that there are certain capabilities they cannot leverage until they are made available.

For CSP managed scenarios, the networking capabilities are limited to ARM API due the requirement for RBAC to separate management scope between the provider and the customer.

Virtual Networks

Azure Virtual Networks provide a key building block for establishing virtual private networks. Virtual networks can be used to allow isolated network communication within the Azure environment or establish cross-premises network communication between an organization's network infrastructure and Azure. By default, when virtual machines are created and connected to Azure Virtual Network, they are allowed to route to any subnet within the virtual network, and outbound access to the Internet is provided by Azure's Internet connection.

A fundamental first step in creating services within Microsoft Azure is establishing a Virtual Network. To establish a virtual private network within Azure, you must create a minimum of one virtual network. Each virtual network must contain an IP address space and a minimum of one subnet that leverages all or part of the virtual network address space. To establish remote network communications to on-premises or other virtual networks, a gateway subnet must be allocated for the virtual network and a virtual network gateway must be added to it.

To enable cross premises connectivity, a Virtual Network must attach a virtual network gateway (often referred to as a gateway). Currently, there are three types of gateways that can be deployed:

  • Static routing gateway (basic, standard, and high performance)
  • Dynamic routing gateway (basic, standard, and high performance)
  • ExpressRoute gateway

The type of gateway determines the cross-premises connectivity capabilities, the performance, and the features that are offered. Static and dynamic gateways are used when establishing Point-to-Site (P2S) and Site-to-Site (S2S) VPN connections where the cross-premises connectivity leverages the Internet for the transport path. ExpressRoute gateways are designed for high-speed, private, cross-premises connectivity where the traffic flows across dedicated circuits and not the Internet.

Static gateways are for establishing low-cost connections to a single virtual network in Azure. Dynamic gateways are used to establish low-cost connections to an on-premises environment or to connect multiple virtual networks for routing purposes in Azure. ExpressRoute gateways are used for connecting on-premises environments to Azure over high-speed private connections.

Feature References

Azure Virtual Network Overview

https://msdn.microsoft.com/library/azure/jj156007.aspx

Virtual Network FAQ

https://msdn.microsoft.com/en-us/library/azure/dn133803.aspx

Virtual Network Cross Premises Connectivity

https://msdn.microsoft.com/en-us/library/azure/dn133798.aspx

VPN Devices and Gateway Information

https://msdn.microsoft.com/en-us/library/azure/jj156075.aspx

ExpressRoute

http://azure.microsoft.com/en-us/documentation/services/expressroute/

Mandatory: Azure solutions must contain a minimum of one virtual network to establish network communications within Azure. A Virtual Network must contain a minimum of one subnet for virtual machine placement and one gateway subnet if cross premises network connectivity is required.

Proper network address space planning is required when implementing virtual networks and subnets.

Recommended: Azure solutions should use the dynamic routing or ExpressRoute gateway versus the static routing gateway.

Design Guidance

When you design Virtual Networks, consider the following:

Capability Considerations

Capability Decision Points

RBAC

The Virtual Network Contributor resource role allows the ability to manage the entire virtual network and subnets. The Virtual Machine Contributor role can be used to grant the ability to use a subnet but not manage it.

CSP management

CSP scenarios might drive additional virtual networks to allow customer separate management capabilities.

Core limitations

Ensure that the virtual network design supports the number of virtual machines that are desired.

Virtual Network Gateways

Virtual network gateways provide connectivity from on-premises networks to Azure and between Azure virtual networks. Types of gateway connection technologies include Point-to-Site (P2S), Site-to-Site (S2S), and ExpressRoute. This section covers gateways in the context of Site-to-Site (S2S) and ExpressRoute connections.

For Site-to-Site gateways, an IPsec/IKE VPN tunnel is created between the virtual networks and the on-premises sites by using Internet Key Exchange (IKE) protocol handshakes. For ExpressRoute, the gateways advertise the prefixes by using the Border Gateway Protocol (BGP) in your virtual networks via the peering circuits. The gateways also forward packets from your ExpressRoute circuits to your virtual machines inside your virtual networks.

Currently there are two types of S2S virtual private network connections (VPNs) that require the use of two types of gateways: static routing and dynamic routing. A static routing gateway uses policy-based VPNs. Policy-based VPNs encrypt and route packets through an interface based on a customer-defined policy. The policy is usually defined as an access list. Static routing VPNs require a static routing VPN gateway. Although they are effective for single virtual network connections, static gateways are limited to a single virtual network per VPN connection.

In contrast, dynamic routing gateways use route-based VPNs. Route-based VPNs depend on a tunnel interface specifically created for forwarding packets. Any packet arriving at the tunnel interface is forwarded through the VPN connection. Dynamic routing VPNs require a dynamic routing VPN gateway.

From a performance perspective there are three types of dynamic routing gateways: Basic, Standard, and High-Performance. The differences between these gateway types are outlined in the following table.

Type

Type

S2S connectivity

Authentication method

Maximum number of S2S vNet connections

Maximum number of P2S connections

S2S VPN Throughput

ExpressRoute

Throughput

Basic Dynamic Routing Gateway

Basic Dynamic Routing Gateway

Route-based VPN configuration

Pre-shared key

10

128

~100 Mbps

~500 Mbps

Standard Dynamic Routing Gateway

Standard Dynamic Routing Gateway

Route-based VPN configuration

Pre-shared key

10

128

~100 Mbps

~1000 Mbps

High-performance Dynamic Routing Gateway

High Performance Dynamic Routing Gateway

Route-based VPN configuration

Pre-shared key

30

128

~200 Mbps

~2 Gbps

Creating and connecting a gateway for a Virtual Network is a multiple step process, and it requires certain configurations to be complete. A high-level set of steps is outlined here:

  1. Define a virtual network gateway subnet within the available IP address space within the virtual network.
  2. Create the virtual network gateway and allow it to provision.
  3. Establish the connection between locations.
    1. For S2S connections, you must download the supported VPN device configuration script (or instructions) and establish the connection. For more information, see About VPN Devices and Gateways for Virtual Network Connectivity.
    2. For virtual network-to-virtual network connections, you must define and register shared keys to establish the connection.
    3. For ExpressRoute connections, you must define a service key to establish connections.
  4. Initiate the connection handshake to connect the gateways.

After you have a successful gateway connection, the gateway status will show as active within the virtual network dashboard in the Azure portal. Note that for ExpressRoute, S2S, and virtual network-to-virtual network connections the portal will provide a gateway connection status, Data-In traffic amount, Data-Out traffic amount, and the gateway address in the portal. Here is an example of this information shown in the Azure portal:

The configuration of an S2S or P2S virtual network gateway can be performed within the portal, but at this time, ExpressRoute gateways can only be provisioned and configured by using the ExpressRoute PowerShell module.

Note that it is possible to resize a gateway (between Basic, Standard, and High-Performance) with the Resize-AzureVNetGateway
cmdlet. This allows organizations to start at one class of service and expand their capabilities as their requirements grow. This process results in resizing the gateway, and some downtime is required during the resizing process. No other configuration changes are required. Resizing operations include increasing or decreasing between a Basic, Standard, and High-Performance gateway.

Feature References

Configure a Cross-Premises Site-to-Site connection to an Azure Virtual Network

http://azure.microsoft.com/documentation/articles/vpn-gateway-site-to-site-create/

Configure a Virtual Network Gateway in the Management Portal

https://msdn.microsoft.com/en-us/library/azure/jj156210.aspx

Connect Multiple On-premises Sites to a Virtual Network

https://msdn.microsoft.com/en-us/library/azure/dn690124.aspx

Configure a Virtual Network and Gateway for ExpressRoute

http://azure.microsoft.com/documentation/articles/expressroute-configuring-vnet-gateway/

ExpressRoute Technical Overview

http://azure.microsoft.com/documentation/articles/expressroute-introduction/

ExpressRoute Prerequisites

http://azure.microsoft.com/documentation/articles/expressroute-prerequisites/

Coexistence Gateway

https://azure.microsoft.com/en-us/documentation/articles/expressroute-coexist/

Mandatory:

  • Connecting a virtual network to an on premises environment or to another virtual network requires the creation of a virtual gateway.
  • Only a single gateway can be attached to a virtual network.

Recommended:

  • Because gateways take time to be provisioned and they must be accessible to establish the handshake, create the gateway as soon as possible after the virtual network is created.
  • Use a high-performance gateway for ExpressRoute connections of 1 Gbps or higher.
  • Use a high-performance dynamic routing gateway in S2S scenarios if more than 10 virtual network connections are required per gateway or if more than 100 Mbps of throughput is required.
  • Consider using a coexistence gateway as the default gateway for all ExpressRoute circuits.
   

Design Guidance

For more information about Applicable Azure design patterns, see Hybrid Networking (Azure Architecture Patterns).

When you design virtual network gateways, consider the following:

Capability Considerations

Capability Decision Points

Gateway provisioning performance

Note that when you create a gateway, it can take anywhere from 15-30 minutes for the gateway to be available via the provisioning process.

Gateway limits

A virtual network can have a maximum of a single gateway attached to it.

Static routing gateway limits

Multi-site, virtual network to virtual network, and Point-to-Site gateway connection technologies are not supported with static routing VPN gateways.

Co-existence gateway

A new co-existence gateway exists that combines both the BGP and IKE protocols (ExpressRoute and S2S connections). With this gateway, it is possible to support ExpressRoute and S2S VPN connections to a single virtual network.

The coexistence gateway supports two modes of operation: failover and coexistence.

  • In failover mode, the ExpressRoute side handles all traffic until a failure occurs and then the S2S gateway takes over.
  • In coexistence mode, the gateway allows a customer to leverage a high-speed ExpressRoute connection to provide ingress and egress to Azure while also providing virtual network-to-virtual network connections across the Azure network fabric.

Gateway performance remains the same for the connection types.

Cisco ASA VPN Device

The most common VPN device that customers use is a Cisco ASA device. This device does not currently support dynamic routing and we do not support multiple policy configuration with a static routing gateway, so only a single virtual network can be connected to a Cisco ASA VPN device.

CSP Limits

CSP scenarios currently only support S2S VPN gateways due to limitations of the ARM API.

Virtual Network Gateway Address Requirements

In the previous section, we discussed that there are three types of gateways available today in Azure. Gateways are used to connect on-premises environments to Azure and to enable virtual network-to-virtual network connectivity. Gateways are created at the virtual network level and a virtual network can have only a single gateway connected to each.

Regardless of which gateway type you create; you must have a gateway subnet defined for the virtual network. The gateway subnet has different address space requirements based on the type of gateway created.

A static routing gateway or a dynamic routing gateway must have a subnet with a /29 CIDR definition. When the gateway is connected, it actually takes the /29 segment and breaks it into two /30 segments to provide redundant connections as part of the Site-to-Site VPN. The address requirements are the same for a standard and a high-performance static or dynamic routing gateway.

An ExpressRoute gateway must have a subnet with a /28 CIDR definition. When the ExpressRoute gateway is established, it breaks the /28 into two /29 segments that are used to provide the redundant connections as part of the ExpressRoute circuit establishment.

Mandatory:

  • An Azure gateway requires the creation of a gateway subnet in a virtual network to create the gateway. The gateway subnet must meet the address requirements based on the type of connection the gateway will support (S2S or ExpressRoute).
  • Address prefixes used in Azure or for Azure connectivity must be non-overlapping
  • CSP Scenarios will require address planning with the customer to ensure non-overlapping address prefixes are being used.

Recommended:

  • Develop and Address prefix plan for Azure so you can create a spreadsheet that pre-allocates addresses space for the virtual network architecture.
  • Plan for both production, pre-production, and non-production environments

Virtual Network to Virtual Network Routing

Virtual network-to-virtual network routing allows establishing routing paths across the Azure network fabric without having to send the traffic on-premises. Establishing virtual network-to-virtual network routing requires creating an IPsec tunnel and dynamic routing across that segment.

The static and dynamic routing gateways use IKE protocol to establish an IPsec tunnel and route traffic, but only the dynamic routing gateway supports dynamic routing. Based on those requirements, the static routing gateway does not support virtual network-to-virtual network routing. Every virtual network-to-virtual network segment requires a dynamic routing gateway on both ends.

The ExpressRoute gateway leverages the BGP routing protocol to establish the communication and route traffic, and therefore, it does not meet the requirements to establish a virtual network-to-virtual network connection. This requires that a separate connection to each virtual network must be established when using ExpressRoute. In addition, it requires that virtual network traffic routes from Azure to the edge of the ExpressRoute circuit and back to Azure communicate between virtual machines on different virtual networks.

When you establish virtual network routing, by default a virtual network can allow traffic to flow cross a single virtual network-virtual network gateway connection. This is an isolation feature that forces establishing multiple hop routing definitions to enable communications.

The following section discusses the different options of connecting virtual networks together to support routing scenarios. Note that it is possible to provision ASM and ARM versions of virtual networks. The process for connecting both versions is slightly different.

Multiple Virtual Network Routing Configuration

Each gateway has a limited number of other gateway connections that it can establish. The connection model between gateways dictate how far you can route within Azure. There are three distinct models that you can leverage to connect multiple virtual networks to one another:

Mesh

Hub and Spoke

Daisy-Chain

By default, in the Mesh approach, every virtual network can talk to every other virtual network with a single hop. Therefore, this approach does not require you to define multiple hop routing. Challenges with this approach include the rapid consumption of gateway connections, which limit the size of the virtual network routing capability.

By default, in the Hub and Spoke approach (as illustrated in the previous example) a virtual machine on vNet1 will be able to communicate to a virtual machine on vNet2, vNet3, vNet4, or vNet5. A virtual machine on vNet2 could talk to virtual machines on vNet1, but not a virtual machine on vNet3, vNet4, or vNet5. This is due to the default single hop isolation of the virtual network in this configuration.

By default, in a Daisy-Chain approach, a virtual machine on vNet1 can communicate to a virtual machine on vNet2, but not vNet3, vNet4 or vNet5. A virtual machine on vNet2 could talk to virtual machines on vNet1 and vNet3. The same virtual network single hop isolation applies.

Feature References

Connecting ASM virtual networks to ARM virtual networks

https://azure.microsoft.com/en-us/documentation/articles/virtual-networks-arm-asm-s2s/

Mandatory:

  • Multiple hop routing requires selecting a connection model and modifying the default single hop routing.
  • It is possible to establish virtual network to virtual network connections between ASM and ARM virtual networks.

Recommended:

  • Do not use the Mesh connection option because of the limitations on expandability due to gateway connection limits.
  • For higher virtual network gateway connection limits, deploy the high performance gateway

Design Guidance

For more information about applicable Azure design patterns, see Hybrid Networking (Azure Architecture Patterns).

Additional implementation guidance and examples are provided in the Appendix of this document.

Virtual Network IP Address Space Planning

A Virtual Network in Azure is an address space container that can have a gateway connected to it to allow communications. As part of the Virtual Network configuration, customers must configure non-overlapping IP address space for their Azure environment.

This IP address space can consist of private IPV4 IP address ranges (as described in RFC 1918) or public (non-RFC 1918) IPV4 IP address ranges owned by the organization. Exceptions to public address ranges include:

  • 224.0.0.0/4 (multicast)
  • 255.255.255.255/32 (broadcast)
  • 127.0.0.0/8 (loopback)
  • 169.254.0.0/16 (link-local)
  • 68.63.129.16/32 (internal DNS)

A virtual network address space can be subdivided into smaller groups of address spaces called subnets. Subnets are the connection points for virtual machines and specific PaaS roles, not the virtual network. The subnets are connected to the virtual network and part of a flat routed network where traffic that flows through the gateway will reach each subnet.

There are two types of subnets that can be created:

  • Virtual machine subnet
  • Gateway subnet

Virtual machine subnets can have virtual machines, PaaS roles (web and worker), and internal load balancers. Gateway subnets can have connections only with other gateways—provided they are using non-overlapping IP address spaces.

Feature References

Non-RFC 1918 space now allowed in a virtual network

http://azure.microsoft.com/en-us/updates/non-rfc-1918-space-is-now-allowed-in-a-virtual-network

About Public IP Address Space and Virtual Network

https://azure.microsoft.com/documentation/articles/virtual-networks-public-ip-within-vnet/

Microsoft Azure Datacenter IP Ranges

http://www.microsoft.com/en-us/download/details.aspx?id=41653

Mandatory:

  • When designing Azure Virtual Network infrastructures, planning for IP address spaces is a required initial step prior to deploying and configuring virtual networks.
  • For CSP "Connect Through" scenarios, the provider will control the IP address space, but must coordinate with the customer to achieve non-overlapping space.
  • For CSP "Connect To" scenarios the customer will control the IP address space.

Design Guidance

When you design virtual network IP address spaces, consider the following:

Capability Considerations

Capability Decision Points

Virtual network and subnet Configuration

Although there is a limit on the number of virtual networks that can be placed in a subscription, there is no limit on subnets except for how small the address space of the virtual network can be subdivided.

Each subnet has the first three addresses reserved for Azure usage, so the first available address is the fourth address. This means that the smallest subnet can have a CIDR of /29, so there are six assignable addresses.

It is possible to have multiple IP address space definitions in a virtual network definition.

Virtual machines and address space planning

Currently, a virtual network can have a total of 2048 virtual machines attached to subnets. By default, every virtual machine has a single network adapter, and therefore, the virtual network space needs a minimum of 2048 IP addresses (plus the three for Azure) if you are going to maximize the density of the virtual network.

Address space considerations

When designing an address space for a virtual network, consider the following:

  • Limits to the number of objects that can consume IP addresses in a virtual machine subnet
  • Requirements of the gateway subnet based on the type of gateway connection

Planning for internal load balancing

Each virtual machine can have multiple network adapters, and if internal load balancers are used, you also need a single IP address for every internal load balancer. So a formula would be:

Virtual network address space = # of Virtual machines + # of additional network adapters + # of internal load balancers + 3

Note that you need to round up this number to an IP address CIDR border. For example, if the formula results in a minimum requirement of 8003 addresses, you must round up to the next CIDR border of /19, which is 8190 addresses for the virtual network address space.

Duplicate or overlapping IP ranges

One limitation of address space design is that no duplicate IP address ranges can exist in any routed network. This means that you cannot use the same address space for an Azure virtual network or subnet that already exists somewhere else (such as on premises) where the Azure subnets need to route.

CSP's need to ensure that customers do not implement overlapping address spaces in Azure.

Virtual Network Logging

Currently virtual network logging is limited to change management (create, modify, and delete) audit logging. In the Azure portal, it is available via Management Services > Operations Logs. You can also use the Azure PowerShell cmdlet Get-AzureSubscriptionIDLog
for:

  • virtualNetworks (write, delete)
  • publicIPAddresses (write, delete)
  • networkInterfaces (write, delete)
  • loadBalancers (write, delete)
  • networkSecurityGroups (write, delete)

Data plane and control plane logging is not available at this time.

Design Guidance

Consider the impact of logging virtual network information for customers who require regulatory compliance (such as PCI) or other operational requirements.

Network Connectivity

Azure supports two types of connectivity options to connect customer's networks to Azure virtual networks: Site-to-Site VPN and ExpressRoute. Although Point-to-Site is another viable connectivity option, it is client-focused and is not specific to this discussion.

Site-to-Site VPN connections use VPN devices over public Internet connections to create a path to route traffic to a virtual network in a customer subscription. Traffic to the virtual network flows across an encrypted VPN connection, while traffic to the Azure public services flows over the Internet.

It is not possible to create a Site-to-Site VPN connection that provides direct connectivity to the public Azure services via a public peering path. To provide multiple VPN connections to the virtual network, you must use multiple VPN devices connected to different sites. These relationships are depicted in the following diagram:

If a customer selects to engage a cloud service provider in a "connect through" scenario, the customers connect to the CSP network over a S2S VPN and the CSP is connected to Azure over separate S2S VPN connections.

If a customer selects to engage a cloud service provider in a "connect to" scenario, the customer connects to Azure network over a S2S VPN and the CSP is connected to the customer's network over a separate S2S VPN connection that allows the CSP to manage the Azure subscription and resources on behalf of the customer.

ExpressRoute connections use routers and private network paths to route traffic to Azure Virtual Network, and optionally, to the Azure public services. Private connections are made through a network provider by establishing an ExpressRoute circuit with a selected provider. The customer's router is connected to the provider's router and the provider creates the ExpressRoute circuit to connect to the Azure Routers.

When the circuit is created, VLANs can be created that allow separate paths to the private peering network to link to virtual networks and to the public peering network to access Azure public services.

Design Guidance

Cloud Service Provider scenarios are currently limited to Site-to-Site connection options due to the current lack of support for ExpressRoute in ARM.

ExpressRoute Overview

ExpressRoute is a high-speed private routed network connection to Azure. The connections between the customer's network edge and the provider's network edge are redundant as are the connections from the provider's edge to the Azure edge.

From the provider to the Azure edge, you can have private peering connections to customer virtual networks and public peering connections to the Azure PaaS services, such as Azure SQL Database. There are two carrier models provided for ExpressRoute: Network Service Providers (NSPs) and Exchange Providers (IXPs). NSP and IXP connectivity models, speeds, costs and capacities vary. These differences are summarized in the following table:

 

Network Service Provider

Exchange Provider

Bandwidth

10, 50, 100, 500, 1000 Mbps

200, 500, 1000, 10000 Mbps

Route management

Provider manages

Customer manages

High availability

Provider manages

Customer manages

MPLS support

Yes

No

Azure circuit costs

Ingress and egress included in monthly fee

Ingress and egress allocation included in monthly fee and based on consumption

Provider circuit costs

Based on consumption—some provide all-inclusive plans

Based on consumption

Feature References

Configure an ExpressRoute Connection through a Network Service Provider

http://azure.microsoft.com/documentation/articles/expressroute-configuring-nsps/

Configure an ExpressRoute Connection through an Exchange Provider

http://azure.microsoft.com/documentation/articles/expressroute-configuring-exps/

ExpressRoute Whitepaper with detailed steps for connecting via IXP model

http://download.microsoft.com/download/0/F/B/0FBFAA46-2BFD-478F-8E56-7BF3C672DF9D/Microsoft%20Azure%20ExpressRoute.pdf

ExpressRoute Public Peering Design Considerations

Establishing a connection to the public peering network allows virtual machines on Azure Virtual Networks and on-premises systems to leverage the ExpressRoute circuit to connect to Azure PaaS services on the public peering network without traversing the Internet.

Establishing a public peering connection is an optional configuration step for an ExpressRoute circuit. When the public peering connection is established, the routes for all the Azure datacenters worldwide are published to the edge router. This directs traffic to the Azure services instead of going out to the Internet.

The interface between the Azure public services and the customer's network is protected by redundant NAT firewalls. These NAT devices allow customers' systems to access the Azure public services, but they only allow stateful traffic back to the customer's networks.

This interaction is outlined in the following diagram:

Design Guidance

When you design ExpressRoute peering, consider the following:

Capability Consideration

Capability Decision Points

Azure Services in the datacenter

All Azure services reside within an Azure datacenter and are assigned routable IP addresses.

Public peering services

From a design perspective, any Azure public service only sees the NAT device address. If the Azure public service provides firewall protection, only the NAT addresses can be used in the firewall rules.

From a security perspective, specifying the NAT addresses will prevent connections from the Internet for the customer's instance of that public service. This also means that any system behind the NAT can access the public service, which may not be desired from a security perspective.

ExpressRoute Performance Design Considerations

ExpressRoute circuits provide a private path to route traffic to the Azure datacenter. When the traffic reaches the Azure edge device, it must leverage the software-defined routing within the Azure datacenter to isolate traffic. Currently, ExpressRoute connections from the customer's datacenter to the Azure edge and can achieve up to 10 Gbps.

At that edge, virtual connections are established to the customer's private virtual network gateways to enable routing traffic. Today the maximum performance that a single virtual network gateway can provide is 2 Gbps. To optimize traffic through the ExpressRoute circuit, it may be required to leverage multiple virtual networks and gateways.

Customers have the ability to purchase the optimal ExpressRoute circuit bandwidth to meet their throughput requirements. Circuits can be upgraded to provide additional performance with minimal impact. Circuits cannot be downgraded without impact.

Design Guidance

When you design for ExpressRoute performance, consider the following:

Capability Consideration

Capability Decision Points

Bursting traffic

ExpressRoute circuits allow for bursting of traffic to up to two times the rated bandwidth of the circuit. Gateways will also support this bursting capability. Gateways and circuits will drop packets if the burst limit is exceeded.

Standard versus Premium ExpressRoute

ExpressRoute comes in two SKUs: Standard and Premium. Although the performance of the ExpressRoute circuit does not change the number of routes, the number of virtual network connections per circuit, and the ability to route traffic across Azure regions is an upgrade when using Premium.

Gateway performance

Gateways come in three SKUs: Basic, Standard, and High Performance. The maximum speed of the gateway is a function of the SKU and affects the performance that you can achieve over an ExpressRoute circuit to a single virtual network.

ExpressRoute Cost Design Considerations

ExpressRoute connectivity and pricing is made of two components: the service connection costs (Azure) and the authorized carrier costs (telco partner). Customers are charged by Azure for the ExpressRoute monthly access fee, and potentially an egress traffic fee based on the type and performance of the ExpressRoute connection. Customers also have costs associated with the selected provider, which is typically comprised of the circuit connection and monthly traffic fees.

From an Azure perspective, an NSP connection is an inclusive plan where customers are charged a monthly fee and get unlimited ingress and egress traffic. Fees associated with IXP connections include a monthly service charge and potential traffic egress charges when a high watermark of traffic is exceeded. In these cases, the customer is charged an additional fee based on the amount of egress traffic above the included amount.

Feature References

Azure ExpressRoute cost information

http://azure.microsoft.com/en-us/pricing/details/expressroute/

Recommended: When planning for network connectivity with ExpressRoute, ensure that the costs are well understood and that conversations with authorized carriers are addressed early in the planning process.

Design Guidance

When you design for ExpressRoute costs, consider the following:

Capability Consideration

Capability Decision Points

Provider costs

The provider costs are much harder to ascertain because egress traffic is inclusive and additional monthly circuit fees typically apply. NSP provider costs can range from a flat monthly fee per gigabyte to a premium service that is a large flat monthly fee. Potential additional costs may include the number of MPLS circuits that have been configured.

IPX connections

An IXP provider connection is a fiber connection, and it typically includes monthly and one-time fiber connection fees. The customer's connection typically includes the fiber from the datacenter to the provider's access point, the costs for transmitting the traffic to the consolidation point of presence, and a "last mile" fiber connection to get connected to the Azure datacenter.

NSP connections

For the NSP model, the provider typically provides and manages the provider edge routers and the configuration and management of the published routes. However, for an IXP model, the customer must provide the router that is placed at the provider access point and manage all the route publishing.

The advantage of the IXP model is that typically the customers is given a rack and is allowed to place hardware in addition to the router. This allows the customer to include security hardware and other appliances.

ExpressRoute Premium

ExpressRoute Premium is an add-on package that allows an increase in the number of BGP routes, allows for global connectivity, and increases the number of virtual networks per ExpressRoute circuit. This add-on can be applied to Network Service Provider or Exchange Provider circuits.

Summary of ExpressRoute Premium features:

  • Increased route limits for public and private peering (from 4,000 routes to 10,000 routes).
  • Global connectivity for services. An ExpressRoute circuit created in any region (excluding China and government cloud) will have access to resources across any other region in the world. For example, a virtual network created in West Europe can be accessed through an ExpressRoute circuit provisioned in the West US region.

Increased number of virtual network links per ExpressRoute circuit (from 10 to a larger limit, depending on the bandwidth of the circuit).

Feature References

Azure ExpressRoute Premium Circuit connection information

https://azure.microsoft.com/en-us/documentation/articles/expressroute-faqs/

Design Guidance

When you design for ExpressRoute Premium, consider the following:

Capability Consideration

Capability Decision Points

Service availability and access

Although ExpressRoute Premium is available in regions such as India and Australia, to leverage the cross virtual network connectivity, you must have a business presence within the country and a local Azure billing account to establish a cross-region virtual network connection.

Microsoft Azure Site–to-Site VPN

Microsoft Site-to-Site (S2S) connectivity allows low cost connections from customer locations to Azure private peering networks. S2S leverages the Internet for transport and IPsec encryption to protect the data flowing across the connection.

Requirements:

  • Public facing IPv4 address for the on-premises VPN device that is not behind a NAT
  • Compatible hardware VPN device or RRAS

Potential Use Cases:

  • CSP Connectivity
  • Branch office connectivity
  • Low cost primary datacenter connection where you do not want to require configuration at the client level
  • Require virtual network-virtual network routing in Azure datacenter

Feature References

VPN Device information

https://msdn.microsoft.com/en-us/library/azure/jj156075.aspx

Configuring Multiple Sites Connectivity

https://msdn.microsoft.com/en-us/library/azure/dn690124.aspx

Configure a Cross-Premises Site-to-Site connection to an Azure Virtual Network

http://azure.microsoft.com/documentation/articles/vpn-gateway-site-to-site-create/

Mandatory:

  • A dedicated IPv4 address is required for the on-premises VPN device to establish a S2S VPN connection
  • For CSP scenarios, a separate S2S VPN device is required for the CSP connection

Recommended: If automating the creation of the gateway for the S2S VPN, specify your shared key versus retrieving a shared key from Azure.

  • Always use encrypted VPN connections for S2S VPNs.
  • Always use VPN devices that support dynamic routing.
  • Leverage multi-site S2S support to provide redundant paths to a virtual network
  • Leverage the high performance gateways to maximize virtual network to virtual network connections and to obtain the high performance connection for S2S scenarios.

Optional: Leverage the New-GUID cmdlet to generate a complex shared key

Design Guidance

When you design for Site-to-Site Connections, consider the following:

Capability Consideration

Capability Decision Points

S2S VPN performance

A maximum of 200 Mbps connection per VPN at the gateway interface to the virtual network regardless of the Internet connection speed

On-premises VPN device

The supported VPN device used determines virtual network routing capability (static or dynamic routing)

Shared keys

Shared keys are required for establishing site-to-site connectivity

Multi-site support

You must determine if multiple on-premises sites can access a single virtual network gateway.

Public peering

S2S VPNs do not have access to the public peering network to connect to Azure services.

Microsoft Azure Point-to-Site VPN

Microsoft Point-to-Site (S2S) connectivity allows low cost connections from customer workstations to Azure private peering networks. P2S leverages the Internet for transport and certificate-based encryption to protect the data flowing across the connection. A VPN device or a public facing IPv4 address is not required to establish a P2S VPN connection.

Requirements:

  • Microsoft VPN client installed on the workstation with a supported operating system
  • Outbound Internet access
  • Root certificate installed in Azure to support encryption
  • Client certificate installed on the workstation
  • Virtual network with a dynamic routing gateway

Potential use cases:

  • Developers accessing virtual networks without any other dedicated network connectivity
  • Companies that have no datacenter or branch offices where they can place a VPN device to establish a S2S connection
  • Temporary connection while you are away from your S2S network
  • Do not have a public IP4 address to establish a S2S VPN

Feature References

Configure a Cross-Premises Point-to-Site connection to an Azure Virtual Network

http://azure.microsoft.com/documentation/articles/vpn-gateway-point-to-site-create/


  •  

    Mandatory: The following are required for implementation of Point-to-Site VPN connections:

    • A certificate to encrypt the connection
    • Microsoft VPN client package installed on the workstation
    • P2S is only supported with a dynamic routing gateway

    Design Guidance

    When you design for Point-to-Site connections, consider the following:

    Capability Consideration

    Capability Decision Points

    P2S VPN limitations

    There is a maximum of 128 P2S VPN connections per virtual network. At the time of writing, the client package is available for x86 and x64 Windows clients.

    Certificate requirements

    Self-signed or Enterprise Certification Authority (CA) certificates must be used

    Interoperability with ExpressRoute

    You cannot leverage P2S connections with a virtual network connected to an ExpressRoute circuit due to existing gateway limitations.

    Forced Tunneling

    Forced tunneling allows you to specify the default route for one or more virtual networks to be the on-premises VPN or ExpressRoute gateway. This is implemented by publishing a 0.0.0.0/0 route that points to that gateway. In effect, this results in any packet that is transmitted from a virtual machine connected to the virtual network that is not destined to another IP address within the scope of the virtual network to be sent to that default gateway.

    When using forced tunneling, any outbound packet that is attempting to go to an Internet address will be routed to the default gateway and not to the Azure Internet interface. For a virtual machine that has a public endpoint defined that allows inbound traffic, a packet from the Internet will be able to enter the virtual machine on the defined port. A response might be sent, but the reply will not go back out the public endpoint to the Internet. Rather, it will be routed to the default gateway. If the default gateway does not have a route path to the Internet, the packets will be dropped, effectively blocking any Internet access.

    Forced tunneling has different implementation requirements and scope depending on the type of Azure connectivity of the virtual network. A virtual network that is connected over a S2S VPN connection requires forced tunneling to be defined and configured on a per virtual network basis by using Azure PowerShell. A virtual network that is connected over an ExpressRoute connection requires forced tunneling to be defined at the ExpressRoute circuit, and this affects all virtual networks that are connected to that circuit.

    Determining how forced tunneling will be used in a design should involve the following design decisions:

    • Type of virtual network connectivity (S2S or ExpressRoute) (defines scope of impact)
    • Requirements for direct Internet egress via Azure's Internet connection (direct conflict with a business requirement)
    • Security requirements and flexibility (forced tunneling can provide isolation of network traffic while the connection is up)
    • Connectivity costs (forcing all traffic back over the S2S or ExpressRoute circuit)
    • CSP considerations on how the traffic will be routed to the customer. CSP scenarios will usually require user defined routing
    Defense-in-Depth Considerations for Forced Tunneling

    It is always a good security practice to have defense-in-depth where there are additional layers of security in case a layer is compromised or inadvertently removed. Forced tunneling forces all packets back to the default gateway. However, relying only on that approach is not a good defense in depth design.

    If you leverage forced tunneling, there is no reason to define any public endpoint for virtual machines when they are provisioned. Leaving the default public ports provides a security vector if forced tunneling is ever disabled.

    A good design practice is to implement Network Security Groups for the subnets of every virtual network configured for forced tunneling. This allows you to have an additional layer of network protection. Understand that although you can use Network Security Groups to create rules for a virtual machine or a subnet that restricts outbound access, any co-administrator can temporarily or permanently override those rules. (Note that Network Security Group rule changes are logged.)

    Forced tunneling provides the best defense-in-depth for a virtual network that is connected by ExpressRoute. A forced tunnel configuration with an ExpressRoute circuit requires a network engineer to be involved because it is implemented as a BGP routing configuration. This is not something an Azure co-administrator has the rights to configure. However, a forced tunneling configuration with a S2S VPN is something that can be performed by a co-administrator on each virtual network.

    Mandatory: A design that leverages forced tunneling (default route) typically must provide access via a different path than using Azure Internet access.

    Recommended: Combine forced tunneling with Network Security Groups to achieve defense-in-depth for traffic isolation.

    Optional: Investigate the use of a dual network adapter edge firewall appliance with an extranet subnet as one alternative to Network Security Groups.

    Network Security

    Network security in Azure can present many challenges, especially to organizations that rely exclusively on network security measures for isolation. Many on-premises and IaaS deployments leverage a point-to-point firewall, rule-based approach to secure access to resources. They combine this with platform-based authentication access.

    PaaS deployments present new challenges because they are designed to be driven by application and identity controls, but many organizations attempt these deployments with traditional network based-approaches.

    Hybrid PaaS and IaaS deployments (where there may be IaaS or PaaS roles combined with Azure public services such as Azure SQL Database, Redis Cache, and Service Bus) are the most challenging to plan. This is because the Azure public services are multitenant, and in some cases, they are cannot be connected directly to the network infrastructure owned by the customer.

    Many of the public Azure services also do not have the construct of a service-level firewall that the customer can configure directly, so leveraging traditional approaches to secure those solutions with classic network approaches can be challenging.

    Regulatory requirements introduce potential complications because they sometimes explicitly require (or are interpreted to require) a traditional, on-premises, point-to-point network security approach to mitigations.

    Another complication with attempting to use traditional network-based security controls exclusively is that most of these controls assume the IP address is a good proxy for machine or service identity.

    IP addresses are a poor proxy for identity outside of a corporate LAN that is using static assignments, particularly in a globally scaled Internet service such as Azure where IP addresses change rapidly. This typically creates significant challenges for organizations that are overly reliant on network security measures and are using static IP addresses for server and service mapping.

    Review the guidance in the Microsoft Azure Security section (specifically the Containment and Segmentation Strategy) for how to design complete security containment strategies that overcome the limitations of networking controls alone.

    Traditional Security Approaches

    Applying traditional security approaches to Azure networking involves the following:

    Security Feature   

    Description

    When to Use

    Network security group

    Access control rules that can be applied to subnets or virtual machines

    • Control the ingress and egress traffic of a subnet
    • Control traffic between virtual machines in a subnet
    • Control ingress and egress traffic for a single virtual machine

    Forced tunneling

    Default route for a gateway that send all non-local traffic to the customer's on-premises edge router for processing

    • Block outbound Internet traffic in a virtual network
    • Block inbound traffic in a virtual network
    • ExpressRoute is implemented at the BGP routing level. Provides defense-in-depth by role separation.

    Firewall appliances – single network adapter

    Software-based firewall that can be placed between virtual machines and the Internet. Requires all traffic to be routed through the firewall, typically by using agents and IPsec.

    • Need additional firewall protection from the Internet
    • Want traffic flow control of all virtual machine traffic
    • Want to log and monitor traffic

    Firewall appliances – dual network adapters

    Software-based firewall that can be placed between subnets or between a subnet and the Internet

    • Need additional firewall protection from the Internet and also need high throughput.
    • Want packet capture and inspection
    • Want detailed logging

    IPsec

    Traffic authentication and encryption at the server level. Requires machines to be domain joined,

    • Want to use policy-based traffic encryption
    • Want to control which servers can communicate with each other

    Hardware firewall appliances at the network edge

    Placing a hardware firewall appliance at the customers network edge

    • Want to control ingress and egress traffic between Azure and on-premises

    Web application firewalls

    Software-based firewall that is used to control ingress traffic from the Internet. Typically a layer 7 firewall.

    • Want SSL session termination
    • Want session affinity

    Network Access Approaches

    When customers extend their datacenters to Azure or deploy an application within the Azure infrastructure, they must select an approach for access control and security, based on an access scenario. Common access scenarios include:

    • Internal users
      • Accessing the solution from on premises
      • Accessing the solution from non-corporate locations
      • Accessing the solution via VPN
    • External users
      • Accessing the solution via the Internet
    • Dependent applications
      • Accessing the solution from on premises
      • Accessing the solution from Azure
      • Accessing the solution from Internet locations
    • Solution
      • Accessing on-premises resources
      • Accessing Azure public services
      • Accessing other Internet-facing services

    Note that a security-access approach might have multiple options to provide access. For example, accessing an application in Azure via the Internet can be accomplished with different security and traffic routing approaches.

    Application Access Approach   

    Description

    When to Use

    Direct to Azure

    Internet access is accomplished by exposing the UI tier directly on the Internet.

    • The application needs to be accessed from the Internet, and minimal security is required.
    • The application has no connection to corporate resources.

    Using the existing security solution

    Internet access could be blocked by using forced tunneling. All traffic must flow through the corporate Internet-facing security stack, be routed over the corporate backbone, and get to Azure using ExpressRoute or S2S connections.

    • The application needs to be accessed from the Internet, and high security is required.
    • A security stack that meets requirements cannot be created in Azure.

    Using a provider security solution

    Internet access could be blocked by using forced tunneling. All traffic must flow through a service provider's Internet-facing security stack, be routed over the service providers backbone, and get to Azure by using S2S or ExpressRoute connections,

    • The application needs to be accessed from the Internet, and high security is required.
    • Internet access is being provided by a service provider that has a backbone connection to Azure.
    • Existing corporate Internet bandwidth cannot handle application load.

    Using an Azure-based security solution

    Internet access is accomplished by building a security stack in Azure by using network virtual appliances.

    • Application needs to be accessed from the Internet, and high security is required.
    • Existing corporate Internet bandwidth cannot handle the application load.
    • There is a desire to have no dependency on corporate resources.
    • High-speed Internet access is required.
    • Application requires global load balancing and the lowest latency connection.

    Virtual Appliances

    Virtual Appliances are third-party-based virtual machine solutions that can be selected from the Azure Gallery or Marketplace to provide services like network firewall, application firewall and proxy, load balancing, and logging. Appliances are licensed by:

    • Using a license key that you already own.
    • Including the licensing cost into the hourly cost of the appliance.

    Appliances are available in single network adapter or multiple network adapter configurations depending on the type of appliance and the required capabilities. For example, a logging appliance might only require a virtual machine with a single network adapter because all the traffic is written to the appliance. A network firewall typically requires a virtual machine with a multiple network adapter configuration that supports layer 3 routing so that the traffic has to flow through the appliance to reach its destination.

    To leverage an appliance that supports layer 3 routing, the network architecture must include user defined routing to override the default implicit routes to specify explicit user-defined routes. This allows the specification of routing rules that can direct traffic to the appliance network adapter, to the local virtual network, or to the on-premises environment.

    The following table lists virtual appliances types and when to use them:

    Virtual Appliance Type   

    Description

    When to Use

    Network firewall

    Virtual appliance that leverages a virtual machine with a multiple network adapter configuration and layer 3 routing support to enable a network firewall between multiple subnets in Azure.

    • Control outbound traffic flow to the Internet from an application tier
    • Control inbound traffic flow from the Internet to a UI tier of an application
    • Control traffic flow between two subnets in Azure
    • Collect detailed packet captures or network logs of traffic flowing through the appliance

    Load balancer

    Provides layer 4 or layer 7 load balancing

    • A load balancer with more features that the Azure Load Balancer is required
    • Detailed logging is required
    • SSL termination is required

    Security appliance

    Intrusion detection appliance

    • Attempting to create a security stack to manage inbound Internet traffic
    • Advanced security monitoring and mitigation solution is needed

    User Defined Routing

    User defined routing allows you to configure and assign routes that override the default implicit system routes, ExpressRoute BGP advertised routes, or the local-site network-defined routes for S2S connections. Configuring a user defined route allows the specification of next-hop definition rules that control traffic flow within a subnet, between subnets, from a subnet through an appliance to another subnet, to the Internet, and to on-premises networks.

    Configuring user defined routes involves modifying the default routing table. Each entry in the routing table requires a set of information:

    • Destination address CIDR
    • NextHop type specification: Includes Local, VPN Gateway, Internet, Virtual Appliance, NULL
    • If the NextHop type is Virtual Appliance, you need the address of the appliance network adapter.

    User defined routing is only applied to virtual machines and cloud services in Azure. Placing a virtual appliance and defining user defined routes between on-premises networks and Azure allows you to control the traffic. Any traffic that flows from on-premises networks to Azure is not affected by the user defined routes, and it leverages the system routes and bypasses the virtual appliance.

    Mandatory: To leverage user defined routing or a virtual appliance requires that both are implemented.

    Recommended:

    • Ensure that any user defined routes are more specific than ExpressRoute BGP routes or local site network routes; otherwise, they will not be used.
    • CSP scenarios should leverage user defined routing to control customer traffic is required
    Route Table Design Considerations

    While you can have multiple route tables defined, a subnet can only have a single route table associated with it. A single route table can be associated to multiple subnets. All virtual machines and cloud services connected to a subnet are affected by the route table decisions.

    Default Routing in a Subnet

    Routing of traffic from a virtual machine is accomplished by using implicit system routing via a distributed router that is implemented at the virtual network level. Every packet follows a set of implicit routes that are implemented at the host level. These routes control the flow of traffic within the virtual network to on-premises networks (if enabled), and to the Internet. Traffic flow to the Internet is achieved through NAT by the host.

    The following diagram shows the implicit routing rules that a virtual machine follows by default without any user defined routing.

    The following rules are applied to the packet in this scenario:

    • If the address is within the virtual network address prefix, route to the local virtual network.
    • If the address is within the on-premises address prefixes or BGP published routes (BGP or local site network for S2S), route to the gateway.
    • If the address is not part of the virtual network, BGP, or local site network routes, route to Internet via NAT.
    • If the destination is an Azure datacenter address and ExpressRoute public peering is enabled, it is routed to the gateway because the gateway has the Azure datacenter address via BGP.
    • If the destination is an Azure datacenter with S2S or ExpressRoute without public peering enabled, it is routed to the host NAT for the Internet path, but it never leaves the datacenter
    Routing Changes introduced by User Defined Routing and Virtual Appliances

    When a network firewall virtual appliance is introduced to the scenario, user defined routing must be configured to control the traffic routing through the appliance. Without user defined routing, no traffic will flow through the appliance.

    The following diagram shows a virtual appliance inserted into the scenario to control traffic routing to the Internet via front-end and back-end subnets in Azure:

    The following rules are applied to the packet in this scenario:

    • If the user-defined routing is defined with NextHop Local routing, route to a virtual machine in the virtual network, based on address.
    • If the user-defined routing is defined with NextHop VPN Gateway routing, route to a machine on-premises, based on address.
    • If the user-defined routing is defined with NextHop Appliance routing, route to the virtual appliance, based on address.
    • If the user-defined routing is defined with NextHop Internet routing, route to the Internet over the host NAT

    Mandatory: For CSP scenarios where the provider attempts to leverage a single VPN device to connect multiple customers to Azure, user defined routing is required to maintain proper traffic separation and flow.

    Network Security Groups

    A Network Security Group is a top-level object that is associated with your subscription. It can be used to control traffic to one or more virtual machine instances in your virtual network. A Network Security Group contains access control rules that allow or deny traffic to virtual machine instances. The rules of a Network Security Group can be changed at any time, and changes are applied to all associated instances.

    A Network Security Group requires a regional virtual network. Network Security Groups are not compatible with virtual networks that are associated with an affinity group.

    Network Security Groups are similar to firewall rules in that they provide the ability to control the inbound and outbound traffic to a subnet, a virtual machine, or virtual network adapter.

    Network Security Groups allow you to define rules that specify the source IP address, source port, destination address, destination port, priority, and traffic action (Allow or Deny). The rules can be applied to inbound and outbound traffic independently.

    Traditionally, a firewall rule is applied to a port on a router that is connected to a switch. It affects all traffic flowing inbound and outbound to the switch, but it does not affect any traffic within the switch. A Network Security Group rule that is applied to a subnet is more like a firewall rule that is applied at the switch and affects inbound and outbound traffic on every port in the switch.

    Any virtual machine connected to the switch port would be affected by the Network Security Group rule applied to the subnet.

    For example, if a Network Security Group is created and a Network Security Group rule is defined that denies inbound Remote Desktop Protocol (RDP) traffic for all addresses over port 3389, no virtual machine outside the subnet can connect via RDP to a virtual machine that is connected to the subnet, and no virtual machine connected to the subnet can connect via RDP to any other connected virtual machine.

    Network Security Groups can also be applied to the virtual machine or to the network adapter of a virtual machine. This allows greater flexibility in how traffic is filtered.

    Mandatory: For Ingress traffic to the VM, rules are applied at a subnet level, then VM level, and then NIC level. For Egress traffic from the VM, rules are applied at the NIC level, then VM level, and then subnet level. Rules are applied in priority order.

    To allow the virtual machines within the subnet to connect via RDP to each other, a new rule with higher priority has to be added that allows inbound traffic from the subnet CIDR on port 3389.

    Description

    Priority

    Source Address

    Source Port

    Destination Address

    Destination Port

    Protocol

    Action

    Deny inbound RDP

    1010

    *

    *

    *

    3389

    TCP

    Deny

    Allow inbound for subnet

    1000

    192.168.100.0/24

    *

    *

    3389

    TCP

    Allow

    Every Network Security Group that is created has a set of default Inbound and Outbound rules that are defined and cannot be deleted. The rules can be overridden with higher priority rules though. Any user defined rule can range from a priority value of 100-4096, where 100 is the highest priority rule.

    Default Inbound Network Security Group Rules

    Description

    Priority

    Source Address

    Source Port

    Destination Address

    Destination Port

    Protocol

    Action

    Allow virtual network inbound

    65000

    VIRTUAL_NETWORK

    *

    VIRTUAL_NETWORK

    *

    *

    Allow

    Allow Azure load balancer inbound

    65001

    AZURE_LOADBALANCER

    *

    *

    *

    *

    Allow

    Deny all inbound

    65500

    *

    *

    *

    *

    *

    Deny

    Default Outbound Network Security Group Rules

    Description

    Priority

    Source Address

    Source Port

    Destination Address

    Destination Port

    Protocol

    Action

    Allow virtual network outbound

    65000

    VIRTUAL_NETWORK

    *

    VIRTUAL_NETWORK

    *

    *

    Allow

    Allow Internet outbound

    65001

    *

    *

    INTERNET

    *

    *

    Allow

    Deny all outbound

    65500

    *

    *

    *

    *

    *

    Deny

    Subscription Limits for Network Security Groups

    Object

    Service Management Subscription Limit

    Resource Management Subscription Limits

    Network Security Groups

    100 per subscription

    1 Network Security Group per subnet

    1 Network Security Group per virtual machine

    1 Network Security Group per network adapter

    1 Network Security Group can be linked to multiple subnets, virtual machines, or network adapters

    100 per region/per subscription

    1 Network Security Group per subnet

    1 Network Security Group per virtual machine

    1 Network Security Group per network adapter

    1 Network Security Group can be linked to multiple subnets, virtual machines, or network adapters

    Network Security Group rules

    100 Rules/Network Security Group*

    100 rules per Network Security Group*

    *Can be increased by Microsoft support personnel to a maximum of 400 rules per Network Security Group.

    Default tags are system-provided identifiers to address a category of IP addresses. Default tags can be specified in customer-defined rules. The default tags are as follows:

    Tag

    Description

    VIRTUAL_NETWORK

    This default tag denotes all of your network address space. It includes the virtual network address space (IP CIDR in Azure) and all connected on-premises address spaces (local networks). It also includes virtual network-to-virtual network address spaces.

    AZURE_LOADBALANCER

    This default tag denotes the load balancer for the Azure infrastructure. This translates to an IP address for an Azure datacenter where the health probes originate. This is needed only if the virtual machine or set of virtual machines associated with the Network Security Group is participating in a load balanced set. Note this is not the actual load balancer IP address.

    INTERNET

    This default tag denotes the IP address space that is outside the virtual network and reachable by public Internet. This range includes the public IP space that is owned by Azure. If you use this tag for outbound restrictions, you potentially will not be able to access an Azure PaaS service unless you have a higher priority rule that grants access to that service.

    Mandatory: Network Security Groups must be assigned to a subnet, virtual machine, or network adapter for any of the rules to affect traffic.

    Recommended: For CSP scenarios, consider using network security groups to protect subnets from improperly configured CSP routing tables.

    Design Guidance

    When you are designing Network Security Groups, consider the following:

    Capability Consideration

    Capability Decision Points

    Compliance

    Network Security Groups can present design challenges from a PCI or other compliance perspective because of the logging capabilities that exist in the service.

    Priority numbering

    When designing Network Security Group rules for inbound or outbound scenarios, be sure to leave a blank priority number space between rules. Note that the priority values are independent for inbound and outbound rules.

    Default rules

    To override the default rules, define a rule that has a priority number in the 4000 range to allow as many additional rules as possible.

    Port numbers

    Although you can specify a contiguous range of ports in the rule definition (1024-1048), you cannot specify random port numbers (1024, 1036, 30000).

    Targeting

    There are limitations that affect how many Network Security Groups, how many rules per Network Security Group, and how they can be applied to the object.

    Consider the following when determining Network Security Group targets:

    • Targeting by virtual machine is useful when the number of systems that require this rule set (immediately and in the future) is unknown. This requires per-virtual machine management, but it avoids issues such as IP space exhaustion through over-provisioning subnet address space to plan for future growth of a given role
    • Targeting by subnet is useful when the rule sets are defined by role and the number of systems that the Network Security Group is expected to apply is well known, and appropriate subnet sizes can be determined for these systems. This requires pre-planning IP address space, but it provides simplistic application and management of rule sets similar to management models for on-premises VLAN ACLs.

    Precedence

    Consider where the Network Security Group is being deployed and if there are other Network Security Group in play at virtual machine or subnet levels that may prevent the Network Security Group being applied at the virtual machine or network adapter level from functioning.

    • Deny at the subnet level takes precedence over Allow at virtual machine or network adapter level.
    • Deny at the virtual machine level takes precedence over Allow at the network adapter level or the subnet level.
    • Deny at network adapter level takes precedence over Allow at virtual machine or subnet level.

    Azure Endpoints and ACLs

    Endpoints allow for communication between Azure compute instances and the Internet. Endpoints can be defined in Azure so that they allow translation of a public port and IP address to a private port and private address. By default, when provisioning a virtual machine, two endpoints are automatically created:

    • Remote Desktop with a private port for 3389 and a random public port
    • PowerShell with a private port of 5986 and a public port of 5986

    In the portal, you can see the defined endpoints on the Endpoints tab of the virtual machine configuration.

    To provision a virtual machine with no public endpoints, you have two choices:

    • Use the portal and delete the default ports in the wizard
    • Use the parameter options in Azure PowerShell to disable the endpoints

    If a cloud service has more than one virtual machine, the endpoints have to share the single public facing VIP, but they require different public ports for the port translation to be redirected to the correct virtual machine.

    Endpoint design requires the consideration of the security threat that the endpoint provides. When you have a public facing endpoint, it can be used to access the provided service, but it also can be attacked by hackers.

    Azure provides denial-of-service features at the edge of the Azure firewall, but it does not prevent someone from attempting to hack a public facing port. Any public facing port for a provided service should leverage a strong authentication mechanism to help prevent a hacker from gaining access.

    Mandatory: Public endpoints are not required unless you need inbound access from the Internet.

    Recommended: Only enable public endpoints if the inbound Internet is the only way to achieve communication.

    Use P2S, S2S, or ExpressRoute to RDP, or access the PowerShell interfaces to a virtual machine versus using a public endpoint.

    To further protect resources deployed within Azure, you can manage incoming traffic to the public port by configuring rules for the network access control list (ACL) of the endpoint. An ACL provides the ability to selectively permit or deny traffic for a virtual machine endpoint for an additional layer of security.

    By using network ACLs, you can do the following:

    • Selectively permit or deny incoming traffic based on the remote subnet IPv4 address range to a virtual machine input endpoint.
    • Block lists of IP addresses
    • Create multiple rules per virtual machine endpoint
    • Specify up to 50 ACL rules per virtual machine endpoint
    • Use rule ordering to ensure the correct set of rules are applied on a given virtual machine endpoint (lowest to highest)
    • Specify an ACL for a specific remote subnet IPv4 address.

    For instructions about configuring ACLs for your Azure virtual machine endpoints, see:

    The following diagram outlines the UI for creating of an ACL for a public endpoint.

    Feature References

    Microsoft Azure Network Security Whitepaper version 3

    http://download.microsoft.com/download/C/A/3/CA3FC5C0-ECE0-4F87-BF4B-D74064A00846/AzureNetworkSecurity_v3_Feb2015.pdf

    Security Considerations for SQL Server in Azure Virtual Machines

    https://msdn.microsoft.com/en-us/library/azure/dn133147.aspx

    Active Directory Considerations in Azure Virtual Machines and Virtual Networks Part 5 – Domains and GCs

    http://blogs.technet.com/b/privatecloud/archive/2013/04/09/active-directory-considerations-in-azure-virtual-machines-and-virtual-networks-part-5-domains-and-gcs.aspx

    Security Considerations for Infrastructure as a Service–IaaS-Private Cloud

    http://blogs.technet.com/b/privatecloud/archive/2011/10/12/security-considerations-for-infrastructure-as-a-service-iaas-private-cloud.aspx

    IP Addresses

    Azure allocates IP addresses based on the type of object being provisioned and the options selected. In some cases, having a reserved IP address is required. The following table outlines the options for reserved IP addresses and the use cases for each type:

    Type

    Description

    When to Use

    DIP

    Dynamic IP address. Internal IP address assigned by default and associated with a virtual machine.

    Always assigned

    VIP

    Virtual IP address. Assigned to a virtual machine, cloud service load balancer, or an internal load balancer.

    Address is private for an internal load balancer and is public for a cloud service load balancer or a virtual machine.

    Address is shared across all virtual machines within the same cloud service.

    Always assigned

    PIP

    Public IP address. Public instance-level IP address that can be assigned to a virtual machine. A PIP allows direct communication to a virtual machine without going through the cloud service load balancer.

    Use only when you need to directly communicate with an instance in cloud service

    Reserved

    This is a static public-facing VIP address for a cloud service that must be specially requested. There are a limited number of these addresses per subscription.

    Use only when you need a public facing static IP address

    Internal static

    A static address allocated from the subnet address pool. Internal facing only. The number is only limited by the number of addresses assigned to the subnet address pool. This is implemented as a DHCP reservation.

    Use only when you need an internal facing static IP address

    For more information, see VIPs, DIPs and PIPs in Microsoft Azure.

    This relationship is illustrated in the following diagram:

    Address Assignment

    When you create an object that connects to a subnet in Azure, two IP addresses are automatically allocated to that object:

    • VIP: Public facing IP address associated with the cloud service it is a member
    • DIP: Internal facing private IP address

    Both addresses are assigned to the single network adapter and that adapter is connected to the subnet. The internal facing DIP address is allocated from the address space pool of the subnet to which the virtual machine is attached. The public facing VIP address is allocated from the pool of Azure datacenter addresses that are assigned to the datacenter where it resides.

    Azure provides dynamic allocation of IP addresses to compute resources within each subscription. Addresses are assigned starting from the first available address in the subnet pool. If a virtual machine is allocated an address and then it releases that address, the address is available for reassignment.

    An IP address assigned to a virtual machine is associated to the virtual machine until the machine is in a stopped (deallocated) state or it is destroyed completely. Using the Shutdown option in the Azure portal results in the virtual machine being placed in the stopped-deallocated state, and the DHCP reservation is released. When the virtual machine is restarted it will receive a new IP address.

    Actions like a virtual machine reboot by using Shutdown from the operating system via RDP, or by leveraging the Stop-AzureVM PowerShell cmdlet with the StayProvisioned parameter will not deallocate the IP address of the virtual machine.

    Azure IP addresses that are released to the available address pool are immediately available for reassignment to a virtual machine. When Azure allocates an address, it searches sequentially from the beginning of the subnet address pool until it finds an available address, and then assigns it to the virtual machine. This assignment method is used for dynamic and static addresses from the subnet address pool.

    Mandatory: Every object that connects to a subnet in Azure requires a DIP (including IaaS virtual machines, internal load balancers, PaaS roles)

    A public-facing VIP is always assigned to a cloud service and shared by all virtual machines or PaaS roles within the cloud service.

    Recommended: Do not use the Azure portal to shut down a virtual machine unless you are trying to change its IP address or delete the virtual machine, otherwise you will lose the assigned IP address.

    Use static IP addresses only when a dynamic address will not meet requirements. Do not use them because that is the current on-premises approach.

    Feature References

    Stop-AzureVM cmdlet command reference

    https://msdn.microsoft.com/en-us/library/azure/dn495269.aspx

    Static Addresses

    By default, in Azure all addresses are dynamic regardless of if they are provisioned through the Azure portal or through PowerShell. Statically assigned IP addresses can only be requested or assigned by using Azure PowerShell. During the object creation process, a command-line option is provided to allow a static address to be specified.

    There is not a way within Azure to preallocate or reserve an address prior to assignment. All address assignments are done at the time of object provisioning. To determine if an address is available to use as a static address, you can use the Azure PowerShell cmdlet Test-AzureStaticvNetIP to test if an IP address has already been allocated from the subnet address pool.

    If the address is not available, the cmdlet will return a list of addresses that are available. Using the Test-AzureStaticvNetIP cmdlet to determine if an address is allocated does not guarantee that the address has not been allocated by the time you provision the object.

    Reserved IP addresses are static public facing VIP addresses that are typically used to provide a static IP address for a public facing application. Using a reserved IP address allows a DNS A record to be created with minimum management overhead required. It also provides a consistent IP address that can be used for point-to-point security rules in firewalls. Reserved IP addresses must be requested by using the Azure PowerShell cmdlet New-AzureReservedIP, and then given a name. The name is used by the New-AzureVM cmdlet during provisioning.

    New-AzureVM -ServiceName "WebApp" -ReservedIPName "MyWebSiteIP"
    -Location "US West"

    There are a limited number of reserved IP addresses in a given subscription. The default is five addresses, but through a limit increase request, it may be increased to a maximum of 100.

    Reserved IP addresses are scarce resources, and they should only be used when a static address is absolutely required.

    Mandatory: Carefully plan and track reserved IP address usage to prevent running out of the address quota.

    Recommended: If more than five reserved addresses are required, contact Microsoft support early to increase the reserved address quota to prevent running out and preventing deployments.

    Leverage reserved IP address names that can be easily associated with the service they are being used for.

    Name Resolution

    When IaaS- and PaaS-provisioned services need to resolve host names and FQDNs, they can use either Azure provided name resolution or their own DNS server, depending on the actual scenario.

    Azure automatically registers a new virtual machine or PaaS role in the Azure default *.cloudapp.net DNS suffix. Storage accounts are registered in *.blob.core.windows.net. Azure Web Apps (a feature in Azure App Service) are registered in *.azurewebsites.net. It may also be desirable to have those services resolvable under a custom domain name, for example *.contoso.com.

    The following table is provided to outline scenarios that are related to name resolution.

    Scenario

    Name resolution provided by:

    Name resolution between role instances or virtual machines located in the same cloud service

    Azure-provided name resolution

    Name resolution between virtual machines and role instances located in the same virtual network

    Azure-provided name resolution using FQDN

    ~or~

    Name resolution using your DNS server

    Name resolution between virtual machines and role instances located in different virtual networks

    Name resolution using your DNS server

    Cross-premises: Name resolution between role instances or virtual machines in Azure and on-premises computers

    Name resolution using your DNS server

    Reverse lookup of internal IP addresses

    Name resolution using your DNS server

    Name resolution for custom domains (such as Active Directory domains or domains that you register)

    Name resolution using your DNS server

    Name resolution between role instances located in different cloud services, not in a virtual network

    Not applicable. Connectivity between virtual machines and role instances in different cloud services is not supported outside a virtual network.

    Feature References

    Azure Name Resolution

    https://msdn.microsoft.com/en-us/library/azure/jj156088.aspx

    Configure a custom domain name for blob data in an Azure Storage account

    http://azure.microsoft.com/en-us/documentation/articles/storage-custom-domain-name/

    Configure a custom domain name in Azure App Service

    http://azure.microsoft.com/en-us/documentation/articles/web-sites-custom-domain-name/

    Design Guidance

    When you are planning name resolution, consider the following:

    Capability Consideration

    Capability Decision Points

    Subscription creation

    When preparing a new Azure subscription for provisioning or migrating resources, configure DNS servers at the subscription level, and then assign them to the virtual network level so the Azure DHCP Server service will hand out the DNS servers for resolution support

    Service limits

    A maximum of 10 custom DNS servers can be configured per subscription.

    Azure DNS Service

    Azure DNS is a global scale DNS service for hosting tenant DNS domains and providing name resolution by using Microsoft Azure infrastructure. Azure DNS has been tuned to be a highly available DNS service with fast query response times. Azure DNS provides updates of DNS records and global distribution.

    By hosting domains in Azure DNS, tenant DNS records can be managed by using the same credentials, APIs, tools, and billing as other Azure services.

    Mandatory: Automation scripts must be created to automate the creation and update of Azure DNS domains and records.

    Azure DNS domains are hosted on the Azure global network of DNS name servers. Azure uses Anycast networking, so that each DNS query is answered by the closest available DNS Server. This provides fast performance and high availability for your domain.

    Mandatory: Azure DNS does not currently support purchasing domain names. Tenants purchase domains from a third-party domain name registrar, who typically charges an annual fee. These purchased domains can then be hosted in Azure DNS to manage DNS records. For more information, see Delegate a Domain to Azure DNS.

    Mandatory: Azure DNS does not currently support C names at the root (apex) of the domain.

    To create the domains and domain records within Azure DNS, you can use Azure PowerShell, Azure CLI, REST APIs, or the SDK.

    Etags and Tags

    ETags are used to manage concurrency in a highly distributed DNS infrastructure where changes could be implemented at any location that has access to Azure. Azure DNS uses ETags to handle concurrent changes to the same resource safely.

    Each DNS resource (zone or record set) has an ETag associated with it. Whenever a resource is retrieved, its ETag is also retrieved. When updating a resource, you have the option to pass back the ETag so Azure DNS can verify that the ETag on the server matches.

    Because each update to a resource results in the ETag being regenerated, an ETag mismatch indicates that a concurrent change has occurred. ETags are also used when creating a new resource to ensure that the resource does not already exist.

    By default, Azure PowerShell uses ETags to block concurrent changes to DNS zones and record sets. The optional –Overwrite switch can be used to suppress ETag checks, in which case any concurrent changes that have occurred are overwritten.

    Tags are different from ETags. Tags are name-value pairs used by Azure Resource Manager to label resources for billing or grouping purposes. For more information about Tags, see Using tags to organize your Azure resources.

    Azure PowerShell supports Tags for zones and record sets. Tags are specified using the –Tag parameter:

    Load Balancing

    There are several mechanisms that provide load balancing capabilities within Azure. The following table outlines these features and their potential use in Azure designs:

    Type

    Description

    When to Use

    External load balancer

    A software load balancer that is automatically created when a cloud service is created. It is Internet facing only. It has a single Internet-facing VIP by default, but additional Internet-facing VIPs can be added.

    VIP addresses are dynamically assigned from the Azure public datacenter address pool by default, but they can be assigned a reserved static address.

    Use external load balancers to provide Internet-facing load balancing capabilities for the UI tier of an application.

    The remaining tiers of the application should use internal load balancers if required.

    Load balanced sets

    A way to combine multiple virtual machines or PaaS roles from a single cloud service into a group that is associated with a port of the load balancer.

    Use load balanced sets when you need to use a single VIP with multiple load balanced applications in a single cloud service.

    Internal load balancer

    A software load balancer that is internal facing only. It has a single VIP that is allocated from the local subnet address pool.

    Use an internal load balancer when you need load balancing capabilities for an application.

    However, that application should not be Internet-facing—for example, the second and third tiers of a three tier application.

    Traffic manager

    A public-facing load balancer that is designed to support cross datacenter balancing of loads and geolocation optimization so the user is sent to the closest datacenter.

    Typically used to load balance two cloud services in separate datacenters to provide geolocation optimization.

    Feature References

    Azure Load Balancer

    http://azure.microsoft.com/documentation/articles/load-balancer-overview

    Azure Traffic Manager Overview

    https://azure.microsoft.com/en-us/documentation/articles/traffic-manager-overview/

    Azure Traffic Manager Load Balancing Methods

    https://azure.microsoft.com/en-us/documentation/articles/traffic-manager-load-balancing-methods/

    About Traffic Manager Monitoring   

    https://azure.microsoft.com/en-us/documentation/articles/traffic-manager-monitoring/

    Internal Load Balancer

    http://azure.microsoft.com/documentation/articles/load-balancer-internal-overview

    Configure Load Balanced Sets

    http://azure.microsoft.com/documentation/articles/load-balancer-internal-overview

    Configure an internal Load Balanced Sets

    http://azure.microsoft.com/documentation/articles/load-balancer-internal-getstarted

    Azure Internal Load Balancer SQL Always-On

    https://azure.microsoft.com/en-us/documentation/articles/load-balancer-configure-sqlao/

    Naming Conventions

    The choice of a name for any asset in Microsoft Azure is an important choice because:

    • It is difficult (though not impossible) to change that name at a later time.
    • There are certain constraints and requirements that must be met when choosing a name.

    This table covers the naming requirements for various elements of Azure networking.

    Item

    Length

    Casing

    Valid characters

    Virtual network

     

    Case-insensitive

    Alphanumeric and hyphen

    Cannot start with a space or end with a hyphen

    Subnet

     

    Case-insensitive

    Alphanumeric, underscore, and hyphen

    Must be unique within a virtual network

    Network Security Group

     

    Case-insensitive

    Alphanumeric and hyphen

    Network Security Group rule

     

    Case-insensitive

    Alphanumeric and hyphen

    AT&T VLAN name

    15

    Case-insensitive

    Alphanumeric and hyphen

    Microsoft Azure Identity

    Identity and Access Management is a daunting space in technology. As trends come and go and Internet threats mature, identity and access management solutions must constantly evolve.

    A few years ago, a framework for building identity solutions emerged. It's called "The Four Pillars of Identity." Many organizations have adopted this framework to forge their identity strategy at macro and micro levels.

    The Four Pillars of Identity are areas that identity solutions must address to be successful:

    • Administration
    • Authentication
    • Authorization
    • Auditing of identities.

    For more information about the Four Pillars of Identity, please read the whitepaper titled The Four Pillars of Identity – Identity Management in the Age of Hybrid IT.

    Several options for leveraging identity and access management solutions exist when working with Azure. Most often, it's helpful to distinguish between two audiences when determining a solution: the developer and the IT pro.

    The Developer Audience

    For developers, the most important thing with regards to identity is to integrate their applications with the organization's preferred identity and access management platform. In the past, many developers didn't have a good grasp on how to integrate applications with enterprise identity and access management platforms, so they often took on the task of managing the identities and access within the application itself.

    This places a lot of burden on the developers, because they have to take on all the work of each of the four identity pillars described previously. This means that they have to provide a place to store identities and provide a way for users to change identity data, manage credentials, deactivate their access, request new access, and so on.

    Developers would also have to securely authenticate users, manage entitlements that authorize users to various resources in the application, and maintain audit trails of authentication and access events.

    Even if a single development team can do this well for a given application, organizations typically have hundreds of applications in use. The result is that users have multiple identities sprawled throughout an organization, with each application operating independently with regards to identity and access management.

    The IT Professional Audience

    IT pros are under pressure from the organization to facilitate the adoption of cloud services by extending the traditional identity and access management enterprise into the cloud. Without this important integration, cloud services such as Azure, become virtually unusable.

    When there is no integration between the cloud and an organization's on-premises identity and access management platform, users have multiple identities with different credentials and different access rights to an organization's data. Not only is this a bad experience for end users, but it makes it impossible to manage access to all of an organization's applications and resources. The issue gets worse when non-Microsoft clouds are introduced into the equation.

    Another important concept is that identity becomes the "control plane" for the cloud. In the past, an organization could keep sensitive information on-premises and put up firewalls and extranets to protect it and keep potential malicious users out.

    This becomes much more difficult in a cloud-connected world. The network edge is being pushed out and becoming vaster, while users on mobile devices are accessing on-premises applications inside the organization's network and cloud services provided by the organization.

    Organizations can no longer depend on firewalls to keep out potential attackers because those firewalls also keep out the people who require access to resources. Because of this, the identity and access management platform is the primary means of protecting an organization's applications and data in the cloud-connected world.

    Azure Active Directory

    Azure Active Directory (Azure AD) interacts with the cloud in two ways:

    • An enabler of the cloud
    • A consumer of the cloud

    IT professionals will mostly be concerned with Azure AD as an enabler of the cloud because they are often tasked with integrating the enterprise identity and access management platform into the cloud.

    On the other hand, developers will mostly be concerned with the identity services that Azure AD provides as a consumer of the cloud. Most often, they are looking to understand how their applications can leverage the cloud identity service.

    Enabler of the Cloud

    Azure AD plays a pivotal role in enabling the cloud. To use Microsoft cloud services, such as Office 365, the cloud services must:

    • Store identity data that is used to identify the user
    • Store profile data about the user
    • Entitle the user to specific applications and data in the cloud service

    Rather than having each cloud service keep its own identity repository, all Microsoft cloud services use Azure AD. The capabilities of the Microsoft cloud cannot be enabled without it.

    After identities are populated in the Microsoft cloud, Azure AD becomes an identity and access management hub that enables other clouds. Azure AD can facilitate access to an organization's custom applications regardless of whether they are on-premises or hosted in the cloud, in addition to other Software-as-a-Service applications that do not reside in the Microsoft cloud.

    Consumer of the Cloud

    Azure AD is a single, multitenant directory that contains over 200 million active identities and serves billions of authentication requests each day. A cloud-scale identity service like this can only be built by using the scale and breadth of the cloud. In addition, Azure AD has features that rely on cloud services, such as Azure Multi-Factor Authentication and machine learning technologies. In this way, Azure AD consumes the cloud to provide its services.

    Azure AD becomes a cloud service that can be consumed by other applications and services. Application programming interfaces (APIs) and endpoints are exposed so that developers can use Azure AD to store and retrieve their identity data, and they can depend on Azure AD to authenticate users to their applications.

    IT professionals can use the cloud Identity Management-as-a-Service features, such as Self-service password reset, to enable new identity management capabilities that traditionally took months to deploy on-premises.

    Tenant Directory Planning

    The existence of an Azure AD directory is a requirement for an Azure subscription. Therefore, each Azure tenant has at least one directory associated with it. This directory is used for signing in to and accessing the Azure portal, Office 365, and other Microsoft cloud services.

    Additionally, Azure tenants can have multiple directories. These directories are separate and unique. For example, if two Azure AD directories exist in the same tenant, they have their own set of administrators and there is no data shared between them. Administrators of one directory in the tenant do not have access to another directory in the same tenant, unless they are explicitly granted access to it.

    Mandatory: There must be at least one directory in the tenant. You do not have a choice. All tenants created within Azure are assigned a default directory if one doesn't exist.

    How Many Directories Should a Customer Have?

    Most tenants should have at least two directories—one for the production users using the cloud services that are integrated with Azure AD, and another directory for testing.

    If a customer has software development teams, it is possible that those teams might need Azure AD directories that they can use for developing applications. The following criteria should be used to determine if separate development directories are needed:

    • Is there any reason why the development team can't use the test directory? For example, developers might need a separate directory if they need to control user accounts and attributes in the directory.
    • Does the development team need to have the full log-in experience that an end-user will go through? If so, the development directory might require a deeper level of integration with the on-premises Active Directory if the production tenant is federated with Active Directory Federation Services (AD FS). Maintaining this integration for each developer directory would be extremely arduous because it would require multiple on-premises servers. Most organizations would develop applications against the test directory in this case, rather than maintaining multiple Azure AD Connect instances.
    • Are any Azure AD Premium features (such as Multi-Factor Authentication) needed by the development team? Azure AD Premium is licensed per-directory. Therefore, if Azure AD Premium features are needed in the development directory, the customer must purchase independent licenses for the development accounts.

    Optional: Software development teams might want their own Azure AD directories in the tenant.

    Cross-Organizational Directories: Complex Government Organizations

    Some complex government organizations look like a single entity on paper; but in reality, they are multiple, independently-run organizations. The question of whether to have a single tenant or multiple tenants is a very important discussion to have with these customers before they get locked in to a model that doesn't work for them.

    There is no definitive answer for every situation. Rather, this must be addressed on a case-by-case basis. The following criteria should help you understand how to guide customers.

    Considerations for a cross-organizational directory:

    • The customer has a long-term goal of operating as a single entity with a consolidated Active Directory environment.
    • Applications in one organization within the tenant should be readily accessible by users in other organizations.

    Considerations for unique organizational directories:

    • Each organization in the tenant has an Active Directory environment and unique IT staff.
    • There are security requirements that prevent the customer from having a single set of directory administrators over all organizations.
    • Applications within an organization are restricted only to users within that organization.
    Cross-Organizational Directories: Mergers and Acquisitions

    The topic of a cross-organizational directory is important to discuss with commercial customers who often buy and sell other companies. The following criteria can be used to help you determine if the customer should have a cross-organizational directory.

    Considerations for a cross-organizational directory:

    • The customer plans to permanently integrate the acquired company with no foreseeable plans to divest it.
    • Users in the acquired company should be able to access applications and data in the acquiring company.

    Considerations for unique organizational directories:

    • The customer plans to divest the acquired companies in the future.
    • The acquired company is already an Azure AD customer and the cost and disruption of migrating the users to the acquiring tenant is prohibitive.
    • Users in the acquired company cannot access applications or data in the acquiring company.
    Custom Domain Names

    When a directory is created, the default name of the directory is <something>.onmicrosoft.com. The <something> is chosen by the directory administrator during the creation of the directory. Usually, customers want to use their own domain name, such as contoso.com. This can be achieved by using a custom domain name.

    Recommended: Add a customer's public-facing DNS name as a custom domain name for the production Azure AD directory. Otherwise, users will sign in with accounts such as bob@contoso.onmicrosoft.com instead of bob@contoso.com.

    Multiple custom domain names can be added to each Azure AD directory, but a custom domain name can only be used in one Azure AD directory. For example, if there are two Azure AD directories in the tenant, and the first directory assigns the custom domain name of contoso.com, the second directory cannot use that name.

    Mandatory: Custom domain names must be publically registered with an Internet domain name registrar, and the customer must be able to modify the DNS records of the public record to prove ownership of the domain.

    Feature References

    Add your custom domain to the Azure AD tenant

    https://msdn.microsoft.com/en-us/library/azure/hh969247.aspx

    Integrating On-Premises Active Directory (Identity Bridge)

    When you create a new Azure AD tenant, the contents of the directory will be managed independently from the on-premises Active Directory forest. This means that when a new user comes in to the organization, an administrator must create an on-premises Active Directory account and an Azure Active Directory account for the employee. Because these two accounts are separate by default, they also may have different user names and passwords, and they need to be managed separately.

    However, an organization can use Azure AD Connect to connect the on-premises Active Directory to Azure AD. When this is in place, users that are added or removed from the on-premises Active Directory are automatically added to Azure AD. The user names and passwords are also kept synchronized between the two directories, so end users do not have different credentials for cloud and on-premises systems.

    AD FS can be used to add an identity federation trust between on-premises Active Directory and Azure AD, which enables the users in the organization to have a single sign-on experience. We call this scenario the "identity bridge" because it bridges the on-premises identity systems with the cloud, thereby enabling a single identity service for the enterprise.

    Recommended: Unless you have a cloud-only company (with no on-premises systems), you should incorporate this integration. Even if you are not using Azure AD, you will have a better experience with Azure and the other Microsoft cloud services that you may subscribe to.

    Synchronizing Users to the Cloud

    The goal for synchronization of identities is to extend the on-premises Active Directory into Azure AD. After synchronization is in place, Active Directory and Azure AD should be viewed as a single identity service with on-premises and cloud components, instead of two separate identity services.

    In most cases, managing the identities (such as on-boarding, off-boarding, and entitlement changes) still occurs on-premises by using identity management solutions that were specifically created for these scenarios.

    This is depicted in the "On-Prem" box in the following diagram. These systems are usually going to be different than the identity bridge systems that connect the on-premises Active Directory to Azure AD.

    Historically, there have been four tools available to do the job of the identity bridge, which has caused a lot of confusion. Therefore, we released a single tool that can be used for everything except the most complex of scenarios.

    When deciding on which synchronization tool to use, the choice should be between using Azure AD Connect or the Microsoft Identity Manager Synchronization Services with the Azure AD Connector. This is summarized in the following diagram.

    In general, the default stance should be to use Azure AD Connect, unless the scenario is extremely complex, requiring a lot of customization. Some key features of Azure AD are lost (such as password synchronization and write-back) when using Identity Manager, so it should only be used as a fallback option if absolutely necessary.

    Multiple Active Directory Forests

    Many customers do not have simple single-forest Active Directory environments, and dealing with multiple forests can be a challenge when integrating with Azure AD. Typically, customers fall in two scenarios:

    • They have an account and resource forest model.
    • They have multiple forests with active users in many of them.

    Single Forest with Multiple Domains

    Some customers have a single forest environment with multiple domains. Azure AD Connect natively handles this scenario when the following conditions need to be met:

    • Users need to exist uniquely across the forest. A user cannot have an active account in more than one domain, because both accounts will be synchronized as separate identities in Azure AD.
    • If the domains in the forest use different UPN suffixes, each UPN suffix needs to be added to the Azure AD tenant as a custom domain name.

    Account and Resource Forest Model

    When a customer has an account and resource forest model, there is a dedicated forest where all of the user identities reside (the account forest) and a dedicated forest for some or all of the applications (the resource forest). A one-way trust (often a forest trust) is in place so that the resource forest trusts the account forest. This relationship is depicted in the following diagram.

    This is most commonly seen with complex Exchange Server deployments. Often, there needs be a representation of the user in the resource forest's Active Directory for the application to use. This is sometimes referred to this as a shadow account. In most cases, it's a duplicate of the user's account from the Account forest, but it is put into a disabled state. Thereby, users are prevented from signing in to it.

    Azure AD Connect natively handles this scenario. If the resource forest contains data that needs to be added to Azure AD (such as mailbox information for an Exchange user), the synchronization engine detects the presence of disabled accounts with linked mailboxes. The appropriate data is then contributed to the Azure AD user account.

    Multiple Forests with Unique Users

    In this scenario, there are multiple independent forests in the environment, which may or may not have Active Directory trust relationships between them. This situation will be encountered in highly segmented organizations or companies that acquire other companies via mergers and acquisitions. The following diagram depicts what this architecture might look like.

    Users in this scenario have only a single account in one of the forests (they do not have multiple user accounts across forests). Because of this, you do not need the synchronization tool to match a user to multiple accounts.

    However, one decision that needs to be made is whether the accounts will be migrated into a single forest at some point. This is an important thing to consider, because it will determine whether you can use the objectGUID of the user accounts as the source anchor (which is used to match the Active Directory accounts to the Azure AD accounts).

    If the users will be migrated to a single forest at some point, you'll need to use a different source anchor, such as the user's email address or UPN. The reason is that the objectGUID can't be migrated with the user. After migration, there would be multiple accounts in Azure AD for migrated users—one for the old forest and another for the new forest.

    Mandatory: If users from the additional forests will be migrated into a single forest in the future, you must choose something other than the objectGUID as the source anchor attribute (such as the mail attribute).

    Multiple Forests with Duplicate Users

    This scenario is the same as the previous scenario (multiple forests with unique users) with the exception that a single user has multiple user accounts in different forests in the environment. These accounts are either:

    • Enabled (users likely have a password and sign in to these accounts)
    • Disabled (a shadow account is used to store attributes for an application, such as Exchange).

    Even though there are multiple user accounts in the organization, there should be only a single account for the user in Azure AD. To enable this, the synchronization service needs to be able to match user accounts across the forests to a single person. For this to happen, the accounts in each forest need to have an attribute that contains the same, unique value for a user.

    Mandatory: If a single person has multiple user accounts in different forests, you must choose a common attribute to match the accounts together.

    UPN Alignment

    The User Principal Name (UPN) is the attribute in Azure AD that is used for a user's sign-in name. By default, this is sourced from the on-premises Active Directory directory by using the userPrincipalName attribute for the user account. Because of legacy guidance, some customers' AD forests use non-routable UPN suffixes or UPN suffixes that are different from the public-facing DNS name of the organization.

    For example, the UPN suffix in Active Directory might be @contoso.local, while the public facing DNS name is @contoso.com. In this situation, the Active Directory users have a log-in name similar to bob@contoso.local, rather than bob@contoso.com.

    Azure AD requires that the UPN suffix be a valid public domain name that is registered with an Internet name registrar. This is to ensure that it's unique across all Azure AD tenants and that only one organization owns the domain name. When the tenant is federated with an on-premises identity provider, the UPN suffix is used to determine where to redirect the user for authentication.

    Customers that have a UPN suffix that is not routable or not desirable for the user logon name have two options:

    • Perform a UPN rationalization exercise
    • Use the Alternate Login ID

    UPN Rationalization

    UPN rationalization entails that the organization add a new UPN suffix to the Active Directory forest, and then change the UPN suffix of every account to match the new UPN suffix. This is the preferred approach for UPN alignment because it provides the best experience for users after the alignment is complete. There are challenges with UPN rationalization, however.

    Applications that are Dependent on the UPN Attribute

    It is possible that some of a customer's applications use the UPN to store data about users in the application. If this is the case, changing the UPN in Active Directory would break those applications. The risk associated with performing a UPN rationalization exercise increases with the size of the organization.

    For smaller customers with a well-defined set of applications, it's easier to determine if changing the UPN suffix will impact any of the applications in use. However, for larger organizations, it is nearly impossible to gauge the impact. In that situation, it is best to pick a sample of users that is representative of all of the business groups in the organization, and first test the change with their accounts.

    Mandatory: If user certificates use the UPN in the Subject Name field, the certificates need to be reissued during the UPN rationalization.

    Recommended: Before performing rationalizing UPNs, build a catalog of the applications that have a dependency on the UPN attribute of the users, and test the new UPN on users of those applications.

    User Certificates Issued with UPN as the Subject Name

    Another big challenge when changing UPNs is that some organizations issue x.509 certificates that have the UPN value in the Subject Name field of the certificate. The impact varies with each customer because the certificates could be used for authentication or for signing or encrypting data, such as when sending email messages.

    The data in a certificate cannot simply be changed because the certificates are digitally signed by the Certification Authority that issues them. If the data is changed, the signature is broken and the certificate is no longer valid. Therefore, the certificates must be reissued when the UPN is changed.

    The process of obtaining new certificates varies between customers, so if there are certificates that rely on the UPN attribute, it's important to understand the process that the customer uses for reissuing those certificates. In some cases, this may mean provisioning a new "soft" certificate (a certificate with a private key that resides on the computer, rather than a hardware device) to the user's machine. Or it may require that the user write the new certificate to their smartcard.

    Recommended: Understand the process used for reissuing user certificates, so that you can adequately communicate with users and prepare for a massive reissuance event, if needed.

    Identity Management Systems

    If an identity management solution is in place, such as Microsoft Identity Manager, it's likely that there's a dependency on the UPN attribute. In these cases, it's likely that the identity management system is managing the value of the UPN attribute for users. So if the UPN is changed on the user account in Active Directory, the identity management system would set it back to the old value (which it deems is authoritative).

    Depending on the configuration of the identity management system, it is also possible that the UPN attribute is being used as an anchor for joining identities in different systems to the identities in Active Directory. Therefore, changing the UPN without updating the identity management system could result in identities being disassociated in the connected systems. At best, this would cause the identities to stop synchronizing to those systems. At worst, the identities would be deleted from the target system.

    Recommended: Spend some time understanding the identity management systems that are used for managing Active Directory within the organization to ensure that there isn't a dependency on the UPN of user accounts.

    Alternate Login ID

    The Alternate Login ID is a way to achieve UPN alignment without having to modify the UPN attribute of user accounts in Active Directory. When using the Alternate Login ID, an Active Directory attribute other than userPrincipalName is selected to feed the UPN of Azure AD. This can be any unique, indexed attribute that uses the user@domain.com format. The impact to users is much less than changing their UPNs.

    Although the Alternate Login ID can help in some situations, it should not be the default solution because it has some drawbacks, including:

    • Cannot be used with an Exchange hybrid online deployment.
    • Configuring it on an existing Azure AD Sync implementation that has already synchronized to Azure AD requires that you manually change the UPN on each Azure AD account.
    • Kerberos-based single sign-on no longer works for applications that rely on the Sign-in Assistant (such as Lync, OneDrive for Business, and Office Pro Plus). Users are prompted to enter credentials, which then can be cached by the Windows Credential Manager, but users will be prompted on a regular basis when their password changes.
    • Azure AD Application Proxy requires that the UPN in Azure AD is the same as the UPN in the on-premises Active Directory for Kerberos constrained delegation to work. Therefore, an Alternate Login ID will break Kerberos constrained delegation for Azure AD Application Proxy.

    Due to these issues, it is recommended that Alternate Login ID be used as a secondary option only when UPN rationalization is not possible with a customer.

    Synchronization Server Availability

    It is not possible to have a high availability design for the server hosting the Azure AD Connect service. By default, the synchronization server runs the synchronization job to Azure AD every three hours by using a scheduled task on the server. This interval can be decreased, if needed. High availability for the Azure AD Connect server should not be necessary in most situations because synchronization is not a continuous event.

    In the event of a catastrophic failure, a new Azure AD Connect server can be built and synchronized in a couple of hours for a medium-sized business. Larger business with more than 100,000 users will take more time to synchronize. If you need a faster recovery time, Azure AD Connect can be configured to use a dedicated SQL Server deployment with high availability.

    Consider a dedicated SQL Server environment in the following scenarios:

    • The organization has more than 100,000 users. The SQL Express LocalDB used by Azure AD Connect has a limitation of a 10 GB database. Therefore, if an organization has more users than SQL Express can hold, a full SQL Server implementation is required.
    • A large organization wants to have a low recovery time for the synchronization service.

    Optional: A dedicated SQL Server instance can be used to provide better performance and high availability options for the Azure AD Connect synchronization service.

    Password Hash Synchronization

    With password hash synchronization, the Azure AD Connect service will synchronize one-way SHA256 hashes of Active Directory password hashes into Azure AD. This allows a user that signs into Azure AD to use the same password that is used to sign in to the on-premises Active Directory.

    Even though the default synchronization frequency for Azure AD Connect is every three hours, password hash synchronization occurs every two minutes, allowing users who change their passwords in on-premises Active Directory to begin using their new password in Azure AD almost immediately.

    When you enable password hash synchronization, it applies to all users that are being synchronized to Azure AD. This means that you cannot pick and choose which user's password hashes get synchronized. The only way to prevent a user's password hash from being synchronized to Azure AD is by filtering out the user in the synchronization policies, thereby removing their account from Azure AD.

    If you are using federated authentication for Azure AD, we still often recommend enabling password hash synchronization. This approach is recommended to allow password-based sign in to be used as a fallback if the customer's on-premises AD FS instance goes down.

    If a user's password is already synchronized to Azure AD, enabling password-based sign in is as simple as running a PowerShell script. Users can safely be switched back to federated authentication after the problem is resolved and AD FS is back online.

    Recommended: Even if all of a customer's users are signing in to Azure AD with AD FS, it is recommended to enable password synchronization. Doing so provides a good fall-back method for user authentication if AD FS goes offline.

    Signing In to Azure Active Directory

    After user accounts are synchronized from the on-premises Active Directory to the Azure AD tenant, users can sign in to the accounts and access applications that are integrated with Azure AD, such as Office 365. There are two options for signing in users to Azure AD:

  1. Provide the user name and password to Azure AD for verification
  2. Sign in to on-premises identity provider that is trusted by Azure AD

Authenticating to Azure AD

The user object in Azure AD is separate from the object in the on-premises Active Directory. Because of this, the Azure AD object has its own user name and password. Unless password hash synchronization is enabled in Azure AD Connect, users will have different passwords for Active Directory and Azure AD.

This can confuse users and lead to a poor cloud experience. Therefore, it's recommended to enable password hash synchronization unless there is a specific reason that the customer doesn't want it enabled.

Recommended: Enable password hash synchronization so that the Azure AD password for users is the same as the on-premises Active Directory password.

Authenticating to an On-Premises Identity Provider

Azure AD supports the ability to establish an identity federation trust with an on-premises identity provider (IdP), such as Active Directory Federation Services (AD FS). This enables users to have a desktop single sign-on experience when accessing resources that are integrated with Azure AD.

With this experience, an end user would sign in to a domain-joined workstation and not be prompted again for a password throughout the entire session, regardless of which applications are used.

When a federation trust is in place, Azure AD defers to the on-premises identity provider to collect the user's credentials and perform the authentication. After authenticating the user, the on-premises identity provider creates a signed security token to serve as proof that the user was successfully authenticated.

This security token may also contain data about the user (called claims), which can then be provided to Azure AD for various purposes. The security token is given to Azure AD, which then verifies the signature on the token and uses it to provide access to the applications. The following diagram illustrates this behavior:

Domain Names

When enabling a federated identity relationship between Azure AD and an on-premises identity provider, an entire domain name in Azure AD is converted from a standard domain to a federated domain. This impacts all of the users that have UPNs under the domain name. You cannot have a mix of federated and non-federated users in a domain name.

Note: You cannot convert the default <tenant>.onmicrosoft.com domain name to a federated domain name. Only custom domain names added to Azure AD can be federated.

Any subdomains under a domain namespace will have the same configuration as the parent domain. For example, if the custom domain name contoso.com is configured as a federated domain, child.contoso.com will also be a federated domain. This happens automatically by Azure AD, and it cannot be overridden.

Recommended: Federated domains can be converted back to standard domains at any time. Using this option in conjunction with password synchronization can provide a great fall-back strategy if the customer's identity provider goes down