Exchange Virtualization “To Be or Not to Be?”

Exchange Virtualization “To Be or Not to Be?”

Exchange Virtualization recommendations

Today everybody is asking the same question, from top management to IT professionals.

To answer this Question this, the answer would be different for different People. Depending, upon their background. For Example, the answer will be it Depends for Top the Management for an extent; but it will be a YES to an IT professional with lots of counter questions and gathering information.

The following information will help the IT professionals to make their decision.

  • Determine the Number of users for which you are sizing your environment. I will not virtualize for more than 5000 to 6000 users (Medium profile).
  • Before virtualizing your environment, see if you can optimize your deployment cost, by consolidating the exchange roles (MS recommends Multi role Exchange Deployments)

Simple unit of scale   Organizations that anticipate regular growth in the number of mailboxes should consider deploying  multiple-role servers. Because each multiple-role server represents a building block, this model allows the easy addition of building blocks to support the need for increased capacity.

Large-scale deployments that want to leverage modern processors   Based on scalability testing performed prior to the release-to-manufacturing (RTM) version of Exchange 2010, multiple-role  servers can effectively utilize hex core (or more) processors in a single server. This capability allows large organizations to reduce the number of servers by combining the Mailbox, Hub Transport, and Client Access server roles instead of deploying these roles separately on servers with fewer processor cores. This approach leverages the building block model described earlier to provide a platform for large-scale deployments while reducing the overall number of servers required. Scalability of the multiple-role configuration on larger core count systems should be validated with lab testing prior to production deployment.

Server deployments with internal storage   Many servers available today have two physical multi-core processors and 10 to 16 internal disks. Several improvements in Exchange 2010 reduce I/O requirements, making these servers a cost-effective solution. Depending on user profile and disk type, these servers generally support up to 4,000 mailboxes. We recommend adding the Client Access and Hub Transport server roles to these servers to utilize the additional CPU and make these servers self-contained building blocks.

Risk mitigation scenarios where the number of mailboxes hosted on a Mailbox server is limited   Multiple-role servers are a solution for deployments where risk management policies limit the number of mailboxes that can be deployed on a Mailbox server. For example, say an organization with 10,000 mailboxes has a policy that a single server outage can’t affect more than 25 percent of the mailboxes in the environment. This requirement limits the number of mailboxes per Mailbox server to 2,500. The additional capacity on that server could be utilized by adding the Client Access and Hub Transport server roles to the server.

Small organizations and branch office deployments   Except as noted when Windows Network Load Balancing is used, a multiple-role deployment is a recommended solution for deployments where the primary goals are to minimize the number of physical servers, operating system instances, and Exchange servers to manage. Running the Client Access, Hub Transport, and Mailbox server roles on the same physical server provides the necessary role redundancy with a minimum requirement of two or three physical servers.

  • Microsoft recommends CPU ratio should not exceed 1:2 for Exchange Host.  I will personally start with the recommendations for sizing. Later on adjust the CPU cycles, by fine tuning Hyper-V CPU parameters.
  • Do not configure dynamic memory /memory ballooning for Exchange virtual environments.
  • Make use of Pass through disks for mailbox and HUB transport roles (Databases)
  • NIC card needs to be provisioned separately, in case of DAG environments. This separation needs to be done at Physical NIC level. If not, watch the throughput on NICs. (Keep in mind, 1 GBPs network will offer 20 to 30 % lesser throughput)
  • I personally recommend keeping separate hosts for messaging environment. It’s better for Exchange administrators to understand the Vitalization technology before virtualizing it. Organization with two different teams, usually have two different opinions; resulting in huge support gaps.

Note: Organizations planning for Private cloud, should have separate Virtualization strategy for messaging environment. As rule of thumb is different for Private cloud than Exchange environment.


Example of Sizing for an Exchange 2010 Multiple-Role Scenario

The following example illustrates the server sizing process for multiple-role servers. The example has the following design assumptions:

  • Total mailbox count4,000
  • Mailbox profile100 messages per day (for example, 20 sent and 80 received)
  • Database cache per mailbox6 MB (based on a 100 message per day profile)
  • Availability requirementsMailbox resiliency within a single site; protection against simultaneous failure of three database copies and two servers
  • Database requirements10 databases in the DAG, 400 mailboxes per database
  • Server platform2 x 4 core 2.26 gigahertz (GHz) processor-based server (8 cores)

The following process applies:

  1. Calculate server count A four-node DAG is required to protect against the simultaneous failure of two servers
  2. Calculate the maximum active mailboxes per server based on the activation model. Assuming the active databases are equally distributed across the nodes, each server ideally hosts 1,000 active mailboxes (4,000 ÷ 4). To calculate the active mailbox count after a double-node failure (based on this example), the mailbox count is divided by the remaining four nodes, which equals 1,000 active mailboxes per node (4,000 ÷ 4).
    In this example, the MaximumActiveDatabases parameter on the Set-MailboxServer cmdlet is configured for 30 to ensure that no more than 40 percent of the databases become active on a single server.
  3. Calculate active mailbox CPU requirementsMultiply the maximum number of active mailboxes on a server by the megacycles per active mailbox (1,000 × 2 megacycles = 2,000 megacycles), based on the Estimated IOPS per mailbox based on message activity and mailbox database cache table in Understanding the Mailbox Database Cache. Multiply this value by 10 percent for each additional database copy.
    In this example, there’s one active copy and three passive copies for every database, so the 2,000 megacycles is increased by 30 percent (4,000 × 1.3 = 5,200 megacycles). For more information, see “Database Cache Metrics” in Understanding the Mailbox Database Cache.
  4. Calculate passive mailbox CPU requirementsMultiply the number of passive mailboxes (when a server is hosting the maximum number of active mailboxes) by the megacycles per passive mailbox (3,000 × 0.3 megacycles = 1,000 megacycles), based on the Estimated IOPS per mailbox based on message activity and mailbox database cache table in Understanding the Mailbox Database Cache. For more information, see “Database Cache Metrics” in Understanding the Mailbox Database Cache.
  5. Add active and passive CPU requirements to get total CPU requirementIn this example, 5,200 active mailbox megacycles + 1,000 passive mailbox megacycles = 6,200 megacycles total CPU requirement.
  6. Apply Mailbox CPU requirement to hardware platformThis example uses a 2 x 4 core 2.26-GHz processor-based server. Based on the guidance in Mailbox Server Processor Capacity Planning, this equates to 40,055 megacycles. Divide the required megacycles by the available megacycles based on the server platform to estimate the CPU utilization at peak period after a double-node failure (6,200 ÷ 40,055 = 31 percent predicted CPU utilization).
    We recommend that the Mailbox server role portion of multiple-role configurations be designed to not exceed 40 percent utilization during peak periods (for example, simultaneous failure of two nodes). This design allows sufficient space to accommodate CPU utilization of Client Access and Hub Transport server roles while maintaining total server CPU utilization at less than 80 percent during peak periods (for example, simultaneous failure of two nodes).
  7. Calculate active mailbox memory requirementsMultiply the number of active mailboxes by the required database cache per mailbox. In this example, with a double server failure, the remaining servers will host 1,000 active mailboxes (1,000 × 6 MB) ÷ 1,024 = 5.85 GB. The database cache requirements are based on the mailbox profile. For more information, see “Database Cache Metrics” in Understanding the Mailbox Database Cache.
  8. Apply total memory requirements to hardware platformThe total memory required is based on the database cache requirements and the server design (dedicated or multi-role). For more information, see the Default mailbox database cache sizes table in Understanding the Mailbox Database Cache. The total memory requirement for the multi-role server in this example is 13.3 GB ((4 GB + 5.85 GB) ÷ 0.75). Because 13.3 GB isn’t a standard memory configuration, round up to 16 GB or the closest memory configuration that your server supports.

Best Practices for Virtualizing Microsoft Exchange 2010

  • Supported Exchange Virtualization Scenarios
    • Exchange 2010 SP1 or later
    • Hyper-V or any hypervisor in the Server Virtualization Validation Program (SVVP) – link provided below.
  • Items Not Supported when Virtualizing Exchange
    • Hypervisor snapshots
    • Differencing / Delta disks
    • CPU oversubscription in a ratio > 2:1
    • Applications running on the parent / root partition
    • VSS backups of VMs from root
    • NAS storage of virtual disk files
  • JetStress Support in Virtualized Environments
    • Supported in VMs on Microsoft Windows Server 2008 R2 or later
    • Supported in VMs on Microsoft Hyper-V Server 2008 R2 or later
    • Supported in VMs on VMware ESX 4.1 or later
    • More Info –
  • Big Problems to Avoid for Production Exchange VMs
    • Dynamic Memory / Memory Overcommit
    • VM Snapshots
    • CPU Oversubscription
  • Overview of Best Practices
    • Hypervisor adds CPU overhead – 10-12% in our Exchange 2010 tests
    • Size for physical and provide those resources to each VM
    • Exchange is architected for scale-out scenarios, avoid “all eggs in one basket”
  • Resource Sizing
    • Start with physical sizing process – use calculator (listed below)
    • Account for virtualization overhead (10-12%)
    • Determine VM placement to account for HA
    • Size root servers, storage and network infrastructure
  • Guest VM sizing
    • Size Mailbox role first – other role sizes factored from Mailbox server requirements
    • Considerations for use of Multi-role servers – Mailbox, Hub and CAS roles on single VM
  • Unified Messaging Sizing
    • Min 4 Virtual Processors (VP)
    • VM with 4VP & 16GB memory can handle 40 concurrent calls with Voice Mail Preview (65 calls without)
  • Storage Decisions
    • Exchange storage separate from Guest OS virtual disk physical storage
    • Must be fixed virtual disk, SCSI pass-through (RDM) or iSCSI (terminated at host or guest)
    • SCSI pass-through (RDM) recommended to host queues, DBs and logfile streams unless using Hyper-V Live Migration where CSV is recommended
    • Must be block-level storage – NAS volumes not supported
  • Virtual Processors
    • Prefer smaller number of multi-core VMs vs many single-core VMs
    • Don’t assume that a hyperthreaded (SMT) CPU is a full CPU core
  • Private Cloud
    • Good model for providing virtual infrastructure resources  to Exchange, but be careful with “dynamic” cloud capabilities
    • Be prepared to apply different resource management policies to Exchange VMs
  • Host-based Failover Clustering
    • Not an “Exchange Aware” HA Solution – Does not provide HA in the event of storage failure / data corruption
    • If using, combine with DAG when possible to provide maximum HA – Admin can re-balance DAG after failover to redistribute
  • VM Live Migration and Exchange
    • DAG does not need to be dynamically re-balanced
    • Use CSV rather than pass-through LUNS for all Mailbox VM storage
    • Consider relaxing cluster heartbeat timeouts (5 seconds = default, 30 seconds = max recommended)
    • Size network appropriately for Live Migration
  • VM Placement
    • Don’t co-locate DAG database copies on same physical hosts
    • Distribute VMs running same roles to different physical hosts
    • If not using multi-role VM’s, consider isolating mailbox and hub role VMs on separate physical hosts if possible.