Load balancing
- 1 Overview
- 1.1 IIS load balancer
- 1.2 Azure options
- 1.2.1 Traffic manager
- 1.2.2 Load balancer
- 1.2.3 Application gateway
- 1.3 Uptime SLAs
- 1.4 Case study
- 1.4.1 Open question
Overview
The EGIS architecture calls for paired web servers to provide capacity and resiliency, and eventually high-availability. To make use of these web servers a load balancer must direct traffic to them.
Many options are available, though few are supported in any way by ESRI. The ESRI PowerShell DSC project, which we make use of for our ESRI software installations, can provision an IIS load balancer. ESRI's various installation documents are inconsistent, or intentionally vague, but do recognize that clients may use 3rd party load balancers .
IIS load balancer
We're using ESRI PowerShell DSC to install and configure ArcGIS Enterprise. Their templates include options to instantiate a virtual machine to act as a load balancer, using IIS rewrite rules and a server farm setup.
Azure options
Azure offers many load balancing solutions , and these products could be used as a load balancer options. These options are not directly supported by ESRI and the EGIS team would be responsible for supporting these options (with the help of our Microsoft colleagues).
Traffic manager
Azure traffic manager is a simple DNS switch that can route to a pool of public addresses. The priority and distribution of network traffic can be defined in many ways. A traffic manager is crucial in the implementation of disaster recovery and will be in place in our production instance.
Load balancer
Azure load balancer is used to distribute traffic to healthy VMs and is commonly used to balance traffic to VMs paired in availability sets . This is a simple load fairly similar to the VM IIS load balancing solution.
Application gateway
An application gateway is a more featured load balancing option that supports SSL offloading, multi-application routing, etc. This service would be similar to ESRI's web adaptor.
An application gateway will most likely be used by TC to manage traffic coming to all TC cloud apps and distribute them on to EGIS via a url similar to tc.gc.ca/egis.
Uptime SLAs
According to Azure's current SLA for virtual machines , a single VM has a guaranteed connectivity 99.9% of the time. SLA's for each of the azure services discussed are also available:
Comparing uptimes of each approach
Service | % Uptime | Minutes down/30days |
---|---|---|
VM | 99.9% | 43.83 |
App Gateway | 99.95% | 21.92 |
Load Balancer | 99.99% | 4.38 |
Traffic Manager | 99.99% | 4.38 |
-- | -- | -- |
Paired VMs [1] | 99.95% | 21.92 |
[1] Note, paired VMs spread across two Availability Zones have a 99.99% uptime SLA, but Availability Zones are not yet available in any Canadian region, although are scheduled for the CanadaCentral region in the near future. The current EGIS deployments makes use of Availability sets.
Uptime percentages were converted to downtimes using the table found on Wikipedia's page on high availability
Calculating an overall SLA
If we treat each tier independently, which is a worst-case assumption, we can calculate an overall system uptime by multiplying each component uptime ref . Downtime per month is calculated using http://www.slatools.com/sla-uptime-calculator. This yields an estimated overall uptime as follows:
TM | IIS LB | WEB | POR | GIS | DS | Overall Uptime | Overall downtime/month |
---|---|---|---|---|---|---|---|
99.99% | 99.90% | 99.95% | 99.95% | 99.95% | *99.90% | 99.64% | 155.5min |
TM | Azure LB | WEB | POR | GIS | DS | Overall Uptime | Overall downtime/month |
---|---|---|---|---|---|---|---|
99.99% | 99.99% | 99.95% | 99.95% | 99.95% | *99.90% | 99.73% | 117min |
* Since our DS0 machine currently hosts the file share used by paired portal and gis machines, we can’t consider it a load balanced resource
Case study
We've recently deployed a production-style instance in Bill's sandbox, TC-Sandbox-BXUeGIS-RG. This instance has a virtual machine load balancer in front of the two web machines, and also has a traffic manager and load balancer configured.
VM load balancer: https://tstweb.tcgis.ca/portal/home/
Azure load balancer: https://tst-tm.tcgis.ca/portal/home/
Open question
Given a Treasury Board preference to consider cloud-first, but ESRI's lack of specific support for 3rd party load balancers, which option do we choose? What do we need to do to feel comfortable with using an Azure load balancer?
Update The DSC templates given starting with the PowerShell Module ArcGIS v2 to reference an approach for a 3rd party load balancer. See the module wiki for a description of variable ExternalLoadBalancer.