Appendix E. Advanced Nodes configuration - Node Instances Synchronization
Sentinet allows to load-balance the virtual services by supporting multiple Node instances -- Node instances deployed on the different machines or web sites, but having the same Node identifier (Node Key) and the same configuration. Each Node instance is physically isolated from each other, i.e. it directly communicates with the Repository Web Application’s services and keeps its own copy of the configuration. When configuration changes in the Sentinet Repository, each load-balanced Node instance downloads it, stores it in its local persistent cache, and starts or restarts the affected virtual services. In most cases load-balanced Nodes do not have to be synchronized and can be updated in "live" environments one at a time as long as at least one Node instance is always available for the load-balancer. This process requires manual configuration of the load-balancer (taking Node instances ON and OFF load-balancer rotation) during live update of the Sentinet Nodes. Sentinet offers advanced Nodes configuration options to allow external load-balancer to be configured with automated exclusion/inclusion of the Sentinet Node instances during live updates. This advanced Nodes configuration is called Node Instances Synchronization. Below is the description of the Node Instances Synchronization solution and its process.
The solution assumes that there are two or more Node instances deployed on the same or different servers. These instances are exposed via a software or hardware load balancer that forwards incoming requests between the servers. Each Node instance implements and actively hosts an HTTP(S) GET health status endpoint at a predefined well-known address described in this document. The load balancer may detect an instance health by periodically calling the health status endpoint on each node instance and analyzing its response. If instance is considered unhealthy, it is excluded from the load-balancer rotation.
Step 1. Once Node instance detects a configuration change, it retrieves new configuration, and stores it in its local cache (in a server file system).
Step 2. Node instance makes sure that no other instance is currently re-hosting (updating its runtime configuration and restarting its service hosts). It does that by waiting and eventually acquiring a configuration update lock. The lock is implemented as a file in a shared folder.
Step 3. Node instance acquires a lock and sets its health status to "unhealthy". This means that health status endpoint begins to return an error code or stops responding completely.
Step 4. Node instance waits for X time which is longer than load-balancer's heath check time period. This is needed to make sure that instance is marked "unhealthy" and removed from the load-balancer rotation.
Step 5. Node instance recycles or re-starts the updated service hosts.
Step 6. Once node instance goes back to normal, it sets its health status to "healthy" which is advertised by the Node instance health status endpoint.
Step 7. Node instance waits X time again to make sure that instance is marked "healthy" and added to the load-balancer rotation.
Step 8. Node instance releases the configuration update lock (deletes the lock file).
Step 9. The next load-balanced Node instance in line acquires the lock and performs the rehosting procedure as described above.
By default, the configuration update synchronization is disabled in Sentinet. Since it adds a delay to configuration update time, it shall not be enabled for a single Node instance deployment, or if a load-balancer does not support (or is not configured with) the health status checks.
To enable Nodes Synchronization feature the following configuration needs to be added to the Node's web.config file:
- Health status endpoint HTTP handler (enabled by default)
<configuration>
…
<system.webServer>
<handlers>
…
<add name="HealthStatusHandler" path="health-status.axd" verb="GET"
type="Nevatech.Vsb.Runtime.Hosting.HealthStatusHandler, Nevatech.Vsb.Runtime, Version=5.0.0.0,
Culture=neutral, PublicKeyToken=f35d905149dcb6c0" resourceType="Unspecified"
preCondition="integratedMode"/>
</handlers>
</system.webServer>
…
</configuration>
Node instance health check endpoint can be reached at <node base address>/health-status.axd address. The endpoint returns HTTP 200 status code with the body content "OK" to indicate the healthy state and HTTP 500 status code with the body content "OUT OF ORDER" to indicate the unhealthy state.
- Node instance synchronization configuration (disabled by default).
<configuration>
…
<nevatech.vsb.runtime>
…
<instanceSynchronization enabled="true" maxLockWaitTime="300" waitTimeBeforeUpdate="60"
waitTimeAfterUpdate="60" syncFilePath="" />
…
</nevatech.vsb.runtime>
</configuration>
The "instanceSynchronization" element exposes additional attributes to finetune the synchronization behavior for a custom environment.
"enabled" (true/false) -- turns on or off the feature. If feature is turned off, the configuration updates are performed immediately without any synchronization (legacy behavior).
"maxLockWaitTime" (seconds) -- the period of time to wait for a synchronization lock acquisition (step 2). If the lock cannot be acquired gracefully during the specified time, it will be forcibly taken. If property is set to 0 then lock will never be taken forcibly. This parameter helps to prevent the instance hanging in cases when a configuration lock may be abandoned because of a system crash. Default is 5 minutes.
"waitTimeBeforeUpdate" (seconds) - the time period between setting the instance status to "Unhealthy" and applying configuration updates (step 4). This value must be equal or greater than the load balancer's health check interval. Default is 1 minute.
"waitTimeAfterUpdate" (seconds) - the time period between setting the instance status to "Healthy" and releasing the synchronization lock (step 7). This value must be equal or greater than the load balancer's health check interval. Default is 1 minute.
"syncFilePath" - the path to a folder or a file share where synchronization lock file will be created. Node's IIS Application Pool process must have read/write permissions to this folder or file share. If this attribute is empty or not specified (not recommended) then the lock file is created in the Node's configuration cache folder (default).
“syncFileName” – custom name of the lock file. If this attribute is empty or not specified, then default name will be used (which is derived from the Node Key).
Note that instance synchronization does work for both Node configuration Recycling modes: Service Isolation and Node Recycling (see Sentinet User Guide for more details on Recycling modes). For performance reasons it is recommend to set Node Recycling mode to Service Isolation when Node Instances Synchronization feature is turned on, because there is no benefit in restarting IIS Application Pool in this scenario.