High Availability (HA) Configuration

OpenSpecimen can be deployed in a clustered setup to facilitate high availability, high throughput processing of requests. Typically, OpenSpecimen nodes are deployed behind a load balancer/proxy which delegates the user requests to appropriate node based on the load, availability etc.

The configuration of OpenSpecimen cluster involves letting each node know about the presence of other nodes in the cluster. This is required to propagate the localised knowledge, like updates to cached metadata, of the actions to all the cluster nodes. This ensures all the cluster nodes have the same knowledge of the metadata and data, and behave exactly the same way in deterministic manner. The consequence of this is, requests can be routed to any node using any strategy that fits the user traffic pattern. This helps in achieving scalability.

Naming Nodes

As a first step, every node in the cluster should be given an unique name. This is to identify the nodes in the cluster. The name should not contain any whitespace characters. The node name should be assigned, preferably, only once and should not be changed thereafter.

Edit openspecimen.properties file located in $TOMCAT_HOME/conf directory.
Add a property node.name as below:
# VM1 - /usr/local/openspecimen/tomcat/conf/openspecimen.properties node.name=lion # VM2 - /usr/local/openspecimen/tomcat/conf/openspecimen.properties node.name=panther
Restart OpenSpecimen
Ensure the log files are created using os.{node.name}.log pattern. E.g. os.lion.log, os.panther.log

Data Directory

All the cluster nodes should share the same data directory. There will be only one data directory shared across all the nodes in the cluster using any file sharing mechanism like SAMBA, NFS etc. All the nodes should have read/write access to the data directory.

Cluster Setup

Navigate to Home → Settings → Search for Cluster
Upload a JSON file like below:
{ "notifTimeout": 60, "notifErrorRcpts": ["john.doe@krishagni.com", "jane.doe@krishagni.com"], "secret": "TopSecret!@3", "nodes": [ { "name": "lion", "url": "http://10.0.1.1:8080/openspecimen/" }, { "name": "panther", "url": "http://10.0.1.2:8080/openspecimen/" } ] }

Attribute	Description

Attribute	Description
nodes	The list of OpenSpecimen nodes in the cluster. The name attribute identifies the node (same as that specified in openspecimen.properties). The URL attribute specifies the HTTP URL of the node's Tomcat to use for sending the cluster related events/notifications.
secret	A secret known only to the OpenSpecimen nodes. Used for trusted communication between the nodes in the cluster.
notifTimeout	Max. amount of time, in seconds, that a node waits for receiving the acknowledgement from the other nodes in the cluster for its broadcast event.
notifErrorRcpts	Comma separated list of email IDs to whom the cluster error notifications are sent.

Startup Sequence

The preferred way to startup the nodes in cluster is to start one node at a time, ensure the node is up and functional, and then move on to the next node in the cluster. Do not attempt to start all nodes of the cluster at the same time or concurrently.

Upgrading

A small amount of downtime is required for smooth upgrade of OpenSpecimen. When upgrading, bring down all the nodes. Upgrade one node at a time, ensure the node is up and functional, and then move on to upgrade the next node in the cluster. Do not attempt to run multiple versions of OpenSpecimen against the same database schema. Otherwise, the behaviour is unspecified.