Proxying and Load Balancing OpenSpecimen

The purpose of this page is to explain how to setup software load balancer and reverse proxy for 2 or more OpenSpecimen instances using Apache web server. Load balancer helps in achieving high throughput, minimising response times and better utilisation of resources. A by-product of setting up load balancer is to increase reliability and availability of OpenSpecimen by ensuring there is no single point of failure (SPOF).

OpenSpecimen is a stateless application and hence there is no intrinsic limitation in achieving desired levels of scalability and availability.

Pre-requisites

  1. Apache2 web server with following modules enabled/installed:
    1. mod_proxy, for proxying/forwarding requests
    2. mod_proxy_balancer, for implementing load balancing 
    3. mod_proxy_ajp, for reverse proxying to Tomcat/backend servers using AJP
    4. mod_status, for server activity monitoring
  2. Minimum of 3 machines. One machine with Apache2 web server installed. Other 2 machines with OpenSpecimen application deployed.

Topology

The configuration explained in next section helps users (system administrators) setup topology as shown in below diagram.

This setup has one instance of Apache2, load balancing requests among 2 instances of OpenSpecimen on round-robin basis. Individual requests can be load balanced without worrying about where the user login session was initiated. This is possible because of OpenSpecimen’s stateless architecture that prohibits storing user state information in-memory.

Steps

  1. Configure virtual host. Place virtual host configuration at bottom of apache2 web server's configuration file httpd.conf. This file is usually located in directory /etc/httpd/conf.d/ on Linux systems.
  2. Alternatively, virtual host configuration can be placed in separate file e.g. opespecimen-vhost.conf and including the configuration from httpd.conf file using below directive
    1. Include <path-to-dir>/openspecimen-vhost.conf
  3. On doing appropriate configuration, restart apache2 web server
    1. sudo apachectl graceful

Configuration

#
# Receive requests from all IP addresses on port 80
#
<VirtualHost *:80>
  #
  # Hostname used to access OpenSpecimen services
  #
  ServerName test.openspecimen.org

  #
  # Do not use as forward proxy server
  #	
  ProxyRequests off

  #
  # List OpenSpecimen instances used for load balancing
  # 
  <Proxy balancer://os-cluster>
    #
    # OpenSpecimen instance1
    # Use AJP for communication between Apache2 and OpenSpecimen app
    #
    BalancerMember ajp://os1.internal.openspecimen.org:8009/openspecimen/

    #
    # OpenSpecimen instance 2
    # Use AJP for communication between Apache2 and OpenSpecimen app
    #
    BalancerMember ajp://os2.internal.openspecimen.org:8009/openspecimen/

    #
    # Requests from everyone is accepted
    #
    Require all granted

    #
    # Load balance requests between instances of OpenSpecimen 
    # using round robin mechanism. This means each instance shares
    # load equally. In this example, each instance shares 50% of overall
    # load. 
    #
    ProxySet lbmethod=byrequests
  </Proxy>

  #
  # View load balancing statistics
  #
  <Location /balancer-manager>
    SetHandler balancer-manager

    # Allow access only from internal hosts
    Require host admin.internal.openspecimen.org
  </Location>

  #
  # Do not proxy balance-manger tool as it should be used
  # from within internal network
  #
  ProxyPass /balancer-manager !

  #
  # Forward all requests received on context path /openspecimen to
  # appropriate OpenSpecimen instance using load balancer
  #
  ProxyPass /openspecimen balancer://os-cluster/
  ProxyPassRervese /openspecimen balancer://os-cluster
</VirtualHost>