Clustering FiFo

Application Types

FiFo is built to maintain the highest possible uptime: this means all components fall into one of two categories:


Stateless

Maintaining availability in stateless applications is pretty simple, it's 'the cloud way': you can fire up multiple instances and since they don't need to share state, a failing instance can simply be replaced by a new one or discarded altogether.

The following components are stateless:

To maintain availability, you will have to spawn at least two of each and put them behind a load balancer. In the current version this task has to be performed by the user.


Cluster/Distribution

Some components can't be stateless since they need to maintain some kind of data. To achieve a high level of availability these components run in a clustered/distributed mode. FiFo uses riak core as a distribution layer. This means that it follows an eventual consistent model and utilises all nodes that are present in the cluster which allows FiFo to survive single nodes failing as long as a minimum number of nodes is present.

The following components are based on riak core:

Due to these services being based on riak core it is recommended to run at least five instances of this service on different hardware machines to guarantee continual uptime. This cluster is called a ring in riak terms.

πŸ“˜

This has to do with the dynamo paper for those interested.


Chunter is a special case since it is directly tied to a hardware node which makes high availability irrelevant.


Sniffle / Snarl / Howl

Changing the cluster

πŸ“˜

If the cluster is set up before the system goes into production there is not much to worry about. However there are situations where it is required to extend or shrink the cluster during production. Generally there is nothing that prevents extending or shrinking a cluster during production but it requires the user to perform an additional step.

🚧

To prevent interruptions it is necessary to disable the mDNS on the nodes that you want to add or remove. Otherwise other services will think these nodes are valid parts of the cluster and therefore try to access them. To disable mDNS the configuration option mdns.server = disabled needs to be added to the Sniffle Configuration File, Snarl Configuration File or Howl Configuration File depending on the service added or removed. This should be done before the services are first started or removed / leaves.

In the following section the existing server will be named sniffle@10.0.0.1 and the new server will be named sniffle@10.0.0.2. If multiple servers exist it does not matter which one is picked. Also keep in mind that for working with a Howl or Snarl cluster the section before the @ needs to be replaced with howl or snarl respectively.


New Cluster Node Preparation

Prepare a new FiFo Node as described in the Installation section.

πŸ“˜

The first FiFo Node or existing FiFo Cluster will likely already contain an admin account so that the last step of the Installation section is not needed. If additional admin accounts are created it will not fail the join operation. A clean up of redundancies can be performed within the UI or using the fifoadm command.


Adding a server

To add a server the following steps need to be taken (sniffle-admin is used as an example):

  • on the existing server check the server status with ring status and member status.
  • on all the new servers join the existing server with the command sniffle-admin cluster join sniffle@10.0.0.1.
  • on the existing server check the planned operations by executing sniffle-admin cluster plan.
  • when all new servers are in joining stage execute sniffle-admin cluster commit on the existing server.
  • on the existing server re check the server status with ring-status and member-status commands.

Exact descriptions of all commands can be found in the Sniffle Sniffle Administration Guide, Snarl Administration Guide and Howl Administration Guide.


Removing a server

Removing a server is as simple as adding one. To ensure it is not picked up again after it is removed the line mdns.server = disabled should be added to the configuration file of the server you want to remove.

  • on the existing server check the server status with sniffle-admin ring-status and sniffle-admin member-status.
  • on the leaving server run sniffle-admin cluster leave.
  • on the existing server re check the server status with ring-status and member-status commands.
  • once the server is removed it is restarted and can be shut down using the svcadm disable sniffle command.