Clustering FiFo
Application Types
FiFo is built to maintain the highest possible uptime: this means all components fall into one of two categories:
Stateless
Maintaining availability in stateless applications is pretty simple, it's 'the cloud way': you can fire up multiple instances and since they don't need to share state, a failing instance can simply be replaced by a new one or discarded altogether.
The following components are stateless:
To maintain availability, you will have to spawn at least two of each and put them behind a load balancer. In the current version this task has to be performed by the user.
Cluster/Distribution
Some components can't be stateless since they need to maintain some kind of data. To achieve a high level of availability these components run in a clustered/distributed mode. FiFo uses riak core as a distribution layer. This means that it follows an eventual consistent model and utilises all nodes that are present in the cluster which allows FiFo to survive single nodes failing as long as a minimum number of nodes is present.
The following components are based on riak core:
Due to these services being based on riak core it is recommended to run at least five instances of this service on different hardware machines to guarantee continual uptime. This cluster is called a ring in riak terms.
This has to do with the dynamo paper for those interested.
Chunter is a special case since it is directly tied to a hardware node which makes high availability irrelevant.
Sniffle / Snarl / Howl
Changing the cluster
If the cluster is set up before the system goes into production there is not much to worry about. However there are situations where it is required to extend or shrink the cluster during production. Generally there is nothing that prevents extending or shrinking a cluster during production but it requires the user to perform an additional step.
To prevent interruptions it is necessary to disable the
mDNS
on the nodes that you want to add or remove. Otherwise other services will think these nodes are valid parts of the cluster and therefore try to access them. To disablemDNS
the configuration optionmdns.server = disabled
needs to be added to the Sniffle Configuration File, Snarl Configuration File or Howl Configuration File depending on the service added or removed. This should be done before the services are first started or removed / leaves.
In the following section the existing server will be named sniffle@10.0.0.1
and the new server will be named sniffle@10.0.0.2
. If multiple servers exist it does not matter which one is picked. Also keep in mind that for working with a Howl or Snarl cluster the section before the @
needs to be replaced with howl
or snarl
respectively.
New Cluster Node Preparation
Prepare a new FiFo Node as described in the Installation section.
The first FiFo Node or existing FiFo Cluster will likely already contain an admin account so that the last step of the Installation section is not needed. If additional admin accounts are created it will not fail the join operation. A clean up of redundancies can be performed within the UI or using the
fifoadm
command.
Adding a server
To add a server the following steps need to be taken (sniffle-admin is used as an example):
- on the existing server check the server status with
ring status
andmember status
. - on all the new servers join the existing server with the command
sniffle-admin cluster join sniffle@10.0.0.1
. - on the existing server check the planned operations by executing
sniffle-admin cluster plan
. - when all new servers are in
joining
stage executesniffle-admin cluster commit
on the existing server. - on the existing server re check the server status with
ring-status
andmember-status
commands.
Exact descriptions of all commands can be found in the Sniffle Sniffle Administration Guide, Snarl Administration Guide and Howl Administration Guide.
Removing a server
Removing a server is as simple as adding one. To ensure it is not picked up again after it is removed the line mdns.server = disabled
should be added to the configuration file of the server you want to remove.
- on the existing server check the server status with
sniffle-admin ring-status
andsniffle-admin member-status
. - on the leaving server run
sniffle-admin cluster leave
. - on the existing server re check the server status with
ring-status
andmember-status
commands. - once the server is removed it is restarted and can be shut down using the
svcadm disable sniffle
command.
Updated less than a minute ago