EnterMedia Clustering Setup

For high availability clusters we do recommend having at least three nodes running to avoid having a single point of failure or a split-brain issue. Having 3 nodes, if one node go down, your cluster will always have at least one master and one slave node up and running and a correct percentage of shards allocated.

EnterMedia recommended architecture includes two full EnterMedia nodes running on a single host sharing a data volume or storage resource and one Elastic-only node in a remote host that will also require access to the storage volume. For high availability we also recommend to replicate this architecture in a secondary data center, setting up EnterMedia’s Pulling Feature to keep both clusters in-sync.

EnterMedia Nodes

Set up and mount a data storage resource in the host machine. We recommend to link the mounted resource in the EnterMedia folder structure: /media/emsites/entermedianode/data. 

Get a copy of the EnterMedia Deploy script:

curl -o entermedia-docker.sh -jL docker.entermediadb.org

Uncomment the port configuration parameters in the Docker run command:

docker run -t -d \
...
-p 93$NODENUMBER:9300 \ 
-p 92$NODENUMBER:9200 \ 
...

This will add unique and custom transport and publish nodes to the ElasticSearch configuration. Default ElasticSearch communication ports are 9200 and 9300, a node built with node id 10 within this script will be configured to publish and transport on 9210 and 9310 ports.

*You need to set your node numbers below 100 to make the final custom ElasticSearch ports four number long.

Install 2 EnterMedia nodes sharing the same instance name but different node id:

$ sudo ./entermedia-docker.sh entermedianode 10
$ sudo ./entermedia-docker.sh entermedianode 20

Configure node.xml

Stop both nodes (/media/emsites/entermedianode/XX/stop.sh) and edit the node.xml file (/media/emsites/entermedianode/XX/tomcat/config/node.xml),  here you need to specify the other node IP and the local node publishing host and port. You need to repeat the changes on the 2 nodes node.xml file.

<node id="entermedianode10" name="Cluster Node 01">
 <property id="cluster.name">unique-cluster</property>
 <!--ELASTIC PATHS CONFIG--> 
 <property id="path.data">${webroot}/WEB-INF/elastic/${nodeid}/data</property>
 <property id="path.work">${webroot}/WEB-INF/elastic/${nodeid}/work</property>
 <!--CLUSTER CONFIG-->
 <property id="network.bind_host">0.0.0.0</property>
 <property id="discovery.zen.ping.unicast.hosts">192.168.100.100:9320</property>
 <property id="network.publish_host">192.168.100.100</property>
 <property id="transport.publish_port">9310</property>
</node>
  • network.bind_host – Network interface(s) a node should bind
  • network.publish_host – Single interface that the node advertises to other nodes in the cluster
  • discovery.zen.ping.unicast.hosts – Comma-separated list of the cluster nodes with custom port
  • transport.publish_port – Custom transport publishing port

Firewall

Download our custom IP tables rules config file:

$ cd && curl -o firewall-iptables.conf https://raw.githubusercontent.com/entermedia-community/entermediadb-docker/master/scripts/firewall-iptables.conf

Customize the allowed external IP(s) or IP range from the other nodes to allow communication on the standard ElasticSearch ports (9200/9300)

-A FILTERS -s 192.168.100.0/24 -j ACCEPT

*Please verify that the firewall rules will be compatible with any other services running in your host machine*

Once Docker starts you need to apply that Iptables rules with the following command:

$ sudo iptables-restore -n firewall-iptables.conf

Every time Docker service starts it will overwrites the Iptables rules, so you might need to find a way to automate this process to  run the above command at each Docker service restart.

Start the Cluster

Start each node individually:

$ /media/emsites/entermedianode/10/start.sh
$ /media/emsites/entermedianode/20/start.sh

And you can verify the startup sequence in the logs (/media/emsites/entermedianode/10/logs.sh) or run the status tools in the next section. At this point you will need to setup your NGINX to be able to make public your instances in the browser.

Cluster Status Tools

We provide a script with some basics ElasticSearch API endpoints. Run it to get some cluster information. Find the health.sh script in your scripts area (/media/emsites/node01/11/health.sh).

$ ./health.sh [nodes | allocation | shards]
  • nodes  – Provides full information about all the nodes in the cluster.
  • allocation – Provides a snapshot of how many shards are allocated to each data node and how much disk space they are using.
  • shards – Detailed view of what nodes contain which shards.

Critical information to monitor in the default health.sh command output are the status, should be always green, and the active_shards_percent_as_number, should be 100%.

{
  "cluster_name" : "unique-cluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 50,
  "active_shards" : 100,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

Elastic-only Node (Data Node)

Now follow the instructions to setup an Elastic-only Node for your cluster.