Howto setup High-Available HAProxy with Keepalived

Summary

This blog post explains how to setup HAProxy and Keepalived for use with for example the MariaDB Galera cluster setup described here.

The combination of HAProxy and Keepalived will make the MariaDB Galera cluster more resilient and high-available by adding load balancing and transparent failover when a MariaDB cluster node goes down. HAProxy also allows you to separate read and write actions to the MariaDB Galera cluster which is very useful if you want to loadbalance read actions across all MariaDB nodes but want to write to only one MariaDB node.

Since only one of anything is a Single Point of Failure (SPoF) I’ll be using two HAProxy nodes in a high-availability setup. Should the primary HAProxy node fail then Keepalived will initiate an automatic and transparent failover to the secondary HAProxy node.

Requirements

This tutorial requires the MariaDB Galera cluster setup mentioned above. You will also need 2 nodes (VMs or bare metal) called haproxy1 and haproxy2 each with one IP address. In addition you will also need a third IP address that will be used as the floating Virtual IP (VIP) address between the two HAProxy nodes. The HAProxy nodes don’t need a lot of power, memory or storage. One CPU with 512MB to 1GB memory and an 8GB disk are fine.

Let’s get started.

Install HAProxy and Keepalived on both HAProxy nodes

On haproxy1 install HAProxy and Keepalived.

Now make sure both services start at boot:

Check if both services are properly activated:

That looks good so on to the next node.

Do the same on haproxy2:

Now make sure both services start at boot:

Check if both services are properly activated:

That looks good so on to the next step.

Allow non-local Virtual IPs on all HAProxy nodes

Make binding to non-local Virtual IPs on all nodes:

Add HAProxy user to the MariaDB database

Make sure a HAProxy user exists on all MAriaDB nodes so HAProxy can access all the MariaDB nodes to check if they are still up. I’m using the user ‘haproxy’ and use the IPs of all the ethernet interfaces on the two HAProxy nodes (not the VIP address):

Since this information is automatically replicated to db2 and db3 we only
have to do this on db1. You can check if the information was replicated
with:

That looks good. On to the next step.

Setup rsyslog logging for HAProxy

By default HAProxy does not do any logging which obviously needs fixing. To enable HAProxy to log to rsyslog we need to add a rsyslog config file on both HAProxy nodes.

On node haproxy1:

Make sure the SELinux label is ok:

And restart the rsyslog service:

On node haproxy2:

Make sure the SELinux label is ok:

And restart the rsyslog service:

Add HAProxy config

HAProxy is extremely flexible and you can make it do a ton of cool things. Check out the HAProxy website for a taste what it’s capable of. The goal in this tutorial is to put the two HAProxy nodes in front of the MariaDB cluster nodes to loadbalance and when a MariaDB node fails to transparently redirect DB queries to the remaining operational MariaDB nodes.

On node haproxy1 first save the default configuration:

On node haproxy1 create the HAProxy configuration file:

Make sure the SELinux label is ok:

Now let’s do the same for the second haproxy node.

On node haproxy2 first save the default configuration:

On node haproxy2 create the HAProxy configuration file again or copy it from node haproxy1 as they are the same.

Make sure the SELinux label is ok:

Understanding the configuration of a HAProxy node is best done with the HAProxy docs. Look up each settings to get an understanding of what they mean.

The two sections called listen mariadb-galera-writes and listen mariadb-galera-reads are the importants ones.

The listen mariadb-galera-writes section has a check if the MariaDB nodes are still available and defines three MariaDB nodes with only one active and two as backup nodes. Should one MariaDB node fail then HAProxy will automatically ignore it and send write queries to the first backup node (db2 in this case). So this section limits writing to any database on the cluster to a single MariaDB node.

The listen mariadb-galera-reads section also has a check if the MariaDB nodes are still available. Since all three MariaDB nodes are active, all three MariaDB nodes can receive read queries. Should one MariaDB node fail then HAProxy will automatically ignore it and send queries to the other two MariaDB nodes where priority is given to the MariaDB node with the least connections. So this section allows reading from all three MariaDB nodes.

Make HAProxy high-available with Keepalived

What is Keepalived? Paraphrased from the Keepalive website: Keepalived provides simple and robust facilities for loadbalancing and high-availability to Linux system and Linux based infrastructures. Loadbalancing framework relies on the well-known and widely used Linux Virtual Server (IPVS) kernel module providing Layer4 loadbalancing while high-availability is achieved by the VRRP protocol.

Add Keepalived config

On node haproxy1 first save the default Keepalived configuration:

On node haproxy1 create the Keepalived configuration file:

Important: change the following parameters (if required) for your environment:

notification_email:
Change the ‘cluster-admin@example.org’ email address to the email address where you want the notifications to be sent.

notification_email_from:
Change the ‘haproxy-noreply@example.org’ email address to the your preferred ‘From:’ value.

smtp_server:
Change smtp.example.org to the DNS name or IP address of the MTA that your haproxy nodes use.

interface:
If the Virtual IP address (VIP) is not attached to eth0 then change it to the interface where it is attached.

auth_pass:
Change the ‘‘ password to a proper password of 8 characters.

virtual_ipaddress:
Change the VIP from ‘10.0.0.25’ to the Virtual IP address reserved for HAProxy.

Make sure the SELinux label is ok:

Let’s setup the Keepalived config on the second haproxy node (haproxy2):

On node haproxy2 first save the default Keepalived configuration:

On node haproxy2 create the Keepalived configuration file:

Make sure the SELinux label is ok:

Important – change the following parameters for your environment (if required) as you previously did for node haproxy1:

– notification_email
– notification_email_from
– smtp_server
– interface
– virtual_router_id (which should be the same on both nodes)
– auth_pass (which should be the same on both nodes)
– virtual_ipaddress (which should be the same on both nodes)

Important – note the differences and similarities between the Keepalived config file on haproxy1 and haproxy2:

Keepalived Priorities and Weight or How Failover Works

haproxy1 has a base priority of 101 + a weight of 2 when HAProxy is running = 103
haproxy2 has a base priority of 100 + a weight of 2 when HAProxy is running = 102

Since haproxy1’s 103 is bigger than haproxy2’s 102 haproxy2 is not allowed to initiate a failover and grab the VIP.

Here’s what happens when HAProxy goes down on node haproxy1?

haproxy1 has a base priority of 101 + 0 because HAProxy is no longer running = 101
haproxy2 has a base priority of 100 + a weight of 2 because HAProxy is running = 102

Since haproxy1’s 101 is smaller than haproxy2’s 102 haproxy2 is allowed to initiate a failover and grab the VIP.

ProTip: check the priority that the ACTIVE node is broadcasting by sniffing the traffic with the following tcpdump command:

Firewall Setup: Enable Multicast in IPtables

By default Keepalived uses multicast to send out its VRRP advertisements. You can configure Keepalived to use Unicast too if you prefer that. Here is an example multicast rule to allow multicast packets between the two haproxy nodes:

Since Keepalived is using vrrp.mcast.net (224.0.0.18) you could further tighten the rule above with:

As an example here is the first rule applied to the default CentOS 6 firewall:

And if you want to manually add those rules and add a rule for protocol 112 (VRRP), here are the commands that you could use on both nodes:

Change your interface and tighten the source and/or destination networks to your requirements.

Troubleshooting Multicast

Here are some commands which show you the status of various interfaces and multicast groups those interfaces are member of:

ProTip: if multicast is disabled (but enabled in the kernel) you can manually enable it with the following command:

How to setup MariaDB Galera Cluster 10.0 on CentOS

Introduction

MariaDB is a drop-in replacement for MySQL. MariaDB Galera Cluster is a synchronous multi-master cluster for MariaDB. It is available on Linux only, and only supports the XtraDB/InnoDB storage engines.

This blog post explains step by step how to setup MariaDB Galera Cluster 10.0 with 3 nodes running on CentOS 6.5 x86_64 resulting in a HA (high-availability) database cluster.

Cluster details

I’m using 3 freshly deployed VMs running a minimal install of CentOS 6.5 x86_64.

Cluster node 1 has hostname db1 and IP address 10.0.0.9
Cluster node 2 has hostname db2 and IP address 10.0.0.11
Cluster node 3 has hostname db3 and IP address 10.0.0.13

Step 1 – set SELinux in permissive mode

It’s currently not possible to have SELinux in enforcing mode because there are no policies (yet) for MariaDB Galera Cluster 10.0. I have filed BZ1117233. Hopefully the SELinux developers will soon add proper policies. For now put SELinux into permissive mode on all nodes:

Update: SELinux policies for MariaDB Galera 10 will be in RHEL 6.6 and CentOS 6.6.

Step 2 – configure the firewall

MariaDB Galera Cluster on each cluster node must be able to talk to MariaDB Galera Cluster on the other nodes so we need to open some firewall ports. Execute the following commands on all nodes:

Please not that these rules allow any host on your local IP network to access the database nodes. For production make it more secure by specifying the source IP range from which connections to the database nodes are accepted.

Step 3 – install the MariaDB repository

I’m using the MariaDB Galera Cluster RPM packages from mariadb.org. Add the MariaDB RPM repository by executing the following command on all nodes:

Step 4 – install MariaDB Galera Cluster 10.0 software

If you did a CentOS 6 minimal installation then make sure you install the socat package from the EPEL repository before proceeding with installing the MariaDB Galera Cluster 10.0 software. You can find the socat package at http://dl.fedoraproject.org/pub/epel/6/x86_64/repoview/socat.html.
Either install it directly from EPEL with the following command (for x86_64):

Or install the EPEL repository and then install it the regular way:

On CentOS 7 the socat package is part of the distro and is installed as part of the minimal install so should already be there. If it’s not then just install it with:

Install the MariaDB Galera Cluster 10.0 software by executing the following command on all nodes:

Step 5 – setup MariaDB security

The default installation is not very secure so let’s improve that. Start the mysql service (yes the init script in MariaDB 10.0 is still called mysql) on all nodes:

Next we need to run the mysql_secure_installation script so we can improve the security. Execute the following command on all nodes:

For this tutorial I changed the password to ‘dbpass’ and accepted all defaults (so answered yes to all questions).

Step 6 – setup MariaDB Galera Cluster users

Now we need to setup some users that must be able to access the database. The ‘sst_user’ is the user which a database node will use for authenticating to another database node in the State Transfer Snapshot (SST) phase. Execute the following command on all nodes:

Please not that the ‘%’ means that the root or sst_user is allowed to access the database from any host. For production make it more secure by specifying the hostname(s) or IP addresses from which those users can access the database.

Step 7 – create the MariaDB Galera Cluster config

First stop the mysql services on all nodes:

Now let’s create the MariaDB Galera Cluster configuration by executing the following command on all nodes (read the IMPORTANT NOTE after the config about required changes for db2, and db3):

IMPORTANT NOTE: when executing this command on db2 and db3 do not forget to adjust the wsrep_node_address and wsrep_node_name variables.

On db2 they should be:

On db3 they should be:

Step 8 – initialize the first cluster node

On node db1 only start MariaDB with the special ‘‐‐wsrep-new-cluster’ option so the primary node of the cluster is initialized:

Check if everything went ok by executing the following command on db1 only:

And the output:

The important information are the following lines:

There is also a lot of information in the logfile which is located in /var/lib/mysql and it’s called db1.<FQDN>.err. Here are the last couple of lines after initializing db1:

Step 9 – add the other cluster nodes

Make 100% sure that both nodes db2 and db3 have the correct configuration in /etc/my.cnf.d/server.cnf under the [mariadb-10.0] heading as described in step 7.

With the correct configuration in place, all that is required to make db2 and db3 a member of the cluster is to start them like you would start any regular service. On db2 issue the following command:

On node db2 check /var/lib/mysql/db2.<FQDN>.err and you should see:

And on node db1 check /var/lib/mysql/db1.<FQDN>.err and you should see:

Check what has changed in the cluster status by executing the following command on db1 or db2:

And you should see that node db2 is now known as the cluster size is ‘2’ and the IP address of node db2 is listed:

Repeat the same step for node db3. On node db3 only execute the following command:

On node db3 check /var/lib/mysql/db3.<FQDN>.err and you should see something like this:

Check what has changed in the cluster status by executing the following command on for example db1:

And you should see that node db3 is now known as the cluster size is ‘3’ and the IP address of node db3 is listed:

Step 10 – verify replication

The cluster is now up and running. Let’s check if it’s actually working. On db1 create a database ‘clustertest’ by issuing the following command:

Check if the database, table and data exists:

That looks good. Now do the check on node db2 to see if it good replicated:

Everything was successfully replicated from db1 to db2. Now do the same check on node db3:

As you can see everything was successfully replicated by node db1 across all other nodes.

Step 11 – starting a cluster that is completely down

If the entire cluster (each node) is down then this is how you start the cluster.

First decide which node you consider the primary node. Normally that would be the node with the database which you consider (the most) correct. Then start that node with the following command. Note that the command will not touch or delete data. It will just start the first node as the primary cluster node.

Once the primary node is successfully started you can start the other nodes in the cluster with the regular command:

And it’s probably a good idea to check if the cluster is fully operational by looking on all cluster nodes at the output of this command:

Step 12 – MariaDB Galera limitations

MariaDB Galera Cluster software has some known limitations. Make sure to read them.

Severalnines has an excellent explanation what those limitations mean in practical terms and how they can be dealt with in their “Avoiding Deadlocks in Galera…” article.

A workaround to prevent the rollbacks and deadlocks is to use a single node for high concurrency writes and send updates to only that node while using multiple nodes for reads. This can be realized by adding HAProxy & Keepalived to the setup described in this tutorial and is also available on this blog.