Load Balancing and failover for replicated MySQL

Richard Benson09 August 2010Web, IT Proscomments
NB: At present you cannot request or share additional internal IP addresses with Rackspace Cloud, so you are going to have to use the external addresses.  With large databases this will incur additional bandwidth charges, be sure to evaluate this additional cost against the benefits provided by high availability.  When Rackspace allow the allocation of additional internal IPs, this article will be updated to reflect that.

Through two related tools and some cheap Rackspace Cloud servers, you can provide a front-end for your database that will balance between multiple replicated database servers and automatically failover if one of your balancers develops a fault.

This article will cover setting up heartbeat and pacemaker to handle the transfer of an IP from one machine to another on a failure and then another install of HAProxy on both boxes to balance the load between your database servers and cope with either of them failing.
As with all of these guides, the starting point is to create your Rackspace Cloud servers, spin up two Ubuntu 10.10 (maverick) instances, at the 256MB size for now, you can increase them later if required.  Name your instances appropriately, and note the IP addresses assigned to them by Rackspace.  Once your instances are running, create a new support ticket and request the first server has an additional IP address assigned to it and allow them to configure the server for you.  For the purposes of this guide, we will refer to these with the following conventions, replace with your own data where you see the following:

node-1 - External: 123.4.5.6 - Internal: 10.0.0.1
node-2 - External: 123.4.5.7 - Internal: 10.0.0.2
additional-ip - External: 123.4.5.8

We will also refer to your Master-Master MySQL nodes later on in the article:

db01 - 10.0.0.3
db02 - 10.0.0.4

Run the following commands on both servers to make sure they are up to date:

apt-get update
apt-get upgrade -y

Installing Heartbeat for High Availability

Due to the way that heartbeat works, you need to remove the new IP address that Rackspace configured for you from the server (we needed Rackspace to configure it so their systems were updated correctly).  Open the network configuration file:

nano /etc/network/interfaces
Find the lines that are similar to below, and comment them out by inserting a # at the start of the line:

auto eth0:1
iface eth0:1 inet static
    address 123.4.5.8
    netmask 255.255.255.0
Now you can install all of the required packages on both servers:

apt-get install heartbeat pacemaker -y
The first part of the configuration is to set up your authkeys files, which will enable the two heartbeat installs to talk to each other and will make sure that no-one else can join your cluster.

nano /etc/heartbeat/authkeys
You will have a blank file, just go ahead and add the following lines, replacing with an appropriate password.

auth 1
1 sha1 SomeSecurePassword
Exit out of nano (CTRL+X) and confirm you want to save the file, now you need to change the permissions on the file otherwise heartbeat won't trust it.

chmod 600 /etc/heartbeat/authkeys
Do the same on both servers, so that the authkeys files match.

The next file to configure is the actual configuration for heartbeat. This will contain very little information as we allow pacemaker to handle the real meat of the config.

On the first server open the file in nano, it will be blank:

nano /etc/heartbeat/ha.cf
Insert the following, an explanation of the lines is available on the Linux HA website, it is worth reviewing these before making this solution live to confirm that they are appropriate for your situation:

autojoin none
logfacility daemon
keepalive 2
deadtime 15
warntime 5
initdead 120
ucast eth1 10.0.0.2
node node-1
node node-2
use_logd yes
crm respawn
It is important that the "node" lines match the internal hostname of the servers, you can check this with "uname -n". With this file, there is only one line that needs to be changed for the second server. Create the file exactly as above, then change the "ucast" line to include the internal IP of node-1 e.g.:

ucast eth1 10.0.0.1
Once you have created both of these files on both servers, you can start the heartbeat service with the following command:

/etc/init.d/heartbeat restart
You now have heartbeat running, but you haven't yet told it about the IP address you want to share between the two servers, that is where pacemaker comes in.

Configure Pacemaker and Share the IP

Start by opening another ticket with Rackspace and have them share the IP between both servers, again have them configure it for you onto node-2 and again take out the reference to it in /etc/network/interfaces.  You can continue your configuration whilst they do this, and the IP will switch over but it won't be accessible on node-2 until they have completed their network configuration.

We will be using pacemaker to do all the required configuration, it's still called "crm" for historic reasons.  Open up the pacemaker configuration tool:

crm configure
You will be setting up what pacemaker refers to as a "resource".  These resources can be almost any service or application as well as an individual IP, you can even have heartbeat start services on the failover device when needed, but that is beyond the scope of this article, you will just be configuring a single IP to be shared.

Now enter your configuration in the crm shell, remembering to replace the IPs where appropriate.  You can also change the name of the resource to anything you like.  You only need to do this on one of your servers, pacemaker will automatically copy the configuration to all servers in the cluster.

primitive shared_ip_one IPaddr params ip=123.4.5.8 cidr_netmask="255.255.255.0" nic="eth0"
property stonith-enabled="false"
location share_ip_one_master shared_ip_one 100: node-1
monitor shared_ip_one 20s:10s
commit
exit
This has configured the resource "shared_ip_one" with the IP address allocated, told heartbeat to use node-1 as the primary server and told it to monitor the resource for availability.

You should now have a working heartbeat and pacemaker setup, to test run the following on node-2:

crm_mon
This should give you an output similar to the following:

============
Last updated: <DATE>
Stack: Heartbeat
Current DC: node-1 (ff8a0b30-adeb-40f3-8eb5-dc07a76603d2) - partition with quorum
Version: 1.0.9-unknown
2 Nodes configured, unknown expected votes
1 Resources configured.
============

Online: [ node-1 node-2 ]

shared_ip_one     (ocf::heartbeat:IPaddr):        Started node-1
Leave this screen open, start a ping running from your machine to 123.4.5.8 and then issue the following command onto node-1:

poweroff
This shuts down the server and therefore means that it can't respond to the pings anymore, however you should see your ping still working and the crm_mon screen on node-2 should now look similar to the following:

============
Last updated: <DATE>
Stack: Heartbeat
Current DC: node-2 (ff8a0b30-adeb-40f3-8eb5-dc07a76603d2) - partition with quorum
Version: 1.0.9-unknown
2 Nodes configured, unknown expected votes
1 Resources configured.
============

Online: [ node-2 ]
OFFLINE: [ node-1 ]

shared_ip_one     (ocf::heartbeat:IPaddr):        Started node-2
This shows that node-2 has now taken over the IP address.  To get node-1 started again, go into your Rackspace control panel and issue a hard reboot command to the server. Shortly afterwards you should see your crm_mon return to showing both nodes online with node-1 back in control.

Configure Load Balancing


If you've correctly configured your MySQL Master to Master Replication then you have two servers that you can connect to perform any database commands. Now you can also split the load between them and be able to automatically cope with either of them going down.

This stage is very similar to setting up the load balanced web service, and in fact the high availability detailed in this article can be used on your web balancers as well.

On both of these servers you need to install HAProxy and get it configured, so issue the following commands, for details refer to the web load balancing article.

apt-get install haproxy -y
cp /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg_orig
nano /etc/haproxyhaproxy.cfg
As before, leave the "global" and "defaults" sections as they are, and remove all the "listen" sections, then add the following:

listen mysql-cluster 123.4.5.8:3306
    mode tcp
    stats enable
    stats auth admin:<somepassword>
    balance roundrobin


    server db01 10.0.0.3:3306 check
    server db01 10.0.0.4:3306 check
In addition, you need to set HAProxy to start on when the system is booted:

nano /etc/default/haproxy
In this file, set "ENABLED" to "=1".  Then restart HAProxy and check that there are no errors in startup.

/etc/init.d/haproxy restart
Remember to make this configuration identical on both servers.

In any of your web applications that access the database, you need to use 123.4.5.8 as the server.

Conclusion

If you havve followed all the articles to this point, and possibly gone back and added heartbeat to your web load balancer, you should now have the resiliency to lose a server from any level of your infrastructure. Due to the relatively low costs of running cloud servers, having a number of servers running at one time should not cost you an arm and a leg.  Whilst you are now maintaining five or six servers, you are free to restart any of your servers for regular maintenance without taking your site down.

In the next article we will extend this infrastructure further by adding MySQL clustering, giving even more resiliency and speed.
comments powered by Disqus
Support Ticket
Remote Support
Support
clever girl