At present you cannot request or share additional internal IP addresses with Rackspace Cloud, so you are going to have to use the external addresses. With large databases this will incur additional bandwidth charges, be sure to evaluate this additional cost against the benefits provided by high availability.
, this article will be updated to reflect that.
Through two related tools and some cheap Rackspace Cloud servers, you can provide a front-end for your database that will balance between multiple replicated database servers and automatically failover if one of your balancers develops a fault.
This article will cover setting up heartbeat and pacemaker to handle the transfer of an IP from one machine to another on a failure and then another install of HAProxy on both boxes to balance the load between your database servers and cope with either of them failing.
As with all of these guides, the starting point is to create your Rackspace Cloud servers, spin up two Ubuntu 10.10 (maverick) instances, at the 256MB size for now, you can increase them later if required. Name your instances appropriately, and note the IP addresses assigned to them by Rackspace. Once your instances are running, create a new support ticket and request the first server has an additional IP address assigned to it and allow them to configure the server for you. For the purposes of this guide, we will refer to these with the following conventions, replace with your own data where you see the following:
Run the following commands on both servers to make sure they are up to date:
apt-get upgrade -y
Installing Heartbeat for High Availability
Due to the way that heartbeat works, you need to remove the new IP address that Rackspace configured for you from the server (we needed Rackspace to configure it so their systems were updated correctly). Open the network configuration file:
Find the lines that are similar to below, and comment them out by inserting a # at the start of the line:
iface eth0:1 inet static
Now you can install all of the required packages on both servers:
apt-get install heartbeat pacemaker -y
The first part of the configuration is to set up your authkeys files, which will enable the two heartbeat installs to talk to each other and will make sure that no-one else can join your cluster.
You will have a blank file, just go ahead and add the following lines, replacing with an appropriate password.
1 sha1 SomeSecurePassword
Exit out of nano (CTRL+X) and confirm you want to save the file, now you need to change the permissions on the file otherwise heartbeat won't trust it.
chmod 600 /etc/heartbeat/authkeys
Do the same on both servers, so that the authkeys files match.
The next file to configure is the actual configuration for heartbeat. This will contain very little information as we allow pacemaker to handle the real meat of the config.
On the first server open the file in nano, it will be blank:
ucast eth1 10.0.0.2
It is important that the "node" lines match the internal hostname of the servers, you can check this with "uname -n". With this file, there is only one line that needs to be changed for the second server. Create the file exactly as above, then change the "ucast" line to include the internal IP of node-1 e.g.:
ucast eth1 10.0.0.1
Once you have created both of these files on both servers, you can start the heartbeat service with the following command:
You now have heartbeat running, but you haven't yet told it about the IP address you want to share between the two servers, that is where pacemaker comes in.
Configure Pacemaker and Share the IP
Start by opening another ticket with Rackspace and have them share the IP between both servers, again have them configure it for you onto node-2 and again take out the reference to it in /etc/network/interfaces. You can continue your configuration whilst they do this, and the IP will switch over but it won't be accessible on node-2 until they have completed their network configuration.
We will be using pacemaker to do all the required configuration, it's still called "crm" for historic reasons. Open up the pacemaker configuration tool:
You will be setting up what pacemaker refers to as a "resource". These resources can be almost any service or application as well as an individual IP, you can even have heartbeat start services on the failover device when needed, but that is beyond the scope of this article, you will just be configuring a single IP to be shared.
Now enter your configuration in the crm shell, remembering to replace the IPs where appropriate. You can also change the name of the resource to anything you like. You only need to do this on one of your servers, pacemaker will automatically copy the configuration to all servers in the cluster.
primitive shared_ip_one IPaddr params ip=220.127.116.11 cidr_netmask="255.255.255.0" nic="eth0"
location share_ip_one_master shared_ip_one 100: node-1
monitor shared_ip_one 20s:10s
This has configured the resource "shared_ip_one" with the IP address allocated, told heartbeat to use node-1 as the primary server and told it to monitor the resource for availability.
You should now have a working heartbeat and pacemaker setup, to test run the following on node-2:
This should give you an output similar to the following:
Current DC: node-1 (ff8a0b30-adeb-40f3-8eb5-dc07a76603d2) - partition with quorum
2 Nodes configured, unknown expected votes
Online: [ node-1 node-2 ]
shared_ip_one (ocf::heartbeat:IPaddr): Started node-1
Leave this screen open, start a ping running from your machine to 18.104.22.168 and then issue the following command onto node-1:
This shuts down the server and therefore means that it can't respond to the pings anymore, however you should see your ping still working and the crm_mon screen on node-2 should now look similar to the following:
Current DC: node-2 (ff8a0b30-adeb-40f3-8eb5-dc07a76603d2) - partition with quorum
2 Nodes configured, unknown expected votes
shared_ip_one (ocf::heartbeat:IPaddr): Started node-2
This shows that node-2 has now taken over the IP address. To get node-1 started again, go into your Rackspace control panel and issue a hard reboot command to the server. Shortly afterwards you should see your crm_mon return to showing both nodes online with node-1 back in control.
If you've correctly configured your MySQL Master to Master Replication
then you have two servers that you can connect to perform any database commands. Now you can also split the load between them and be able to automatically cope with either of them going down.
This stage is very similar to setting up the load balanced web service, and in fact the high availability detailed in this article can be used on your web balancers as well.
On both of these servers you need to install HAProxy and get it configured, so issue the following commands, for details refer to the web load balancing article
In this file, set "ENABLED" to "=1". Then restart HAProxy and check that there are no errors in startup.
Remember to make this configuration identical on both servers.
In any of your web applications that access the database, you need to use 22.214.171.124 as the server.
If you havve followed all the articles to this point, and possibly gone back and added heartbeat to your web load balancer, you should now have the resiliency to lose a server from any level of your infrastructure. Due to the relatively low costs of running cloud servers, having a number of servers running at one time should not cost you an arm and a leg. Whilst you are now maintaining five or six servers, you are free to restart any of your servers for regular maintenance without taking your site down.
In the next article we will extend this infrastructure further by adding MySQL clustering, giving even more resiliency and speed.