Next Previous Contents

21. Setting up Linux-HA for directors using rpms

This was posted to the mailing list by Peter Mueller Peter Mueller pmueller@sidestep.com on 17Sep2001. (The original was in html with DOS carriage control. I've converted it by hand. There may be some parsing errors. Joe)

original mon files from Juri, data posted from personal experience or mailing list (linux-ha or LVS) or respective websites.

urls

note : these scripts assume mon 0.99.2. For simplicity in install I downloaded mon-0.38rpm (couldn't find new 0.99.2 rpm) and upgraded to 0.99.2 via source. I then changed appropriate lines in /etc/rc.d/init.d/mon.

21.1 linux-ha howto

This document is a mini how-to get heartbeat working between two individually working LVS boxes. It is certainly not intended to be all-encompasing document detailing everything imagineable. What it is intended to deliver is an 'essential steps' to getting LVS-HA functional. And you definitely should have two individually functioning boxes before even attempting this. (Yes, go back and test your setup with each box to insure it works!).

Another important note to add is that I have only tested this setup with Ultramonkey RPMs. I don't know if your setup will work. I wouldn't trust this document unless you do the same. (I would be interested in knowing if the HA features are the same for all 'heartbeat' setups..)

PS. - apologies if this document is RedHat biased, I'm running from VALinux boxes that are RedHat configured.

Fix the (possible) ethernet alias issue.

By now you've setup a dummy alias device on each LVS box (most likely eth0:0). This alias device is unecessary and potentially problematic in the HA-setup. The reason for this is that the heartbeat software (/etc/ha.d/resource.d/) actually creates a new eth0:0 device on the active box. If you have an eth0:0 (or whatever) alias configured for your VIP on the standby director box, you might get a " VSbox2 kernel: Uh Oh, MAC address 00:02:B3:03:9A:13 claims to have our IP address (vip.ip.goes.here) (duplicate IP conflict likely)" error! Not good...

If I were you I'd move your alias script out of your /etc/sysconfig/network-scripts/ directory and restart networking to clear out that alias. Alternatively, if you are using shell scripts then you should modify those to not control alias ips.

Configure /etc/ha.d/. files.

21.2 Stop ldirectord from starting, ensure heartbeat starts on reboot

/etc/rc.d/init.d/ldirectord stop.
/usr/sbin/chkconfig --level 2345 ldirectord off
/usr/sbin/chkconfig --level 345 heartbeat on # <-- run on whatever init levels you want

21.3 starting heartbeat and verifying functionality

At this point you should have linux-director NOT running on both boxes. If you type ipvsadm -L on either box you should get:

[root@vs1 ha.d]# ipvsadm -L
IP Virtual Server version 0.9.11 (size=3D4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port          Forward Weight ActiveConn InActConn

Now start up heartbeat. tail /var/log/messages, and /var/log/ha-log for important log information. My /var/log/messages looks like :

Apr 24 13:12:38 vs1 heartbeat[2070]: Configuration validated. Starting heartbeat.
Apr 24 13:12:39 vs1 heartbeat[2075]: Starting serial heartbeat on tty /dev/ttyS0
Apr 24 13:12:39 vs1 heartbeat[2075]: UDP heartbeat started on port 694 interface eth0
Apr 24 13:12:39 vs1 heartbeat[2077]: node vs1.internal.smartbasket.com -- link eth0: status up
Apr 24 13:12:39 vs1 heartbeat[2077]: node stage-monitor -- link /dev/ttyS0: status up
Apr 24 13:12:39 vs1 heartbeat[2077]: node stage-monitor -- link eth0: status up

And a quick check of ifconfig on the primary director shows the alias interface (eth0:0) appears. Note that eth0:0 is *NOT* present when heartbeat isn't running.

[root@vs1 ha.d]# ifconfig -a
eth0      Link encap:Ethernet  HWaddr 00:02:B3:06:B6:45 =20
          inet addr:10.0.1.5  Bcast:10.0.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:106550 errors:0 dropped:0 overruns:0 frame:0
          TX packets:75338 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100=20
          Interrupt:10 Base address:0xd000=20

eth0:0    Link encap:Ethernet  HWaddr 00:02:B3:06:B6:45 =20
          inet addr:10.0.1.10  Bcast:10.0.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:10 Base address:0xd000=20

A ps aux on the active director shows :

root      1648  0.0  0.1  1444  868 ttyS0    SL   13:17   0:00 /usr/lib/heartbeat/heartbeat
root      1650  0.0  0.1  1332  748 ttyS0    SL   13:17   0:00 /usr/lib/heartbeat/heartbeat
root      1651  0.0  0.1  1332  736 ttyS0    SL   13:17   0:00 /usr/lib/heartbeat/heartbeat
root      1652  0.0  0.1  1328  736 ttyS0    S    13:17   0:00 /usr/lib/heartbeat/heartbeat
root      1653  0.0  0.1  1332  732 ttyS0    SL   13:17   0:00 /usr/lib/heartbeat/heartbeat
root      1654  0.0  0.1  1328  728 ttyS0    S    13:17   0:00 /usr/lib/heartbeat/heartbeat
root      1775  0.0  0.8  5352 4388 ttyS0    S    13:17   0:00 perl /etc/ha.d/resource.d/ldirectord ldir
root      1869  0.0  0.1  2344  724 pts/0    R    13:20   0:00 ps aux

21.4 Test your fail-over features, understand HA.

At this point you should test around your failover functionality and learn how your setup works. You also need to customize your ha.cf file to the specifications for your site.

As noted in the 'getting started' document mentioned in the url section above, be certain to NOT yank all heartbeat medium cables at once! This will cause a 'split brain' scenario and you won't be happy! Test failover possibilities one at a time, or catastrophically!

21.5 Configuration of mon - recommended


Next Previous Contents