linux pci=nommconf nophet
I setup disks through ServerRAID adapter software to /dev/sda as RAID 1 out of 2 volumes and /dev/sdb as RAID 5 out of remaining 4 volumes. Linux will be installed on /dev/sda and /dev/sdb will be shared.
Don't forget to connect eth2 adapters with crossover cable and serial ports with null modem cable. I then changed network settings for production ethernet to 100Mbit/s full duplex and crossover cable to 1000Mbit/s full duplex (100Mbit/s was a requirement from the customer).
Example files from server1 are below.
/etc/sysconfig/network-scripts/ifcfg-eth0:
DEVICE=eth0
BOOTPROTO=none
ONBOOT=yes
IPADDR=10.1.1.10
NETMASK=255.255.255.0
GATEWAY=10.1.1.1
TYPE=Ethernet
ETHTOOL_OPTS="speed 100 duplex full autoneg off"
/etc/sysconfig/network-scripts/ifcfg-eth2:
DEVICE=eth2
BOOTPROTO=none
ONBOOT=yes
IPADDR=10.1.2.10
NETMASK=255.255.255.0
TYPE=Ethernet
ETHTOOL_OPTS="speed 1000 duplex full autoneg off"
/etc/hosts needs to be changed accordingly:
10.1.1.10 server1
10.1.1.11 server2
10.1.1.12 serviceIP
10.1.2.10 server1repl
10.1.2.11 server2repl
10.1.1.13 s1rsa
10.1.1.14 s2rsa
You should install additional RPMs:
libnet-1.1.2.1-2.1.x86_64.rpm
perl-Net-SSLeay-1.30-4.fc6.x86_64.rpm
perl-TimeDate-1.16-5.el5.noarch.rpm
heartbeat-2.1.4-2.1.x86_64.rpm
heartbeat-devel-2.1.4-2.1.x86_64.rpm
heartbeat-pils-2.1.4-2.1.x86_64.rpm
heartbeat-stonith-2.1.4-2.1.x86_64.rpm
drbd82-8.2.6-1.el5.centos.x86_64.rpm
kmod-drbd82-8.2.6-1.2.6.18_92.el5.x86_64.rpm
I manually created haclient group and hacluster user:
# groupadd -g 496 haclient
# useradd -M -g haclient -u 498 -d /var/lib/heartbeat/cores/hacluster hacluster
This install should be done on both server1 and server2. After installing DRBD you'll have to start it which will make /proc/drbd file available. I'll put config files a little bit later.
Create meta data on both machines:
[root@server1]# drbdadm create-md r0
[root@server2]# drbdadm create-md r0
Start DRBD on both machines:
[root@server1]# /etc/init.d/drbd start
[root@server2]# /etc/init.d/drbd start
Now we can check the status:
[root@server1]# cat /proc/drbd
version: 8.2.6 (api:88/proto:86-88)
GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by buildsvn@c5-x8664-build, 2008-06-21 08:48:13
0: cs:Connected st:Secondary/Secondary ds:Inconsistent/Inconsistent C r---
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 oos:425709436
They both are secondary and inconsistent. Let's make server1 our primary:
[root@server1]# drbdsetup /dev/drbd0 primary -o
[root@server1]# cat /proc/drbd
version: 8.2.6 (api:88/proto:86-88)
GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by buildsvn@c5-x8664-build, 2008-06-21 08:48:13
0: cs:SyncSource st:Primary/Secondary ds:UpToDate/Inconsistent Cr---
ns:787924 nr:0 dw:0 dr:795968 al:0 bm:48 lo:2 pe:4 ua:253 ap:0 oos:424921628
[>....................] sync'ed: 0.2% (414962/415731)M
finish: 14:45:15 speed: 7,648 (7,800) K/sec
This will take some time to synchronise. If you reboot the machine after synchronisation they will both be Secondary but UpToDate so change one of them to be a master with:
[root@server1]# drbdadm primary r0
Now that one of them is the master, make a filesystem:
[root@server1]# mke2fs -j /dev/drbd0
and add it to /etc/fstab on both machines:
/dev/drbd0 /u ext3 defaults,noauto 0 0
Now you have a working DRBD configuration and you can mount /u filesystem on the primary node. This is the DRBD config file /etc/drbd.conf:
global {
usage-count no;
}
common {
syncer { rate 30M; }
}
resource r0 {
protocol C;
handlers {
pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f";
pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";
local-io-error "echo o > /proc/sysrq-trigger ; halt -f";
outdate-peer "/usr/lib64/heartbeat/drbd-peer-outdater -t 5";
}
startup {
degr-wfc-timeout 360; # 6 minutes.
}
disk {
on-io-error detach;
}
net {
timeout 60; # 6 seconds (unit = 0.1 seconds)
connect-int 10; # 10 seconds (unit = 1 second)
ping-int 10; # 10 seconds (unit = 1 second)
ping-timeout 5; # 500 ms (unit = 0.1 seconds)
max-buffers 2048;
unplug-watermark 128;
max-epoch-size 2048;
ko-count 4;
cram-hmac-alg "sha1";
shared-secret "SomeSecret";
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;
data-integrity-alg "md5";
}
syncer {
rate 30M;
al-extents 257;
}
on server1 {
device /dev/drbd0;
disk /dev/sdb1;
address 10.1.2.10:7788;
flexible-meta-disk internal;
}
on server2 {
device /dev/drbd0;
disk /dev/sdb1;
address 10.1.2.11:7788;
meta-disk internal;
}
}
To configure heartbeat first unmount the newly created filesystem and make /dev/drbd0 secondary on both nodes. Now go to /etc/ha.d directory and make a ha.cf config file:
logfacility local0
keepalive 2
deadtime 30
warntime 10
initdead 120
udpport 694
baud 19200
serial /dev/ttyS0 # Linux
bcast eth0 eth2 # Linux
auto_failback off
stonith external/ibmrsa-telnet /etc/ha.d/stonith.ibmrsa
node server1
node server2
ping 10.1.1.1
respawn hacluster /usr/lib64/heartbeat/ipfail
apiauth ipfail gid=haclient uid=hacluster
All you need now is some more files...
/etc/ha.d/authkeys MUST be the same on both machines:
auth 1
1 sha1 letsmakeitsecret
/etc/ha.d/haresources MUST be the same on both machines:
server1 10.1.1.12 drbddisk::r0 Filesystem::/dev/drbd0::/u::ext3
/etc/ha.d/stonith.ibmrsa on server1 (user and password are still default):
server1 10.1.1.14 USERID PASSW0RD
/etc/ha.d/stonith.ibmrsa on server2 (user and password are still default):
server2 10.1.1.13 USERID PASSW0RD
You can now start heartbeat on both nodes:
[root@server1]# /etc/init.d/heartbeat start
[root@server2]# /etc/init.d/heartbeat start
You can check what's going on by tailing /var/log/messages and soon you'll have eth0:0 alias with 10.1.1.12 IP and /u filesystem mounted.
You should now do some testing, try at least this:
turn the primary node off with holding the power switch
takeover happens ok with STONITH turning primary node back on
unplug the power cables from primary node
no takeover on this environment since RSA adapter is also without power and heartbeat will not takeover without assurance from STONITH that the node is down! Just make sure both nodes have redundant power supplies and everything will be ok
unplug production ethernet from eth0 adapter on primary node
takeover happens ok without resetting primary machine with STONITH
To return resources back to the primary machine you should turn it on (if it isn't already on) and wait for DRBD to synchronise the disks. (DRBD should be set to start automatically during the boot)
[root@server1]# cat /proc/drbd
version: 8.2.6 (api:88/proto:86-88)
GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by buildsvn@c5-x8664-build, 2008-06-21 08:48:13
0: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r---
ns:787924 nr:0 dw:0 dr:795968 al:0 bm:48 lo:2 pe:4 ua:253 ap:0 oos:424921628
When the disks are synchronised start heartbeat on server1 and wait for cluster to stabilise.
[root@server1]# /etc/init.d/heartbeat start
Now you have both machines up and running but resources are still on server2. To get them back to server1 turn heartbeat off on server2, wait for resources to get back and then turn it back on.
[root@server2]# /etc/init.d/heartbeat stop
Wait for resources to move back...
[root@server2]# /etc/init.d/heartbeat start
No comments:
Post a Comment