DRBD8 + GFS2 on debian etch

This article should be seen as a continuation of the post entitled DRBD8 with two primaries on debian etch. We noticed that DRBD8 with two primaries could not only ensure filesystem integrity but also does not work as expected under Ext3 filesystem. It focuses only on synchronization. As shown in following figure, read accesses take place only locally as write accesses are sent both locally and remotely to the other node so that at any time the global filesystem is consistent on both nodes. This post shows how to ensure data protection when two master nodes are synchronized with DRBD8. The key point is the filesystem. A lock mechanism must be established to ensure protection. That’s why you have to deploy a filesystem that includes such a mechanism like GFS2 or OCFS2. The second one revealed malfunctions. We therefore focus on setting up a GFS2 on debian etch which was the only one that meets our requirements. Thus we have a shared nothing architecture between two nodes using DRBD8 + GFS2. Such a system does not spend too many resources since only write accesses are spread through the network and each host processed requests only once. Keep in mind that a specific filesystem is mandatory in our case. A filesystem with a lock manager is the only way to prevent data corruption when modifications are made from the two nodes simultaneously.

Configuration files

System requirements

  • 2.6.24 kernel at least

[tux]# apt-get update
[tux]# apt-get install linux-image-2.6.24-etchnhalf.1-686
[tux]# apt-get install linux-headers-2.6.24-etchnhalf.1-686
[tux]# reboot

  • Install dpkg-dev and other dependencies to get and build deb packages from sources:

[tux]# apt-get install dpkg-dev debhelper dpatch fakeroot gcc libc6-dev bzip2

  • Update your source list to be able to get packages from debian lenny which, contrary to etch, contains GFS2 packages. Add the following line to your /etc/apt/sources.list and comment default ones:

deb http://ftp.de.debian.org/debian lenny main
deb-src http://ftp.de.debian.org/debian lenny main
#deb http://ftp.de.debian.org/debian etch main
#deb-src http://ftp.de.debian.org/debian etch main

  • Then run apt-get update

[tux]# apt-get update

Check out GFS2 sources from lenny

  • First, create a directory to store source packages, enter:

[tux]# mkdir build
[tux]# cd build/

  • Use apt-get command to get source code for a package:

[tux]# apt-get source gfs2-tools

  • At the same time, get the following source packages from lenny repository, you will need them later to be able to build gfs2-tools successfully:

[tux]# apt-get source findutils libopenais-dev libvolume-id-dev

  • Since this step is done, you don’t need lenny depository any more. Update your source list to be able to get packages from debian backports (avoid compiling every depency package) and comment lenny repository. Afterwards you’ll get every package from default etch depository or backports. Thus to install remaining dependencies, your /etc/apt/sources.list shall contain the following lines:

#deb http://ftp.de.debian.org/debian lenny main
#deb-src http://ftp.de.debian.org/debian lenny main
deb http://ftp.fr.debian.org/debian/ etch main
deb-src http://ftp.fr.debian.org/debian/ etch main
deb http://www.backports.org/debian etch-backports main contrib non-free

  • If you use backports depository for the first time and if you want apt to verify the downloaded backports packages, you can import backports.org archive’s key into apt, then run apt-get update to take modifications into account:

[tux]# apt-get update
[tux]# apt-get install debian-backports-keyring
[tux]# apt-get update

Build packages get from lenny

  • To just compile and build a debian packages, you need to enter into the directory and issue the command:

[tux]# cd redhat-cluster-2.[current_date]/
[tux]# dpkg-buildpackage -rfakeroot -b -uc
dpkg-checkbuilddeps: Unmet build dependencies: libxml2-dev libncurses5-dev libopenais-dev (>= 0.83) libvolume-id-dev (>= 0.105-4) linux-libc-dev (>= 2.6.26) libvirt-dev (>= 0.3.0) libnss3-dev libnspr4-dev libslang2-dev

  • There are lots of unmet dependencies for gfs2-tools. First build and install libopenais-dev:

[tux]# cd ..
[tux]# cd openais-0.83/
[tux]# dpkg-buildpackage -rfakeroot -b -uc
dpkg-deb: building package `openais’ in `../openais_0.83-1_i386.deb’. dpkg-deb: building package `libopenais2′ in `../libopenais2_0.83-1_i386.deb’. dpkg-deb: building package `libopenais-dev’ in `../libopenais-dev_0.83-1_i386.deb’.

[tux]# cd ..
[tux]# dpkg -i libopenais-dev_0.83-1_i386.deb libopenais2_0.83-1_i386.deb

  • Then build and install libvolume-id-dev:

[tux]# cd udev-0.125/
[tux]# dpkg-buildpackage -rfakeroot -b -uc
dpkg-checkbuilddeps: Unmet build dependencies: quilt (>= 0.40) libselinux1-dev (>= 1.28)

[tux]# apt-get install quilt libselinux1-dev
[tux]# dpkg-buildpackage -rfakeroot -b -uc
touch .stamp-build fakeroot debian/rules binary find: invalid expression; you have used a binary operator with nothing before it.

  • libvolume-id-dev can not compile successfully. The problem is related to the find version used by etch. That’s why we will compile and install a new one with the sources we previously downloaded:

[tux]# find –version
GNU find version 4.2.28

[tux]# cd ..
[tux]# cd findutils-4.4.0/
[tux]# dpkg-buildpackage -rfakeroot -b -uc
dpkg-checkbuilddeps: Unmet build dependencies: autotools-dev dejagnu

[tux]# apt-get install autotools-dev dejagnu
[tux]# dpkg-buildpackage -rfakeroot -b -uc
dpkg-deb: building package `findutils’ in `../findutils_4.4.0-2_i386.deb’. dpkg-deb: building package `locate’ in `../locate_4.4.0-2_i386.deb’.

[tux]# cd ..
[tux]# dpkg -i findutils_4.4.0-2_i386.deb locate_4.4.0-2_i386.deb
[tux]# find –version
find (GNU findutils) 4.4.0

  • You can now successfully compile and install libvolume-id-dev:

[tux]# cd udev-0.125/
[tux]# dpkg-buildpackage -rfakeroot -b -uc
dpkg-deb: building package `udev’ in `../udev_0.125-7_i386.deb’. dpkg-deb: building package `libvolume-id0′ in `../libvolume-id0_0.125-7_i386.deb’. dpkg-deb: building package `libvolume-id-dev’ in `../libvolume-id-dev_0.125-7_i386.deb’. dpkg-deb: building package `udev-udeb’ in `../udev-udeb_0.125-7_i386.udeb’.

[tux]# cd ..
[tux]# dpkg -i libvolume-id-dev_0.125-7_i386.deb libvolume-id0_0.125-7_i386.deb

  • After that steps, you can install linux-libc-dev and libvirt-dev from backports depository:

[tux]# apt-get install linux-libc-dev libvirt-dev

  • Then process to standard etch dependencies installation:

[tux]# apt-get install libxml2-dev libncurses5-dev libnss3-dev libnspr4-dev libslang2-dev psmisc

  • You should now be able to compile gfs2-tools:

[tux]# cd redhat-cluster-2.20080801/
[tux]# dpkg-buildpackage -rfakeroot -b -uc
upgrade.o: In function `upgrade_device_archive': /root/build/redhat-cluster-2.20081102/ccs/ccs_tool/upgrade.c:226: undefined reference to `mkostemp’

  • Oops, there is still a problem to resolve. The function mkostemp has been introduced only in glibc 2.7. We have to switch to standard mkstemp function which is available under etch (glibc 2.6). Edit ccs/ccs_tool/upgrade.c and specify available function:
ccs/ccs_tool/upgrade.c
@@ -223,7 +223,7 @@ static int upgrade_device_archive(char *location){
memset(tmp_file, 0, 128);
sprintf(tmp_file, "/tmp/ccs_tool_tmp_XXXXXX");

-  tmp_fd = mkostemp(tmp_file, O_RDWR | O_CREAT |O_TRUNC);
+  tmp_fd = mkstemp(tmp_file);
if(tmp_fd < 0){
  fprintf(stderr, "Unable to create temporary archive: %s\n",
  strerror(errno));
  error = -errno;
--
  • Finally we can successfully compile gfs2-tools:

[tux]# dpkg-buildpackage -rfakeroot -b -uc
[tux]# cd ..

  • In order to be able to use a GFS2 filesystem you need gfs2-tools package as well as a cluster and a lock manager.

[tux]# dpkg -i gfs2-tools_2.20081102-1_i386.deb libcman2_2.20081102-1_i386.deb libdlm2_2.20081102-1_i386.deb cman_2.20081102-1_i386.deb openais_0.83-1_i386.deb

cman depends on libnet-snmp-perl;
however: Package libnet-snmp-perl is not installed.
cman depends on libnet-telnet-perl;
however: Package libnet-telnet-perl is not installed.
cman depends on python-pexpect;
however: Package python-pexpect is not installed.
cman depends on sg3-utils;
however: Package sg3-utils is not installed.

  • As you can see, cman still depends on several packages that are not yet installed. If problems occur, fix broken dependencies to make it work correctly:

[tux]# apt-get install libnet-snmp-perl libnet-telnet-perl python-pexpect sg3-utils
[tux]# apt-get -f install

  • Following this procedure, you can be through with the installation process!

dpkg -i gfs2-tools_2.20081102-1_i386.deb libcman2_2.20081102-1_i386.deb libdlm2_2.20081102-1_i386.deb cman_2.20081102-1_i386.deb openais_0.83-1_i386.deb

Configuration requirements

  • Create a cluster directory and copy provided configuration file:

[tux]# mkdir /etc/cluster
[tux]# cp cluster.conf /etc/cluster/

  • Adjust hostname, that is to say specify the same hostname as in /etc/cluster/cluster.conf in /etc/hostname.
  • Every resource has to be accessible from its hostname, that’s why you have to define every hostname in hosts section of both nodes:

[tux]# vi /etc/hosts
127.0.0.1 localhost
192.168.9.xx hostname2.domain.org hostname2
192.168.9.xx hostname1.domain.org hostname1

  • As a reminder, we will deploy a GFS2 filesystem on an architecture composed of two nodes. So you have to perform each step for both nodes. When this is done, each node as to be accessible with its hostname from one or another. To verify that works correctly, check if ping can be sent successfully:

[hostname1]# ping hostname2
[hostname2]# ping hostname1

  • If this is the case, you can go further and start the cluster manager at the same time on each node. Such a daemon is required to use a GFS2 filesystem. A node will not be allowed to mount a GFS2 filesystem unless the node is running fenced. Fencing happens in the context of a cman/openais cluster. A node must be a cluster member before it can run fenced. That’s why we have to start the cluster manager first:
[hostname1]# /etc/init.d/cman start
Starting cluster manager:
 Loading kernel modules: done
 Mounting config filesystem: done
 Starting cluster configuration system: done
 Joining cluster: done
 Starting daemons: groupd fenced dlm_controld gfs_controld
 Joining fence domain: done
 Starting Quorum Disk daemon: done
[hostname2]# /etc/init.d/cman start
  • Check if the cluster manager is correctly started:
[hostname1]# cman_tool nodes
Node  Sts   Inc   Joined               Name
   1   M     32   2009-01-13 17:30:42  hostname1
   2   M     28   2009-01-13 17:30:42  hostname2
[hostname1]# cman_tool status
Version: 6.1.0
Config Version: 1
Cluster Name: cluster
Cluster Id: 13364
Cluster Member: Yes
Cluster Generation: 32
Membership state: Cluster-Member
Nodes: 3
Expected votes: 1
Total votes: 2
Node votes: 1
Quorum: 1
Active subsystems: 7
Flags: 2node Dirty
Ports Bound: 0
Node name: hostname1
Node ID: 1
Multicast addresses: 239.192.52.104
Node addresses: 192.168.9.xx
  • If cman spends ages “Waiting for fenced to join the fence group.” then ensure that it is also running on the other node, and more importantly, be aware that the clusternode name in cluster.conf must be in DNS (or /etc/hosts) and resolve to an IP reachable from the other node.

The next step is to install and configure DRBD8

  • Please refer to the following post if you need more information. You need to perform following steps on both nodes.
  • Install DRBD on both machines:

[tux]# apt-get install drbd8-source drbd8-utils
[tux]# module-assistant auto-install drbd8

  • Prepare the drbd disk, set a new logical partition thanks to fdisk command, /deb/sdb5 in our case.
  • Adjust drbd.conf according to your system, that is to say specify your local partition as well as the correct hostnames and IP addresses, then copy it in /etc directory.
  • Create DRBD resource. The DRBD partition must have the same size on both nodes to work correctly.

[tux]# drbdadm create-md r0

  • If any error occurs, then adjust partition size using the current configuration leaves usable information returned by previous command or erase /dev/sdb5 current content, please perform one of the following commands on each node.

[tux]# e2fsck -f /dev/sdb5 && resize2fs /dev/sdb5 current_conf_sizeK

or

[tux]# dd if=/dev/zero of=/dev/sdb5

  • And then you can successfully create your resource on both machines:

[tux]# drbdadm create-md r0

  • When it is done on both nodes, you can start DRBD, then force one node in primary state

[hostname1]# /etc/init.d/drbd start
[hostname2]# /etc/init.d/drbd start
[hostname1]# drbsetup /dev/drbd0 primary -o

  • Wait the end of the synchronization. Then you can successfully set the secondary node into primary state in order to get two primaries:

[hostame2]# drbdadm primary r0

  • At the end the two nodes should be in the following state:
[hostname1]# /etc/init.d/drbd status
drbd driver loaded OK; device status:
version: 8.0.14 (api:86/proto:86)
GIT-hash: bb447522fc9a87d0069b7e14f0234911ebdab0f7 build by phil@fat-tyre,
2008-11-12 16:40:33
m:res  cs         st               ds                 p  mounted  fstype
0:r0   Connected  Primary/Primary  UpToDate/UpToDate  C

Couple DRBD8 and GFS2

  • Create the GFS2 filesystem on the DRBD device using dlm lock manager. Since it is already clustered, you only have to make the filesystem on one node (thank you for your comment Peter ;-)):

[hostname1]# mkfs.gfs2 -t cluster:gfs1 -p lock_dlm -j 2 /dev/drbd0

  • Here is a brief description of the parameters used:
-t clustername:fsname
   Clustername must match that in cluster.conf; only members
   of this cluster are permitted  to use  this file system.
   Fsname is a unique file system name used to distinguish this
   GFS2 file system from others created (1 to 16 characters).
   Lock_nolock doesn't use this field.

-j Number
   The number of journals for gfs2_mkfs to create. You need at
   least one journal per machine that will mount the filesystem.
   If this option is not specified, one journal will be created.

-p LockProtoName
   LockProtoName is the name of the locking protocol  to  use.
   Acceptable locking  protocols are lock_dlm (for shared storage)
   or if you are using GFS2 as a local filesystem  (1  node  only),
   you can specify the lock_nolock protocol.  If this option is not
   specified, lock_dlm protocol will be assumed.

Test the shared nothing architecture

  • You can finally check if your system works correctly. First create a new directory:

[hostname1]# mkdir /synchronized
[hostname2]# mkdir /synchronized

  • Then mount the DRBD partition at this location. You should be able to access it read/write on both machines simultaneously.

[hostname1]# mount -t gfs2 /dev/drbd0 /synchronized
[hostname2]# mount -t gfs2 /dev/drbd0 /synchronized

  • If you followed every step carefully, your system must be set up correctly and the DRBD status command should return the following state on both nodes. You can see that DRBD is mounted on /synchronized.
[hostname1]# /etc/init.d/drbd status
drbd driver loaded OK; device status:
version: 8.0.14 (api:86/proto:86)
GIT-hash: bb447522fc9a87d0069b7e14f0234911ebdab0f7 build by phil@fat-tyre,
2008-11-12 16:40:33
m:res  cs         st               ds                 p  mounted        fstype
0:r0   Connected  Primary/Primary  UpToDate/UpToDate  C  /synchronized  gfs2
  • We can try to create files from both locations to see if it works how it should.

[hostname1]# touch /synchronized/from_host1
[hostname2]# touch /synchronized/from_host2
[hostname1]# ls /synchronized
from_host1 from_host2
[hostname2]# ls /synchronized
from_host1 from_host2

  • That’s it, and… It works properly! I even made several thousands of entries in parallel on both nodes and it worked like a charm. No file disappeared or was altered.

Automate GFS2 partition mounting at startup

  • Copy provided mount script in /etc/init.d directory:

[tux]# cp mountgfs2.sh /etc/init.d/

  • Then take care to call the script at the right time. GFS2 partition needs to be mounted after DRBD start (to see /dev/drbd0) and unmounted before DRBD stop (to free the partition and enable DRBD to quit properly):
[tux]# update-rc.d mountgfs2.sh start 70 2 3 4 5 . stop 07 0 1 6 .
  Adding system startup for /etc/init.d/mountgfs2.sh ...
       /etc/rc0.d/K07mountgfs2.sh -> ../init.d/mountgfs2.sh
       /etc/rc1.d/K07mountgfs2.sh -> ../init.d/mountgfs2.sh
       /etc/rc6.d/K07mountgfs2.sh -> ../init.d/mountgfs2.sh
       /etc/rc2.d/S70mountgfs2.sh -> ../init.d/mountgfs2.sh
       /etc/rc3.d/S70mountgfs2.sh -> ../init.d/mountgfs2.sh
       /etc/rc4.d/S70mountgfs2.sh -> ../init.d/mountgfs2.sh
       /etc/rc5.d/S70mountgfs2.sh -> ../init.d/mountgfs2.sh

Failure handling

Let’s spend some time to speak about the behavior in case of failure. In the current state, only manual fencing is configured in cluster.conf. Thus, in one node failed, the access to the shared system is frozen until we are sure the failed node is really dead.

  • If hostname2 failed for one or another reason, following lines will appear in /var/syslog of hostname1:
Jan 13 22:04:10 hostname1 kernel: dlm: closing connection to node 2
Jan 13 22:04:10 hostname1 fenced[2543]: hostname2 not a cluster member after 0 sec post_fail_delay
Jan 13 22:04:10 hostname1 fenced[2543]: fencing node "hostname2"
Jan 13 22:04:10 hostname1 fenced[2543]: fence "hostname2" failed
  • During this time, access to GFS2 filesystem is frozen. A manual fencing (executed on functional machine) is needed to get access again to the shared partition:

[hostname1]# fence_ack_manual -n hostname2

  • Once this is done, repair the failed node and connect it with the valid one only when you are sure it is ok, necessary to avoid corruption!

Admittedly, you could try to automate this procedure by monitoring the logs and triggering the manual fencing only when necessary. It would allow to keep on accessing the GFS2 partition even if a node failed. In reality, this trick is strongly discouraged since there is no way to guaranty data integrity. The only way to deal with failures safely is to acquire a dedicated fence device that will STONITH the other one. I know, buying a new network device is restrictive, but it is the only way to consider such a system in a production environment.

Other Considerations

We set up a functional shared nothing architecture on two nodes using DRBD8 over a GFS2 filesystem. If you plan to deploy such a system on a production environment there are few things to keep in mind. First of all, you should find a way to benchmark the system according to your needs. There was no loss or data corruption in my tests, but you could encounter performance issues. For example if your synchronized partition is suppose to contain billion of files, time access could become too large due to the lock mechanism. Otherwise, the system has proved to be stable. I was even able to recover manually from failure states!

  1. Peter
    January 20th, 2009 at 18:53
    Reply | Quote | #1

    Excellent Howto! Just one small point – in the bit where you make the cluster file system:-

    Couple DRBD8 and GFS2

    * Create the GFS2 filesystem on the DRBD partition using dlm lock manager:

    [hostname1]# mkfs.gfs2 -t cluster:gfs1 -p lock_dlm -j 2 /dev/drbd0
    [hostname2]# mkfs.gfs2 -t cluster:gfs2 -p lock_dlm -j 2 /dev/drbd0

    It is not necessary to re-make the filesystem again on hostname2 since it is already clustered:)

  2. gcharriere
    January 20th, 2009 at 20:08
    Reply | Quote | #2

    @Peter
    My mistake…
    Thank you for your comment, you’re right.

  3. Hizar
    February 5th, 2009 at 01:23
    Reply | Quote | #3

    This is an excellent written how to, a lot of care has goine into this and it worked exactly as described.

    I hope you will publish something soon to demonstrate the best way to have failover between two nodes that users need to write to, such as a webserver hosting a CMS. I looked at your LVS howto (also very well written) and although this provides failover, I am not sure what would happen if primary node failed, services failed over to secondary node, user writes to secondary disk, and then primary node is available.. since there is no syncronisation (as far as i understand) then changes made would have been lost.. please correct me if I am mistaken.

    Many thanks for these great HowTo’s.

  4. February 5th, 2009 at 16:06
    Reply | Quote | #4

    @Hizar
    There is no failover at this level. You can access the filesystem equally from both nodes simultaneously, modifications from one or another are synchronized, thus the synchronized partition is in the same state regardless of the node to which you connect.

    Both nodes have the same role. Thus if a node failed, generally, you cannot access the clustered filesystem any more (even from the remaining valid node!). This mechanism avoids filesystem corruption. In this case, we have to manually kill the node that encountered failure. When this is done, inform the valid node that the other one is really dead. Enter the “fence_ack_manual” command from the remaining valid node as explained in the post for this purpose. After theses steps you gain access to the filesystem again (but only from the remaining valid node!). Here we play the role of a fence device! If you buy such a device, this is done automatically. In this case, the access to the synchronized partition is only frozen during a small amount of time without manual intervention.

    Finally you want to know how to couple such a filesystem with your application to make it work in case of failure, right? There is no perfect solution, but here is what I suggest. As you read my post about keepalived and lvs you should know that if a service (webserver in your case) failed (or is stopped) on one node, it is removed from the ipvs table, thus every subsequent queries will only be forwarded to the valid webserver. In our case, to make it work properly, you should establish a mechanism that is able to detect that your filesystem failed. For example if following lines appeared in the syslog of failed node, then trigger a procedure to stop the webserver:

    Feb 5 15:41:50 hostname2 groupd[2550]: gfs daemon appears to be dead
    Feb 5 15:41:50 hostname2 groupd[2550]: fence daemon appears to be dead
    Feb 5 15:41:50 hostname2 dlm_controld[2556]: cluster is down, exiting
    Feb 5 15:41:50 hostname2 kernel: dlm: closing connection to node 2
    Feb 5 15:41:50 hostname2 kernel: dlm: closing connection to node 1
    Feb 5 15:41:50 hostname2 ccsd[2541]: Stopping ccsd, SIGTERM received.

    Thus every subsequent request will be handled by the valid node that can still access the filesystem (only if it received the fence ack!). After that you have to investigate to understand why a failure appeared, repair it and restart the node which will be synchronized with the valid one. No data loss!

    I hope it will help you. I know, the answer is quite long by I think your question deserved some clarifications.

  5. Hizar
    February 5th, 2009 at 22:47
    Reply | Quote | #5

    Thank you for the detailed answer Gaël. It is much appreciated. I will investigate your suggestion further.

  6. David
    February 14th, 2009 at 22:03
    Reply | Quote | #6

    Gaël,

    First of all thanks for your excellent howto!

    You write:

    “Admittedly, you could try to automate this procedure by monitoring the logs and triggering the manual fencing only when necessary. It would allow to keep on accessing the GFS2 partition even if a node failed. In reality, this trick is strongly discouraged since there is no way to guaranty data integrity. The only way to deal with failures safely is to acquire a dedicated fence device that will STONITH the other one.”

    My question:

    Let’s assume each of both cluster nodes has two ethernet interfaces. On every cluster node one such interface is reserved for communicating to the intranet while the other interface is directly connected to the other cluster node using an ethernet crossover cable. This link is fully dedicated to DRBD/GFS2-related traffic.

    If hostname2 failed (as you have described in your howto) hostname1’s syslog files report this error. Although you do not recommend this we could monitor this file/message on hostname1 and trigger a script which
    – shuts down the DRBD/GFS2-related ethernet interface to hostname2 (hostname1 and hostname2 are then totally split in terms of DRBD/GFS2)
    – executes “fence_ack_manual -n hostname2″

    Wouldn’t this be safe enough and allow for almost immediate access to the DRBD/GFS2 file system on hostname1?

  7. February 15th, 2009 at 18:27
    Reply | Quote | #7

    @David
    Theoretically you’re right. If a failure is detected by hostname1 on hostname2, simply shutdown corresponding dedicated interface and then execute the fence ack command. In my opinion it should be safe enough.

    However, if you refer to redhat cluster project FAQ (http://sources.redhat.com/cluster/faq.html#fence_manual), here is what they say about it:

    “Can’t I just use my own watchdog or manual fencing?
    No. Fencing is absolutely required in all production environments. That’s right. We do not support people using only watchdog timers anymore.
    Manual fencing is absolutely not supported in any production environment, ever, under any circumstances.”

    I also thought to use a second dedicated interface but, as it was not supported and my architecture was designed to go into production, I didn’t test this alternative in details.

    If your data is not extremely critical and if you take time simulating every type of failure without affecting data, I think this is the best and above all the cheapest alternative to a dedicated fencing device!

    Keep us posted with your results if you plan to go further in this direction, Thanks.

  8. David
    February 16th, 2009 at 14:02
    Reply | Quote | #8

    @gcharriere

    Gaël, thanks for your response! I’ll keep you posted once (and if!) I’ll implement something like this.

  9. re3e
    February 27th, 2009 at 20:20
    Reply | Quote | #9

    looks great , i’ve been spending quite some time at debugging my ocfs2/drbd8/hb/vmware cluster (personal project i want to demo at work) and was at a dead end since it freezes with nothing in syslog , hopefully i’ll be able to post back with a successfull working setup somewhere this week end

  10. Simon
    June 8th, 2009 at 11:28

    Hello everybody,

    First, thank you Gaël! That tutorial will solve perfectly my issues. Unfortunately, I got the “Waiting for fenced to join the fence group” message on both nodes and I cannot connect them to the cluster.
    The name resolution works well between them:
    – ping node1 on node2 ok
    – ping node2 on node1 ok

    Have you any clue? Thx a lot!

  11. Simon
    June 8th, 2009 at 12:52

    I forgot the cluster trace and status, it might be useful:

    node1:/etc/cluster# /etc/init.d/cman start
    Starting cluster manager:
    Loading kernel modules: done
    Mounting config filesystem: done
    Starting cluster configuration system: done
    Joining cluster: done
    Starting daemons: groupd fenced dlm_controld gfs_controld
    Joining fence domain:Waiting for fenced to join the fence group.
    Waiting for fenced to join the fence group.
    Waiting for fenced to join the fence group.
    [...]
    Waiting for fenced to join the fence group.
    Error joining the fence group.
    done
    Starting Quorum Disk daemon: done

    node1:/etc/cluster# cman_tool status
    Version: 6.1.0
    Config Version: 6
    Cluster Name: mycluster
    Cluster Id: 14341
    Cluster Member: Yes
    Cluster Generation: 4
    Membership state: Cluster-Member
    Nodes: 1
    Expected votes: 1
    Total votes: 1
    Node votes: 1
    Quorum: 1
    Active subsystems: 7
    Flags: 2node Dirty
    Ports Bound: 0
    Node name: node1
    Node ID: 1
    Multicast addresses: 239.192.56.61
    Node addresses: 127.0.1.1

    And I guess, the “Node addresses” is not right because node1 is set with 192.168.1.3 and node2 with 192.168.1.1

    Can you help me?

  12. Simon
    June 8th, 2009 at 14:26

    Shame on me, I had a double entry in my /etc/hosts… Sorry for that.
    Now, my Node Addresses are right but I still have those “Waiting for fenced…” messages O_o

  13. June 8th, 2009 at 17:01

    @Simon
    As you said, try to check the entries of both /etc/hosts file.
    Ensure they look like this:

    hostname1:~# cat /etc/hosts
    127.0.0.1 localhost
    192.168.9.xx hostname1
    192.168.9.xx hostname2

    This should be sufficient to make it work correctly. Then the “Waiting for fenced…” message should disappear as soon as the two machines see each other when they try to join the fence domain simultaneously.

    If this does not solve your problem, it would be better if you could attach your syslog messages to provide a more accurate diagnostic.

  14. Simon
    June 9th, 2009 at 08:02

    Ok, I am not sure why but it work this morning. I changed my hosts file yesterday and it still didn’t work. So, I looked around multicast settings: I added a rout for all the multicast addresses and I changed the default one () but it did not work after that.
    And this morning when I turned on my servers… it works! I can easily guess that I had to restart a daemon but which one? I will check it out.
    Anyway, thank you for your help and that great tutorial!
    Merci !

  15. David Calvache
    August 20th, 2009 at 01:12

    GlusterFS will meet your target, or either GFS2 with GNBD without DRBD

    Have you take a lok on these?

  16. shankar
    April 7th, 2010 at 09:32

    hi guys,

    I have a similar settings that done by gcharriere, but on my site im using cross cable between 2 nodes for my clustering n drbd purpose. my scenario as follow’s

    Scenario 1

    Node1 and Node2 been cluster n fence nicely.Let’s say by accidentally some one unplug the crosscable.What i have done is i run fence_ack_manual -n node2 in node1.
    my node1 back to normal and i can touch my files in mounted drive. Now how i can make node2 to be join back the cluster n fence it back n sync files from node1 after my cross cable been plug back in.

    For your info my mounted drive in node2 will be in freeze state.i try to execute reboot in my node2 but the system hang at unmounting gfs file system.and will hang forever till i have to do hard reboot to make everything back to normal. is it any better way of doing this. i have found this command to force reboot but im not sure is it safe or not.

    echo 1 > /proc/sys/kernel/sysrq
    echo b > /proc/sysrq-trigger

    (http://wiredgorilla.com.au/2009/04/linux-server-tip-force-rebootshutdown/)

    please advice

    thanks,
    shan

  17. December 7th, 2010 at 13:31

    @David Calvache
    Just took a look at Gluster website, looks promising. Do not hesitate to share your experience…

  18. December 7th, 2010 at 13:49

    @shankar
    I haven’t tried to force the reboot like this. Anyway, manual fencing is not safe at all. Thus I can only advice you to do as many tests as you can, keeping in mind that what you do will by definition never be safe.

  19. atxbibip4
    September 14th, 2011 at 17:00

    Good article. For people who want to know about GFS2 vs OCFS2 vs NFS. I tested this three solutions and in conclusion I can say that GFS2 is very slow (for my application), OCFS2 rocks BUT can froze on load! (don’t use in production if you have load), NFS same performance than OCFS2 but more secure, no froze, no lag. But to implement NFS with DRBD and heartbeat (not finished) ask some time to spend. Tested on Lenny on VM (each VM has its core, ram …).
    I tested with jmeter with multiple clients on the same scenario. So, go to the (old) NFS for me.

  20. HEWI
    October 4th, 2011 at 20:45

    Realy nice !!
    Got everything working, and did some tests…
    Pulled the harddrive in and out, and drbd syncroniced nicely.
    Colud make new files in /syncroniced all the time.
    But…
    When i shut down node 2, and rebooted node one, i came st a Full stop ;o(
    after running the “fence_ack_manual -n node2″ i still cant access /synconiced.
    /etc/init.d/drbd status os in Uptodate/Missing status (witch figures ;o)
    If i try to mount locally doing “mount -t gfs2 /dev/drbd0 /synchronized”
    i get:
    mount -t gfs2 /dev/drbd0 /synchronized
    /sbin/mount.gfs2: node not a member of the default fence domain
    /sbin/mount.gfs2: error mounting lockproto lock_dlm

    Have i missed anything ???
    P.S: as soon as i power up node2 it works fine again, but thats not realy failover ;o)

  21. HEWI
    October 5th, 2011 at 11:52

    @HEWI
    Found the error ;o)
    fence_ack_manual -n missing-node-hostname

  22. Brent
    March 15th, 2012 at 15:29

    Hiya

    Any chance of updating this howto for Debian Squeeze?

    Thanks
    Brent

  23. March 16th, 2012 at 14:34

    @Brent
    Unfortunately I doubt, at least from my side…

  24. June 28th, 2012 at 19:31

    Hello there! This is my first visit to your blog! We are a group of volunteers and starting a new initiative in a community in the same niche. Your blog provided us valuable information to work on. You have done a outstanding job!

  25. June 29th, 2012 at 00:23

    I’m extremely inspired along with your writing abilities as neatly as with the structure on your weblog. Is that this a paid subject matter or did you modify it your self? Anyway keep up the nice high quality writing, it is rare to peer a nice weblog like this one today..

9 trackbacks

  1. DRBD8 with two primaries on debian etch | HOWTO's and Tutorials Pingback | 2009/01/12
  2. Installation de GFS pour utilisation avec volume DRBD sous Centos 5.5 ou redhat Pingback | 2011/03/12
  3. mouse click the next article Trackback | 2014/08/27
  4. More methods Trackback | 2014/08/28
  5. simpsons tapped out hack iphone Trackback | 2014/09/06
  6. mississauga ontario Trackback | 2014/09/17
  7. earthhealingproject.org Trackback | 2014/09/21
  8. click through the next webpage Trackback | 2014/09/21
  9. cool training Trackback | 2014/09/23
Comments are closed.