Failover and loadbalancer using keepalived (LVS) on two machines

In this scenario, we have two machines and try to make the most of available resources. Each of the node will play the role of realserver, it will provide a service such as a web or a mail server. At the same time, one of the machines will loadbalance the requests to itself and to its neighbor. The node that is responsible of the loadbalancing owns the VIP. Every client connects to it transparently thanks to the VIP. The other node is also able to take over the VIP if it detects that current master failed but in nominal case only process requests forwarded by the loadbalancer.

Throughout this post the following ip addresses are used. Do not forget to modify them according to you network settings:

  • hostname1 ip address: 192.168.9.10
  • hostname2 ip address: 192.168.9.20
  • virtual ip address: 192.168.9.100

Configuration Files

Install Keepalived and set up configuration files

  • Install Keepalived on both machines:

[tux]# apt-get install keepalived

  • Copy provided Keepalived configuration files (master and slave) into /etc/keepalived/ directory:

[hostame1]# cp keepalived_master.conf /etc/keepalived/keepalived.conf
[hostame2]# cp keepalived_slave.conf /etc/keepalived/keepalived.conf

  • Copy provided bypass_ipvs.sh script that will be called during master/slave transitions on both machines:

[tux]# cp bypass_ipvs.sh /etc/keepalived/

Install and Configure services (mail and web server in our case) on both machines

  • For our test purposes, the realservers provide a mail and a web server each. First install them:

[tux]# apt-get install postfix apache2

  • Configure postfix, make sure each node can connect to the mail server of its neighbor. In installation phase, select local only, then comment following line in /etc/postfix/main.cf to be sure the mail server not only listen on local interface:

# inet_interfaces = loopback-only

  • Then try to connect to the mail server of your neighbor to be sure it works correctly:

[hostname1]# telnet hostname2 25
Connected to hostname2.
Escape character is ‘^]’.
220 hostname2 ESMTP Postfix (Debian/GNU)

[hostname2]# telnet hostname1 25
Connected to hostname1.
Escape character is ‘^]’.
220 hostname1 ESMTP Postfix (Debian/GNU)

  • Generate digest string to check web server using genhash value for one accessible web page. In our case we compute the digest for /apache2-default/index.html which is the default page for apache2:

[hostname1]# genhash -s hostname1 -p 80 -u /apache2-default/index.html
MD5SUM = c7b4690c8c46625ef0f328cd7a24a0a3

[hostname1]# genhash -s hostname2 -p 80 -u /apache2-default/index.html
MD5SUM = c7b4690c8c46625ef0f328cd7a24a0a3

  • Keepalived will check if the server is up using this digest value. That’s why you have to copy it in Keepalived configuration specifically in realserver sections intended to web server:
HTTP_GET {
  url {
    path /apache2-default/index.html
    digest c7b4690c8c46625ef0f328cd7a24a0a3
  }
  connect_timeout 3
  nb_get_retry 3
  delay_before_retry 2
}
  • At this step, we set up a functional mail and web server on each node.

Configure VIP(Virtual IP Service)

  • This IP will enable access to realservers. It will be completely configured from Keepalived configuration and does not need any other modification. Only one of the two nodes owns the VIP at a given time. Thus there are different configurations for the two nodes. In our case, hostname1 is set up as the master and hostname2 is the slave. Furthermore the VIP is 192.168.9.100:
  • On the master:
# describe virtual service ip
vrrp_instance VI_1 {
  # initial state
  state MASTER
  interface eth0
  # arbitary unique number 0..255
  # used to differentiate multiple instances of vrrpd
  virtual_router_id 1
  # for electing MASTER, highest priority wins.
  # to be MASTER, make 50 more than other machines.
  priority 100
  authentication {
    auth_type PASS
    auth_pass xxx
  }
  virtual_ipaddress {
    192.168.9.100/24
  }
  • On the slave:
# describe virtual service ip
  vrrp_instance VI_1 {
  # initial state
  state BACKUP
  interface eth0
  # arbitary unique number 0..255
  # used to differentiate multiple instances of vrrpd
  virtual_router_id 1
  # for electing MASTER, highest priority wins.
  # to be MASTER, make 50 more than other machines.
  priority 50
  authentication {
    auth_type PASS
    auth_pass xxx
  }
  virtual_ipaddress {
    192.168.9.100/24
  }
  • Then we can start or reload Keepalived and check that the master really owns the VIP:
[hostname1]# /etc/init.d/keepalived start
[hostname2]# /etc/init.d/keepalived start
[hostname1]# ip addr list dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 00:0c:29:e3:e2:40 brd ff:ff:ff:ff:ff:ff
    inet 192.168.9.10/24 brd 192.168.9.255 scope global eth0
    inet 192.168.9.100/24 scope global secondary eth0
    inet6 fe80::20c:29ff:fee3:e240/64 scope link
       valid_lft forever preferred_lft forever

Configure loadbalancing

  • The loadbalancing is also configured thanks to Keepalived. At a given time, only one machine owns the VIP and then the requests are forwarded to the realservers according to chosen rules. Services are accessed through VIP and are processed indifferently by one or another machine. In /etc/keepalived/keepalived.conf realservers are defined like this:
# describe virtual mail server
virtual_server 192.168.9.100 25 {
  delay_loop 15
  lb_algo rr
  lb_kind DR
  persistence_timeout 50
  protocol TCP

  real_server 192.168.9.10 25 {
    TCP_CHECK {
      connect_timeout 3
    }
  }
  real_server 192.168.9.20 25 {
    TCP_CHECK {
      connect_timeout 3
    }
  }
}
  • This example demonstrates how the requests intended to the web server are processed. The requests are loadbalanced according to round robin (rr) algorithm to the realservers. Direct Routing (DR) mode is preferred. In this scenario as soon as a realserver is selected to process a request, then the realserver is directly connected to the client without going through the loadbalancer. Thus a single loadbalancer can process a huge amount of requests without becoming the bottleneck of our system since query processing requires only a few amounts of resources.
  • Then enable ip_forward on both machines permanently. In /etc/sysctl.conf :

net.ipv4.ip_forward = 1

  • You can load this option and check it is set up correctly with the following commands:

[tux]# sysctl -p
net.ipv4.ip_forward = 1

[tux]# sysctl -a | grep net.ipv4.ip_forward
net.ipv4.ip_forward = 1

  • We have a mail and a web server at disposal. Ensure that loadbalancer is configured correctly:
[hostname1]# ipvsadm -L -n
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  192.168.9.100:25 rr persistent 50
  -> 192.168.9.10:25             Local   1      0          0
  -> 192.168.9.20:25             Route   1      0          0
TCP  192.168.9.100:80 rr persistent 50
  -> 192.168.9.10:80             Local   1      0          0
  -> 192.168.9.20:80             Route   1      0          0
  • Requests intended to VIP on port 25 or 80 are distributed to 192.168.9.10 and 192.168.9.20. Then we try to connect to the mail server through the VIP from another machine, that is to say not hostname1 or hostname2:

[tux]# telnet 192.168.9.100 25
Trying 192.168.9.100…

  • Noting happens… And that’s completely normal and happens every time the loadbalancer assigns the request to the node that does not currently owns the VIP since this node is not supposed to handle this request. The traditional way to sort out this issue is to configure the VIP on the other node for example on the loopback interface so that it accepts packets with VIP as destination address. Then you should configure network interfaces such that they ignore some ARP requests playing with arp_ignore and arp_announce options. This will be sufficient to resolve our problem in a classical scenario where there are dedicated machines for the load distribution, but not in our case!
  • In our architecture, both the loadbalancer and the realserver are located on the same machine. If you simply add the VIP on the secondary machine, there will be cases where packets will be processed by the loadbalancer of the two machines since it is not deactivated on the slave. Furthermore if each loadbalancer selects its neighbor to process the request, we will face a ping pong effect. In such cases there will be an infinite loop between the two nodes.  Thus the request is not handled at all!
  • Fortunately, there is a trick to handle every request efficiently. We use the mechanism specific to Keepalived to call predefined scripts on master/slave transitions in /etc/keepalived/keepalived.conf:

# Invoked to master transition
notify_master “/etc/keepalived/bypass_ipvs.sh del 192.168.9.100″
# Invoked to slave transition
notify_backup “/etc/keepalived/bypass_ipvs.sh add 192.168.9.100″
# Invoked to fault transition
notify_fault “/etc/keepalived/bypass_ipvs.sh add 192.168.9.100″

  • bypass_ipvs.sh script adds a nat rule when host is configured as slave and removes it when it goes to master state so that requests intended to VIP are processed correctly even when they are handled by the slave. The prerouting rule is essential for the slave to redirect incoming service packet to localhost. Otherwise a loop can appear between master and slave. The routing table is consulted when a packet that creates a new connection is encountered. Prerouting rule alters packets as soon as they come in. Redirect statement redirects the packet to the machine itself by changing the destination IP to the primary address of the incoming interface. Locally generated packets are mapped to the 127.0.0.1 address thanks to the following rule. Thus packets forwarded by the active loadbalancer are not handled a second time.

iptables -A PREROUTING -t nat -d 192.168.9.100 -p tcp -j REDIRECT

  • Check rule on the slave:
[hostname2]# iptables -t nat --list
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination
REDIRECT   tcp  --  anywhere             192.168.9.100       

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

Failover

  • Stop Keepalived on the master:

[hostname1]# /etc/init.d/keepalived stop

  • Ensure new master owns VIP:
[hostname2]# ip addr list dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 00:0c:29:ab:e7:dd brd ff:ff:ff:ff:ff:ff
    inet 192.168.9.20/24 brd 172.16.89.255 scope global eth0
    inet 192.168.9.100/24 scope global secondary eth0
    inet6 fe80::20c:29ff:feab:e7dd/64 scope link
       valid_lft forever preferred_lft forever
  • Check that nat rule disappeared:
[hostname2]# iptables -t nat --list
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination      

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
  • If origin master appears again, then the global architecture adjusts automatically to keep on processing incoming requests.

Service Failure Handling

  • If a service failed, then it should not correctly respond to basic Keepalived checks and be automatically removed from ipvs table:
[hostname1]# ipvsadm -L -n
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  192.168.9.100:25 rr persistent 50
  -> 192.168.9.10:25             Local   1      0          0
  -> 192.168.9.20:25             Route   1      0          0
TCP  192.168.9.100:80 rr persistent 50
  -> 192.168.9.10:80             Local   1      0          0
  -> 192.168.9.20:80             Route   1      0          0

[hostname1]# /etc/init.d/postfix stop
Stopping Postfix Mail Transport Agent: postfix.

[hostname1]# ipvsadm -L -n
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  192.168.9.100:25 rr persistent 50
  -> 192.168.9.20:25             Route   1      0          0
TCP  192.168.9.100:80 rr persistent 50
  -> 192.168.9.10:80             Local   1      0          0
  -> 192.168.9.20:80             Route   1      0          0
  • New requests are not forwarded any more to the service that encountered failure.

Other Considerations

Thanks to the nat rule, we successfully set up loadbalancing and automatic failover on a two nodes configuration. With such architecture we take full advantage of available resources. One of the nodes plays the role of loadbalancer and realserver as the other can take over the role of loadbalancer as soon as it detects that its neighbor failed. The slave does not only check the master but also handles the requests it receives from the loadbalancer.

Keep in mind that the request distribution is not strictly made on a connection basis. The clients connect to one of the realserver going through the VIP. Once this is done and as soon as the realserver in question is available,  further requests will be forwarded to the same realserver. In a classical scenario, many clients connect to the VIP, thus the global amount of requests is equally distributed between the two nodes.

To go further

This section will show you in details the interaction between the different components of my system. This could be very useful to understand how it is suppose to work if everything goes well. The following figures were built based on the traces I got from Wireshark. So this is not how it is suppose to behave based on what you can read on some documentation, but it is really how it works under my configuration. The configuration is the same as before expect from the fact that I added a client machine:

  • hostname1 ip address: 192.168.9.10
  • hostname2 ip address: 192.168.9.20
  • virtual ip address: 192.168.9.100
  • client ip address: 192.168.9.15

Here are the commands I entered:

client~$ telnet 192.168.9.100 25
Trying 192.168.9.100…
Connected to 192.168.9.100.
Escape character is ‘^]’.
220 hostname2 ESMTP Postfix (Debian/GNU)
HELO hostname2
250 hostname2
QUIT
221 2.0.0 Bye
Connection closed by foreign host.

As expected, it works like a charm. I access hostname2 mail server going through VIP address currently owned by hostname1. In this case, as you can see, the ipvs mechanism decided to forward the request to hostname2. It could also decide to process it on its own since the decision is taken on a round robin basis, but this is not what I wanted to show. There are the interactions I built from the traces I noticed on client, hostname1, hostname2.

As you can see, the client always speaks to the VIP address and thus sends its requests to hostname1 in my case. Thanks to keepalived, hostname1 forwards the request to hostname2. The important point to notice is that hostname2 directly responds to the client without going through the VIP machine. So the client does not know that the current responder is in fact hostname2 since the packet it received has the VIP address as source address. The key point to make it work is to ensure that hostname2 can accepts and process packets with the VIP address as destination address. By default, this is not the case. Here it works because of my PREROUTING rule. Another way would be to add the VIP address as a second address in hostname2. In my configuration only the first option can work since the two machines run ipvs. If I set up the VIP address on each machine, infinite loops can appear between hostname1 and hostname2 if each one decides to forward the request to the other.

You have seen the traces when it works as expected. But what if for example hostname2 is not configured properly to accept requests with VIP destination address. For test purposes, I thus manually removed my PREROUTING rule on hostname2 to see what’s going on.

Firs of all, you notice that the client does not receive any response. As before, the client sends its first request to the VIP. The VIP owner makes its job correctly; it forwards the request to hostname2. But here is the problem: hostname2 receives the SYN request with VIP destination address. There is no reason for him to process such a request, so it simply drops the packet. The VIP keeps on sending the same SYN request until the time-to-live is exceeded. That’s why you should take care to correctly configure every machine that is suppose to respond to requests with VIP destination address.

  1. June 17th, 2009 at 13:37
    Reply | Quote | #1

    I just added the last section “To go further” so that you can clearly understand the interactions between the different components when it works and also when there is a problem if the server is not well configured to accept packets with VIP destination address.

  2. Kenn
    June 17th, 2009 at 13:58
    Reply | Quote | #2

    I’ve been closely following this blog although my setup is a little different.

    I happen to have the real servers as a windows machine. The flow process is nothing like what I see. This is regardless if adding a second IP (in this case I added a second IP of the VIP) and for the default IP of the machine I added the gateway to be the VIP. No such luck.

    The windows machine recognises the VIP so no problem there. Doing a ping on the VIP gives good echo responses back.

    Maybe the real question I should be asking is Keepalived purely Linux based (meaning all real servers have to be on Linux)? If this is the case then I find it a little strange as I would have thought connecting to the real servers is just network protocols. In some ways it must be supported on Windows because the health checks work perfect. Keepalived determines when there is something wrong with with one or more windows server.

    In fact, it should be possible to have for example: A windows mail server and a linux mail server. Thus load balancing should load balance between the two (even though they are different). Though this is odd – but I wanted to illustrate that I don’t believe such an architecture should be limited as the joining connections are just network protocols.

    I’ve asked a lot of questions here. But it would be great to have answers/solutions to this. I’ve tackled this issue for days and there is no information of this on google (or am I googling wrong)???

    Any ideas? Thanks,
    Kenn

  3. June 17th, 2009 at 14:45
    Reply | Quote | #3

    @Kenn
    There should be no difference weather your real server is linux or windows based. As I said, first ensure that your loadbalancer forwards the request to your real server and then check that the real server receives it.
    Then you should quickly notice if your real server process or drops the request. If it responds to the request using VIP source address to client destination address, that means your real server is correctly configured to process VIP destination address. If it does not react after receiving the client request, that means it is not correctly configured to accept packets with VIP destination address…

  4. Kenn
    June 18th, 2009 at 09:15
    Reply | Quote | #4

    Hi Gael,

    Thanks for all your help so far. I made two fundamentally fatal errors. I list them here so hopefully some else can learn from them.

    Firstly, I am only using one NIC. I have spare NICs around and am debating whether adding a second NIC is necessary (keepalived seems to keep referring to eth0 and eht1 devices in which I am not sure why this is important given that everything is on the same network).

    1) Do not make the mistake of forwarding from virtual server to real server on different ports. Maybe it should work but for me it does not. I had forwarded VIP:1234 to Real_Server_IP:8080 and this caused grief.

    2) Follow the instructions as exactly as you provided earlier. That is that on the real server, you need to set a second IP for the windows machine which is the same as the VIP. Confirm this by running ipconfig in the command shell prompt. You should see the usual static IP of the box and the VIP. I guess without the VIP on the box the Windows OS will not process the network packets. So I presume that keepalived forwarded the packet to the real server, but did not change the internal addressing inside which probably still had the VIP. As a result of this it can not process it unless the machine had a matching IP.

    3) Software firewalls do not seem to even detect anything amiss. I thought possibly the software firewall on windows machine might block the traffic. But nothing.

    4) You do not need to change the gateway on the windows server machines to point to anything else. Leave it at the master router for your network. I thought perhaps it was necessary to forward to the Linux box as isn’t the Linux box acting as a router? In some ways I prefer it for outgoing traffic to go straight to the master router and out to the internet rather than going through Linux keepalived and then exiting out via main router.

    I had only did a simple test and now will proceed to add the fail-over scheme as per your tutorial above. Since my real servers are not sharing the same server as the keepalived system I did not need any pre-routing rules added.

    will let you know how things progess.

    Cheers,
    Kenn

  5. June 18th, 2009 at 10:38
    Reply | Quote | #5

    @Kenn
    Concerning the NIC, keepalived refers to the interface you provide in its configuration. If your inside network is on eth0 then specify eth0, that’s it. No need to add a second interface.
    1) I tried it too and indeed it does not work as expected. Although the port 1234 was supposed to be forwarded according to the result of ipvsadm command, only the real server port was forwarded. So keep on using the same ports.
    2) I took time to provide quite understandable figures and explanations. So this is not “I presume that keepalived forwarded the packet to the real server, but did not change the internal addressing inside which probably still had the VIP”, but for sure keepalived do not change the internal addressing! The packet forwarded to your real server has VIP destination address. Thus as I said many times, you have to configure your server to accept such packets, that is to say adding the VIP address or using my PREROUTING rule. The fact that your server is based on Windows does not change anything to this behaviour.
    4) Why would you change the gateway on your windows server machines? Packets are forwarded from your loadbalancer to your windows server and then your server deals directly with your client going through your router if your client is on the internet. No need to change the gateway.

  6. Kenn
    June 18th, 2009 at 14:10
    Reply | Quote | #6

    With regards to 2) I had to check the source of the server listening to accept connections on 8080 and bind. It is currently set to accept htonl(INADDR_ANY) and port number 8080. This means it will accept any destination IP as long as it is on port 8080 it will bind and accept connection.

    You’ve given a wealth of valuable information here (enough for me to go and study, learn and investigate). I’ll keep this posted when progress has been made.

  7. Kenn
    June 19th, 2009 at 21:32
    Reply | Quote | #7

    ALL WORKING!

    Firstly, I want to say: thanks Gael, without this blog, your tutorial this would have been an impossible task for me to get Linux and Windows working all together. Additionally, it is only going through all this I can understand the significance of your article especially about needing the special pre-routing rules for the case when the real server shares the master/backup boxes.

    I will post my findings below (Gael please feel free to correct anything):

    1) If you do not fully understand the article above read it and study it well. Everything you need to know is here. Keep points to understand are DR (we are NOT doing NAT – unsure about the difference then google)

    2) The above (and my setup) is for when everything is on the same network (i.e do not worry about changing the gateway it is unnecessary). I fell into the trap as many configurations involve two NICs (i.e eth0 and eth1). Since only one is used this isn’t an issue. Gateway never will change.

    3) Always use the same port number (even for testing). I did something silly like having VIP and 1234 mapped to RIP and 8080. Which keepalived does not map it for some reason.

    4) In the case of setups my Real Servers were Windows server machines. So how to add the VIP? What ever you do don’t add second IP to the main NIC card. Yes, it will accept the VIP. It appears then there are two VIP in the network – one on Linux and one in Windows. After some time Windows then post that there is an conflicting IP on the network (duh – there is two). But before that – if your testing your client => Linux => Server it works (but it is not really – just a direct map between client => server by passing all of the Linux part). You add the VIP onto windows machines as follows:
    a) Install a Loopback connection on Windows machine. Google it for steps
    b) On the loopback ensure you put the VIP and add mask 255.255.255.255 to stop responses to the ARP when broadcast is sent out asking who owns VIP and tell x.y.z.a (your server is supposed to process packets sent to the machine having VIP as destination but not actually own VIP). In windows you will need to set 255.255.255.0 first (it does not allow you to use 255.255.255.255 as mask, then open registry, on local machine, find the services, tcpip interfaces and find the correct entry mapping to the VIP and change the mask to 255.255.255.255 and reboot machine). ipconfig /all should show correct details from here.
    c) Run wireshark sniffer or something. You must reproduce Gael’s interaction between client, Linux and server. If you do not see it then something is wrong. Important point – pay attention to the mac address NOT the IPs (IP should be pointing to the write server in the flow process however, since you are using DR, you need to further verify with the mac address of the packets).

    5) That is it! Do some testing, shut down several servers and see packets being routed around to active servers.

  8. Hasse
    July 31st, 2009 at 10:19
    Reply | Quote | #8

    The problem about not using the samme port for the VIP service vs. the RIP service is because you use the “lb_kind DR” That is the LVS type of direct routing. It is changing destination mac addresses and nothing else. It won’t change the incomming port to the realservers port, so they need to be the same.

  9. aaa support
    September 13th, 2009 at 08:39
    Reply | Quote | #9

    couple questions:

    question 1:

    when you say this:

    In our architecture, both the loadbalancer and the realserver are located on the same machine. If you simply add the VIP on the secondary machine, there will be cases where packets will be processed by the loadbalancer of the two machines since it is not deactivated on the slave.”

    Should the vip still be added as a loopback ip on both servers?

    question 2:

    you stated:

    “Furthermore if each loadbalancer selects its neighbor to process the request, we will face a ping pong effect. In such cases there will be an infinite loop between the two nodes. Thus the request is not handled at all!”

    can you explain this effect in more details?

    also, when you say “its neighbour to process the request”, do you mean primary loadbalancer will have slave load balancer handle request

    please explain this looping in more details,

    e.g.

    please explain more about what will happen if I dont add the prerouting rule on the slave (even with the vips added as loopback ips within both servers)

    thanks,

  10. gcharriere
    September 13th, 2009 at 17:08

    @aaa support
    question 1: in my architecture, do not add the VIP on both servers. The PREROUTING rule is sufficient to make it work.

    question 2: With VIP on both nodes, there can be cases where each node forwards the request to the other node without processing it. The PREROUTING rule prevents this behavior since in this case, the slave can not loadbalance the request, it can only process it.

    Both machines have a loadbalancer, but only the one with the VIP loadbalances requests at a time “t”.

    Without the PREROUTING rule, the slave will not be able to accept a request intended for the VIP.

  11. aaasupport
    September 26th, 2009 at 09:12

    if I have 2 loadbalancers , and both loadbalancers have the vip’s added as local loopbacks, do I need the prerouting rule? will the loadbalancing work without it?

    ===

    note: i am using

    net.ipv4.conf.eth0.arp_announce = 2

    ===

    please advise, as my architecture requires the use of local loopbacks,

    ===

  12. September 26th, 2009 at 09:56

    @aaasupport
    You should not need the prerouting rule.

    These rules should make it work: (to check)
    net.ipv4.conf.eth0.arp_ignore=1
    net.ipv4.conf.eth0.arp_announce=2
    net.ipv4.conf.lo.arp_ignore=1
    net.ipv4.conf.lo.arp_announce=2

    But as I said many times, you will encounter errors with this method if both loadbalancer and real server are on the same machine as in my configuration.

  13. October 11th, 2009 at 10:44

    Thanks to the advice of a guy named Philippe who wrote me a mail to share his experience with keepalived, I updated the three following scripts:
    bypass_ipvs.sh
    keepalived_master.conf
    keepalived_slave.conf

    Indeed, if you start and stop your hosts several times, the PREROUTING rule can appear several times, the new bypass_ipvs.sh deals with this issue and check if the rule already exists before adding it as you can see here:

    ****************************************************
    # Add or remove the prerouting rule
    case “$1″ in
    add)
    # check if the rule was already specified
    n=$(iptables -t nat -L| grep $VIP | wc -l)
    if [[ $n == 0 ]]; then
    # the rule was not found, add it
    iptables -A PREROUTING -t nat -d $VIP -p tcp -j REDIRECT
    fi
    ;;
    del)
    # check if the rule was already specified
    n=$(iptables -t nat -L| grep $VIP | wc -l)
    while [[ $n > 0 ]]; do
    # remove the rule
    iptables -D PREROUTING -t nat -d $VIP -p tcp -j REDIRECT
    n=$(($n-1))
    done
    ;;
    *)
    echo “Usage: $0 {add|del} ipaddress”
    exit 1
    esac

    ****************************************************

    Thank you for your feedback Philippe.

  14. Thomas
    January 10th, 2010 at 14:43

    I think there’s a minor bug in the updated bypass_ipvs.sh script.

    The script is looking/searching for a numeric IP address when counting the number of rules defined, but the iptables command will (may?) resolve and display to the text version of the requested IP.

    Thus, the logic in the “add” & “del” section should probably be:

    n=$(iptables -n -t nat -L | grep “${VIP}” | wc -l)

    and not:

    n=$(iptables -t nat -L | grep $VIP | wc -l)

    I made another (small) change in my version of the script as well so as to use bash friendly arithmetic:

    Under “del)”, in the while loop:
    let n=$n-1

    HTH

  15. Thomas
    January 10th, 2010 at 15:02

    My configuration:

    2 node cluster. host1 = keepliaved master, host2 = keepalived slave. Both are also realservers (running one postfix instance on each cluster member and want keepalived to load balance between them).

    I’ve modeled my setup on your configuration files (keepalived.conf) with rr & DR for the VIP. I am using – as can be seen from my previous comment – your bypass_ipvs.sh script.

    My problem is that when telnetting to port 25 on the VIP, I’m only connecting to the postfix instance running on the master keepalived host responds. When the VIP attempts to connect to the postfix instance running on the slave keepalived host, it will hang (seemingly in the ping-pong mode you’ve described in your post).

    iptables -t nat –list from both master & slave:

    From the Master:

    Chain PREROUTING (policy ACCEPT)
    target prot opt source destination

    Chain POSTROUTING (policy ACCEPT)
    target prot opt source destination

    Chain OUTPUT (policy ACCEPT)
    target prot opt source destination

    From the Slave:
    iptables -t nat –list
    Chain PREROUTING (policy ACCEPT)
    target prot opt source destination
    REDIRECT tcp — anywhere VIP.fqdn

    Chain POSTROUTING (policy ACCEPT)
    target prot opt source destination

    Chain OUTPUT (policy ACCEPT)
    target prot opt source destination

    Not sure how to identify what the problem is since if I shut down master keepalived process, the postfix instance on the slave now becomes the responding service. This is obviously a routing problem, but I can’t figure out what to do to address the problem and be able to connect to _both_ postfix services via the VIP.

  16. Thomas
    January 10th, 2010 at 22:35

    @Thomas
    Forgot an important detail: The problem I’m having isn’t about getting access to both realservers from a 3rd server (i.e. some other system than the ones hosting the master/slave), it’s being able to connect to both postfix services from the master or slave server.

  17. Harri
    January 11th, 2010 at 16:06

    I got this:

    /etc/keepalived/bypass_ipvs.sh: 33: Syntax error: “(” unexpected

  18. Harri
    January 11th, 2010 at 16:24

    PS: Another problem:

    — /etc/keepalived/bypass_ipvs.sh.bak 2010-01-11 16:06:05.000000000 +0100
    +++ /etc/keepalived/bypass_ipvs.sh 2010-01-11 16:22:03.000000000 +0100
    @@ -63,7 +63,7 @@
    case “$1″ in
    add)
    # check if the rule was already specified
    – n=$(iptables -t nat -L| grep $VIP | wc -l)
    + n=$(iptables -t nat -L -n| grep $VIP | wc -l)
    if [[ $n == 0 ]]; then
    # the rule was not found, add it
    iptables -A PREROUTING -t nat -d $VIP -p tcp -j REDIRECT
    @@ -71,7 +71,7 @@
    ;;
    del)
    # check if the rule was already specified
    – n=$(iptables -t nat -L| grep $VIP | wc -l)
    + n=$(iptables -t nat -L -n| grep $VIP | wc -l)
    while [[ $n > 0 ]]; do
    # remove the rule
    iptables -D PREROUTING -t nat -d $VIP -p tcp -j REDIRECT

    Hope this helps

  19. January 11th, 2010 at 21:34

    @Thomas
    The behavior you described for bypass_ipvs.sh is not a bug. As you said, the script is looking for an IP address, if you specify a hostname, it will not even pass the validation process. The script was built and supported only for an IP address. If you want to specify a hostname, I advice you to begin with an IP address, ensure everything works and then only, modify the whole script to make it work with a hostname. I didn’t check the behavior of the PREROUTING rule when you specify a hostname instead of an IP address. If you are sure that the PREROUTING rule can work with a hostname, then you just have to modify the my script so that it accepts and deals with hostnames.

    Your “friendly” arithmetic modification — n=$n-1 — does not work in my sh environment, so keep: n=$(($n-1))

  20. January 11th, 2010 at 21:34

    @Thomas
    Again, it works with an IP address, thus your issue should come from your hostname specification

  21. January 11th, 2010 at 21:35

    @Harri
    Please check that you use the provided bypass_ivps.sh script. Indeed, I don’t remember having specified the “-n” option for the computation of the number of rules… Furthermore, I just checked the script and it works perfectly under my environment.

  22. Tyler
    January 12th, 2010 at 10:00

    First off, thank you for the great detailed guide. I’ve been following it attempting to replace a failing load balancing appliance with LVS.

    I am using Ubuntu with 2 nodes. Each node is a director and also a web server using a single NIC with everything on the same network (just like your setup).

    I’ve managed to get keepalived running and performing immediate failover, but am struggling to get the load balancing portion working. The request always gets handled by whichever director is currently master. ipvsadm shows no entries! I am assuming that the virtual_server section with the 2+ real_server entries define what keepalived will automatically add to the ipvsadm entries when started… Is my assumption wrong? Am I supposed to manually enter ipvsadm entries?

    Using Ubuntu 9.10 server edition and the latest version of keepalived available with apt-get.

    What am I doing wrong? Thank you for your help!

  23. Harri
    January 12th, 2010 at 10:53

    @gcharriere
    Please don’t confuse old and new version in the diff output. The -n is missing(!) in your script. “iptables -t nat -L -n” produces numeric output for the IP addresses. If you omit the “-n”, then you will get hostnames instead of IP addresses and the nat iptable is not cleaned up on failover.

    I would assume you did not specify a reverse lookup for $VIP on your DNServer?

  24. January 12th, 2010 at 11:44

    The ‘-n’ is not missing, you can add it if you want, but since the script is only supposed to work with numerical IP addresses, omitting this option will not impact the overall behavior. Again, it works with IP addresses. If you manage to make it work with hostnames, I’ll be more than happy to publish your results on this website.

  25. January 12th, 2010 at 13:45

    @Tyler
    Indeed, keepalived should automatically deal with ipvs table. Your problem is certainly related to your Ubuntu server that is unable to properly handle ipvs module.

  26. Tyler
    January 12th, 2010 at 21:12

    @gcharriere
    Any idea why that would be happening? I installed the barebones Ubuntu server with LVS support from the cd, performed apt-get upgrade and apt-get install keepalived, enabled all the ip_vs.. modules, and have tried out multiple configuration files, all with no luck.

    The failover portion (VRRP) is working great, the load balancing (LVS) is not working at all. If I manually create the entries using ipvsadm it will work, but I want to get keepalived managing that.

    The confusing part is that the /var/log/messages output from starting keepalived shows the virtual_server / healthchecks section is being ignored and no errors are being thrown.

    Any ideas?

  27. January 13th, 2010 at 11:51

    @Tyler
    After a quick search, the actual version (1.1.17) of keepalived in Ubuntu 9.10 seemed to be built without LVS support. You should investigate in this area.
    http://www.mail-archive.com/ubuntu-ha@lists.launchpad.net/msg00312.html

  28. Harri
    January 14th, 2010 at 12:12

    @gcharriere
    The “-n” option for iptables means “numeric output”, i.e. use numeric IP addresses. If you omit the ‘-n’ (and if you have reverse DNS for your $VIP properly setup), then “iptables -t nat -L” will give you full qualified host names instead of numeric IP addresses.

  29. May 16th, 2010 at 10:00

    Thanks for your great tutorial !
    it is working perfectly on my servers but i have a small question:
    How do you do if you have 2 VIPs that need to load balance traffic between the 2 boxes ?
    Is there a way to do it ?

  30. May 17th, 2010 at 20:46

    @Jonathan
    Configure another vrrp_instance and another VIP.

  31. May 18th, 2010 at 12:27

    @gcharriere
    Ok but what do you do with the PREROUTING rule ?
    not needed for the second VIP ?

  32. May 18th, 2010 at 18:21

    @Jonathan
    Simply add another prerouting rule for your second VIP to make sure both hosts can process packets intended to this VIP.

  33. May 19th, 2010 at 09:19

    @gcharriere
    Yes thank you it is working ! :)

  34. Kenn
    November 3rd, 2010 at 22:53

    After having much success from when following this tutorial last year. However, I had slight differences in that I had my real servers on windows machines.

    What I like to do is combine the setup of what you have here such that:

    1/ VIP: Real Server 192.168.9.10 and 192.168.9.20 for proxy port on 7777
    2/ VIP: Real Server 192.168.9.30 and 192.168.9.40 for windows service on 6666

    Would the above setup work? I have 2/ working fine. But wondering, if using the above scripts I can get 1/ to work. I guess the question is: can you mix a group of real servers on port 6666 to be on windows only machines, and mix a group of real servers on port 7777 to be on the same servers as keepalived (master / slave as per your setup below)?

    It will be a dream if this is possible as it maximises hardware resources, and keeps the best of both worlds Windows and Linux.

    Thanks,
    Kenn

  35. Kenn
    November 4th, 2010 at 11:38

    @Kenn

    I can say that it does work. I had a lot of problems getting the above bypass script to work on this linux box. There were the things I had to do:
    1) Change the shell to bash
    2) In your test to iptables to see if the VIP has been registered I had to add “-n”. This tells not to do DNS and it is important as without this you will not see the VIP in the results (I see the hostname of who ever the VIP is assigned to instead of the VIP and as a result the script will never remove the routing rule when toggling it between master and backup states).

    One question I do have, I am using the 2 Linux machines for Keepalived (one master, one back), and on each Linux machine I installed a proxy.

    The lb_algo I am using is just wrr and setting Linux1 real server to weight 100 and the Linux2 real server to weight 0. Is this correct? I want all traffic to go to proxy server on Linux1 since it is where everything is being cached. If this proxy server should die then go to the Linux2 one, or if Linux1 computer dies, go to Linux2. I can verify that these are working as I fail out the proxy server, or shut down the master keepalived server.

    What I am unsure about is if using lb_algo wrr and assigning Linux1 real server weight to 100, and Linux2 real server weight to 0, will ensure that all traffic goes to Linux1 box.

    Thanks,
    Kenn

  36. November 24th, 2010 at 13:24

    The setup you describe in this article is assuming you have one eth0 interface but no eth1 interface, correct?
    And since both your real IP and your virtual IP map to eth0, you need to take countermeasures against the ping pong effect, if I’m not mistaken.

    My setup will be almost identical to yours (2 machines acting both as loadbalancer and as realserver), but I have an eth1 interface which interconnects host1 to host2 directly. The connection to the internet is established via eth0 on both machines.
    If a packet arrives on the virtual IP, it will flow in via eth0. The loadbalancer will only listen on the virtual IP, and route the packet to either itself or to the other host – using the private IP address, going over eth1.
    Since neither of the hosts has a loadbalancer listening on eth1, we don’t have the ping pong effect, right?

    So my question actually is: Can I disregard the bypass_ipvs script altogether in my setup and go with the “traditional” approach?
    (“The traditional way to sort out this issue is to configure the VIP on the other node for example on the loopback interface so that it accepts packets with VIP as destination address. Then you should configure network interfaces such that they ignore some ARP requests playing with arp_ignore and arp_announce options.”)

    Could it cause troubles, if a request originates from -let’s say- host1 (e.g. a database lookup using the virtual IP)?
    Image the following scenario:
    A website’s domainname resolves to a VIP.
    The active loadbalancer (host1) routes the request to host2.
    Host2 tries to answer the HTTP request – a custom webapp initiates an HTTP GET on some other website which is also hosted in the same cluster on the same VIP. What will happen then?

    Thank you in advance,
    Can

  37. December 7th, 2010 at 14:47

    @Kenn
    If weight is equal to zero, no traffic should be redirected to this real server even if it is the only one up and running. You should rather configure a healthchecker on your Linux2 that checks Linux1. Configure it in a way that this healthcheck should return true only when the neighbor does not work any more. Thus, Linux2 realserver will only work when Linux1 is down.

  38. December 7th, 2010 at 15:43

    @Can
    – one eth0, correct.
    – Can you disregard the bypass_ipvs script altogether in your setup and go with the “traditional” approach? Yes, since the cost to a directly connected interface is 0. Thus there is no risk that your packet will be redirected to the other VIP (which could be a problem).
    If a realserver initiates a GET request on a VIP address it will try to handle itself since it already has a MAC address associated to the VIP.

  39. Jeff
    April 26th, 2011 at 02:42

    Hi, thanks for your instructions.I have a question about generating digest. Here, when I try to run the ‘genhash’ command :

    [hostname1]# genhash -s hostname1 -p 80 -u /apache2-default/index.html
    MD5SUM = c7b4690c8c46625ef0f328cd7a24a0a3

    [hostname1]# genhash -s hostname2 -p 80 -u /apache2-default/index.html
    MD5SUM = c7b4690c8c46625ef0f328cd7a24a0a3

    I just get different MD5SUM .But in your example we can see that genhash generates the same MD5SUM from both host. Does it matter ?

  40. April 26th, 2011 at 09:08

    @Jeff
    It doesn’t matter as soon as the hash of each host remains the same when you run it several times. If it is not the case, it means that your server computes a different random value every time the page is executed. So please ensure it is not the case, otherwise keepalived will assume your realserver is dead and remove it from its table.

  41. Mien
    July 8th, 2011 at 06:13

    Hi, Thanks a lot for the tutorial. It’s really great:). I could run the same model with two Ubuntu servers. But I had a small issue that whenever the client connect to web server the it loaded the file.php but not process it on the server side. Your advice will be highly appreciated

  42. July 9th, 2011 at 08:54

    @Mien
    Could you please try to connect to the same page (file.php) through the VIP as well as through the real server IP address so that we can see the differences?

  43. Mien
    August 11th, 2011 at 09:45

    I changed the configuration and it worked somehow. I still faced the issue with the loadbalancer.
    [hostname1]# ipvsadm -L -n
    IP Virtual Server version 1.2.1 (size=4096)
    Prot LocalAddress:Port Scheduler Flags
    -> RemoteAddress:Port Forward Weight ActiveConn InActConn
    TCP 192.168.9.100:80 rr persistent 50

    There was no route to the local and neigbour. Please advise.

  44. Mien
    August 12th, 2011 at 10:19

    Only failover worked perfectly, the function load balancing did not work.

  45. August 16th, 2011 at 10:22

    @Mien:
    I got also a load balancing problem with a setup where the same client was running requests on the cluster which I setup according to this page.
    The solution I found is to remove the “persistence_timeout 50″ into the keepalived configuration file. This parameter specifies that a given client will always be forwarded to the same target (information provided to IPVS) within a specific timeframe.
    Then I could see my load balancing working correctly!

  46. Mien
    August 17th, 2011 at 07:44

    Thanks for your info but I still have an issue it. I am using Ubuntu 10.04, does it cause this issue?

  47. Nick
    September 2nd, 2011 at 00:38

    @Tyler
    Your not doing anything wrong, Ubuntu keepalived 1.1.17 has a awesome bug does not come with ipvs fail support for the kernel. Its in debian but not Ubuntu

    https://bugs.launchpad.net/ubuntu/+source/keepalived/+bug/496932

    I ended up compiling keepalived-1.2.2 from source you will find that IPVS support is not there, so you will need to compile IPVS support in, from kernel source headers when you compile keepalived from source tar.

    The issue appears to be in Ubuntu 10.04 as well, it never got fixed Ubuntu failz.

  48. Vikas
    November 11th, 2011 at 13:16

    Hi gcharriere,
    Really good article for clustering. I was in search of exactly this kind of cluster setup. I followed your tutorial and managed to get load balancing and failover for Tomcat and ActiveMQ (listening on 8161). However when I am trying the same thing with MySQL, I am not getting failover in case of MySQL is not running on the master machine(hostname1). Basically, it is not able to connect using VIP when MySQL is running only on slave machine(hostname2). With tcpdump i can see following logs, which hints that there is some kind of looping happening for this case.

    I am getting following tcpdump When I am trying to connect mysql from hostname1

    04:09:41.392661 IP $VIP.60681 > $VIP.mysql: …

    and following when connecting from hostname2

    04:09:41.392661 IP hostname2.60681 > $VIP.mysql: …

    I have verified that the NAT rule is present on the slave. Can you please help me?

  49. Awktane
    November 27th, 2011 at 12:11

    A note with Ubuntu 10.04 – modprobe IP_VS and then add IP_VS to /etc/modules. Works like a charm!

    I too had to change the shell on the .sh script to bash and because my servers are able to access reverse DNS entries I had to make it n=$(iptables -n -t nat -L … under both add and del. No matter the circumstance this should make it more efficient because then iptables won’t bother trying to look up names.

  50. Luis
    November 30th, 2011 at 05:14

    Hi.. I am a newbie. and trying to follow up this tutorial

    I am not getting the master to work properly, the VIP pings, but on the browser port 80 does not answer.

    1. Do I have to add the VIP in /etc/network/interfaces? If I do that it seems to work, but first the VIP gets to the BACKUP, I have to stop keepalived there, and then the MASTER takes over, then I can enable the BACKUP again.

    2. If I reboot both machines without the VIP added to /etc/network/interfaces and stop apache on the BACKUP server, the MASTER takes over, somehow, the BACKUP is gaining control first, and I have to disable it to get the MASTER to take over, but the BACKUP is not answering to por 80 on the VIP

    3. I am getting some weird messages that might be related (MASTER /var/log/messages)

    Nov 30 09:22:54 AERLRS045 Keepalived_vrrp: VRRP_Instance(VI_1) Entering MASTER STATE
    Nov 30 09:22:54 AERLRS045 Keepalived_vrrp: Netlink: skipping nl_cmd msg…
    Nov 30 09:24:07 AERLRS045 kernel: [ 82.754640] process `sysctl’ is using deprecated sysctl (syscall) net.ipv6.neigh.default.retrans_time; Use net.ipv6.neigh.default.retrans_time_ms instead.
    Nov 30 09:25:26 AERLRS045 kernel: [ 161.739457] ip_tables: (C) 2000-2006 Netfilter Core Team
    Nov 30 09:25:26 AERLRS045 kernel: [ 161.760538] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
    Nov 30 09:25:26 AERLRS045 kernel: [ 161.760736] CONFIG_NF_CT_ACCT is deprecated and will be removed soon. Please use
    Nov 30 09:25:26 AERLRS045 kernel: [ 161.760739] nf_conntrack.acct=1 kernel parameter, acct=1 nf_conntrack module option or
    Nov 30 09:25:26 AERLRS045 kernel: [ 161.760741] sysctl net.netfilter.nf_conntrack_acct=1 to enable it.
    Nov 30 09:29:18 AERLRS045 Keepalived_healthcheckers: Error connecting server [192.168.172.224:80].
    Nov 30 09:29:18 AERLRS045 Keepalived_healthcheckers: Removing service [192.168.172.224:80] from VS [192.168.172.59:80]

    If I stop keepalived on the MASTER, the VIP continues ping but is not answering to port 80.

    5. When I manually execute /etc/keepalived/bypass_ipvs.sh del 192.168.172.59, I get the following message:

    /etc/keepalived/bypass_ipvs.sh: 33: Syntax error: “(” unexpected

    Many Thanks, and sorry for the dumb questions, definitely I am doing something wrong.

Comment pages
Comments are closed.