Next Previous Contents

8. Persistent connection

The term "persistence" has 2 meanings in setting up an LVS. There is "persistent connection" a term used for connecting to webservers and databases and "persistent connection" used in LVS. These are quite different.

8.1 netscape/database/tcpip persistence

Persistant connection outside of LVS is described in http persistent connection and is an application level protocol. It works this way:

In a normal http (or database connection), after the server has sent it's reply, it shuts down the tcpip connection. This makes your session with the server stateless - the server has no record of previous packets/data/state sent to it. If the payload is small (eg 1 packet), then you've gone through a lot of handshakes and packet exchanges to deliver one packet. To solve this, http persistent connection was invented. Both the client and server must be persistence-enabled for this to work. At connect time, the client and server notify each other that they support persistent connection. The server uses an algorithm to determine when to drop the connection (timeout, needs to recover file handles...). The client can drop the connection at anytime without consulting the server. This requires more resources from the server as file handles can be open for much longer than the time needed for a tcpip transfer.

8.2 LVS persistence

LVS persistence makes a client connect to the same realserver for different tcpip connections. This is used when

The default timeout for LVS persistence is 360secs (used to be 600 secs). The default timeout for a regular LVS connection via LVS-DR is TIME_WAIT (about 1 minute). This means that LVS persistent connections will stay in the LVS connection table for 6 times longer for persistent connection. As a consequence the hash table (and memory requirements) will be 6 times larger for the same number of connections/sec. Make sure you have enough memory to hold the increased table size if you're using persistent connections. If the persistence is being used to hold state (e.g. shopping cart), then you must allow a long enough timeout for the client to surf to another site for a better price, make a cup of coffee, think about it and then go find their credit card. This is going to be much longer than any reasonable timeout for LVS persistence and the state information will have to be held on a disk somewhere on the realservers and you'll have to allow for the client to appear on a different realserver later with their credit card information.

The LVS persistant (or sticky) connection is at the layer 4 protocol level. This is not the same as the persistent connection described above for netscape persistence or database persistence of a single tcpip connection. Unfortunately, both features are persistent and can reasonably claim the name "persistent", and this causes some confusion. LVS could alternately be described as connection affinity or port affinity.

Wensong Zhang wensong@gnuchina.org 11 Jan 2001

The working principle of persistence in LVS is as follows:

You can trace your system in the following way. For example:

[root@kangaroo /root]# ipvsadm -ln
IP Virtual Server version 1.0.3 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port          Forward Weight ActiveConn InActConn
TCP  172.26.20.118:80 wlc persistent 360
  -> 172.26.20.91:80             Route   1      0          0
  -> 172.26.20.90:80             Route   1      0          0
TCP  172.26.20.118:23 wlc persistent 360
  -> 172.26.20.90:23             Route   1      0          0
  -> 172.26.20.91:23             Route   1      0          0

[root@kangaroo /root]# ipchains -L -M -n
IP masquerading entries
prot expire   source               destination          ports
TCP  02:46.79 172.26.20.90         172.26.20.222        23 (23) -> 0

Although there is no connection, the template isn't expired. So, new connections from the client 172.26.20.222 will be forwarded to the server 172.26.20.90.

Bowie Bailey

If I start a service with:

        ipvsadm -A -f 1 -s wlc -p 180
and then change the persistence flag with:
        ipvsadm -E -f 1 -s wlc -p 180 -M 255.255.255.0
how does that affect the connections that have already been made?

Julian 30 Jul 2001

The connections are already established. But the persistence is broken and after changing the netmask you can expect the next connections to be established to another real servers (not to the same as before the change).

If IP address 1.2.3.4 was connected to RIP1 before I changed the persistence and then 1.2.3.5 tries to connect afterwards, would he be sent to RIP1, or would it be considered a new connection and possibly be sent to either server since the mask was 255.255.255.255 when the first connection happened?

New realserver will be selected.

8.3 persistent client connection, pcc (for kernel =<2.2.10)

All connections from an IP go to same realserver. Timeout for inactive connections is 360 sec. pcc is designed for https and cookie serving. With ppc, after the first connection (say to port 80) any subsequent connections requests from the same client but from another port (eg 443) will be sent to the same realserver. The problem with this is that about 25% of the people on the internet have the same IP (AOL customers are connected to the internet via a server in Virginnia, USA). If you have pcc set, then after the first client connects from AOL, then all subsequent connections from AOL will go to the same realserver, until the last AOL client disconnects. This effect will override attempts to distribute the load between realservers.

8.4 persistent port connection (ppc) (for kernel >= 2.2.12)

With kernel 2.2.12, the persistent connection feature has been changed from a scheduling algorythm (you get rr|wrr|lc|wlc|pcc) to a switch (you can have persistent connection with rr|wrr|lc|wlc). If you do not select a scheduling algorithm when asking for a persistent connection, ipvsadm will default to wlc.

The difference between pcc and ppc is probably of minor consequence to the LVS admin (if you want persistent connection, you have to have it and you don't care how you got it). With ppc, connections are assigned on a port by port basis. With ppc, if both port 80 and 443 were persistant, then connections from the same client would not neccessarily go to the same realserver. This solves the AOL problem.

If you are handing out cookies to a client on port 80 and they need to go to port 443 to give their credit card, you want them going to the same realserver. There is no way to make ports sticky by groups (or pairs), so for the moment you emulate the pcc connection by using port 0.

8.5 Problems: Removing persistant connections after a realserver crash

Patrick Kormann pkormann@datacomm.ch

I have the following problem: I have a direct routed 'cluster' of 4 proxies. My problem is that even if the proxy is taken out of the list of real servers, the persistent connection is still active, that means, that proxy is still used.

Andres Reiner

Now I found some strange behaviour using 'mon' for the high-availability. If a server goes down it is correctly removed from the routing table. BUT if a client did a request prior to the server's failure, it will still be directed to the failed server afterwards. I guess this got something to do with the persistent connection setting (which is used for the cold fusion applications/session variables).

In my understanding the LVS should, if a routing entry is deleted, no longer direct clients to the failed server even if the persistent connection setting is used.

Is there some option I missed or is it a bug ?

Wensong Zhang wrote:

No, you didn't miss anything and it is not a bug either. :)

In the current design of LVS, the connection won't be drastically removed but silently drop the packet once the destination of the connection is down, because monitering software may marks the server temporary down when the server is too busy or the monitering software makes some errors. When the server is up, then the connection continues. If server is not up for a while, then the client will timeout. One thing is gauranteed that no new connections will be assigned to a server when it is down. When the client reestablishs the connection (e.g. press reload/refresh in the browser), a new server will be assigned.

jacob.rief@tis.at// wrote:

Unfortunately I have the same problem as Andres (see below) If I remove a real server from a list of persistent virtual servers, this connection never times out. Not even after the specified timeout has been reached.

Wensong

The persistent template won't timeout until all its connections timeout. After all the connections from the same client connection expires, new connections can be assigned to one of the remaining servers. You can use "ipchains -M -L -n" (or netstat -M) to check the connection table (for 2.4.x use cat /proc/net/ip_conntrack).

Only if I unset persisency the connection will be redirected onto the remaining real servers. Now if I turn on persistency again, a prevoiusly attached client does not reconnect anymore - it seems as if LVS remembers such clients. It does not even help, if I delete the whole virtual service and restore it immediately, in the hope to clear the persistency tables.
ipvsadm -D -t <VIP>; ipvsadm -A -t <VIP> -p; ipvsadm -a -t <VIP> -R <alive real server>
And it also does not help closing the browser and restarting it. I run LVS in masquerading mode on a 2.2.13-kernel patched with ipvs-0.9.5. Would'nt it be a nice feature to flush the persistent client connection table, and/or list all such connections?

Wensong

There are several reasons that I didn't do it in the current code. One is that it is time-consuming to search a big table (maybe one million entries) to flush the connections destined for the dead server; the other is that the template won't expire until its connection expire, the client will be assigned to the same server as long as there is a connection not expired. Anyway, I will think about better way to solve this problem.

valery brasseur

I would like to to load balancing based on cookie and/or URL,

Wensong

Have a look at http://www.LinuxVirtualServer.org/persistence.html :-)

Joe

also see cookie

matt matt@paycom.net

I have run into a problem with the persistant connection flag, I'm hoping that someone can help me. First off, I don't think there is anything like this out now, but, is there anyway to load-balance via URL? Such as http://www.matthew.com being balanced among 5 servers without persistant connections turned on, and http://www.matthew.com/dynamic.html being flagged with persistance? Second question is this; I don't exactly need a persistant connection, but I do need to make sure that requests from a particular person continue to go to the same server. Is there any way to do this?

James CE Johnson jcej@tragus.org Jul 2001

We ran into something similar a while back. Our solution was to create a simple Apache module that pushes a cookie to the browser the when the "session" begins (eg -- when no cookie exists). The content of the cookie is some indicator of the realserver. On the second and subsequent requests the Apache module sees the cookie and uses the Apache proxy mechanism to forward the request to the appropriate realserver and return the results.

unknown

Let's say: I have 1000 http requests (A) through a firewall of a customer (so in fact all requests have the same Source IP for Loadbalancer, because of NAT) and then one request (B) from the Intranet and then again 1000 Request (C) from that firewall, what does LB do? I have three Realservers r1, r2, r3 (ppc with rr)

 
a) A to r1, B to r2, C to r1 (because of SourceIP) [Distribution:2000:1:0.0000001]
b) A to r1, B to r2, C to r3 (because r3 is free) [Distribution:1000:1:1000]
c) A to r1, B to r2, C to r2 (due to the low load of r2) [Distribution:1000:1000:0.000001]
A to r1 && r2 && r3 (depending on source port), 
B to r1 || r2 || r3,
C to r1 && r2 && r3 [Distribution: 667:667:666]

Ratz ratz@tac.ch 12 Sep 1999

If C reachs the load balancer before all the 1000 requests of A expire, then the requests of C will be sent to r1, and the distribution is 2000:1:0.

If all the requests of A expires, the requests of C will be forwarded to a server that is selected by a scheduler.

BTW, persistent port is used to solve the connection affinity problem, but it may lead to dynamic load imbalance among servers.

Jean-Francois Nadeau

I will use LVS to load balance web servers (Direct Routing and WRR algo). I use persitency with a big timeout (10 minutes). Many of our clients are behind big proxies and I fear this will unbalance our cluster because of the persitent timeout.

Wensong persistent virtual services may lead to the load imbalance among servers. Using some weight adapation approaches may help avoid that some servers are overloaded for a long time. When the server is overloaded, decrease its weight so that connections from new clients won't be sent to that server. When the server is underloaded, increase its weight.

Can we alter directly /proc/net/ip_masquerade ?

No, it is not feasible, because directly modifying masq entries will break the established connection.

8.6 Persistent and regular services are possible on the same realserver.

If you setup a 2 realserver LVS-DR LVS with persistence,

ipvsadm -A -t $VIP -p -s -rr
ipvsadm -a -t $VIP -R $realserver1 $VS_DR -w 1
ipvsadm -a -t $VIP -R $realserver2 $VS_DR -w 1

giving the ipvsadm output

director:/etc/lvs# ipvsadm
IP Virtual Server version 0.2.5 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port          Forward Weight ActiveConn InActConn
TCP  lvs2.mack.net:0 rr persistent 360
  -> bashfull.mack.net:0         Route   1      0          0         
  -> sneezy.mack.net:0           Route   1      0          0     

then (as expected) a client can connect to any service on the realservers (always getting the same realserver).

If you now add an entry for telnet to both realservers, (you can run these next instructions before or after the 3 lines immediately above)

ipvsadm -A -t $VIP:telnet -s -rr
ipvsadm -a -t $VIP:telnet -R $realserver1 $VS_DR -w 1
ipvsadm -a -t $VIP:telnet -R $realserver2 $VS_DR -w 1

giving the ipvsadm output

director:/etc/lvs# ipvsadm
IP Virtual Server version 0.2.5 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port          Forward Weight ActiveConn InActConn
TCP  lvs2.mack.net:0 rr persistent 360
  -> bashfull.mack.net:0         Route   1      0          0         
  -> sneezy.mack.net:0           Route   1      0          0         
TCP  lvs2.mack.net:telnet rr
  -> sneezy.mack.net:telnet      Route   1      0          0  

the client will telnet to both realservers in turn as would be expected for an LVS serving only telnet, but all other services (ie !telnet) go to the same first realserver. All services but telnet are persistent.

The director will make persistent all ports except those that are explicitely set as non-persistent. These two sets of ipvsadm commands do not overwrite each other. Persistent and non-persistent connections can be made at the same time.

Julian

This is part of the LVS design. The templates used for persistence are not inspected when scheduling packets for non-persistent connections.

8.7 Examples of persistence

Note: making realserver connections persistent allows _all_ ports to be forwarded by the LVS to the realservers. Non-persistent LVS connections are only for the nominated service. An open, persistently connected realserver then is a security hazard. You should run ipchains commands on the director to block all services on the VIP except those you want forwarded to the realservers.

8.8 AOL and proxies

Because of the way proxies can work, a client can come from one IP for one connection (eg port 80) and from another IP for the next connection (eg port 443). To handle this, you can make a netmask of IPs sticky. You can set the netmask for persistence to /24 and all clients from the same class C network will be sent to the same realserver, if persistence is set.

valery brasseur

I have seen some discussion about "proxy farm" such as AOL or T-Online,

Wensong

If you want to build a persistent proxy cluster, you just need set a LVS box at the front of all proxy servers, and use the persistent port option in the ipvsadm commands. BTW, you can have a look at how to build a big JANET cache cluster using LVS.

If you want to build a persistent web service but some proxy farms are non-persistent at client side, then you can use the persistent granularity so that clients can be grouped, for example you use 255.255.255.0 mask, the clients from the same /24 network will go to the same server.

While persistence can be used for services that require multiple ports eg ftp/ftp-data, http/https it can be useful for ssl services.

Here's an example of using persistence granularity (from Ratz 3 Jan 2001). The -M 255.255.255.255 sets up /32 granularity. Here port 80 and port 443 are being linked by fwmarks.

ipchains -A input -j ACCEPT -p tcp -d 192.168.1.100/32 80 -m 1 -l
ipchains -A input -j ACCEPT -p tcp -d 192.168.1.100/32 443 -m 1 -l
ipvsadm -A -f 1 -s wlc -p 333 -M 255.255.255.255
ipvsadm -a -f 1 -r 192.168.1.1 -g -w 1
ipvsadm -a -f 1 -r 192.168.1.2 -g -w 1

For more information on persistence granularity see the section on persistence granularity with fwmark. It's use for fwmark is the same as for VIP.

Francis Corouge wrote:

I made a LVS-DR lvs. All services work well, but with IE 4.1 on secured connection, pages are received randomly. when you make several requests, sometime the page is displayed, but sometimes a popup error message is displayed

Internet Explorer can't open your Internet Site <url>
An error occured with the secured connexion.

I did not test with other versions of IE, but netscape works fine. It works when I connect directly to the real server (realserver disconnected from the LVS, and the VIP on the realserver allowed to arp).

Julian

Is the https service created persistent? i.e. using ipvsadm -p

Joe

Why does persistence fix this problem? (also see http://www.linuxvirtualserver.org/persistence.html)

Julian

I assume the problem is in the way SSL is working: cached keys, etc. Without persistence configured, the SSL connections break when they hit another real server.

what is (or might be) different about IE4 and Netscape?

Maybe in the way the bugs are encoded. But I'm not sure how the SSL requests are performed. It depends on that too.

Example 1. https only

This is done with persistent connection.

lvs_dr.conf config file excerpt (Oct 2001, this syntax doesn't work anymore, persistence is set by the services in @persistent_services).

SERVICE=t https ppc 192.168.1.1

output from ipvsadm


ipvsadm settings
IP Virtual Server version 0.9.4 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port          Forward Weight ActiveConn InActConn
TCP  ssl.mack.net:https wlc persistent 360
  -> di.mack.net:https           Route   1      0          0

Example 2. All ports sticky, timeout 30mins, wrr scheduling

lvs_dr.conf config file excerpt

SERVICE=t 0 wrr ppc -t 1800 192.168.1.1 (Oct 2001, this syntax doesn't work anymore, persistence is set by the services in @persistent_services).

which specifies tcp (t), service (all ports = 0), weighted round robin scheduling (wrr), timeout 1800 secs (-t 1800), to realserser 192.168.1.1.

Here's the code generated by configure

#ppc persistent connection, timeout 1800 sec
/sbin/ipvsadm -A -t 192.168.1.110:0 -s wrr -p 1800
echo "adding service 0 to realserver 192.168.1.1 using connection type dr weight 1"
/sbin/ipvsadm -a -t 192.168.1.110:0 -R 192.168.1.1 -g -w 1

here's the output of ipvsadm

# ipvsadm
IP Virtual Server version 0.9.4 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port          Forward Weight ActiveConn InActConn
TCP  ssl.mack.net:https wrr persistent 1800
  -> di.mack.net:https           Route   1      0          0

Jeremy Johnson jjohnson@real.com

how does LVS handles a single client that uses multiple proxies... for instance aol, when an aol user attempts to connect to a website, each request can come from a different proxy so, how/if does LVS know that the request is from the same client and bind them to the same server?

Joe

if this is what aol does then each request will be independant and will not neccessarily go to the same realserver. Previous discussions about aol have assumed that everyone from aol was coming out of the same IP (or same class C network). Currently this is handled by making the connection persistant and all connections from aol will go to one realserver.

Michael Sparks zathras@epsilon3.mcc.ac.uk

If ISP user (eg AOL) has a proxy array/farm then the requests are _likely_ to come from two possibilities:

The former can be handled using a subnet mask in the persistance settings, the latter is handled by normal persistance.

*However* In the case of our proxy farm neither of these would work since we have 2 subnet ranges for our systems - 194.83.240/24 & 194.82.103/24, and an end user request may come out of each subnet totally defeating the persistance idea... (in fact dependent on our clients configuration of their caches, the request could appear to come from the above two subnets or the above 2 subnets and about 1000 other ones as well)

Unfortunately this problem is more common that might be obvious, due to the NLANR hierarchy, so whilst persistance on IP/subnet solves a large number of problems, it can't solve all of them.

Billy Quinn bquinn@ifleet.com 05 Jun 2001

I've come to conclusion that I need an expensive ( higher layer) load balancer node , which load balances port 80 ( using persistence because of sessions ) to 3 real servers which each run an apache web server, and tomcat servlet engine. Each of the 3 servers is independent and no tomcat load balancing occurs.

This has worked great for about a year, while we only had to support certain IP address ranges. Now, however, we have to support clients using AOL and their proxy servers, which completely messes up the session handling in tomcat. In other words, one client comes from multiple different IP addresses based on which proxy server it comes through.

It seems the thing to do is to adjust the persistence granularity . However, if I adjust the netmask, all of our internal network traffic will go to one server, which kind of defeats the purpose.

What I'm concluding is, that I'll need to change the network architecture (since we are all on one subnet), or buy a load balancer which will look at the actual data in the packets (layer 7?).

Joe

There has been comments by people dealing with this problem (not many), but they seem to be still able to use LVS. We don't hear of anyone who is having lots of trouble with this, but it could be because no-one on this list is dealing with AOL as a large slice of their work.

If 1/3 of your customers are from AOL you could sacrifice one server to them, but it's not ideal. If all your customers are from AOL, I'd say we can't help you at the moment.

My concern with that would be anyone else doing proxying ... now or in the future . I would not be opposed to routing all of the AOL customers to one server for now though . I guess we could have to deal with each case of proxying individually. I wonder how many other ISP's do proxying like that

How many different proxy IPs do AOL customers arrive on the internet from? How many will appear from multiple IP's in the same session and how big is the subnet they come from? (/24?)

Good question, I'm not sure about that one. The customer that reported the problem seemed to be coming from about 2-4 different IP addresses (for the same session ).

If AOL customers come from at least 3 of these subnets and you have 3 servers, then you can use LVS as a balancer.

Peter Mueller pmueller@sidestep.com

Over here we also need layer-7 'intelligent' balancing with our apache/jakarta setup. We utilize two tiers of 'load-balancing'. One is the initial LVS-DR round-robin type setup, while the second layer is our own creation, layer-7. Currently we round-robin the first connection to one server, then that server calls a routine that will ask the second-tier layer-7 java monitor boxes which box to send the connection to. (If for some reason the second layer is down, standard round-robin occurs).

We're about 50% done with migration from cisco LD (yuck!) to LVS-DR. After the migration is fully complete the goal is to have the two layers interacting more efficiently and hopefully merged into one 'layer' eventually.. for example, if we tell our java-monitor second-tier controllers to shutdown a server, the first tier will then mark the node out of service automatically.

PS - we found the added layer-7 intelligent balancing to be about 30-50% (?) added effectiveness to cisco round robin LD.. I think the analogy of a hub versus a switch works fairly well here..

Chris Egolf cegolf@refinedsolutions.net>

We're having the exact same problem with WebSphere cookie-based sessions. I was testing this earlier today and I think I've solved this particular problem by using firewall marks.

Basically, I'm setting everything from our internal network with one FWMARK and everything else with another. Then, I setup the ipvsadm rules with the default client persistence for our internal network(/32) and a class C netmask granularity (/24) for everything from the outside to deal w/ the AOL proxy farms.

Here's the iptables script I'm using to set the marks:

iptables -F -t mangle
iptables -t mangle -A PREROUTING  -p tcp -s 10.3.4.0/24 -d $VIP/32 \
             --dport 80 -j MARK --set-mark 1
iptables -t mangle -A PREROUTING  -p tcp -s ! 10.3.4.0/24 -d $VIP/32 \
             --dport 80 -j MARK --set-mark 2

Then, I have the following rules setup for ipvsadm:

ipvsadm -C
ipvsadm -A -f 1 -s wlc -p 2000
ipvsadm -a -f 1 -r $RIP1:0 -g -w 1
ipvsadm -a -f 1 -r $RIP2:0 -g -w 1

ipvsadm -A -f 2 -s wlc -p 2000 -M 255.255.255.0
ipvsadm -a -f 2 -r $RIP1:0 -g -w 1
ipvsadm -a -f 2 -r $RIP2:0 -g -w 1

FWMARK #1 doesn't have a persistent mask specified, so each client on the 10.3.4.0/24 network is seen as an individual client. FWMARK #2 packets are seen as a class C client network to deal with the AOL proxy farm problem. (for more on persistent netmask see the section in fwmark on fwmark persistence granularity).

Like I said, I just did this today, and based on my limited testing, I think it works. I'm thinking about maybe setting a whole bunch of rules to deal w/ each of the published AOL cache-proxy server networks (http://webmaster.info.aol.com/index.cfm?article=15&sitenum=2), but I think that would be too much of an administrative nightmare if they change it.

The ktcpvs project implements some level of layer-7 switching by matching URL patterns, but we need the same type of cookie based persistence for our WebSphere real servers. Hopefully, it won't be too long before that gets added.

8.9 PPC (persistent port connection) (kernels >2.2.12)

(Note: Jul 2001, this is quite old now, from at least 2yrs ago. If you've got this far, you probably don't need to go further.)

In earlier kernels, persistence was implemented with PCC (persistent client connection).

PPC is used for clients who must maintain a session with the same realserver throughout a session (eg for various SSL protocols, to an http server sending cookies...). The default session timeout is 5mins (ip_vs_pcc.h).

PCC (kernel <2.2.12) was removed in 2.2.12 and has resurfaced as a more general persistance feature called persistant port. PCC connects (some or all) ports from a client IP through to the same ports on a single realserver (the realserver is selected on the first connection, after than all subsequent port requests from the same IP go to the same realserver). With persistant port, the persistant connection is on a port by port basis and not by IP. If persistant port is called with a port of "0" then the connection will be the same as PCC.

here's the syntax for 2.2.12 kernels

Wensong To use persistent port, the commands are as follows:

ipvsadm -A -t <VIP>:<port> [-s <scheduler>] -p
ipvsadm -a -t <VIP>:<port> -R <real server> ...
        ...

if port=0 then all ports from the CIP will be mapped through to the realserver (the old PCC behaviour). If port=443, then only port 443 from the CIP will be mapped through to the realserver as a persistant connection.

If the virtual service port is set persistent, connections from the same clients are gauranteed to direct to the same server. When a client sends request for the service at the first time, the load balancer (director) selects a server by the scheduling method and creates a connection and the template. Then, the following connections from the same client will be forwarded to the same server according to the template in the specified time.

The source address of an incoming packet is used to lookup connection template.

from Peter Kese (who implemented pcc)

PCC (persistent client connection) scheduling algorithm needs some more explanation. When PCC scheduling is used, the connections are scheduled on a per client base instead of per connection. That means, the scheduling is performed only the first time a certain client connects to the virtual IP. Once the real server is chosen, all further connections from the same client will be forwarded to the same real server.

PCC scheduling algorithm can either be attached to a certain port or to the server as whole. By setting the service port to 0 (example: ipvscfg -A -t 192.168.1.10:0 -s pcc) the scheduler will accept all incoming connections and will schedule them to the same real server no matter what the port number is.

As Wensong had noted before, the PCC scheduling algorithm might produce some imbalance of load on real servers. This happens because the number of connections established by clients might vary a lot. (There are some large companies for example, that use only one IP address for accessing the internet. Or think about what happens when a search engine comes to scan the web site in order to index the pages.) On the other hand, the PCC scheduler resolves some problems with certain protocols (e.g. FTP) so I think it is good to have it.

and a comment about load balancing using pcc/ssl. (the problem: once someone comes in from aol.com to one of the realservers, all subsequent connections from aol.com will also go to the same server) -

Lars

Lets examine what happens now with SSL session comeing in from a big proxy, like AOL. Since they are all from the same host, they get forwarded to the same server - *thud*.

Now, SSL carries a "session id" which identifies all requests from a browser. This can be used to separate the multiple SSL sessions, even if comeing in from one big proxy and load balance them.

(from unknown)

SSL connections will not come from the same port, since the clients open many of them at once, just like with normal http. So would we be able to differentiate all the people coming from aol by the port number?

No. A client may open multiple SSL connections at once, which obviously will not come from the same port - but I think they will come in with the same SSL id.

(unknown again)

But like I said: really hard to get working, and even harder to get right ;-)

Wensong

No, not really! As I know, the PCC (Persistent Client Connection) scheduling in the VS patch for kernel 2.2 can solve connection affinity problem in SSL.

When a SSL connection is made (crypted with server's public key), port 443 for secure Web servers and port 465 for secure mail server, a key (session id) must be generated and exchanged between the server and the client. The later connections from the same client are granted by the server in the life span of the SSL key.

So, the PCC scheduling can make sure that once SSL "session id" is exchanged between the server and the client, the later connections from the same client will be directed to the same server in the life span of the SSL key.

However, I haven't tested it myself. I will download ApacheSSL and test it sometime. Anyone who have tested or are going to test it, please let me know the result, no matter it is good or bad. :-)

(a bit later)

(unknown)

I tested LVS with servers running Apache-SSL. LVS uses the VS patch for kernel 2.2.9, and uses the PCC scheduling. It worked without any problem.

SSL is a little bit different.

In use, the client will send a connection request to the server. The server will return a signed digital certificate. The client then authenticates the certificate using the digital signature and the public key of the CA.

If the certificate is not authentic the connection is dropped. If it is authentic then the client sends a session key (such as a) and encrypts the data using the servers public key. This ensures only the server can read it since decrypting requires knowing the server private key. The server sends its session key (such as b) and encrypts with its private key, the client decrypt it with server's public key and get b.

Since both the client and the server get a and b, they can generate the same session key based on a and b. Once they have the session key, they can use this to encrypt and decrypt data in communication. Since the data sent between the client and server is encrypted, it can't be read by anyone else.

Since the key exchange and generating is very time-consuming, for performance reasons, once the SSL session key is exchanged and generated in a TCP connection, other TCP connections can also use this session key between the client and the server in the life-span of the key.

So, we have make the connections from the same client is sent to the same server in the life-span of the key. That's why the PCC scheduling is used here.

About longer timeouts

felix k sheng felix@deasil.com and Ted Pavlic

2. The PCC feature....can I set the permanent connection for something else than the default value ( I need to maintain the client on the same server for 30 minutes at maximum) ?

If people connecting to your application will contact your web server at least once every five minutes, setting that value to five minutes is fine. If you expect people to be idle for up to thirty minutes before contacting the server again, then feel free to change it to thirty minutes. Basically remember that the clock is reset every time they contact the server again. Persistence lasts for as long as it's needed. It only dies after the amount of seconds in that value passes without a connection from that address.

So if you really want to change it to thirty minutes, check out ip_vs_pcc.h -- there should be a constant that defines how many seconds to keep the entry in the table. (I don't have access to a machine with IPVS on it at this location for me to give you anything more precise)

I think this 30 minute idea is a web specific time out period. That is, default timeout's for cookies are 30 minutes, so many web sites use that value as the length of a given web "session". So if a user hits your site, stops and does nothing for 29 minutes, and then hits your site again, most places will consider that the same session - the same session cookies will still be in place. So it would probably be a nice to have them going to the same server.

8.10 Related to PPC - Sticky connections

(Joe, Jul 2001 - This is also quite old now too)

Wensong

Since there are many messages about passive ftp problem and sticky connection problem, I'd better send a separate message to make it clear.

In LinuxDirector (by default), we have assumed that each network connection is independent of every other connection, so that each connection can be assigned to a server independently of any past, present or future assignments. However, there are times that two connections from the same client must be assigned to the same server either for functional or for performance reasons.

FTP is an example for a functional requirement for connection affinity. The client establishs two connections to the server, one is a control connection (port 21) to exchange command information, the other is a data connection (usually port 20) that transfer bulk data. For active FTP, the client informs the server the port that it listens to, the data connection is initiated by the server from the server's port 20 and the client's port. LinuxDirector could examine the packet coming from clients for the port that client listens to, and create any entry in the hash table for the coming data connection. But for passive FTP, the server tells the clients the port that it listens to, the client initiates the data connection connectint to that port. For the LVS-Tunneling and the LVS-DRouting, LinuxDirector is only on the client-to-server half connection, so it is imposssible for LinuxDirector to get the port from the packet that goes to the client directly.

SSL (Secure Socket Layer) is an example of a protocol that has connection affinity between a given client and a particular server. When a SSL connection is made, port 443 for secure Web servers and port 465 for secure mail server, a key for the connection must be chosen and exchanged. The later connections from the same client are granted by the server in the life span of the SSL key.

Our current solution to client affinity is to add persistent client connection scheduling in LinuxDirector. In the PCC scheduling, when a client first access the service, LinuxDirector will create a connection template between the give client and the selected server, then create an entry for the connection in the hash table. The template expires in a configurable time, and the template won't expire if it has its connections. The connections for any port from the client will send to the server before the template expires. Although the PCC scheduling may cause slight load imbalance among servers, it is a good solution to connection affinity.

The configuration example of PCC scheduling is as follows:

ipvsadm -A -t <VIP>:0 -s pcc
ipvsadm -a -t <VIP>:0 -R <your server>

BTW, PCC should not be considered as a scheduling algorithm in concept. It should be a feature of virtual service port, the port is persistent or not. I will write some codes later to let user to specify whether port is persistent or not.

(and what if a realserver holding a sticky connection crashes?)

Ted Pavlic tpavlic_list@netwalk.com

Is this a bug or a feature of the PCC scheduling...

A person connects to the virtual server, gets direct routed to a machine. Before the time set to expire persistent connections, that real machine dies. mon sees that the machine died, and deletes the real server entries until it comes back up.

But now that same person tries to connect to the virtual server again, and PCC *STILL* schedules them for the non-existent real server that is currently down. Is that a feature? I mean -- I can see how it would be good for small outages... so that a machine could come back up really quick and keep serving its old requests... YET... For long outages those particular people will have no luck.

Wensong

You can set the timeout of template masq entry into a small number now. It will be expired soon.

Or, I will add some codes to let each real server entry keep a list of its template masq entries, remove those template masq entries if the real server entry is deleted.

To me, this seems most sensible. Lowering the timeouts has other effects, affecting general session persistence...
I agree with this. This was what I was hoping for when I sent the original message. I figure, if the server the person was connecting to went down, any persistence wouldn't be that useful when the server came back up. There might be temporary files in existence on that server that don't exist on another server, but otherwise... FTP or SSL or anything like that -- it might as well be brought up anew on another server.

Plus, any protocol that requires a persistent connection is probably one that the user will access frequently during one session. It makes more sense to bring that protocol up on another server than waiting for the old server to come back up -- will be more transparent to the user. (Even though they may have to completely re-connect once)

So, yes, deleting the entry when a real server goes down sounds like the best choice. I think you'll find most other load balancers do something similar to this.

Andres Reiner areiner@nextron.ch

I found some strange behaviour using 'mon' for the high-availability. If a server goes down it is correctly removed from the routing table. BUT if a client did a request prior to the server's failure, it will still be directed to the failed server afterwards. I guess this got something to do with the persistent connection setting (which is used for the cold fusion applications/session variables). In my understanding the LVS should, if a routing entry is deleted, no longer direct clients to the failed server even if the persistent connection setting is used. Is there some option I missed or is it a bug ?

Wensong

No, you didn't miss anything and it is not a bug either. :) In the current design of LVS, the connection won't be drastically removed but silently drop the packet once the destination of the connection is down, because monitering software may marks the server temporary down when the server is too busy or the monitering software makes some errors. When the server is up, then the connection continues. If server is not up for a while, then the client will timeout. One thing is gauranteed that no new connections will be assigned to a server when it is down. When the client reestablishs the connection (e.g. press reload/refresh in the browser), a new server will be assigned.


Next Previous Contents