These topics were too short or not central enough to LVS operation to have their own section.
Multiple VIPs (and their associated services) can co-exist independantly on an LVS. On the director, add the extra IPs to a device facing the internet. On the realservers, for LVS-DR||LVS-Tun, add the VIPs to a device and setup services listening to the ports. On the realservers, for LVS-NAT, add the extra services to the RIP.
Tao Zhao 6 Nov 2001what if I need multiple VIPs on the realserver?
Julian Anastasov ja@ssi.bg
06 Nov 2001
for i in 180 182 182 do ip addr add X.Y.Z.$i dev dummy0 done
There is also an example for setting up multiple VIPs on HA.
On the realservers you can look with `netstatn -an`. With LVS, the director also has information.
malalon@poczta.onet.pl
18 Oct 2001How do I know who is connecting to my LVS?
Julian
Milind Patil mpatil@iqs.co.in
> 24 Sep 2001
I want to limit number of users accessing the LVS services at any given time. How can I do it.
Julian
- for non-NAT cluster (maybe stupid but interesting)
May be an array from policers, for example, 1024 policers or an user-defined value, power of 2. Each client hits one of the policers based on their IP/Port. This is mostly a job for QoS ingress, even the distributed attack but may be something can be done for LVS? May be we better to develop a QoS Ingress module? The key could be derived from CIP and CPORT, may be something similar to SFQ but without queueing. It can be implemented may be as a patch to the normal policer but with one argument: the real number of policers. Then this extended policer can look into the TCP/UDP packets to redirect each packet to one of the real policers.
- for NAT only
Run SFQ qdisc on your external interface(s). It seems this is not a solution for DR method. Of course, one can run SFQ on its uplink router.
- Linux 2.4 only
iptables has support to limit the traffic but I'm not sure whether it is useful for your requirements. I assume you want to set limit to each one of these 1024 aggregated flows.
Wenzhuo Zhang
Is anybody actually using the ingress policer for anti-DoS? I tried it several days ago using the script in the iproute2 package: iproute2/examples/SYN-DoS.rate.limit. I've tested it against different 2.2 kernels (2.2.19-7.0.8(redhat kernel), 2.2.19, 2.2.20preX, with all QoS related functions either compiled into the kernel or as modules) and different versions of iproute2. In all cases, tc fails to install the ingress qdisc policer:
root@panda:~# tc qdisc add dev eth0 handle ffff: ingress RTNETLINK answers: No such file or directory root@panda:~# /tmp/tc qdisc add dev eth0 handle ffff: ingress RTNETLINK answers: No such file or directory
JulianFor 2.2, you need the ds-8 package, at Package for Differentiated Services on Linux. Compile tc by setting TC_CONFIG_DIFFSERV=y in Config. The right command is:
tc qdisc add dev eth0 ingressRatzFor 2.4 ingress is in the kernel but it is still unusable for more than one device (look in linux-netdev for reference).
The 2.2.x version is not supported anymore. The advanced routing documentation says to only use 2.4.
from Ratz ratz@tac.ch
We're going to set up a LVS cluster from scratch. you need
The goal is to set up an own loadbalanced tcp application. The application will consist of a own written shell script being invoked by inetd. As you might have guessed, security is very low priority, you should get the idea behind this. Of course I should take xinetd and of course I should use a tcpwrapper and maybe even SecurID authentication but here the goal is to understand the fundamental design principals of a LVS cluster and its deploy. All instructions will be done as root.
Setting up the realserver
Edit /etc/inetd.conf and add following line: lvs-test stream tcp nowait root /usr/bin/lvs-info lvs-info Edit /etc/services and add following line: lvs-test 31337/tcp # supersecure lvs-test port
Now you need to get inetd running. This is different for every Unix. So please have a look at it yourself. You verify if it's running with 'ps ax|grep [i]netd' And to verify if it really runs this port you do a 'netstat -an|grep LISTEN' and if there is a line:
tcp 0 0 0.0.0.0:31337 0.0.0.0:* LISTEN
you're one step closer to the truth. Now we have to supply the script that will be called if you connect to realserver# port 31337. So simply do this on your command line (copy 'n' paste):
cat > /usr/bin/lvs-info << EOF && chmod 755 /usr/bin/lvs-info #!/bin/sh echo "This is a test of machine `ifconfig -a | grep HWaddr | awk '{print $1}'`" echo EOF
Now you can test if it really works with telnet or netcat:
telnet localhost 31337 netcat localhost 31337
This should spill out something like:
hog:/ # netcat localhost 31337 This is a test of machine 192.168.1.11 hog:/ #
If it worked, do the same procedure to set up the second realserver. Now we're ready to set up the load balancer. These are the required commands to set it up for our example:
ipvsadm -A -t 192.168.1.100:31337 -s wrr ipvsadm -a -t 192.168.1.100:31337 -r 192.168.1.11 -g -w 1 ipvsadm -a -t 192.168.1.100:31337 -r 192.168.1.12 -g -w 1
Check it with ipvsadm -L -n:
hog:~ # ipvsadm -L -n IP Virtual Server version 0.9.14 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 192.168.1.100:31337 wrr -> 192.168.1.12:31337 Route 1 0 0 -> 192.168.1.11:31337 Route 1 0 0 hog:~ #
Now if you connect from outside with the client node to the VIP=192.168.1.100 you should get to one of the two realserver (presumably to .12) Reconnect to the VIP again an you should get to the other realserver. If so, be happy, if not go back, check netstat -an, ifconfig -a, arp-problem, routing tables and so on ...
I want to use virtual server functionality to allow switching over from one pool of server processes to another without an interruption in service to clients.
Michael Sparkssparks@mcc.ac.uk
current realservers : A,B,C servers to swap into the system instead D,E,F
- Add servers D,E,F into the system all with fairly high weights (perhaps ramping the weights up slowly so as not to hit them too hard:-)
- Change the weights of servers A,B,C to 0.
- All new traffic should now go to D,E,F
- When the number of connections through A,B,C reaches 0, remove them from the service. This can take time I know but...
from Joe
A planned feature for ipvsadm will be to give a realserver a weight of 0 (now implemented). This realserver will not be sent any new connections and will continue serving its current connections till they close. You may have to wait a while if a user is downloading a 40M file from the realserver.
eg if you want to test LVS on your BIG Sunserver and how to restore an LVS to a single node server again.
current ftp server: standalone A planned LVS (using LVS-DR): realserver A director Z
Setup the LVS in the normal way with the director's VIP being a new IP for the network. The IP of the standalone server will now also be the IP for the realserver. You can access the realserver via the VIP while the outside users continue to connect to the original IP of A. When you are happy that the VIP gives the right service, change the DNS IP of your ftp site to the VIP. Over the next 24hrs as the new DNS information is propagated to the outside world, users will change over to the VIP to access the server.
To expand the number of servers (to A, B,...), add another server with duplicated files, add an extra entry into the director's tables with ipvsadm.
To restore - in your DNS, change the IP for the service to the realserver IP. When no-one is accessing the VIP anymore, unplug the director.
You can't shutdown an LVS. However you can stop it forwarding by clearing the ipvsadm table (ipvsadm -C), then allow all connections to expire (check the active connections with ipvsadm) and then remove the ipvs modules (rmmod). Since ip_vs.o requires ip_vs_rr.o etc, you'll have to remove ip_vs_rr.o first.
Do you know how to shutdown LVS? I tried rmmod but it keeps saying that the device is busy.
Kjetil Torgrim Homme kjetilho@linpro.no
18 Aug 2001
Run ipvsadm -C. You also need to remove the module(s) for the balancing algorithm(s) before rmmod ip_vs. Run lsmod to see which modules these are.
The difference between a beowulf and an LVS:
The Beowulf project has to do with processor clustering over a network -- parallel computing... Basically putting 64 nodes up and running that all are a part of a collective of resources. Like SMP -- but between a whole bunch of machines with a fast ethernet as a backplane.
LVS, however, is about load-balancing on a network. Someone puts up a load balancer in front of a cluster of servers. Each one of those servers is independent and knows nothing about the rest of the servers in the farm. All requests for services go to the load balancer first. That load balancer then distributes requests to each server. Those servers respond as if the request came straight to them in the first place. So -- with the more servers one adds -- the less load goes to each server.
A person might go to a web site that is load balanced, and their requests would be balanced between four different machines. (Or perhaps all of their requests would go to one machine, and the next person's request would go to another machine)
However, a person who used a Beowulf system would actually be using one processing collaborative that was made up of multiple computers...
I know that's not the best explanation of each, and I apologize for that, but I hope it at least starts to make things a little clearer. Both projects could be expanded on to a great extent, but that might just confuse things farther.
(Joe) -
both use several (or a lot of) nodes.
A beowulf is a collection of nodes working on a single computation. The computation is broken into small pieces and passed to a node, which replies with the result. Eventually the whole computation is done. THe beowulf usually has a single user and the computations can run for weeks.
An LVS is a group of machines offering a service to a client. A dispatcher connects the client to a particular server for the request. When the request is completed, the dispatcher removes the connection between the client and server. The next request from the same client may go to a different server but the client cannot tell which server it has connected to. The connection between client and server may only be seconds long
from a posting to the beowulf mailing list by Alan Heirich -
Thomas Sterling and Donald Becker made "Beowulf" a registered service mark with specific requirements for use:
-- Beowulf is a cluster -- the cluster runs Linux -- the O/S and driver software are open source -- the CPU is multiple sourced (currently, Intel and Alpha)
I assume they did this to prevent profit-hungry vendors from abusing this term; can't you just imagine Micro$oft pushing a "Beowulf" NT-cluster?
(Joe - I looked up the Registered Service Marks on the internet and Beowulf is not one of them.)
(Wensong) Beowulf is for parallel computing, Linux Virtual Server is for scalable network services.
They are quite different now. However, I think they may be unified under "single system image" some day. In the "single system image", every node can see a single system image (the same memory space, the same process space, the same external storage), and the processes/threads can be transparently migrated to other nodes in order to achieve load balance in the cluster. All the processes are checkpointed, they can be restarted in the node or the others if they fails, full fault tolerant can be made here. It will be easy for programmers to code because of single space, they don't need to statically partition jobs to different sites and let them communicate through PVM or MPI. They just need identify the parallelism of his scientific application, and fork the processes or generate threads, because processes/threads will be automatically load balanced on different nodes. For network services, the service daemons just need to fork the processes or generates threads, it is quite simple. I think it needs lots of investigation in how to implement these mechanisms and make the overhead as low as possible.
What Linux Virtual Server has done is very simple, Single IP Address, in which parallel services on different nodes is appeared as a virtual service on a single IP address. The different nodes have their own space, it is far from "single system image". It means that we have a long way to run. :)
Eddie http://www.eddieware.org
(Jacek Kujawa blady@cnt.pl
)
Eddie is a load balancing software, using NAT (only NAT), for
webservers, written in language erlang. Eddie include intelligent
HTTP gateway and Enhanced DNS.
(Joe) Erlang is a language for writing distrubuted applications.
Martin Seigert at Simon Fraser U posted Benchmarks for various NICS to the beowulf mailing list. The conclusion was that for fast CPUs (i.e.600MHz, which can saturate 100Mbps ethernet) the 3c95x and tulip cards were equivelent. For slower CPUs (166MHz) which cannot saturate 100Mbs ethernet, the on-board processing on the 3Com hards allowed marginally better throughput.
If you are going into production, you should test that your NIC works well with your hardware. Give it a good exercising with a netpipe test (see the performance page).
I use Netgear FA310TX (tulip), and eepro100 for single port NICs. The related FA311 card seems to be Linux incompatible (postings to the beowulf mailing list), currently (Jul 2001) requiring a driver from Netgear (this was the original situation with the FA310 too). I also use a quad DLink DFE-570TX (tulip) on the director. I'm happy with all of them.
The eepro100 has problems as Intel seems to change the hardware without notice and the linux driver writers have trouble handling all the versions of hardware. One kernel (2.2.18?) didn't work with the eepro100. I bought all of my eepro100's at once and presumably they are identical. There have been a relatively large number of posting of people with eepro100 problems on the LVS mailing list.
linux with an eepro100 can't pass more than 2^31-1 packets. This may not still be a problem, but the eepro100 driver has had problems in a few kernels.
(This is from 1999 I think)
Jerry Glomph Blackblack@real.com
Subject: 2-billion-packet bug?I've seen several 2.2.12/2.2.13 machines lose their network connections after a long period of fine operation. Tonight our main LVS box fell off the net. I visited the box, it had not crashed at all. However, it was not communicating via its (Intel eepro100) ethernet port.
The evil evidence:
eth0 Link encap:Ethernet HWaddr 00:90:27:50:A8:DE inet addr:172.16.0.20 Bcast:172.16.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:15 errors:288850 dropped:0 overruns:0 frame:0 TX packets:2147483647 errors:1 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 Interrupt:10 Base address:0xd000Check out the TX packets number! That's 2^31-1. Prior to the rollover, In-and-out packets were roughly equal. I think this has happened to non-LVS systems as well, on 2.2 kernels. ifconfigging eth0 down-and-up did nothing. A reboot (ugh) was necessary.
It's still happening 2yrs later. This time the counter stops, but the network is still functional.
Hendrik Thielthiel@falkag.de
20 Nov 2001using lvs with eepro100 cards (kernel 2.2.17) and encountered a TX packets value stopping at 2147483647 (2^32-1) thats what ifconfig tells...the system still runs fine ...
it seems to be a ifconfig Bug. Check out the TX packets number! That's 2^31-1.
eth0 Link encap:Ethernet HWaddr 00:90:27:50:A8:DE inet addr:172.16.0.20 Bcast:172.16.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:15 errors:288850 dropped:0 overruns:0 frame:0 TX packets:2147483647 errors:1 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 Interrupt:10 Base address:0xd000
Simon A. BoggisHmmm, I have a couple of eepro100-based linux routers - the one thats been up the longest is working fine (167 days, kernel 2.2.9) but the counters are jammed - for example, `ifconfig eth0' gives:
eth0 Link encap:Ethernet HWaddr 00:90:27:2A:55:48 inet addr:138.37.88.251 Bcast:138.37.88.255 Mask:255.255.255.0 IPX/Ethernet 802.2 addr:8A255800:0090272A5548 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:2147483647 errors:41 dropped:0 overruns:1 frame:0 TX packets:2147483647 errors:13 dropped:0 overruns:715 carrier:0 Collisions:0 Interrupt:15 Base address:0xa000BUT /proc/net/dev reports something more believable:
hammer:/# cat /proc/net/dev Inter-| Receive | Transmit face |bytes packets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed eth0:2754574912 2177325200 41 0 1 0 0 0 2384782514 3474415357 13 0 715 0 0 0
Thats RX packets: 2177325200 and TX packets: 3474415357 compared to : 2147483647 from ifconfig eth0
(Joe, Nov 2001 I don't know if this is still a problem, we haven't heard any more about it and haven't had any other tulip problems, unlike the eepro100.)
John Connettjrc@art-render.com
05 May 1999Any suggestions as to how to narrow it down? I have an Intel EtherExpress PRO 100+ and a 3COM 3c905B which I could try instead of the KNE 100TX to see if that makes a difference.
A tiny light at the end of the tunnel! Just tried an Intel EtherExpress PRO 100+ and it works! Unfortunately, the hardware is fixed for the application I am working on and has to use a single Kingston KNE 100TX NIC ...
Some more information. The LocalNode problem has been observed with both the old style (21140-AF) and the new style (21143-PD) of Kingston KNE 100TX NIC. This suggests that there is a good chance that it will be seen with other "tulip" based NICs. It has been observed with both the "v0.90 10/20/98" and the "v0.91 4/14/99" versions of tulip.c.
I have upgraded to vs-0.9 and the behaviour remains the same: the EtherExpress PRO 100+ works; the Kingston KNE 100TX doesn't work.
It is somewhat surprising that the choice of NIC should have this impact on the LocalNode behaviour but work successfully on connections to slave servers.
Any suggestions as to how I can identify the feature (or bug) in the tulip driver would be gratefully received. If it is a bug I will raise it on the tulip mailing list.
(now handled by code added to the scheduler)
From: Christopher Seawood cls@aureate.com
LVS seems to work great until a server goes down (this is where mon comes in). Here's a couple of things to keep in mind. If you're using the Weighted Round-Robin scheduler, then LVS will still attempt to hit the server once it goes down. If you're using the Least Connections scheduler, then all new connections will be directed to the down server because it has 0 connections. You'd think using mon would fix these problem but not in all cases.
Adding mon to the LC setup didn't help matters much. I took one of three servers out of the loop and waited for mon to drop the entry. That worked great. When I started the server back up, mon added the entry. During that time, the 2 running servers had gathered about 1000 connections apiece. When the third server came back up, it immediately received all of the new connections. It kept receiving all of the connections until it had an equal number of connections with the other servers (which by this time...a minute or so later...had fallen to 700). By this time, the 3rd server had been restarted after due to triggering a high load sensor also monitoring the machine (a necessary evil or so I'm told). At this point, I dropped back to using WRR as I could envision the cycle repeating itself indefinitely.
(this must have been solved, no-one is complaining about memory leaks now :-)
Jerry Glomph Blackblack@real.com
We have successfully used 2.0.36-vs (direct routing method), but it does fail at extremely high loads. Seems like a cumulative effect, after about a billion or so packets forwarded. Some kind of kernel memory leak, I'd guess.
The thing I have found out is that on Solaris 2.6, and probably other versions of Solaris, you have to to some magic to get the loopback alias setup. You must run the following commands one at a time:
ifconfig lo0:1 <VIP> ifconfig lo0:1 <VIP> <VIP> ifconfig lo0:1 netmask 255.255.255.255 ifconfig lo0:1 up
Which works well and is actually a pointopoint link like ppp which must be the way Solaris defines aliases to the lo interface. It will not let you do this all at once, just each step at a time or you have to start over from scratch on the interface.
Chris Kennedy I-Land Internet Services <tt/ckennedy@iland.net/
Keith Rowland wrote:Can I use Virtual Server to host multiple domains on the cluster? Can VS be setup to respond to multiple 10-20 different IP addresses and use the clusters to reposnd to any one of them with the proper web directory.
James CE Johnson jjohnson@mobsec.com
If I understand the question correctly, then the answer is yes :-) I have one system that has two IP addresses and responds to two names:
foo.mydomain.com A.B.C.foo eth1 bar.mydomain.com A.B.C.bar eth1:0
On that system (kernel 2.0.36 BTW) I have LVS setup as:
ippfvsadm -A -t A.B.C.foo:80 -R 192.168.42.50:80 ippfvsadm -A -t A.B.C.bar:80 -R 192.168.42.100:80
To make matters even more confusing, 192.168.42.(50|100) are actually one system where eth0 is 192.168.42.100 and eth0:0 is 192.168.42.50. We'll call that 'node'.
Apache on 'node' is setup to serve foo.mydomain.com on ...100 and bar.mydomain.com on ...50.
It took me a while to sort it out but it all works quite nicely. I can easily move bar.mydomain.com to another node within the cluster by simply changing the ippfvsadm setup on the externally addressable node.
On a normal LVS (one director, multiple realservers being failed-over with mon), the single director is a SPOF (single point of failure). Director failure can be handled (in principle) with heartbeat, but no-one is doing this yet. In the meantime, you can have two directors each with their own VIP known to the users and set them up to talk to the same set of realservers. (You can have two VIP's on one director box too). (The configure.pl script doesn't handle this yet.)
Michael Sparksmichael.sparks@mcc.ac.uk
Also has anyone tried this using 2 or more masters - each master with it's own IP? (*) From what I can see theoretically all you should have to do is have one master on IP X, tunneled to clients who recieve stuff via tunl0, and another master on IP Y, tunneled to clients on tunl1 - except when I just tried doing that I can't get the kernel to accept the concept of a tunl1... Is this a limitation of the IPIP module ???
Stephen D. WIlliams sdw@lig.net
Do aliasing. I don't see a need for tunl1. In fact, I just throw a dummy address on tunl0 and do everything with tunl0:0, etc.
We plan to run at least two LinuxDirector/DR systems with failover for moving the two (or more) public IP's between the systems. We also use aliased, movable IP's for the real server addresses so that they can failover also.
There are two types of clients on realservers from the point of view of LVS.
Both types of clients require the same understanding of LVS, but because the first case is simple, it is discussed here. The second case has all sorts of ramifications for LVS and for that reason is discussed in the section on authd.
You might have valid reasons for running clients on realservers, e.g. so that the sysadmin could telnet to a remote site. The way to allow clients on the realservers to connect to outside servers is to configure these requests so that they are independant of the LVS setup (you do have to use the network and default gw set by the LVS). The solution is to NAT the client requests.
This is simple
Here's the command to run on a 2.2.x director to allow realserver1 to telnet to the outside world.
director:# ipchains -A forward -p tcp -j MASQ -s realserver1 telnet -d 0.0.0.0/0
You may have to turn off icmp redirects, if you have a one network LVS-NAT.
director: #echo 0 > /proc/sys/net/ipv4/conf/all/send_redirects director: #echo 0 > /proc/sys/net/ipv4/conf/eth0/send_redirects
After running this command you can telnet from the realservers. You can do this even if telnet is an LVS'ed service, since the telnet client and demon operate independantly of each other. You can use NAT the rshd and identd clients in the same way (replace telnet with rsh/identd and clients on the realserver can connect to their demons on outside machines).
In general this has not been solved. Calls initiated by the identd client on a realserver will come from the VIP, not the RIP. Some hare-brained schemes have been tried but did not work (NAT'ing out the request from the VIP, so that it emerges from the realserver with src_addr=RIP and then NAT'ing the packet again on the director, so it emerges with src_addr=VIP).
There are specific solutions
In LVS-DR/LVS-Tun, if the client and RIP are on the same network. Usually the RIP's on LVS-DR realservers are private addresses. However if the LVS clients and the LVS are all local and on the same network, this will work.
Clients not associated with the LVS'ed services (ie telnet even if telnetd is LVSed, but not authd or rshd) can still be NAT'ed out, since the connect request will come from the RIP and not the VIP. Since the default gw for the realserver in LVS-DR is not the director, you can handle this 2 ways
Laurent LefollLaurent.Lefoll@mobileway.com
14 Feb 2001what is the usefulness of the ICMP packets that are sent when new packets arrives for a TCP connection that timed out for in LVS box ? I understand obviously for UDP but I don't see their role for a TCP connection...
Julian
I assume your question is about the reply after ip_vs_lookup_real_service.
It is used to remove the open request in SYN_RECV state in the real server. LVS replies for more states and may be some OSes report them as soft errors (Linux), others can report them as hard errors, who knows.
it's about ICMP packets from a LVS-NAT director to the client. For example, a client accesses a TCP virtual service and then stops sending data for a long time, enough for the LVS entry to expire. When the client try to send new data over this same TCP connection the LVS box sends ICMP (port unreachable) packets to the client. For a TCP connection how do these ICMP packets "influence" the client ? It will stop sending packets to this expired (for the LVS box...) TCP connection only after its own timeouts, doesn't it ?
By default TCP replies RST to the client when there is no existing socket. LVS does not keep info for already expired connections and so we can only reply with an ICMP rather than sending a TCP RST. (If we implement TCP RST replies, we could reply TCP RST instead of ICMP).
What does the client do with this ICMP packet? By default, the application does not listen for ICMP errors and they are reported as soft errors after a TCP timeout and according to the TCP state. Linux at least allows the application to listen for such ICMP replies. The application can register for these ICMP errors and detect them immediately as they are received by the socket. It is not clear whether it is a good idea to accept such information from untrusted sources. ICMP errors are reported immediately for some TCP (SYN) states.
Joseph Mack, 16 Mar 2001I'm looking at packets after they've been accepted by TP and I'm using (among other things) tcpdump.
Where in the netfilter chain does tcpdump look at incoming and outgoing packets? When they are put on/received from the wire? After the INPUT, before the OUTPUT chain...?
Julian
Before/after any netfilter chains. Such programs hook at packet level before/after the IP stack just before/after the packet is received/must be sent from/by the device. They work for other protocols. tcpdump is a packet receiver just like the IP stack is in the network stack.
(without bringing them all down)
Problem: if down/delete an aliased device (eg eth0:1) you also bring down the other eth0 devices. This means that you can't bring down an alias remotely as you loose your connection (eth0) to that machine. You then have to go the console of the remote machine to fix it by rmmod'ing the device driver for the device and bring it up again.
The configure script handles this for you and will exit (with instructions on what to do next) if it finds that an aliased device needs to be removed by rmmod'ing the module for the NIC.
(I'm not sure that all of the following is accurate, please test yourself first).
(Stephen D. WIlliams sdw@lig.net
)
whenever you want to down/delete an alias, first set it's netmask
to 255.255.255.255. This avoids also automatically downing
aliases that are on the same netmask and are considered
'secondaries' by the kernel.
(Joe) To bring up an aliased device
$ifconfig eth0:1 192.168.1.10 netmask 255.255.255.0
to bring eth0:1 down without taking out eth0, you do it in 2 steps, first change the netmask
$ifconfig eth0:1 192.168.1.10 netmask 255.255.255.255
then down it
$ifconfig eth0:1 192.168.1.10 netmask 255.255.255.255 down
then eth0 device should be unaffected, but the eth0:1 device will be gone.
This works on one of my machines but not on another (both with 2.2.13 kernels). I will have to look into this. Here's the output from the machine for which this procedure doesn't work.
Examples: Starting setup. The realserver's regular IP/24 on eth0, the VIP/32 on eth0:1 and another IP/24 for illustration on eth0:2. Machine is SMP 2.2.13 net-tools 1.49
chuck:~# ifconfig -a eth0 Link encap:Ethernet HWaddr 00:90:27:71:46:B1 inet addr:192.168.1.2 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING ALLMULTI MULTICAST MTU:1500 Metric:1 RX packets:6071219 errors:0 dropped:0 overruns:0 frame:0 TX packets:6317319 errors:0 dropped:0 overruns:4 carrier:0 collisions:757453 txqueuelen:100 Interrupt:18 Base address:0x6000 eth0:1 Link encap:Ethernet HWaddr 00:90:27:71:46:B1 inet addr:192.168.1.110 Bcast:192.168.1.110 Mask:255.255.255.255 UP BROADCAST RUNNING ALLMULTI MULTICAST MTU:1500 Metric:1 Interrupt:18 Base address:0x6000 eth0:2 Link encap:Ethernet HWaddr 00:90:27:71:46:B1 inet addr:192.168.1.240 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING ALLMULTI MULTICAST MTU:1500 Metric:1 Interrupt:18 Base address:0x6000 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:3924 Metric:1 RX packets:299 errors:0 dropped:0 overruns:0 frame:0 TX packets:299 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 chuck:~# netstat -rn Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 192.168.1.110 0.0.0.0 255.255.255.255 UH 0 0 0 eth0 192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo 0.0.0.0 192.168.1.1 0.0.0.0 UG 0 0 0 eth0 Deleting eth0:1 with netmask /32 chuck:~# ifconfig eth0:1 192.168.1.110 netmask 255.255.255.255 down chuck:~# ifconfig -a eth0 Link encap:Ethernet HWaddr 00:90:27:71:46:B1 inet addr:192.168.1.2 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING ALLMULTI MULTICAST MTU:1500 Metric:1 RX packets:6071230 errors:0 dropped:0 overruns:0 frame:0 TX packets:6317335 errors:0 dropped:0 overruns:4 carrier:0 collisions:757453 txqueuelen:100 Interrupt:18 Base address:0x6000 eth0:2 Link encap:Ethernet HWaddr 00:90:27:71:46:B1 inet addr:192.168.1.240 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING ALLMULTI MULTICAST MTU:1500 Metric:1 Interrupt:18 Base address:0x6000 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:3924 Metric:1 RX packets:299 errors:0 dropped:0 overruns:0 frame:0 TX packets:299 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 If you do the same thing with eth0:2 with the /24 netmask chuck:~# ifconfig eth0:2 192.168.1.240 netmask 255.255.255.0 down chuck:~# ifconfig -a eth0 Link encap:Ethernet HWaddr 00:90:27:71:46:B1 inet addr:192.168.1.2 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING ALLMULTI MULTICAST MTU:1500 Metric:1 RX packets:6071237 errors:0 dropped:0 overruns:0 frame:0 TX packets:6317343 errors:0 dropped:0 overruns:4 carrier:0 collisions:757453 txqueuelen:100 Interrupt:18 Base address:0x6000 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:3924 Metric:1 RX packets:299 errors:0 dropped:0 overruns:0 frame:0 TX packets:299 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 tunl0 Link encap:IPIP Tunnel HWaddr unspec addr:[NONE SET] Mask:[NONE SET] NOARP MTU:1480 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0
LVS has been tested with a 100Mbit/sec syn-flooding attack by Alan Cox and Wensong.
Each connection requires 128 bytes. A machine with 128M of free memory could hold 1M concurrent connections. An average connection lasts 300secs. Connections which just receive the syn packet are expired in 30secs (starting ipvs 0.8 ). An attacker would have to initiate 3k connections/sec (600Mbps) to maintain the memory at the 128M mark and would require several T3 lines to keep up the attack.
only one CPU can be in the kernel with 2.2. Since LVS is all kernel code, there is no benefit to LVS by using SMP with 2.2.x. Kernel 2.[3-4] can use multiple CPUs. While standard (300MHz pentium) directors can easily handle 100Mbps networks, they cannot handle an LVS at Gbps speeds. Either SMP directors with 2.4.x kernels or multiple directors (each with a separate VIP all pointing to the same realservers) are needed.
Since LVS-NAT requires computation on the director (to rewrite the packets) not needed for LVS-DR and LVS-Tun, SMP would help throughput.
JoeIf you're using LVS-NAT then you'll need a machine that can handle the full bandwidth of the expected connections. If this is T1, you won't need much of a machine. If it's 100Mbps you'll need more (I can saturate 100Mbps with a 75MHz machine). If you're running LVS-DR or LVS-Tun you'll need less horse power. Since most LVS is I/O I would suspect that SMP won't get you much. However if the director is doing other things too, then SMP might be useful
Julian Anastasov uli@linux.tu-varna.acad.bg
Yep, LVS in 2.2 can't use both CPUs. This is not a LVS limitation. It is already solved in the latest 2.3 kernels: softnet. If you are using the director as real server too, SMP is recommended.
Pat O'Rourkeorourke@mclinux.com
03 Jan 2000In our experiments we've been seeing an SMP director perform significantly worse than a uni-processor one (using the same hardware - only difference was booting an SMP kernel or uni-processor).
We've been using a 2.2.17 kernel with the 1.0.2 LVS patch and bumped the send / recv socket buffer memory to 1mb for both the uni-processor and SMP scenarios. The director is an Intel based system with 550 mhz Pentium 3's.
In some tests I've done with FTP, I have seen *significant* improvements using dual and quad processors using 2.4. Under 2.2, there are improvements, but not astonishing ones.
Things like 90% saturation of a Gig link using quad processors, 70% using dual processors and 55% using a single processor under 2.4.0test. Really amazing improvements.
Michael E Brownmichael_e_brown@dell.com
26 Dec 2000What are the percentage differences on each processor configuration between 2.2 and 2.4? How does a 2.2 system compare to a 2.4 system on the same hardware?
I haven't had much of a chance to do a full comparison of 2.2 vs 2.4, but most of the evidence on tests that I have run points to a > 100% improvement for *network intensive* tasks.
In our experiments we've been seeing an SMP director perform significantly worse than a uni-processor one (using the same hardware - only difference was booting an SMP kernel or uni-processor).
Michael Sparks
It's useful for the director to have 3 IP addresses. One which is the real machines base IP address, one which is the virtual service IP address, and then another virtual IP address for servicing the director. The reason for this is associated with director failover.
Suppose:
There is information on the Squid site about tuning a squid box for performance. I've lost the original URL, but here's one about file descriptors and another by Joe Cooper (occasional contributor to the LVS mailing list) that also addresses the FD_SETSIZE problem (i.e. not enough filedescriptors). The squid performance information should apply to an LVS director. For a 100Mbps network, current PC hardware on a director can saturate a network without these optimizations. However current single processor hardware cannot saturate 1Gpbs network, and optimizations are helpful. The squid information is as good a place to start as any.
Here's some more info
Michael E Brownmichael_e_brown@dell.com
29 Dec 2000How much memory do you have? How fast of network links? There are some kernel parameters you can tune in 2.2 that help out, and there are even more in 2.4. From the top of my head,
1) /proc/sys/net/core/*mem* <-- tune to your memory spec. The defaults are not optimized for network throughput on large memory machines.
2.) 2.4 only /proc/sys/net/ipv4/*mem*
3.) For fast links, with multiple adapters (Two gig links, dual CPU) 2.4 has NIC-->CPU IRQ binding. That can really help also on heavily loaded links.
4.) For 2.2 I think I would go into your BIOS or RCU (if you have one) and hardcode all NIC adapters (Assuming identical/multiple NICS) to the same IRQ. You get some gain due to cache affinity, and one interrupt may service IRQs from multiple adapters in one go, on heavily loaded links.
5.) Think "Interrupt coalescing". Figure out how your adapter driver turns this on and do it. If you are using Intel Gig links, I can send you some info on how to tune it. Acenic Gig adapters are pretty well documented.
For a really good tuning guide, go to spec.org, and look up the latest TUX benchmark results posted by Dell. Each benchmark posting has a full list of kernel parameters that were tuned. This will give you a good starting point from which to examine your configuration.
The other obvious tuning recommendation: Pick a stable 2.4 kernel and use that. Any (untuned) 2.4 kernel will blow away 2.2 in a multiprocessor configuration. If I remember correctly 2.4.0test 10-11 are pretty stable.
Some information is on
http://www.LinuxVirtualServer.org/lmb/LVS-Announce.html
This isn't particularly inclusive. We don't pester people for testimonials as we don't want to scare people from posting to the mailing list and we don't want inflated praise. People seem to understand this and don't pester us with their performance data either. The quotes below aren't scientific data, but it is nice to hear. The people who don't like LVS presumably go somewhere else, and we don't hear any complaints from them.
"Daniel Erdös" 2 Feb 2000How many connections did you really handled? What are your impressions and experiences in "real life"? What are the problems?
Michael Sparks zathras@epsilon3.mcc.ac.uk
Problems - LVS provides a load balancing mechanism, nothing more, nothing less, and does it *extremely* well. If your back end real servers are flakey in anyway, then unless you have monitoring systems in place to take those machines out of service as soon as there are problems with those servers, then users will experience glitches in service.
NB, this is essentially a real server stability issue, not an LVS issue - you'd need good monitoring in place anyway if you weren't using LVS!
Another plus in LVS's favour in something like this over the commercial boxes, is the fact that the load balancer is a Unix type box - meaning your monitoring can be as complex or simple as you like. For example load balancing based on wlc could be supplemented by server info sent to the director.
Drew Streibds@varesearch.com
23 Mar 2000I can vouch for all sorts of good performance from lvs. I've had single processor boxes handle thousands of simultaneous connections without problems, and yes, the 50,000 connections per second number from the VA cluster is true.
lvs powers SourceForge.net, Linux.com, Themes.org, and VALinux.com. SourceForge uses a single lvs server to support 22 machines, multiple types of load balancing, and an average 25Mbit/sec traffic. With 60Mbit/sec of traffic flowing through the director (and more than 1000 concurrent connections), the box was having no problems whatsoever, and in fact was using very little cpu.
Using DR mode, I've sent request traffic to an director box resulting in near gigabit traffic from the real servers. (Request traffic was on the order of 40Mbit.)
I can say without a doubt that lvs toasts F5/BigIP solutions, at least in our real world implementations. I wouldn't trade a good lvs box for a Cisco Local Director either.
> The 50,000 figure is unsubstantiated and was _not_ claimed by anyone at VA > Linux Systems. A cluster with 16 apache servers and 2 LVS servers in a was > configured for Linux World New York but due to interconnect problems the > performance was never measured - we weren't happy with the throughput of the > NICs so there didn't seem to be a lot of point. This problem has been > resolved and there should be an opportunity to test this again soon.In recent tests, I've taken multinode clusters to tens of thousands of connections per second. Sorry for any confusion here. The exact 50,000 number from LWCE NY is unsubstantiated.
Jerry Glomph Black black@real.com
23 Mar 2000
We ran a very simple LVS-DR arrangement with one PII-400 (2.2.14 kernel)directing about 20,000 HTTP requests/second to a bank of about 20 Web servers answering with tiny identical dummy responses for a few minutes. Worked just fine.
Now, at more terrestrial, but quite high real-world loads, the systems run just fine, for months on end. (using the weighted-least-connection algorithm, usually).
We tried virtually all of the commercial load balancers, LVS beats them all for reliability, cost, manageability, you-name-it.
Noma wrote Nov 2000
Are you going to implement TLS(Transport Layer Security) Ver1.0 on LVS?
WensongI haven't read the TLS protocol, so don't know if the TLS transmits IP address and/or port number in payload. In most cases, it should not, because SSL doesn't.
If it doesn't, you can use either of three VS/NAT, VS/TUN and VS/DR methods. If it does, VS/TUN and VS/DR can still work.
Ted Pavlic tpavlic@netwalk.com
, Nov 2000
I don't see any reason why LVS would have any bearing on TLS. As far as LVS was concerned, TLS connections would just be like any other connections.
Perhaps you are referring to HTTPS over TLS? Such a protocol has not been completed yet in general, and when it does it still will not need any extra work to be done in the LVS code.
The whole point of TLS is that one connects to the same port as usual and then "upgrades" to a higher level of security on that port. All the secure logic happens at a level so high that LVS wouldn't even notice a change. Things would still work as usual.
Julian Anastasov ja@ssi.bg
This is an end-to-end protocol layered on another transport protocol. I'm not a TLS expert but as I understand TLS 1.0 is handled just like the SSL 3.0 and 2.0 are handled, i.e. they require only a support for persistent connections.
David Lambe david.lambe@netunlimited.com
Mon, 13 Nov 2000
I've recently completed "construction" of a LVS cluster consisting of 1 LVS and 3 real servers. Everything seems to work OK with the setup except for rcp. All it ever gives is "Permission Denied" when running rcp blahfile node2:/tmp/blahfile from a console on node1. Both rsh and rlogin function, BUT require the password to be entered twice.
Joesounds like you are running RedHat. You have to fix the pam files. The beowulf people have been through all of this. You can either recompile the r* executables without pam (my solution), or you can fiddle with the pam files. For suggestions, go to the beowulf mailing list search engine at scyld beowulf and look for "rsh", "root", "rlogin". (hmm it seems to have gone. Looks like you have to download the whole archive at whole archive and grep through it.)
If you go to the beowulf site, you'll find people are moving to replace rsh etc with ssh etc on sites which could be attacked from outside (and turning off telnet, r* etc)
My machines aren't connected to the outside world so I have root with no passwd. To compile ssh do
./configure --with-none
and use the config file I've attached (the docs on passwordless root logins were not helpfull)
# This is ssh server systemwide configuration file. Port 22 #Protocol 2,1 ListenAddress 0.0.0.0 #ListenAddress :: HostKey /usr/local/etc/ssh_host_key ServerKeyBits 768 LoginGraceTime 600 KeyRegenerationInterval 3600 PermitRootLogin yes #PermitRootLogin without-password # # Don't read ~/.rhosts and ~/.shosts files IgnoreRhosts yes # Uncomment if you don't trust ~/.ssh/known_hosts for RhostsRSAAuthentication #IgnoreUserKnownHosts yes StrictModes yes X11Forwarding no X11DisplayOffset 10 PrintMotd yes KeepAlive yes # Logging SyslogFacility AUTH LogLevel INFO #obsoletes QuietMode and FascistLogging RhostsAuthentication no # # For this to work you will also need host keys in /usr/local/etc/ssh_known_hosts #RhostsRSAAuthentication no RhostsRSAAuthentication yes # RSAAuthentication yes # To disable tunneled clear text passwords, change to no here! PasswordAuthentication yes #PermitEmptyPasswords no PermitEmptyPasswords yes # Uncomment to disable S/key passwords #SkeyAuthentication no #KbdInteractiveAuthentication yes # To change Kerberos options #KerberosAuthentication no #KerberosOrLocalPasswd yes #AFSTokenPassing no #KerberosTicketCleanup no # Kerberos TGT Passing does only work with the AFS kaserver #KerberosTgtPassing yes CheckMail no #UseLogin no # Uncomment if you want to enable sftp #Subsystem sftp /usr/local/libexec/sftp-server #MaxStartups 10:30:60
On Mon, 25 Dec 2000, Sean wrote:
I need to forward request using the Direct Routing method to a server. However I determine which server to send the request to depending on the file it has requested in the HTTP GET not based on it's load. For this I am
Michael E Brownmichael_e_brown@dell.com
Mon, 25 Dec 2000Use LVS to balance the load among several servers set up to reverse-proxy your realservers, set up the proxy servers to load-balance to realservers based upon content.
atif.ghaffar@4unet.net
On the LVS servers you can run apache with mod_proxy comiled in, then redirect traffic with it.
Example
ProxyPass /files/downloads/ http://internaldownloadserver/ftp/ ProxyPass /images/ http://internalimagesserver/images/See more on Proxy pass and transparent proxy module for apache. You can use mod_rewrite if your realservers are reachable from the net.
Is there any way to do URL parsing for http requests (ie send cgi-bin requests to one server group, static to another group?)
John Croninjsc3@havoc.gtf.org
13 Dec 2000Probably the best way to do this is to do it in the html code itself; make all the cgis hrefs to cgi.your-domain.com. Similarly, you can make images hrefs to image.your-domain.com. You then set these up as additional virtual servers, in addition to your www virtual server. That is going to be a lot easier than parsing URLs; this is how they have done it at some of the places I have done consulting for; some of those places were using Extreme Networks load balancers, or Resonate, or something like that, using dozens of Sun and Linux servers, in multiple hosting facilities.
"K.W." kathiw@erols.com
can I run my ipchains firewall and LVS (piranha in this case) on the same box? It would seem that I cannot, since ipchains can't understand virtual interfaces such as eth0:1, etc.
Brian Edmondsbedmonds@antarcti.ca
21 Feb 2001I've not tried to use ipchains with alias interfaces, but I do use aliased IP addresses in my incoming rulesets, and it works exactly as I would expect it to.
JulianI'm not sure whether piranha already supports kernel 2.4, I have to check it. ipchains does not understand interfaces aliase even in Linux 2.2. Any setup that uses such aliases can be implemented without using them. I don't know for routing restrictions that require using aliases.
I have a full ipchains firewall script, which works (includes port forwarding), and a stripped-down ipchains script just for LVS, and they each work fine separately. When I merge them, I can't reach even just the firewall box. As I mentioned, I suspect this is because of the virtual interfaces required by LVS.
LVS does not require any (virtual) interfaces. LVS never checks the devices nor any aliases. I'm not sure what is the port forwarding support in ipchains too. Is that the support provided from ipmasqadm: the portfw and mfw modules? If yes, they are not implemented (yet). And this support is not related to ipchains at all. Some good features are still not ported from Linux 2.2 to 2.4 including all these autofw useful things. But you can use LVS in the places where use ipmasqadm portfw/mfw but not for the autofw tricks. LVS can perfectly do the portfw job and even to extend it after the NAT support: there are DR and TUN methods too.
Lorn Kaylorn_kay@hotmail.com
I ran into a problem like this when adding firewall rules to my LVS ipchains script. The problem I had was due to the order of the rules.
Remember that once a packet matches a rule in a chain it is kicked out of the chain--it doesn't matter if it is an ACCEPT or REJECT rule(packets may never get to your FWMARK rules, for example, if they do not come before your ACCEPT and REJECT tests).
I am using virtual interfaces as well (eg, eth1:1) but, as Julian points out, I had no reason to apply ipchains rules to a specific virtual interface (even with an ipchains script that is several hundred lines long!)
unknownFWMARKing does not have to be a part of an ACCEPT rule.
If you have a default DENY policy and then say:
/sbin/ipchains -A input -d $VIP -j ACCEPT /sbin/ipchains -A input -d $VIP 80 -p tcp -m 3 /sbin/ipchains -A input -d $VIP 443 -p tcp -m 3To maintain persistence between port 80 and 443 for https, for example, the packets will match on the ACCEPT rule, get kicked out of the input chain tests, and never get marked.
Mark Millermarkm@cravetechnology.com
09 May 2001We want a configuration where two Solaris based web servers will be setup in a primary and secondary configuration. Rather than load balancing between the two we really want the secondary to act as a hot spare for the primary.
Here is a quick diagram to help illustrate this question:
Internet LD1,LD2 - Linux 2.4 kernel | RS1,RS2 - Solaris Router | -------+------- | | ----- ----- |LD1| |LD2| ----- ----- | | -------+------- | Switch | --------------- | | ----- ----- |RS1| |RS1| ----- -----
Paul Baker pbaker@where2getit.com
09 May 2001
Just use heartbeat on the two firewall machines and heartbeat on the two solaris machines.
Hormshorms@vergenet.net
09 May 2001You can either add and remove servers from the virtual service (using ipvsadm) or toggle the weights of the servers from zero to non-zero values.
Alexandre Cassenalexandre.cassen@canal-plus.com
10 May 2001For your 2 LDs you need to run a Hot standby protocol. Hearthbeat can be used, you can also use vrrp or hsrp. I am actually working on the IPSEC AH implementation for vrrp. That kind of protocol can be usefull because your LD backup server can be used even if it is in backup state (you simply create 2 LDs VIP and set default gateway of your serveur pool half on LD1 and half on LD2).
For your webserver hot-spare needs, you can use the next keepalived (http://keepalived.sourceforge.net) in which there will be "sorry server" facility. This mean exactly what you need => You have a RS server pool, if all the server of this RS server pool are down then the sorry server is placed into the ipvsadm table automaticaly. If you use keepalived keep in mind that you will use NAT topology.
Joe 11 May 2001Unless there's something else going on that I don't know about, I expect this isn't a great idea. The hot spare is going to degrade (depreciate, disk wear out - although not quite as fast, software need upgrading) just as fast idle as doing work.
You may as well have both working all the time and for the few hours of down time a year that you'll need for planned maintenance, you can make do with one machine. If you only need the capacity of 1 machine, then you can use two smaller machines instead.
Since an LVS obeys unix client/server semantics, an LVS can replace a realserver (at least in principle, no-one has done this yet). Each LVS layer could have its own forwarding method, independantly of the other LVSs. The LVS of LVSs would look like this, with realserver_3 being in fact the director of another LVS and having no services running on it.
________ | | | client | |________| | | (router) | | | ____________ | DIP | | |------| director_1 | | VIP |____________| | | | ------------------------------------ | | | | | | RIP1, VIP RIP2, VIP RIP3, VIP ______________ ______________ _____________ | | | | | | | realserver1 | | realserver2 | | realserver3 | | | | | | =director_2 | |______________| |______________| |_____________| | | ------------------------------------ | | | | | | RIP4, VIP RIP5, VIP RIP6, VIP ______________ ______________ ______________ | | | | | | | realserver4 | | realserver5 | | realserver6 | | | | | | | |______________| |______________| |______________|
If all realservers were offering http and only realservers1..4 were offering ftp, then you would (presumably) setup the directors with the following weights for each service:
You might want to do this if realservers4..6 were on a different network (i.e. geographically remote). In this case director_1 would be forwarding by LVS-Tun, while director_2 could use any forwarding method.
This is the sort of ideas we were having in the early days. It turns out that not many people are using LVS-Tun, most people are using Linux realservers, and not many people are using geographically distributed LVSs.
Joe, Jun 99
For the forseeable future many of the servers who could benefit from the LVS will be microsoft or solaris. The problem is that they don't have tunneling. A solution would be to have a linux box in front of each real server on the link from the director to the real server. The linux box appears to be the server to the director (it has the real IP eg 192.168.1.2) but does not have the VIP (eg 192.168.1.110). The linux box decapsulates the packet from the director and now has a packet from the client to the VIP. Can the linux box route this packet to the real server (presumably to an lo device on the real server)?
The linux box could be a diskless 486 machine booting off a floppy with a patched kernel, like the machines in the Linux router project.
Wensong 29 Jun 1999We can use nested (hyprid) LinuxDirector approach. For example,
VS-TUN ----> VS-NAT ----> RealServer1 | | ... | -----> RealServer2 | | .... | | --------> VS-NAT ....Real Servers can run any OS. A VS-NAT load balancer usually can schedule over 10 general servers. And, these VS-NATs can be geographically distributed.By the way, LinuxDirector in kernel 2.2 can use VS-NAT, VS-TUN and VS-DR together for servers in a single configuration.
(This is not an LVS problem, just a normal routing problem.)
Logulvslog@yahoo.com
5 OctI have two isdn internet connection from two different isps. I am going to put an lvs_nat between the users and these two links so as to loadbalace the bandwidth.
Julian
You can use the Linux's multipath feature:
# ip ru 0: from all lookup local 50: from all lookup main ... 100: from 192.168.0.0/24 lookup 100 200: from all lookup 200 32766: from all lookup main 32767: from all lookup 253 # ip r l t 100 default src DUMMY_IP nexthop via ISP1 dev DEV1 weight 1 nexthop via ISP2 dev DEV2 weight 1 # ip r l t 200 default via ISP1 dev DEV1 src MY_IP1 default via ISP2 dev DEV2 src MY_IP2
You can add my dead gateway detection extension (for now only against 2.2)
This way you will be able fully to utilize the both lines for masquerading. Without this patch you will not be able to select different public IPs to each ISP. They are named "Alternative routes". Of course, in any case the management is not an easy task. It needs understanding.