Tux

...making Linux just a little more fun!

Sendmail and capacity

Dennis Veatch [dennisveatch at bellsouth.net]


Fri, 25 Apr 2008 10:34:49 -0400

Hi guys and gals.

I have what I thought would be a simple question. How do you figure out how many emails sendmail can process and not drive the load average over say 2 or 3? After much googling around and trying to glean information from the sendmail FAQs, etc I am still stumped. I know it depends on hardware configuration, the number of mailboxes, how many emails are sent and received for a given time frame, etc. But I can't even find a general rule of thumb to even get a ball park idea. Can ya help me out?

Perhaps I am approaching this from the wrong perspective as I realize the above statements are most likely way to general to give even a ball park answer, though if you could that would be great.

-- 
You can tuna piano but you can't tune a fish.


Top    Back


Rick Moen [rick at linuxmafia.com]


Fri, 25 Apr 2008 11:37:29 -0700

Quoting Dennis Veatch (dennisveatch@bellsouth.net):

> Hi guys and gals.
> 
> I have what I thought would be a simple question. How do you figure
> out how many emails sendmail can process and not drive the load
> average over say 2 or 3? After much googling around and trying to
> glean information from the sendmail FAQs, etc I am still stumped. I
> know it depends on hardware configuration, the number of mailboxes,
> how many emails are sent and received for a given time frame, etc. But
> I can't even find a general rule of thumb to even get a ball park
> idea. Can ya help me out?

Possibly your problem is constraining your query by system loading: A mail admin's attitude towards his MTA box tends to be that, if it's loaded, it's loaded: As long as the machine keeps up and doesn't fall over, it really doesn't matter of load hits 50 for long stretches. That having been said:

It's quite routine for commodity (but modern) white boxes running sendmail to be able to send out a couple of _hundred thousand_ SMTP messages per day before hitting I/O bottlenecks. Those bottlenecks then tend to be a result of filling the outbound bandwidth "pipe" to capacity, not overtaxing the machine.[1] However, if you really want to be certain of the machine's ability to handle that level of traffic, you should expect to spend US $10k per MTA host, and have a very beefy SCSI (SAS)-based disk subsystem, preferably with a sizeable amount of local disk cache to keep as much as possible of the outbound queue in RAM and maxed-out system RAM. You would also need to consider splitting the load among multiple hosts, at some point.

Last, you'd really be best advised, with that sort of deployment, to hire one of the houses that do this sort of system design and deployment for a living.

I hope that helps.

[1] That external contraint would be removed if, say, the machine were used to handle mail only within a high-speed LAN, and not to/from public networks. But more-typical sendmail deployments deal with the Internet, of course.

-- 
Cheers,           "I don't like country music, but I don't mean to denigrate
Rick Moen         those who do.  And, for the people who like country music,
rick@linuxmafia.com         denigrate means 'put down'."      -- Bob Newhart


Top    Back


Rick Moen [rick at linuxmafia.com]


Fri, 25 Apr 2008 13:03:59 -0700

Forwarding Dennis's response back on-list.

----- Forwarded message from Dennis Veatch <dennisveatch@bellsouth.net> -----

From: Dennis Veatch <dennisveatch@bellsouth.net>
To: Rick Moen <rick@linuxmafia.com>
Date: Fri, 25 Apr 2008 15:01:01 -0400
Subject: Re: [TAG] Sendmail and capacity
On Friday 25 April 2008 14:37:29 you wrote:

> Quoting Dennis Veatch (dennisveatch@bellsouth.net):
> > Hi guys and gals.
> >
> > I have what I thought would be a simple question. How do you figure
> > out how many emails sendmail can process and not drive the load
> > average over say 2 or 3? After much googling around and trying to
> > glean information from the sendmail FAQs, etc I am still stumped. I
> > know it depends on hardware configuration, the number of mailboxes,
> > how many emails are sent and received for a given time frame, etc. But
> > I can't even find a general rule of thumb to even get a ball park
> > idea. Can ya help me out?
>
> Possibly your problem is constraining your query by system loading:
> A mail admin's attitude towards his MTA box tends to be that, if it's
> loaded, it's loaded:  As long as the machine keeps up and doesn't fall
> over, it really doesn't matter of load hits 50 for long stretches.  That
> having been said:
>
> It's quite routine for commodity (but modern) white boxes running
> sendmail to be able to send out a couple of _hundred thousand_ SMTP
> messages per day before hitting I/O bottlenecks.  Those bottlenecks then
> tend to be a result of filling the outbound bandwidth "pipe" to
> capacity, not overtaxing the machine.[1]  However, if you really want
> to be certain of the machine's ability to handle that level of traffic,
> you should expect to spend US $10k per MTA host, and have a very beefy
> SCSI (SAS)-based disk subsystem, preferably with a sizeable amount of
> local disk cache to keep as much as possible of the outbound queue in
> RAM and maxed-out system RAM.  You would also need to consider splitting
> the load among multiple hosts, at some point.
>
> Last, you'd really be best advised, with that sort of deployment, to
> hire one of the houses that do this sort of system design and deployment
> for a living.
>
> I hope that helps.
>
> [1] That external contraint would be removed if, say, the machine were
> used to handle mail only within a high-speed LAN, and not to/from
> public networks.  But more-typical sendmail deployments deal with the
> Internet, of course.

Thanks and it does help some. At least I know now it is a hard thing to determine unless a person is well versed with (in this case) sendmail and has a specific set of hardware in mind.

So ignoring the size of the pipes and considering mail exchange both ways and running sendmail on a AMD X2 4200+ with 4 GB ram. Any ball park idea how many emails it could process? I of course mean they arrive and ones are sent at a reasonable rate to not overload the box. Or is this type box now considered a commodity white box :)

-- 
You can tuna piano but you can't tune a fish.
----- End forwarded message -----


Top    Back


Rick Moen [rick at linuxmafia.com]


Fri, 25 Apr 2008 13:14:35 -0700

Quoting Dennis Veatch (dennisveatch@bellsouth.net):

> So ignoring the size of the pipes and considering mail exchange both
> ways and running sendmail on a AMD X2 4200+ with 4 GB ram. Any ball
> park idea how many emails it could process? I of course mean they
> arrive and ones are sent at a reasonable rate to not overload the box.
> Or is this type box now considered a commodity white box :)

In the context of someone using eyebrow-raising phrases like "ignore the size of the pipes", I'd call that a commodity white box -- as opposed to, say, a Sun Fire V890. Also, if you're talking about that large an installation, and aren't satisfied with my shirtsleeve figure of a few hundred thousand messages per day, then the sponsoring organisation can afford to do a couple of pilot runs for scaling and testing purposes.

(By the way, please remember to include the tag@lists.linuxgazette.net mailing list on your replies. It really does no good at all to send individual subscribers like myself private mail.)

-- 
Cheers,                                      "Reality is not optional."
Rick Moen                                             -- Thomas Sowell
rick@linuxmafia.com


Top    Back


Ben Okopnik [ben at linuxmafia.com]


Sat, 26 Apr 2008 11:24:29 -0400

On Fri, Apr 25, 2008 at 11:37:29AM -0700, Rick Moen wrote:

> Quoting Dennis Veatch (dennisveatch@bellsouth.net):
> 
> > Hi guys and gals.
> > 
> > I have what I thought would be a simple question. How do you figure
> > out how many emails sendmail can process and not drive the load
> > average over say 2 or 3? After much googling around and trying to
> > glean information from the sendmail FAQs, etc I am still stumped. I
> > know it depends on hardware configuration, the number of mailboxes,
> > how many emails are sent and received for a given time frame, etc. But
> > I can't even find a general rule of thumb to even get a ball park
> > idea. Can ya help me out?
> 
> Possibly your problem is constraining your query by system loading:
> A mail admin's attitude towards his MTA box tends to be that, if it's
> loaded, it's loaded:  As long as the machine keeps up and doesn't fall
> over, it really doesn't matter of load hits 50 for long stretches.  That
> having been said: 
> 
> It's quite routine for commodity (but modern) white boxes running
> sendmail to be able to send out a couple of _hundred thousand_ SMTP
> messages per day before hitting I/O bottlenecks.  Those bottlenecks then
> tend to be a result of filling the outbound bandwidth "pipe" to
> capacity, not overtaxing the machine.[1]

Which neatly leads us to Ben's Rule of Bottlenecks: "It's always I/O". Not 100% accurate, but it makes for a good reminder: given how cheap memory, disk space, or even a new computer is, in most cases by far, I/O is the most "immovable" limiting factor. It's also a good thing to keep track of in programming: loading up the CPU or using more memory is preferable to creating extra network traffic.

I recently read a story where a small dental partnership (three offices) had decided to combine their practices and use a certain software package, with all the records kept on a single database server. It had worked fine at one office, but completely froze, everywhere, after the transition. It seems that the people who wrote the package made the tiny mistake of doing a 'SELECT * FROM table' (i.e., pulling down the entire patient database) for every *single* query and then filtering it locally... worse yet, since the individual offices now had to integrate their records into the database, it was doing it for every single record, times 3.

The upshot of the story was that the partnership broke up, with great acrimony.

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *


Top    Back


John Karns [johnkarns at gmail.com]


Mon, 28 Apr 2008 17:11:44 -0700

On Sat, Apr 26, 2008 at 8:24 AM, Ben Okopnik <ben@linuxmafia.com> wrote:

>  I recently read a story where a small dental partnership (three offices)
>  had decided to combine their practices and use a certain software
>  package, with all the records kept on a single database server. It had
>  worked fine at one office, but completely froze, everywhere, after the
>  transition. It seems that the people who wrote the package made the tiny
>  mistake of doing a 'SELECT * FROM table' (i.e., pulling down the entire
>  patient database) for every *single* query and then filtering it
>  locally... worse yet, since the individual offices now had to integrate
>  their records into the database, it was doing it for every single
>  record, times 3.

Ouch! That kind of implementation sets PC-based RDMS back about 15 years, to the days of dBase technology, before client-server technology was adapted to PC server platforms. I find it incredible that there are people in the IT field who could be so totally ignorant and short-sighted. Then again, it stirs memories of working in environments where the ends always justifies the means. In addition, it has a certain familiarity to methods so often used when the application is aimed at a mainframe.

-- 
John


Top    Back


Ben Okopnik [ben at linuxgazette.net]


Tue, 29 Apr 2008 00:42:26 -0400

On Mon, Apr 28, 2008 at 05:11:44PM -0700, John Karns wrote:

> On Sat, Apr 26, 2008 at 8:24 AM, Ben Okopnik <ben@linuxmafia.com> wrote:
> 
> >  I recently read a story where a small dental partnership (three offices)
> >  had decided to combine their practices and use a certain software
> >  package, with all the records kept on a single database server. It had
> >  worked fine at one office, but completely froze, everywhere, after the
> >  transition. It seems that the people who wrote the package made the tiny
> >  mistake of doing a 'SELECT * FROM table' (i.e., pulling down the entire
> >  patient database) for every *single* query and then filtering it
> >  locally... worse yet, since the individual offices now had to integrate
> >  their records into the database, it was doing it for every single
> >  record, times 3.
> 
> Ouch!  That kind of implementation sets PC-based RDMS back about 15
> years, to the days of dBase technology, before client-server
> technology was adapted to PC server platforms.  I find it incredible
> that there are people in the IT field who could be so totally ignorant
> and short-sighted.  Then again, it stirs memories of working in
> environments where the ends always justifies the means.  In addition,
> it has a certain familiarity to methods so often used when the
> application is aimed at a mainframe.

I strongly suspect that it was a case of "Works Fine For Me" on a single system - no matter how stupid the retrieval method, it wouldn't show up (much) when it was all just going across the local bus. When that was redirected across a network, however... well, as John Walker of Autodesk once wrote (I'm misquoting from memory), "Make every part of your software robust. People will use it for things you never expected and in ways you never expected."

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *


Top    Back