View Issue Details

IDProjectCategoryView StatusLast Update
0002077phplist applicationBatch Processingpublic23-02-05 01:38
Reporteredgreenberg 
PrioritynormalSeveritymajorReproducibilityalways
Status resolvedResolutionfixed 
Product Version2.8.11 
Target VersionFixed in Version2.9.4 
Summary0002077: Outbound Emails are tarpitted due to multiple sessions
DescriptionI have phpList installed on a hosting account that uses QMAIL to send mail. When I send to the 13,000 members, each outbound message starts off a qmail-remote session with the distant MTA (mail transfer agent -- or mail server). When there are multiple messages to the same MTA, there can be multiple sessions to the MTA at once.

Many MTAs will not talk to you more than once at a time, as a spam deterrant. In particular, Yahoo accepts your connection but does not respond until the first one is done. (This is called Tarpitting.) The result is a bunch of hung outgoing qmail programs. Eventually, the hosting account hits a process limit and phpList now fires off failing processes, marking the mail as sent, yet it is not. CPU utilization skyrockets and the hosting company (rightfully) kills all the processes in the account :(

What seems to be needed is a throttle mechanism, not on the number of messages per time period, but a delay between outbound messages.

Note, I am using 2.8.9, not 2.8.11.
TagsNo tags attached.

Relationships

child of 0002456 resolvedmichiel PHPList v2.9.4 release 

Activities

stephenrs

28-10-04 04:42

reporter   ~0002408

Yes, a throttling mechanism would be extremely useful. i'm hosted on a virtual private server system, and i did some experiments to see what my system would do under heavy phplist load:

- i sent a message to 6000 addresses on the same domain
- the domain is hosted on the same server as phplist
- the addresses were numbered like 1@domain-name.com, 2@domain-name.com, 3@domain-name.com...6000@domain-name.com
- all 6000 addresses are virtmapped to the same user on the server, so they go to the same mail file.

when the process was done (1 hour and 10 minutes) i performed some analysis on the single mail file to see if any of the messages were lost...many of them were. only about 4000 of the 6000 actually got sent.

i looked at the numbering in the messgages in the mail file and found that there was a pattern. about every 15 minutes, or every 1500 messages, messages were lost before the process resumed normally.

i'm guessing that this is because either some unknown process was firing up every 15 minutes and hogging cpu cycles, or the mail queue or something was getting jammed up because of phplist. in either case, being able to set a delay between message sends would probably fix the problem. i solved it by using the batch send feature and using small batches - this is not ideal...

i've attached a graph that shows the results of this analysis. the high bars in the graph show where and when messages were getting lost. if you have any questions about the details of how i set this experiment up let me know - but basically it's because the email addresses were numbered - it made it easy to see what was happening.

adding a delay should be a very easy thing to code. no?

28-10-04 04:42

 

the6000.gif (14,318 bytes)
the6000.gif (14,318 bytes)

michiel

03-12-04 18:54

manager   ~0002863

You are raising two issues:

1. mails go out too quickly, local SMTP server can't handle it
2. too many mails go to one domain, particularly yahoo and the like, in a too short time frame

1 should be possible to resolve with batch processing. Have you tried that?
2 is an issue that would be nice to add. I'd like some more data first though, so any information you have about this would be great.

stephenrs

03-12-04 19:42

reporter   ~0002865

Actually, i think it's really the same issue - the problem seems to be that when phplist sends messages it sends them sequentially as fast as the host system can fire them off. this has the potential effect of overflooding the SMTP server (which could be on a separate box), or overflooding the receiving domain. because i did the experiment all on one machine, and phplist didn't report any errors (and i didn't do any traces or anything), i can't tell exactly where the breakdown occurred.

i tried batch processing, but this is only a partial solution. it appears that batch processing works by sending the batch number (1000 for example) as fast as possible, and then waiting for the batch period (1 hour for example) to elapse before sending the next batch. if this is correct, then if it takes 15 minutes to send the batch, phplist will wait 45 minutes before sending the next batch if the batch period is 1 hour. this is perhaps a polite way to use shared server resources, but does not address the problem of flooding.

also, batch processing does not appear to work well with automated cron job processing - after the first batch is sent, phplist never sends another batch when run from the command line. i have worked around this by setting up multiple cron jobs to run in succession. each cron job picks up where the last one left off.

i think it would be best to give finer control over how messages get sent - so a combination of throttling and batch processing could be used to customize queue processing for different environments, to help make sure all messages get sent cleanly. error checking on sent messages could also be improved...and then down the road (and ideally), ensuring that all messages get sent could be part of the job of the application, rather than the administrator - so the throttling or resending of failed messages could maybe be automated...

i hope this helps.

michiel

03-12-04 22:38

manager   ~0002872

yes, that's very useful. You're right, batch processing would still try to send as quickly as it can. The current cvs version is now more capable of identifying whether a mail has failed, although that still depends on the MTA reporting it back to PHP. If the MTA (and I use qmail myself, which is designed to handle a load like this and can be throttled as well) reports success, that would require more tweaking, and some throttle could help that.

As for batch processing from commandline, it is supposed to do it that way. I have all my queues running every 15 minutes. If a process is still running the next one fired off will simply quit and if not, the next one will continue where the last one left off. Effectively between all these processes, it is quite possible to come up with the best throttle. Process queue once every five minutes and only allow sending 50 messages in 5 minutes. Or something like that.

If I were to add throttling, what would you think would be a good default value? Don't send more than 10 messages a minute?

stephenrs

03-12-04 23:56

reporter   ~0002878

yes, i guess the weak link in the chain will always be the MTA's ability to properly report status back to php, so you can just come close to perfection, and add some insurance with throttling.

i didn't realize overlapping crons would resolve themselves. this is good to know, but using the batch feature with cron timing could get a little confusing and error prone as a substitute for an explicit throttle.

so as far as default throttling goes - i think it should be expressed as a simple delay between messages, rather than a rate. "10 messages per minute" is really just a small "batch" - so i would guess that something like that can already be done with the current release.

so i think a better way would be to say something like "wait x number of milliseconds after sending each message, before sending the next message". given that my host could initiate sending about 6000 messages per hour (or 1 message about every .6 seconds), a good default for me might be a delay of about 300ms. so in my case, instead of firing off 100 messages per minute, phplist would only try to send about 60 (i think this math is right) - this would give all the links in the chain a little more breathing room...a higher "wait value" might be more safe for the average system however.

this bit seems pretty easy to code (possibly using the usleep function), but i guess you would also have to code some intelligence to deal with any conflicts that could arise with batch settings and actual execution time...make sense?

...maybe you could also include a little utility that allows admins to test their systems to come up with optimal throttle and / or batch values...the tool would do something like fire a few thousand unique messages at a preconfigured test address and report back statistics on lost messages for given throttle / batch settings...maybe i'm just rambling, but i bet people are losing messages and they don't even know it...

michiel

04-12-04 00:02

manager   ~0002880

Well, I wasn't necessarily thinking that the x number per minute would be a little batch thing, more that the throttle would be calculated on the fly. Some messages can take a long time to calculate, and therefore the delay can be less. For example in the current CVS you can send a URL, so the first message will take a while to send, because it first has to pull the page off (and store it in cache). So if we use a "x per minute" value, the delay can be constantly changed.

stephenrs

04-12-04 00:29

reporter   ~0002881

hmmm, calculating on the fly sounds kinda cool, but it also sounds a bit complex to implement reliably, and i think you still run the risk of over flooding SMTP servers or MTAs (unless you combined it with some sort of domain-specific logic that waited for MTA responses). because you never know how long the next message will take to send, it seems to me that any algorithm you use to calculate the throttle will always tend toward the batch methodology of sending as fast as possible, then waiting for the period to elapse. no?

and if the delay can be constantly changed, you run the risk of it getting set too low for some part of the system to handle the flood of messages.

i still think a delay would accomplish the objective more simply, efficiently, and reliably, and give more control to admins.

why don't you like the delay concept?

04-12-04 00:40

 

queue.png (3,450 bytes)
queue.png (3,450 bytes)

michiel

04-12-04 00:43

manager   ~0002882

Oh, I have nothing against a delay, but I have systems that run loads of messages and it seems to work ok. For example attached is a graph of the queue. Fair enough this was only 5k messages, but phplist simply passes it on to qmail and qmail does the rest. Seems to work fine. The other graph shows a higher peak when 22k messages went out.

04-12-04 00:44

 

queue2.png (3,311 bytes)
queue2.png (3,311 bytes)

stephenrs

04-12-04 00:57

reporter   ~0002883

i'm using sendmail on my system, and maybe that's the problem (i know qmail is better for several reasons, i'm just too lazy...), but i guess the point is that it would be great if there was a way to spoon feed outbound messages, rather than rapid firing them, because you can't really predict what the condition of a particular host system will be.

michiel

04-12-04 01:08

manager   ~0002884

yes, but the problem is that you would want the ISP to control this and not the phplist user. They wouldn't have a clue. And I have to set some sensible default which will be used by most inexperienced users. And then I get flooded with requests that sending is so slow, and what to do about it. I guess I need to get more of an ISP community going on the side, to get them involved. I can set it up as a value in the ISP wide config file that is used (/etc/phplist.conf). That would be the best way I think.

stephenrs

04-12-04 02:20

reporter   ~0002885

well, you're kinda right. i didn't mean to suggest that this should be an ISP thing, but I guess i'm a bit of an advanced user - i'm administering a server and a phplist installation for a client who deals with the actual list/user/message administration as a sub-admin. it's my job to make sure the system works seamlessly for them. so i guess i agree with you - i've been envisioning the delay as a param in the local config file (.../config/config.php), maybe in the "advanced features" section - or perhaps accessible from the super-admin interface.

having it in a global /etc/phplist.conf would also be useful if i ended up setting up a separate phplist install for another client.

i actually prefer that ISPs place as few restrictions as possible on a system. i'm on a virtual private server with full root access, so it feels and behaves very much like a dedicated box.

the default delay value could be '0' to avoid slow-down problems on systems that don't need a delay.

michiel

06-12-04 10:04

manager   ~0002897

yes, the way you propose should be quite straightforward and I will mark this as todo. Thanks for discussing the issue so clearly.

michiel

23-02-05 01:38

manager   ~0003608

Throttling has now been added in two ways. A simple delay to be used and an "autothrottle" although I'm not sure the latter works ok, so it's experimental.