qmail-smtpd exiting after 1 second

BlackMagic

New Email
I've used qmail-1.03 for 10 years without problems. Recently I installed the latest patched version, netqmail-1.06, on a Centos 5.4 box and followed the LWQ instructions to the letter. qmail appears to be working, but when I look at the status of the various components using qmailctl I see that qmail-smtpd is exiting after approximately 1 second, and is being continuously restarted by svcscan, rapidly churning through pids along the way.

Here's the proof:
qmailctl status
/service/qmail-send: up (pid 32545) 1211 seconds
/service/qmail-send/log: up (pid 32547) 1211 seconds
/service/qmail-smtpd: up (pid 6275) 0 seconds
/service/qmail-smtpd/log: up (pid 32550) 1211 seconds
.....short delay.....
qmailctl status
/service/qmail-send: up (pid 32545) 1264 seconds
/service/qmail-send/log: up (pid 32547) 1264 seconds
/service/qmail-smtpd: up (pid 6555) 1 seconds
/service/qmail-smtpd/log: up (pid 32550) 1264 seconds

I've seen a couple of queries about this on Google but I've never seen an answer.

I've checked all the supervise/run commands and they appear to be OK. The log for qmail-send (/var/log/qmail/current) is OK, but the log for qmail-smtpd (/var/log/qmail/smtpd/current) is empty.

The log for svcscan doesn't show any problems. Here is additional configuration information:
cat /var/qmail/supervise/qmail-send/log/run
#!/bin/sh
exec /usr/local/bin/setuidgid qmaill /usr/local/bin/multilog t /var/log/qmail
cat /var/qmail/supervise/qmail-smtpd/log/run
#!/bin/sh
exec /usr/local/bin/setuidgid qmaill /usr/local/bin/multilog t /var/log/qmail/smtpd

ps xf shows (in part):
2228 ? S 0:00 \_ supervise rsync
2262 ? S 0:00 | \_ tcpserver -vHRU -l 0 192.168.100.13 873 nice -5 rsync --daemon --no-detach --config /etc/rsyncd.conf
2230 ? S 0:00 \_ supervise qmail-send
2231 ? S 0:00 \_ supervise log
2234 ? S 0:00 \_ supervise qmail-smtpd
15118 ? Z 0:00 | \_ [tcpserver] <defunct>
2235 ? S 0:00 \_ supervise log

From this I suspect there's something wrong with the tcpserver, but it's working OK with other modules.

I would appreciate if someone could point me in the right direction.
 

EQ Admin

EQ Forum Admin
Staff member
Can you post your full qmail-smtpd run script too?

> cat /var/qmail/supervise/qmail-smtpd/run

It should look something like :

#!/bin/sh

QMAILDUID=`id -u qmaild`
NOFILESGID=`id -g qmaild`
MAXSMTPD=`cat /var/qmail/control/concurrencyincoming`
LOCAL=`head -1 /var/qmail/control/me`

#QMAILQUEUE="/var/qmail/bin/simscan"
#export QMAILQUEUE

if [ -z "$QMAILDUID" -o -z "$NOFILESGID" -o -z "$MAXSMTPD" -o -z "$LOCAL" ]; then
echo QMAILDUID, NOFILESGID, MAXSMTPD, or LOCAL is unset in
echo /var/qmail/supervise/qmail-smtpd/run
exit 1
fi

if [ ! -f /var/qmail/control/rcpthosts ]; then
echo "No /var/qmail/control/rcpthosts!"
echo "Refusing to start SMTP listener because it'll create an open relay"
exit 1
fi

exec /usr/local/bin/softlimit -m 30000000 \
/usr/local/bin/tcpserver -v -H -R -l "$LOCAL" -x /etc/tcp.smtp.cdb -c "$MAXSMTPD" \
-u "$QMAILDUID" -g "$NOFILESGID" 0 smtp /usr/local/bin/rblsmtpd -t 5 \
-b -r zen.spamhaus.org \
/var/qmail/bin/qmail-smtpd 2>&1



Use vi to open your run script. Are all of the line breaks OK in your run scripts? Make sure long lines are really only 1 line. Check for type-o's.

Are all of the permissions OK on all of your supervise scripts and directories leading to them? The run files should be chmod 755.

Are you running any extra programs in your qmail-smtpd run file? If yes you might need to increase the memory softlimit from 2MB to 30-40MB depending on the extra programs you're calling.

What happens when you telnet to port 25 on your server? Do you get any error messages?

If all else fails redo section 2.8.2.2. The supervise scripts at Life with qmail and make sure follow it exactly step by step.

I have a little extra above over a LWQ install since I used to use simscan to run virus and spamassassin checks, and I also have a spamhaus RBL check in there as part of the qmail-smtpd line.

Are you using tcpserver for other programs too such as rsync?

From checking all the above you should find the problem. Please let us know which piece it was so others can benefit from the answer too. :thanks:
 

BlackMagic

New Email
I solved the problem after a little more research. The smptd run script from LWQ has a softlimit of 200000. I raised this to 400000, restarted qmail and now it's running perfectly.

LWQ mentions that softlimits can cause problems with tcpserver, but suggests that error messages will be emitted if this is the case. I didn't find any error messages.

I've lived with this bug for almost 6 months and it's sure good to be rid of it.
 

BlackMagic

New Email
Thanks Popowich. It turned out to be a softlimit problem in the qmail-smptd run script.

I use tcpserver with anything I can get running under it, including rsynd, samba, PostgreSQL, clockspeed, dnscache and sshd.
 

EQ Admin

EQ Forum Admin
Staff member
Yes, needing to increase your softlimit is a common problem. If you start adding more programs to the run script (for example I used to use simscan to run clamav and spamassassin) you'll need to increase the limit so it's big enough to run the extra programs. Do you have any monitoring on your smtp service? I'd think that you would be getting at least intermittent smtp timeout warnings if this had been an ongoing issue.
 

yukon

Valued Member
ahh yes, the old softlimit problem ... I had required Popowich's expertise a few years ago with the same problem ... I seem to vaguely recall it was more of an issue on red hat derivatives for some reason or another.
 

BlackMagic

New Email
:thanks:For the record, here's the smtpd script:

#!/bin/sh
QMAILDUID=`id -u qmaild`
NOFILESGID=`id -g qmaild`
MAXSMTPD=`cat /var/qmail/control/concurrencyincoming`
LOCAL=`head -1 /var/qmail/control/me`
if [ -z "$QMAILDUID" -o -z "$NOFILESGID" -o -z "$MAXSMTPD" -o -z "$LOCAL" ]; then
echo QMAILDUID, NOFILESGID, MAXSMTPD, or LOCAL is unset in
echo /var/qmail/supervise/qmail-smtpd/run
exit 1
fi
if [ ! -f /var/qmail/control/rcpthosts ]; then
echo "No /var/qmail/control/rcpthosts!"
echo "Refusing to start SMTP listener because it'll create an open relay"
exit 1
fi
exec /usr/local/bin/softlimit -m 4000000 \
/usr/local/bin/tcpserver -v -R -l "$LOCAL" -x /etc/tcp.smtp.cdb -c "$MAXSMTPD" \
-u "$QMAILDUID" -g "$NOFILESGID" 0 25 /var/qmail/bin/qmail-smtpd 2>&1

I don't have any way of monitoring smtpd, and the smtpd log was always empty.

I dropped the softlimit back to 3000000 as an experiment. smtpd wouldn't start, but it placed an error message in the smtpd log: error while loading shared libraries: libc.so.6. It didn't do that when the softlimit was set at 2000000.
 

EQ Admin

EQ Forum Admin
Staff member
Thanks for taking a few minutes to experiment and see what happens at the different softlimit values. A troubleshooting step I missed above was checking the command "id -u qmaild" on the server. That works fine on Linux. Some of my smtp relays are qmail running on Solaris. On those servers I need to use "/usr/xpg4/bin/id -u qmaild". It also a perfectly valid option for the admin to put in the actual ids as static numbers instead of the command to figure them out dynamically.
 

BlackMagic

New Email
Popowich, the really good part is a Google search on 'qmail smtpd exit 1 second' now brings this article up as the first link. In the past I've searched on the same string and got nothing that solved the problem; mostly qmail-send issues being discussed.
 

Olecoot

New Email
Just had this issue. Had to bump the softlimit up to 8000000.

/service/qmail-send: up (pid 21861) 151 seconds
/service/qmail-send/log: up (pid 21862) 151 seconds
/service/qmail-smtpd: up (pid 21863) 151 seconds
/service/qmail-smtpd/log: up (pid 21864) 151 seconds
 
Top