Sending emails through Postfix timing out when using saslauthd

I was recently debugging an issue where Postfix users were able to receive emails, but emails in their outbox were timing out and refusing to send. Restarting Postfix or saslauthd would briefly alleviate the issue.

The mail log was full of PAM authentication errors, and even testing authentication with testsaslauthd was taking a long time to return (tens of seconds). The format of the command is:

testsaslauthd -u username -p password -s smtp -r domain.com

Running strace against the above command showed it blocking connecting to saslauthd, suggesting that SASL was overloaded and unable to serve connections in a timely manner:

socket(PF_LOCAL, SOCK_STREAM, 0)        = 3
connect(3, {sa_family=AF_LOCAL, sun_path="/var/run/saslauthd/mux"}, 110) = 0

I checked several of the authentication failures in /var/log/maillog and determined that the majority of the usernames didn't exist in the system, and that a handful of IP addresses were trying multiple usernames, so it looked as though a brute force attack was exhausting saslauthd's resources.

The mail log was showing around 150 failed login attempts per minute:

# grep "authentication failure" /var/log/maillog | cut -f1-2 -d":" | uniq -c
    143 Jun 22 12:00
    152 Jun 22 12:01
    158 Jun 22 12:02
    158 Jun 22 12:03
    160 Jun 22 12:04
    162 Jun 22 12:05
    149 Jun 22 12:06
    151 Jun 22 12:07
    157 Jun 22 12:08
    131 Jun 22 12:09
    156 Jun 22 12:10
    156 Jun 22 12:11
    155 Jun 22 12:12
    143 Jun 22 12:13

And analysing per IP showed tens of thousands of failed attempts from some IP addresses:

# grep "authentication failure" /var/log/maillog | cut -f3 -d"[" | cut -f1 -d"]" | sort | uniq -c | sort -n
<OUTPUT REDACTED>

I decided to configure Fail2Ban to block anyone with more than 3 failed connections in a row (note that this may block legitimate users, tweak for your own needs):

[sasl]
enabled  = true
port     = 25,587
filter   = postfix-sasl
logpath  = /var/log/maillog
maxretry = 3
Add this to /etc/fail2ban/jail.local
[INCLUDES]
before = common.conf
[Definition]
_daemon = postfix/smtpd
failregex = ^%(__prefix_line)swarning: [-._\w]+\[<HOST>\]: SASL (?:LOGIN|PLAIN|(?:CRAM|DIGEST)-MD5) authentication failed(: [ A-Za-z0-9+/]*={0,2})?\s*$
Create this in /etc/fail2ban/filter.d/postfix-sasl.conf

Restart Fail2Ban and if you watch the Fail2Ban log, you should start to see it blocking IPs, and SASL should start responding normally again.