The root of the problem is the version of Berkeley DB that’s been used to store the bayes files. On Sarge it’s 4.3. The problem is that the DreamHost mail servers haven’t been upgraded yet so they’re still using Perl 5.6.1 and DB 3.x.
This would explain why it was previously able to filter my mail but failed when training, because the training was taking place on a Sarge machine. Once I upgraded the files to the 4.3 format (db4.3_upgrade) the filtering failed. All my newly created bayes files are in 4.3 format so they won’t work at filtering. I had a quick stab at trying to downgrade them without any success. Since DH is supposed to upgrade the mail servers any time now it’s hardly work trying to get it to work in the mean time.
So this message:
warn: bayes: cannot open bayes databases /home/.../.spamassassin/bayes_* R/O: tie failed:
means your bayes file have the wrong DB version. Check this page for more details:
http://wiki.apache.org/spamassassin/DbDumpAndLoad
Another annoying thing that stumped me for ages is that procmailrc doesn’t appear to let you set the value of PERL5LIB. I thought that would have fixed my problem with the DB issue but also with a DNS version problem. I also see this in my logs:
warn: dns: Net::DNS version is 0.19, but need 0.34 at /home/.../usr/share/perl/5.8.4/Mail/SpamAssassin/Dns.pm line 589.
This is because DH is using some ancient version of Net::DNS. It’s not a fatal error but a lot of useful tests need it. So without it and bayesian testing I guess I’ll be seeing a lot more spam in the couple of weeks until it all gets resolved.