Roaring Penguin Upgrades CanIt Spam Filter for Linux, Unix
Published: November 6, 2007
by Timothy Prickett Morgan
Spam may not be making as many headlines as it used to, but the spam problem certainly has not gone away. Spammers and hackers are unrelenting in their attacks on our desktops and servers. And that is why the open source providers of antispam software have to keep cranking up the defenses. That's why Roaring Penguin Software has released a new line of antispam products based on its CanIt 4.0 release.
The Ottawa, Ontario, provider of antispam software for Linux and Unix is the commercial entity behind the MIMEDefang framework for email filtering. The framework uses the Milter API in Sendmail, some C code to talk to Linux and Unix, and Perl scripts to create mail filters to create an open source email filter. To create CanIT, Roaring Penguin takes the open source SpamAssassin filter and weaves a bunch of features around it to make its filters work better. (SpamAssassin has notoriously bad heuristic filters.) The MIMEDefang project has been out there since 2000, and there are probably tens of thousands of people who are using it, according to David Skoll, president at Roaring Penguin. The company has more than 800 customers using its commercial CanIT implementation of the code, many of which are large service providers who have immense spam filtering problems.
With CanIT 4.0, Roaring Penguin is adding performance improvements for the PostgreSQL database that is behind the spam filter. Or, more precisely, rather than try to store so much unstructured data inside PostgreSQL, the software includes a flatfile, Unix-style storage management daemon (a hierarchical message tree) that stores the vast amounts of unstructured data that have to be processed as emails are filtered. This significantly speeds up the performance of the filter.
The Bayesian filters inside the spam filter have also been tweaked to not just learn words that are in spam, but word pairs, which are often more indicative of what is spam and what is not. As part of the Roaring Penguin product, when customers turn on Bayesian filtering, they can offer up their own email streams and the training they do on their machines to help their peers also using CanIT to filter email. Roaring Penguin maintains an aggregated database of 600,000 spams and 300,000 good emails back on its support systems back in Ottawa, which is updated continuously, to improve filtering for everyone.
The CanIT 4.0 update also has built-in graphical reports, so email administrators who are tracking spam statistics do not have to export data out of the software and import it into spreadsheets to do analysis.
Skoll says that for a server with 50 users, a license to CanIT 4.0 costs $500 for the first year with $100 per year for support costs per year after that. Site licenses for large server farms can run into the $100,000 range. He also says with a laugh that Roaring Penguin has no desire to support its products on Windows.
Post this story to del.icio.us
Post this story to Digg
Post this story to Slashdot