+ Reply to Thread
Results 1 to 4 of 4
  1. #1
    Administrator Aristotle's Avatar
    Join Date
    March 25th, 2001
    Location
    Washington, DC, USA
    Posts
    12,284

    Amazingly Good (client side) Spam Filter Program

    I wanted to test this program out before I said anything about it, and now that I have I can heartily recommend it. The name of the program is K9 and it can be found here:

    http://www.keir.net/k9.html

    K9 is a client side application that only works in Windows and only for POP email accounts (which more than likely is the type of email account you are using). Client side means anyone can use it- you do not need access to the mail server to run this program.

    It sits as a proxy between your POP email client (Outlook, Eudora, Pegasus, whatever) and your mail server. All you have to do is change the main server and username settings in your client and you are good to go.

    K9 uses BAYESIAN FILTERING (to read about Bayesian filtering and all sorts of other interesting stuff about spam, go here: http://www.paulgraham.com/antispam.html). This means it LEARNS over time what types of email are spam and which types are good.

    It learns not simply by filtering out emails that use certain keywords, but by scoring emails based on how often certain words appear in spam emails, how often other GOOD words appear, etc. You can read a very detailed explanation of how Bayesian filtering works here:

    http://www.paulgraham.com/spam.html


    Whenever you download email, K9 scans the email (headers, subject, from, body, etc.) and analyzes ALL the content to decide if it is spam or not. You can choose what it then does with the SPAM mails. I have it add "[SPAM]" to the Subject line of spam, and then I tell my email program to put all mails with [SPAM] in the subject in a spam folder. The website explains how to do this if you find it confusing.

    It will probably take a week or so for you to train K9. I noticed that after about 100-200 mails it really started to get smart. While you are training it, what you do is go through all your mail (it keeps copies of all your mails within its own storage) and you can "re-label" certain mails as GOOD or SPAM. This is how you train it.

    After it gets a few hundred emails under its belt, the accuracy rate is staggering. I reset my stats 2 days ago after I only 5 days of training, and here are my current stats:

    Code:
    Total number of emails processed.......................... 319
    Number of Good emails processed...........................  44
    Number of Spam emails processed........................... 275
    Percentage of emails that matched whitelist rules.........   3.8%
    Percentage of emails that matched blacklist rules.........   0.0%
    Number of emails re-classified to Good....................   1
    Number of emails re-classified to Spam....................   0
    Percentage emails misidentified as Spam (false positives).   0.3%
    Percentage emails misidentified as Good (false negatives).   0.0%
    Overall accuracy..........................................  99.7%
    If that did not format well, you can look at the stats here:

    http://www.thresholdrpg.com/misc/K9-results.gif

    At this point, I cannot recommend K9 highly enough. The program is FREEWARE, but the author accepts donations.

    If the program continues to perform at this level for the next 2-3 weeks, I am definitely going to kick in a donation.

    If you use this program, please share your experiences as well!
    Capitalization is the difference between "I had to help my Uncle Jack off a horse." and "I had to help my uncle jack off a horse."

    There is never a good time for lazy writing!

  2. #2
    Administrator Aristotle's Avatar
    Join Date
    March 25th, 2001
    Location
    Washington, DC, USA
    Posts
    12,284
    I am bumping this because folks in another thread wanted a good example of a Bayesian filter program.

    Incidentally, I am still using this program and it rocks. I donated to the creator months ago.

    It currently has a 99.7% success rate with .1% false negatives (spam mails it thought were good) and .2% false positives (good emails it thought were spam).
    Capitalization is the difference between "I had to help my Uncle Jack off a horse." and "I had to help my uncle jack off a horse."

    There is never a good time for lazy writing!

  3. #3
    Bullfrog
    Join Date
    May 23rd, 2003
    Location
    Nashua, NH
    Posts
    716
    I used this back when I actually had a pop mail account and found it to work very well and highly recommend it. As far as mail goes, I've basically decided that I only need email at work and haven't set up a pop account with my ISP. I do have a hotmail account and a shell account but mail is pretty low on my radar screen these days.
    Don't get too perky!

  4. #4
    Bullfrog
    Join Date
    May 23rd, 2003
    Location
    Nashua, NH
    Posts
    716
    The spammers finally got their hands on my work email address. I was disappointed to see that this is only for pop3 and not Exchange.. but I know there is only so much one man can do!

    What I did notice was that he seems to have a pretty good list of downloads available.

    http://www.keir.net/software.html
    Don't get too perky!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts