Understanding the Roles of the HAM and SPAM Folders in Bayesian Filtering

If you navigate into the SmarterMail installation directory, then go into the Service subdirectory, you'll notice several folders that contain email files. This topic explains the roles of these directories and what data is stored.

Messages are stored in the directories shown below in order to facilitate the collection of Bayesian Filter statistics, a spam detection method. It is important to know that only copies of the messages are stored, and that the original emails continue to be delivered as normal. When Bayesian filters are re-analyzed, emails in these folders will be cleaned out.

Folder Descriptions
  • Ham/Type1 - Collection of messages that have been explicitly marked as not spam through the "Unmark as spam" action
  • Ham/Type2 - Random collection of outgoing messages from the web interface
  • Ham/Type3 - Random collection of outgoing messages send through SMTP
  • Spam/Type1 - Messages marked as spam from the interface
  • Spam/Type2 - Random collection of messages caught by the existing Bayesian Filter
 
Bayesian Filter Updating
The filter update will occur when the following three conditions are met:
  • Type1Ham + Type2Ham + Type3Ham > Threshold
  • Type1Spam + Type2Spam > Threshold
  • Type1Spam > Threshold/2
This will happen no more than once every six hours.