The new MailWasher Pro Technical Information

Forum for MailWasher Pro 7 and/or older 2011/2012 versions.
User avatar
rusticdog
Firetrust Monkey
Contact:
Posts: 14018
Joined: Mon Jun 13, 2005 6:27 pm

The new MailWasher Pro Technical Information

Tue Jul 06, 2010 12:47 am

System Requirements
Compatible with Windows 32 and 64 bit
Windows XP, Vista and 7
Needs .NET3.5 SP1 (If you have all your Windows updates you'll be fine)
http://www.microsoft.com/downloads/deta ... laylang=en


Process Details
The MailWasher User Interface runs as MailWasherPro.exe

Ports
MailWasher will require external communications on following ports
4051 for FirstAlert
80 for Registration


What about a 64bit version
MailWasher does not run natively in 64bit. This is due to third party libraries we use that are only 32bit, also there is no inherent benefit in MailWasher running natively as a 64bit application so it's not a priority for us.


Regular Expression Engine
For the Custom Filters MailWasher uses a PERL compatible regular expression engine called DEELX. If you require technical assistance with regular expressions please feel free to ask in the forum, you can also view the syntax of DEELX here http://www.regexlab.com/en/deelx/syntax.htm


DataGrid and Other
The DataGrid is Xceed Datagrid for WPF
There is also use of some Chilkat and EmailArchitects components


User Data Files
Under Help >> User Files >> you will find all the user files, settings and log files.
Under Help >> Common Files >> you will find all the language files, some FavIcons and the default user files required if any go missing from the directory above.


User Data Files and their purpose
Files are typically self explanatory, for example RecycleBinSettings.xml is where MailWasher stores the settings for the Recycle Bin.
BayesianKnobs.xml, FirstAlertKnobs.xml and RBLKnobs.xml are generated by the engine currently for debugging purposes.
Regex.txt is generated by the engine and is a base64 encoded file based on any custom filters.

The cache subfolder
MWP.db3 - This is an SQLite database which contains your Friends List, Blacklist, cached results from FirstAlert and DNSBLs, hash tables for deleted and cached emails and other miscellaneous settings.

All of the .dat files are related to the Bayesian data as explained here viewtopic.php?f=48&t=5401

The crypto subfolder
This stores encrypted email data. It's not stored in the database for faster access. Yes it gets very large, but it cleans up itself, so don't go deleting files manually.
0001 File - Is a Recycled or Cached Message
0002 File - Is extra message information for quick access
0004 File - Is extra information stored on a message when deleted
0010 File - Are messages trained as Spam
0011 File - Are messages trained as Good


Common Files and their purpose
The .dat files are backups MailWasher will copy if the user files of the same name go missing.
DNSBLServerList.xml is a list of DNSBLs available in the drop down menu for quickly adding a preconfigured server
ServerList.xml is a list of various preconfigured settings for different email providers. MailWasher will use these settings by default when manually adding an account.
FavIcons are used in the UI to display by the account name, this folder contains the icons for the most common email provider accounts.


The Languages subfolder
Contains all language files....obviously. If any wishes to attempt to translate MailWasher you must contact us first.
By default MailWasher will load the standard English language file first, and if another language is selected under Settings >> General >> Application >> then MailWasher will load this language file over the top of the English text.

This ensures that all items/fields in MailWasher have some text, in case the selected Language file has some missing translations.

The list of languages available is determined by the Alias.xml file in the Languages sub-folder.

Code: Select all

      <?xml version="1.0" encoding="utf-8" ?>
    - <LanguageAliases>
      <Alias Filename="Language.xml" Display="English" />
      <Alias Filename="LanguageUK.xml" Display="Proper English" />
      <Alias Filename="LanguageGE.xml" Display="Deutsch" />
      <Alias Filename="LanguageFR.xml" Display="Français" />
      </LanguageAliases>
When you switch languages, MailWasher reads in the text of the new language and outputs any missing strings to the <language_name>.txt file. This allows anyone translating the software to easily identify areas of the language file that still needs to be translated. MailWasher also writes any missing English strings to the selected XML language file adding the tag Translated="False". This allows you to search the XML file for the Translated="False" string to easily find the missing strings, once the string has been translated you can either alter Translated="False" to Translated="True" or simply remove the text entirely.



Adding New Languages :
To add a new Language file into MailWasher Pro you need to edit the Alias.xml file found in the \Languages\ subfolder in the data directory to specify the new language, then restart MailWasher and select the new language. MailWasher will then create the new language file and reads in the text of the new language and outputs any missing strings to the <language_name>.txt file. This allows anyone translating the software to easily identify areas of the language file that still needs to be translated. MailWasher also writes any missing English strings to the selected XML language file adding the tag Translated="False". This allows you to search the XML file for the Translated="False" string to easily find the missing strings, once the string has been translated you can remove the text Translated="False" entirely.

For example, adding this line to the Alias file
<Alias Filename="LanguageEV.xml" Display="Elvish" />

And following the above steps to restart MailWasher and select the new Language to display, will create a new LanguageEV.xml file which will be a full copy of the default English file, with every string that requires translation carrying the Translated="False" tag. Appropriately the LanguageEV.txt will also display corrections required.



Command Line Switches
-nosplash : Stops the Splash Screen from appearing on startup of MailWasher
User avatar
rusticdog
Firetrust Monkey
Contact:
Posts: 14018
Joined: Mon Jun 13, 2005 6:27 pm

How the MailWasher Bayesian works. (Geek Alert)

Mon Jul 19, 2010 6:50 am

The difficulty with any Bayesian is that it requires healthy volumes of email to be classified as good or bad before it becomes effective, there are some tweaks we've added to the current UI under Settings >> Spam Tools >> Learning.


On a low Bayesian sensitivity setting :
MailWasher will limit the authority the Bayesian can have, so the highest 'weighting' it will apply will be -/+99 and 'Infinite' for a message that has been trained.
If the Bayesian evaluates the message between the range of -50 through to +50, and other Spam Tools bring the Total Rating to outside of that range, then MailWasher will automatically train that email as either good or bad.


On a high Bayesian sensitivity setting :
MailWasher will not limit the authority the Bayesian can have, so the highest 'weighting' it will apply will be -/+149 and 'Infinite' for a message that has been trained.
If the Bayesian evaluates the message between the range of -75 through to +75, and other Spam Tools bring the Total Rating to outside of that range, then MailWasher will automatically train that email as either good or bad.


Minimum word length - Sets the minimum numbers of characters required for the word (token) to be evaluated by the Bayesian. So words like 'of' would be completely ignored.
Default is 4

Maximum word length - Sets the maximum numbers of characters required for the word to be evaluated by the Bayesian.
Default is 30

Use lower case - By default MailWasher ignores capitalisation when evaluating words, so for example 'monkey' is considered the same as 'Monkey'. Unchecking this option will treat words with different capitalisation separately.
Default is checked.

Good token weight - This setting gives more authority to words considered to be good, which in cases where emails have both good and spam words generally means the email will come out as good. Currently the setting is 2.0 doubles their weight.
Default is 2.0

Minimum count for inclusion - A word must occur this number of times before it will have its' probability mapped.
Default is 5

Certain spam count - If an email has 0 good tokens, and more than the number specified of bad tokens then MailWasher will class as definitely spam and return the Certain Spam Score regardless of the Bayesian evaluation. If this is a negative number this feature is disabled. A typical number to set would be around 10, though if very few emails have been trained as good it may cause false positives and mark good email as spam.
Default is -1

Interesting word count - The number of tokens both good and bad that will be considered when performing the final evaluating of the email. A larger number early can cause the Bayesian evaluation to be too authoritative, so currently set to 10. Though 15-25 is a better choice as more emails are trained.
Default is 10

Whole words can also be excluded currently by manually editing the 'mwp_exw.dat' file in the Application Data\Firetrust\MailWasher\Cache\ directory. You can open this file in a standard text editor, one word per line.

Whole words can also be included currently by manually editing the 'mwp_inw.dat' file in the same directory. You can open this file in a standard text editor, one word per line. For example as above a word must have 4 characters to be considered, but adding the word 'sex' makes MailWasher consider this word.

Words can also be converted in the 'mwp_conv.dat' file in the same directory, so for example the following would convert all the different variations to a plain 'viagra'
v1agra viagra
\/iagra viagra
/iagra viagra
vi@gra viagra
/i@gra viagra


Other files are
mwp_nswl.dat - This is the non-spam corpus, it is rebuilt every time mail is washed.

mwp_swl.dat - This is the spam corpus, it is rebuilt every time mail is washed.

mwp_pmap.dat - This file stores the probability mappings for tokens,

MWP.db3 - This is the main MailWasher database file, it stores friends list, blacklist, email deleted, the corpus files etc...
User avatar
rusticdog
Firetrust Monkey
Contact:
Posts: 14018
Joined: Mon Jun 13, 2005 6:27 pm

Re: The new MailWasher Pro Technical Information

Wed Aug 25, 2010 3:25 am

Hot Keys :


[ARROW DOWN] Move down one message

[PAGE UP] Move up one page

[PAGE DOWN] Move down one page

[ESC] Closes Preview Window

[DELETE] Marks for Delete

[D] Marks for Delete

Marks for Bounce

[+] Adds to Friends List

[-] Adds to Blacklist

[ENTER] Preview

[SPACE] Preview

[F5] Check Mail

[F6] Process mail

[C] Clear message list

[F7] Launch email application

[F8] Display Accounts window

[CTRL] + [F6] View Preview pane

[CTRL] + [F7] View Filter sidebar

[CTRL] + [F8] Show hidden emails

[CTRL] + [A] Select all

[CTRL] + Show Settings

[CTRL] + [R] Show Recycle Bin

[CTRL] + [T] Display Evaluation Sidebar

[CTRL] + [Shift] + [M] Maximise/Minimise

[<] Mark as Spam
[,] Mark as Spam

[>] Mark as Good
[.] Mark as Good

Return to “MailWasher Pro 7”