Introduction to regex filters

Forum for MailWasher Pro 7 and/or older 2011/2012 versions.
User avatar
stan_qaz
Omniscient Kiwi
Location: Gilbert, Arizona
Posts: 8671
Joined: Fri Jul 25, 2008 5:13 am

Introduction to regex filters

Sat Apr 10, 2010 4:55 am

I'm far from an expert on regular expressions or using them in MW filters but this should be enough to get you started.

The regex engine that MW is using can be found here along with the documentation and a testing/learning utility:

http://www.regexlab.com/en/deelx/


This is the main Syntax page that links to the details of each expression type:

http://www.regexlab.com/en/deelx/syntax.htm


There is an introductory text here:

http://www.regexlab.com/en/regref.htm


Here's a link to the older copy of MTracer, a regular expression testing tool. This is the version not getting picked up by AVG

http://s3.amazonaws.com/Firetrust/MTracer.zip


The main thing to keep in mind is that in regex mode some characters are used as commands and not as text to be matched, for example "|" is treated as the "OR" command and to actually match a "|" found in your text it must be escaped with a "\" to be treated as text so it becomes "\|" in your filter rule.

You may notice that regex code is not very easy to read if you come back to it after a few days, it is much easier to write than it is to read and understand just what it is doing. I work around this by keeping a text file with my regex filter rules along with enough comments on what I was thinking when I wrote it so that I don't have to puzzle out what the regex code is actually doing.

There are a few examples of working regex filters in these posts if you want to see what they look like before you dive in:

http://forum.firetrust.com/viewtopic.php?p=24888#p24888

http://forum.firetrust.com/viewtopic.php?p=24989#p24989

http://forum.firetrust.com/viewtopic.php?p=24991#p24991 (not perfect but a good complex example)

http://forum.firetrust.com/viewtopic.php?p=25105#p25105

Reading the whole topic for regex related posts will also give you some additional hints and tips:

http://forum.firetrust.com/viewtopic.php?f=48&t=5575

The most important may be that the case sensitivity of the default regex engine has been changed for the MW engine:

http://forum.firetrust.com/viewtopic.php?p=25117#p25117

Edit: Added link to introduction.
I am not a Firetrust employee just a MW user.
--
First rule of computer consulting: Sell a customer a Linux computer and you'll eat for a day,
sell a customer a Windows computer and you'll eat for a lifetime.
User avatar
Wizcrafts
Guardian Gecko
Contact:
Location: Flint, Michigan, USA
Posts: 388
Joined: Wed Sep 17, 2008 5:37 am

Re: Introduction to regex filters

Wed Aug 18, 2010 8:50 am

Stan;
I am looking in vain for a place within the MWP filters section to test RegExpr filter rules, like exists in v6.5.4. Do you know if this built-in test input field exists at this time in v 2010.1.0.10?

If there is no test box in this version, where does one go to test RegExpr as they pertain to this incarnation of the program?

Thanks
Submitted respectfully by Wiz.
Member of the MailWasher Beta Tester Team
Fighting spam by writing, updating and publishing MailWasher Pro custom filters.
See www.wizcrafts.net/mwp-filters.html
User avatar
stan_qaz
Omniscient Kiwi
Location: Gilbert, Arizona
Posts: 8671
Joined: Fri Jul 25, 2008 5:13 am

Re: Introduction to regex filters

Wed Aug 18, 2010 9:04 am

You can go here: http://www.regexlab.com/en/mtracer/ but the latest version is kicking off an AV alert and getting sent to the bin on my system. Edit: Sent them a note about the AV issue.

If you have issues with it e-mail me ( stan at stanmiller dot info) and I'll mail you the older 2.1 version from 06-2009 that is working for me.
I am not a Firetrust employee just a MW user.
--
First rule of computer consulting: Sell a customer a Linux computer and you'll eat for a day,
sell a customer a Windows computer and you'll eat for a lifetime.
Nik
Rattled Rabbit
Posts: 3
Joined: Sat Aug 21, 2010 3:31 am

Re: Introduction to regex filters

Sat Aug 21, 2010 3:58 am

Hi stan,

as you say, MTracer has a virus... Is it possible to get the old version?

Anyway, I have one question about filters - it seems that filters don't support Unicode set. Specifically, I have made a filter:

'Subject' Contains Plain Text
[Samskrta]

Filter finds following subjects:

Re: [Samskrita] Kalpa Shastra
RE: [Samskrita]
...

but when there is a devanagari text in the subject, the message isn't filtered, for instance:

RE: [Samskrita] आन्तरराष्ट्रीय रामायण परिषत्
[Samskrita] Learning Saanskrit by fresh approach - Lesson 42 संस्कृतभाषायाः नूतनाध्ययनस्य द्विचत्वारिंशः (४२) पाठः ।
...


Today same problem happened with similar filter that searcher for "[Indo-Eurasia]" - it didn't recognise subject:

Re: [Indo-Eurasia] FW: Enc: Quadro chinês - ano 1085 !!!

because of the circumflex "e" character.

Is there any solution to this problem?

Regards
User avatar
rusticdog
Firetrust Monkey
Posts: 15864
Joined: Mon Jun 13, 2005 6:27 pm

Re: Introduction to regex filters

Sat Aug 21, 2010 4:17 am

stan_qaz wrote:You can go here: http://www.regexlab.com/en/mtracer/ but the latest version is kicking off an AV alert and getting sent to the bin on my system. Edit: Sent them a note about the AV issue.

If you have issues with it e-mail me ( stan at stanmiller dot info) and I'll mail you the older 2.1 version from 06-2009 that is working for me.
Email it to me Stan and I'll upload it to the site.


Nik, could you view one of these [Samskrta] emails in the Source tab, copy/paste the full text and email it to me at forum@firetrust.com
I'm fairly sure that the Beta Testers have already mentioned this isn't right, but the extra reminder won't hurt and a working example we can tool around with will help too.
Nik
Rattled Rabbit
Posts: 3
Joined: Sat Aug 21, 2010 3:31 am

Re: Introduction to regex filters

Sat Aug 21, 2010 4:50 am

rusticdog wrote: Nik, could you view one of these [Samskrta] emails in the Source tab, copy/paste the full text and email it to me at forum@firetrust.com
I'm fairly sure that the Beta Testers have already mentioned this isn't right, but the extra reminder won't hurt and a working example we can tool around with will help too.
I've sent you one message.

Messages are from open Google group:
http://groups.google.com/group/samskrita
User avatar
stan_qaz
Omniscient Kiwi
Location: Gilbert, Arizona
Posts: 8671
Joined: Fri Jul 25, 2008 5:13 am

Re: Introduction to regex filters

Sat Aug 21, 2010 7:10 am

Tracker e-mailed via Gmail.
I am not a Firetrust employee just a MW user.
--
First rule of computer consulting: Sell a customer a Linux computer and you'll eat for a day,
sell a customer a Windows computer and you'll eat for a lifetime.
User avatar
Wizcrafts
Guardian Gecko
Contact:
Location: Flint, Michigan, USA
Posts: 388
Joined: Wed Sep 17, 2008 5:37 am

Re: Introduction to regex filters

Sat Aug 21, 2010 7:20 am

I have now successfully converted all of my published MailWasher Pro 6 filters in the new xml format. Even with Ira's conversion tool it took me 8 hours to get everything corrected, to prevent crashing the application. Apparently, the new version crashes if there are any serious code or syntax errors in a regular expression. The previous version just removed that particular filter from the set. Somebody needs to look into this problem.

I found among other things, that the new filter format does not allow you to use actual angle brackets or ampersands in the filter name or description. Ira's converter missed this and every filter that had either an angle bracket or & caused the program to crash, or the custom filter set was instantly replaced entirely with the default set of three filters.

I'm still deciding what to do to recoop my investment of time before I post the new format filters to my website.
Submitted respectfully by Wiz.
Member of the MailWasher Beta Tester Team
Fighting spam by writing, updating and publishing MailWasher Pro custom filters.
See www.wizcrafts.net/mwp-filters.html
User avatar
rusticdog
Firetrust Monkey
Posts: 15864
Joined: Mon Jun 13, 2005 6:27 pm

Re: Introduction to regex filters

Sat Aug 21, 2010 7:22 am

Is it the latest MTracer on the site that is picked up by AVG ?

I have an account with AVG that let's me upload files to be whitelisted, though I don't think they'd appreciate me uploading files from another company :ninja
But maybe I can just email the guy who set me up with the whitelist service
User avatar
rusticdog
Firetrust Monkey
Posts: 15864
Joined: Mon Jun 13, 2005 6:27 pm

Re: Introduction to regex filters

Sat Aug 21, 2010 7:24 am

Wizcrafts wrote:I'm still deciding what to do to recoop my investment of time before I post the new format filters to my website.
I asked The Big Cheese to assist with this, I'll remind him
User avatar
stan_qaz
Omniscient Kiwi
Location: Gilbert, Arizona
Posts: 8671
Joined: Fri Jul 25, 2008 5:13 am

Re: Introduction to regex filters

Sat Aug 21, 2010 8:21 am

The version shown here as an upgrade:
regex-tracker-1.png
regex-tracker-1.png (13.58 KiB) Viewed 47343 times
http://www.regexlab.com/en/mtracer/download.htm
I am not a Firetrust employee just a MW user.
--
First rule of computer consulting: Sell a customer a Linux computer and you'll eat for a day,
sell a customer a Windows computer and you'll eat for a lifetime.
User avatar
rusticdog
Firetrust Monkey
Posts: 15864
Joined: Mon Jun 13, 2005 6:27 pm

Re: Introduction to regex filters

Sat Aug 21, 2010 5:17 pm

Original post edited to include download link for older version MTracer that isn't picked up by AVG
http://s3.amazonaws.com/Firetrust/MTracer.zip
User avatar
rusticdog
Firetrust Monkey
Posts: 15864
Joined: Mon Jun 13, 2005 6:27 pm

Re: Introduction to regex filters

Sat Aug 21, 2010 5:56 pm

AVG replied and have removed MTracer, should be in next update in a few hours.
Nik
Rattled Rabbit
Posts: 3
Joined: Sat Aug 21, 2010 3:31 am

Re: Introduction to regex filters

Sat Aug 21, 2010 6:03 pm

rusticdog wrote:Original post edited to include download link for older version MTracer that isn't picked up by AVG
http://s3.amazonaws.com/Firetrust/MTracer.zip
AV doesn't complain with this file. Thanks a lot. :thumbsup
User avatar
rusticdog
Firetrust Monkey
Posts: 15864
Joined: Mon Jun 13, 2005 6:27 pm

Re: Introduction to regex filters

Wed Aug 25, 2010 8:32 pm

Nik wrote:Anyway, I have one question about filters - it seems that filters don't support Unicode set. Specifically, I have made a filter:

'Subject' Contains Plain Text
[Samskrta]

Filter finds following subjects:

Re: [Samskrita] Kalpa Shastra
RE: [Samskrita]
...

but when there is a devanagari text in the subject, the message isn't filtered, for instance:

RE: [Samskrita] आन्तरराष्ट्रीय रामायण परिषत्
[Samskrita] Learning Saanskrit by fresh approach - Lesson 42 संस्कृतभाषायाः नूतनाध्ययनस्य द्विचत्वारिंशः (४२) पाठः ।
...

I've got 4 emails currently in my Inbox that shows this but it seems to be working so far, that said, I'm on a slightly different version than you.

There is a possible issue if the UTF8 encoding goes across multiple lines that breaks oddly, so I'll keep watching as they arrive.

Return to “MailWasher Pro 7”