Deflexion.com
blog >>
Join the conversation >>
![]() |
PROCMAIL
QUICK START
|
Copyright © Nancy McGough
& Infinite Ink
Originally published in 1994 as part of the Filtering Mail FAQ
Last modified
27-Nov-2007
| “... Quick Start at Infinite Ink, one of the best primers to a complex program I've ever seen.” |
| -- Kai Schätzl, 2004 May 16, Procmail list |
Procmail is free/libre open-source software that is both a mail processor and a mail delivery agent (MDA). It can be used by either a system administrator or a user to automatically process and deliver incoming mail messages. It can also be used to re-process and re-deliver messages that are already in a mailbox.
This Procmail tutorial is aimed at regular users, not system administrators. In order to use these instructions, you need:
In addition to understanding Internet mail flow, your mail messages must be delivered to a system that:
.forward file to set procmail as your LDA,
or
.maildelivery; with getmail,
use .getmailrc; and with fetchmail,
use .fetchmailrc.
If your system satisfies 4a or 4c, make sure that you skip Steps 8 & 9 below, that is, do not set up a .forward file.
|
|
||
| If you are looking for a provider that has procmail installed, see my list of Free or Reasonably-Priced IMAP Service Providers — most of the providers who give full Unix shell access also let their users write and manage their own procmail recipes. |
Note about the ordering of these terms: The terms listed earlier are used in the definitions of later terms.
pmlog for the Procmail log file, PMRE for Procmail
regular expression (defined below), and PMDIR for the variable
that points to the directory that holds your Procmail-related files.
Note that there are others who use this abbreviation, for example, Paul
Chvostek's Procmail Log Watch awk script is called pmlw
and Jari Aalto's Procmail Documentation and Procmail Library are located
at pm-doc.sf.net and pm-lib.sf.net.| Metacharacter or Metaphrase |
Meaning in a Procmail Regular Expression |
. |
any character other than newline (i.e., other than Line Feed or LF or ASCII character 10) |
(string) |
treat string as a single item |
* |
zero or more of preceding item |
? |
zero or one of preceding item |
+ |
one or more of preceding item |
^ |
newline; usually used to match beginning of line;
Note: As discussed in Re:
procmail help! $MATCH grabs newline too, the meaning of the
^ metacharacter is different in Procmail than
in most other RE tools. |
$ |
newline; usually used to match end of line; Note: As
discussed in Re:
procmail help! $MATCH grabs newline too, the meaning of the
$ metacharacter is different in Procmail than
in most other RE tools. |
[characterList] |
any single character in characterList |
[^characterList] |
any single character NOT in characterList |
\<\> |
each of \< and \> is a metaphrase
for a "non-word character"; details are in the
definition of word & Matching a
Word below, and in Jari Aalto's Procmail
and egrep differences |
| |
OR; ORing in Procmail is discussed below |
() |
() is the null item and is always a match; it is
used to make a recipe condition more readable, to escape a leading
backslash (\), or to escape the special meaning of
other metacharacters or metaphrases. For examples,
see this
discussion of ()< and this
discussion of ()\/. Note that it is more common to escape
a metacharacter with a backslash, which is discussed
in the next item. |
\metacharacter |
Escape the special meaning of metacharacter and treat
it as the literal character that it is; for example
\. means the dot character rather
than backslash followed by any character and
\* means the asterisk character
rather than zero or more backslashes. Note: The
less-than character (<) needs to be escaped differently.
Also see David W. Tamkin's 2004-April-24 message about leading
backslash problem (was persistent lock files). Tip: Another
way to escape a metacharacter is to put it in a [characterList],
e.g., you can use either \. or [.] to
mean a literal dot (.) in a PMRE. Example:
In Setting Keywords or Labels below, there
is a recipe condition that contains a literal $ character. |
\/ |
extraction operator (discussed below) |
\newline |
If the last two characters on a line are a backslash (\)
followed by newline, this is a “continuation backslash”
and it tells Procmail to continue the current line with the text
on the next line and ignore these characters: backslash (\),
newline, white space (spaces, tabs, and/or newlines). |
|
|
||
|
|
-
letters, digits, and underscores - and is bounded on the
left by the beginning of a line or non-word character and on the right
by the end of a line or non-word character. For example, in the line
Subject: testing procmail
the substrings Subject, testing, and procmail are
each words and the substrings Subject:, mail, and test
are not words. The diagram below shows the relationship between words
and substrings, namely: Every word is a substring but not every substring
is a word.
+----------------+
| +---------+ |
| | words | |
| +---------+ |
| substrings |
+----------------+
Since you need to look at the boundaries of a string to determine whether
it is a word, the term word only makes sense when you are talking
about a substring in the context of a larger string. For more information,
see the discussion about matching the word “test”
in Understanding and Refining Your Procmail Recipes below.
cd
cd ~
cd $HOME
cd ${HOME}
Note: Do not use ~ in a procmailrc file; use $HOME
or ${HOME} instead.~
With procmail you can organize your environment variables, recipes,
mailboxes, and directories however you like. Only the .procmailrc
file, which I discuss below in Step by Step Through Setting
Up and Testing Procmail, is required to have this name and to reside
in your home directory. This freedom is part of what gives procmail its
power but it can lead to chaos if you do not have a strategy for organizing
and naming your procmail-related environment variables, files, and directories.
| Everything should be as simple as possible but no simpler. |
-- Albert Einstein as quoted and discussed in
the |
The first strategy that I recommend and try to follow in this article
is simplicity. For example, I recommend that you use the Procmail defaults
whenever possible and do not set unnecessary environment variables in
your procmailrc file(s). For more about this, including problems caused
by not using the defaults, see the Warnings (
)
notes in Step 4 below.
A key to good programming is modularization. This, along with simplicity, will help to make your Procmail configuration portable and will make it easy to plug in, unplug, or reorder recipes. Here are examples of recipe modules you might want to use:
rc.testingrc.subscriptions (aka rc.sbe or rc.bulk
or rc.blue)rc.log rc.forwardrc.rmdupesrc.munge-windows-1252
rc.munge.newsgroups,
which I discuss on my main Pine page in this
sectionrc.autoreplyrc.vacationrc.green used, along with a greenlist (AKA whitelist or accept
list or allow list or goodguy list or
trusted senders), to do Reverse
Spam Filteringrc.greenlister uses an incoming message to trigger
an update to the greenlist; for details see Reverse
Spam Filtering: Winning Without Fightingrc.quarantine (aka rc.virus or rc.violet)
see Snagging Viruses belowrc.spamassassin see Using SpamAssassin
belowlistname_id.rc, which I discuss in Generic
SBE Sorting below.inc (include) files, which
are here
& here,
and the .rc and other files in his
procmail directoryvsnag.genvars.rc,
which is part of Dallman's Vsnag
package and about which Dallman said “the
genvars contents are meant for, and I'm fine with their use for, satisfying
your general private procmail needs. Ditto public, if that means a server
running procmail for the good of its users, including commercially,
so long as I don't lose credit for what I've offered.” In the Step-By-Step section below, I walk you through
creating the first two recipe modules and then, after testing, I show
you how to unplug rc.testing. I discuss rc.quarantine and
rc.spamassassin in the Advanced Recipes
section.
| The rc in procmailrc and in my recipe module names is a Unix naming convention that stands for either runtime configuration, runlevel change, or run commands (take your pick!). Thanks to Richard Smith, the maintainer of geekrave.org, for telling me he thinks of rc as runtime configuration; Elmar Hinz for telling me he thinks of it as runlevel change; and Imperial College for FOLDOC, the Free On-Line Dictionary Of Computing, for its definition and history of rc. |
Using consistent naming and formatting styles will help you keep your directories, files, and mailboxes organized. Here are the styles that I use:
| What | My Naming & Formatting Style | |
| directory | initial upper-case letter | |
| file | all lower case | |
| procmail recipe module | begins with rc. |
|
| unplugged command in an rc file | begins with ## |
|
| comment (that is not an unplugged command) in an rc file | begins with # |
|
| environment variable | all upper case; also see the 2005-Jan-20 message Re: Variable names by Ruud H.G. van Tol | |
| symbolic link to a file | begins with L. (for an example, see Tips for Managing Your Procmail Files below) | |
| symbolic link to a directory | begins with LD. | |
| mailbox (*) | does not contain forward slash (/), dot (.),
backslash (\), hash (#), colon (:),
asterisk
(*), percent (%), quotes, apostrophes,
parentheses, brackets, or braces; is not the string inbox
or mbox or mail.txt
or inbox.mtx or default
or Mailbox
or any combination of upper and lower case letters of these strings.
Also, because of this
story, you may want to avoid using core as a
mailbox name. (* Also see the
Notes About Mailbox Names below.) |
|
| incoming mailbox | begins with -- or IN- (I use one
of these prefixes so that all my incoming mailboxes are grouped together
when they they are sorted by name.)
|
|
| incoming “Solicited Bulk Email” (SBE) mailbox | begins with --s- or IN-S- (you could
also think of the S as standing for subscription
or shared) |
|
| archived SBE mailbox | begins with == or zz- or S-
and ends with -YYYY-MM-DD (the date of the latest message
in the box) |
|
special mailbox such as SENT, DRAFTS,
and TRASH |
Some email clients and IMAP servers do not let users change the
names of special mailboxes but if it's possible, I like to use all
upper case for these so they are more noticeable. I also prefer to
use the name SENT-FCC (FCC = “Folder Carbon Copy”)
for my sent mailbox to distinguish it from SENT-BCC,
which is where I deliver (via procmail) messages that I Bcc
to myself. |
|
| personal special mailbox | begins with plus (+), dash (-), equals
(=), or a number so that it is listed near the top when
all mailboxes are sorted by name using an ASCII
sort. For example, I use 2Reply for messages that I want
to reply to and 2Web for messages that contain information
that I'd like to incorporate into my web site. |
|
| “catch-all” mailbox | contains either the string catchall or dregs
or fallthrough or dropthrough or wildcard
or yellow or red (this is where messages
that are not delivered to one of my virus, magenta, blue, green, or
lime mailboxes are delivered; for more about catch-all mailboxes and
my general email strategy, see Reverse
Spam Filtering: Winning Without Fighting) |
|
| mailbox when I care about the mailbox format | ends with a meaningful extension, for example .unix
for Unix mbox format or.cmbx for c-client MBX format.
(Tip: Do not use the extension .mbx
because different mail clients use it to mean different mailbox formats
and thus it is ambiguous. Also some viruses go after files
with the .mbx extension.) |
| * | Notes About Mailbox Names
|
|
Here are more details about my strategy for choosing mailbox names.
These notes are in addition to the comments in the starred
item (*) above.
|
Request: I'm revising my mailbox
naming style and I'd like to learn about names that other people use,
especially any mailbox-naming scheme that is in common use. I'm thinking
about changing the prefix of my incoming mailbox names from IN-
to something that will sort above all alphabetic characters in an ASCII
sort. I'm leaning towards prefixing these box names with a dash (-),
so, for example, incoming procmail mailing-list messages would be put
in --procmail. And then I'd put the archive of that mailbox
in zz-procmail-YYYY-MM-DD or ==procmail-YYYY-MM-DD
because the leading zz- or == will make it sort
below my incoming mailboxes. I realize that these are kind of odd-looking
names so I'm interested to hear what ideas other people have — please
let me know what you think and
if you have suggestions.
Note 1: Someone emailed me and mentioned that if
a mailbox name starts with a dash, it is a pain to manipulate from a Unix
shell. That is true, but it provides a bit of security if a lamer
breaks into your account and tries to read your mailboxes using shell
commands such as more or less or pine
-f mailboxName
Note 2: I'm now using this naming scheme and so far I haven't had problems with Procmail or with any mail client or IMAP server that I've used. For example, here are mailbox names that could be used with my reverse-spam-filtering system:
--magenta --violet --blue --green --limegreen --yellow --red ==archive1 ==archive2 |
To force my most important incoming mailboxes to the top of an ASCII sort, I prefix each mailbox name with an appropriate number of dashes. For example:
-------green ------limegreen -----yellow ----blue ---red --violet -magenta ==archive1 ==archive2 |
I actually use a modified version of this and don't really have a mailbox name that begins with seven dashes! If you are interested in why I name my incoming mailboxes after the colors of the rainbow, see Reverse Spam Filtering: Winning Without Fighting.
I recommend that you order Procmail flags using the style that Jari Aalto suggests on his Procmail Tips page in the section called The order of the flags. I use this style throughout this page, for example in the SpamAssassin recipes below.
A lot of people are learning Procmail because they want to automatically separate non-spam messages from spam messages. My strategy is to do the following, in this order:
To implement step number . . .
For more about my spam-deflexion strategy and about Unsolicited Bulk Email (UBE or spam), see my Reverse Spam Filtering: Winning Without Fighting page.
For other people's spam-deflexion strategies, see the December-2005 gmane.mail.procmail
discussion about Spam and Procmail, which takes place in these
two threads.
Especially useful is this
message from G.W.
Haywood, in which he describes the system that he uses on his mail
servers and says:
| “Most people haven't the faintest idea how much work is
involved in keeping the bulk of spam and other |
I am collecting Spam-related links at del.icio.us/Deflexion.com/Messaging/Spam.
The step-by-step instructions below guide you through setting up Procmail
to deliver to mailboxes that are in traditional
Unix mail spool format (mbox). To deliver to a mailbox that is in
maildir format, you need to append a forward
slash (/) to the mailbox name. The table below gives examples
of the various types of mailbox specifications that can be used in the
action line of a Procmail recipe.
| Traditional Unix mbox format |
maildir format |
maildir format on a Courier IMAP server |
MH format (*) |
||||
|---|---|---|---|---|---|---|---|
|
|
|
|
||||
|
|
|
|
||||
|
|
|
|
||||
|
An explanation of how Procmail delivers to various folder formats is in the latest (3.22) procmailrc man page. Here is the relevant section.
[ . . .] Anything else will be taken as a mailbox name (either a filename or a directory, absolute or relative to the cur- rent directory (see MAILDIR)). If it is a (possibly yet nonexistent) filename, the mail will be appended to it. If it is a directory, the mail will be delivered to a newly created, guaranteed to be unique file named $MSGPRE- FIX* in the specified directory. If the mailbox name ends in "/.", then this directory is presumed to be an MH folder; i.e., procmail will use the next number it finds available. If the mailbox name ends in "/", then this directory is presumed to be a maildir folder; i.e., proc- mail will deliver the message to a file in a subdirectory named "tmp" and rename it to be inside a subdirectory named "new". If the mailbox is specified to be an MH folder or maildir folder, procmail will create the neces- sary directories if they don't exist, rather than treat the mailbox as a non-existent filename. When procmail is delivering to directories, you can specify multiple direc- tories to deliver to (procmail will do so utilising hardlinks).
When delivering to a maildir or MH format mailbox, you do not need to
use a lock file so the first line of a recipe does not need the second
colon. For example, instead of using this as the first line:
:0:
you can use this:
:0
There are examples of recipes that deliver to maildir-formatted mailboxes in Step 7D and in Using SpamAssassin below.
|
|
||
If you want your system mailboxes $ORGMAIL and $DEFAULT
to be in maildir format, it is best to specify this in the procmail
source code and recompile procmail. This is discussed in the following
messages, which were posted to the Procmail mailing
list.
|
If you are delivering messages to maildir-formatted mailboxes that will
be served to mail clients by Courier IMAP (which supports only maildir
format mailboxes), put a dot (.) in front of the mailbox
name in your Procmail recipe. This leading dot is a hierarchy separator
and will ensure that IMAP clients can see these mailboxes. IMAP clients
will display the mailbox name without the leading dot (or the trailing
slash, which is just used to tell procmail the format of the mailbox and
is not part of the name that's used on the Unix system).
Technical Note: On Courier IMAP, private mailboxes
reside under the INBOX. hierarchy. On the server,
this hierarchy is usually in the directory $HOME/Maildir but users do
not need to tell their IMAP clients the actual name of this directory
because the IMAP protocol will pass that information to the IMAP client.
| The Qmail MTA |
||
| Systems that use the qmail MTA usually use maildir-formatted mailboxes. To use Procmail with qmail, put a line like one of the following in the relevant dot-qmail file: |preline /usr/local/bin/procmail ~/.procmailrc || exit 111 |preline procmail -m -p myprocmail.rc |preline procmail (It is not clear to me which of the above is best and it would be great if someone could fill me in so I can include more information about this here.) To learn more about using Procmail with qmail, see:
Qmail Tip: Depending on the |
“Firstly, the word MAILDIR as a variable name in procmail predates, and has absolutely nothing to do with, the mail storage format called "Maildir" which was invented quite some years later.
Secondly, . . .”
These 18 steps walk you through ensuring that procmail is installed on your system, setting it up to sort your incoming mailing-list messages, and testing your setup.
|
|
||
| Important: For the following steps to work,
you must be a regular user, not root.
If you are a system administrator who is learning Procmail, I highly recommend that you step through this procedure before you try to set up a global procmail rc file. You are much more likely to understand the basics of Procmail if you learn by doing (rather than by reading only). And your system & users will be much less likely to experience an email nightmare if you do your Procmail learning — and making mistakes — as a regular user! |
| Shell | Command |
| csh, tcsh | which procmail |
| sh, bash, ksh | type procmail |
| various | whereis procmail |
| various | where procmail |
| various | locate procmail |
Make a note of the path to procmail because this is needed in Step
9 below.
procmail -v
|
|
||
|
~/.forward, ~/.procmailrc,
and the global procmailrc, if they exist. This will ensure that you
do a clean Procmail setup.
~/.forward exists and what its contents
are if it does, type:
cat ~/.forwardIf
~/.forward exists, rename it using a command similar
to this:
mv -i ~/.forward ~/.forward-2007-11-27This will ensure that you use the optimal
~/.forward
file for your system, which is discussed in Steps
8 and 9 below.~/.procmailrc exists and what its
contents are if it does, type:
cat ~/.procmailrcIf
~/.procmailrc exists, rename it using a command
similar to this:
mv -i ~/.procmailrc ~/.procmailrc-2007-11-27Steps 3 and 4 (in the next section) walk you through setting up a minimal
~/.procmailrc file. If you would like to
use some of the recipes that are in your old procmailrc file, I
recommend that you call them by using an INCLUDERC
command in your new minimal ~/.procmailrc. For an example
of an INCLUDERC command, see the end of Step 4 below
(in the #### Processing Section
####cat /etc/procmailrc cat /usr/local/etc/procmailrcIf a global procmailrc exists and if you are the administrator of your system, rename it using a command similar to this:
sudo mv -i /etc/procmailrc /etc/procmailrc-2007-11-27 ^^^^^^^^^^^^^^^ change to the path on your systemAfter you have experience setting up and using a personal procmailrc, you can, if you want, re-install this renamed global procmailrc. Details about using a global procmailrc are in See if There is a Global procmailrc below.
/usr/local/etc/procmailrc for
the global procmailrc file. Also note that you can view the VPS
default files and directory structure in the /skel
directory. (Verio is where I host the Infinite Ink web pages, including
this page.)/etc/procmailrc.
~/.procmailrc by typing:
cd pico -w .procmailrc
|
|
||
|
~/.procmailrc either by typing
it or by using copy &
paste*.
But first note these warnings.|
|
|
# Note: Anything after a hash character (#) is a comment & is ignored by procmail.
#### Begin Variables Section ####
# It is essential that you set SHELL to a Bourne-type shell if
# external commands are run from your procmailrc, for example if
# you use rc.spamassassin, rc.quarantine, or other advanced recipes.
# Setting SHELL should not be needed for the simple sorting recipes in
# this step-by-step section, but to be safe and to future proof your
# procmailrc, set it anyway! Details are in Check Your $SHELL and $PATH.