mailbox – Access and manipulate email archives¶
Purpose: | Work with email messages in various local file formats. |
---|---|
Available In: | 1.4 and later |
The mailbox module defines a common API for accessing email messages stored in local disk formats, including:
- Maildir
- mbox
- MH
- Babyl
- MMDF
There are base classes for Mailbox and Message, and each mailbox format includes a corresponding pair of subclasses to implement the details for that format.
mbox¶
The mbox format is the simplest to illustrate in documentation, since it is entirely plain text. Each mailbox is stored as a single file, with all of the messages concatenated together. Each time a line starting with “From ” (From followed by a single space) is encountered it is treated as the beginning of a new message. Any time those characters appear at the beginning of a line in the message body, they are escaped by prefixing the line with “>”.
Creating an mbox mailbox¶
Instantiate the email.mbox class by passing the filename to the constructor. If the file does not exist, it is created when you add messages to it using add().
import mailbox
import email.utils
from_addr = email.utils.formataddr(('Author', 'author@example.com'))
to_addr = email.utils.formataddr(('Recipient', 'recipient@example.com'))
mbox = mailbox.mbox('example.mbox')
mbox.lock()
try:
msg = mailbox.mboxMessage()
msg.set_unixfrom('author Sat Feb 7 01:05:34 2009')
msg['From'] = from_addr
msg['To'] = to_addr
msg['Subject'] = 'Sample message 1'
msg.set_payload('This is the body.\nFrom (should be escaped).\nThere are 3 lines.\n')
mbox.add(msg)
mbox.flush()
msg = mailbox.mboxMessage()
msg.set_unixfrom('author')
msg['From'] = from_addr
msg['To'] = to_addr
msg['Subject'] = 'Sample message 2'
msg.set_payload('This is the second body.\n')
mbox.add(msg)
mbox.flush()
finally:
mbox.unlock()
print open('example.mbox', 'r').read()
The result of this script is a new mailbox file with 2 email messages.
$ python mailbox_mbox_create.py
From MAILER-DAEMON Thu Feb 21 11:35:54 2013
From: Author <author@example.com>
To: Recipient <recipient@example.com>
Subject: Sample message 1
This is the body.
>From (should be escaped).
There are 3 lines.
From MAILER-DAEMON Thu Feb 21 11:35:54 2013
From: Author <author@example.com>
To: Recipient <recipient@example.com>
Subject: Sample message 2
This is the second body.
Reading an mbox Mailbox¶
To read an existing mailbox, open it and treat the mbox object like a dictionary. They keys are arbitrary values defined by the mailbox instance and are not necessary meaningful other than as internal identifiers for message objects.
import mailbox
mbox = mailbox.mbox('example.mbox')
for message in mbox:
print message['subject']
You can iterate over the open mailbox but notice that, unlike with dictionaries, the default iterator for a mailbox works on the values instead of the keys.
$ python mailbox_mbox_read.py
Sample message 1
Sample message 2
Removing Messages from an mbox Mailbox¶
To remove an existing message from an mbox file, use its key with remove() or use del.
import mailbox
mbox = mailbox.mbox('example.mbox')
to_remove = []
for key, msg in mbox.iteritems():
if '2' in msg['subject']:
print 'Removing:', key
to_remove.append(key)
mbox.lock()
try:
for key in to_remove:
mbox.remove(key)
finally:
mbox.flush()
mbox.close()
print open('example.mbox', 'r').read()
Notice the use of lock() and unlock() to prevent issues from simultaneous access to the file, and flush() to force the changes to be written to disk.
$ python mailbox_mbox_remove.py
Removing: 1
From MAILER-DAEMON Thu Feb 21 11:35:54 2013
From: Author <author@example.com>
To: Recipient <recipient@example.com>
Subject: Sample message 1
This is the body.
>From (should be escaped).
There are 3 lines.
Maildir¶
The Maildir format was created to eliminate the problem of concurrent modification to an mbox file. Instead of using a single file, the mailbox is organized as directory where each message is contained in its own file. This also allows mailboxes to be nested, and so the API for a Maildir mailbox is extended with methods to work with sub-folders.
Creating a Maildir Mailbox¶
The only real difference between using a Maildir and mbox is that to instantiate the email.Maildir object we need to pass the directory containing the mailbox to the constructor. As before, if it does not exist, the mailbox is created when you add messages to it using add().
import mailbox
import email.utils
import os
from_addr = email.utils.formataddr(('Author', 'author@example.com'))
to_addr = email.utils.formataddr(('Recipient', 'recipient@example.com'))
mbox = mailbox.Maildir('Example')
mbox.lock()
try:
msg = mailbox.mboxMessage()
msg.set_unixfrom('author Sat Feb 7 01:05:34 2009')
msg['From'] = from_addr
msg['To'] = to_addr
msg['Subject'] = 'Sample message 1'
msg.set_payload('This is the body.\nFrom (will not be escaped).\nThere are 3 lines.\n')
mbox.add(msg)
mbox.flush()
msg = mailbox.mboxMessage()
msg.set_unixfrom('author Sat Feb 7 01:05:34 2009')
msg['From'] = from_addr
msg['To'] = to_addr
msg['Subject'] = 'Sample message 2'
msg.set_payload('This is the second body.\n')
mbox.add(msg)
mbox.flush()
finally:
mbox.unlock()
for dirname, subdirs, files in os.walk('Example'):
print dirname
print '\tDirectories:', subdirs
for name in files:
fullname = os.path.join(dirname, name)
print
print '***', fullname
print open(fullname).read()
print '*' * 20
Since we have added messages to the mailbox, they go to the “new” subdirectory. Once they are “read” a client could move them to the “cur” subdirectory.
Warning
Although it is safe to write to the same maildir from multiple processes, add() is not thread-safe, so make sure you use a semaphore or other locking device to prevent simultaneous modifications to the mailbox from multiple threads of the same process.
$ python mailbox_maildir_create.py
Example
Directories: ['cur', 'new', 'tmp']
Example/cur
Directories: []
Example/new
Directories: []
*** Example/new/1361446554.M933748P13757Q1.hubert.local
From: Author <author@example.com>
To: Recipient <recipient@example.com>
Subject: Sample message 1
This is the body.
From (will not be escaped).
There are 3 lines.
********************
*** Example/new/1361446554.M963206P13757Q2.hubert.local
From: Author <author@example.com>
To: Recipient <recipient@example.com>
Subject: Sample message 2
This is the second body.
********************
Example/tmp
Directories: []
Reading a Maildir Mailbox¶
Reading from an existing Maildir mailbox works just like with mbox.
import mailbox
mbox = mailbox.Maildir('Example')
for message in mbox:
print message['subject']
Notice that the messages are not guaranteed to be read in any particular order.
$ python mailbox_maildir_read.py
Sample message 2
Sample message 1
Removing Messages from a Maildir Mailbox¶
To remove an existing message from a Maildir mailbox, use its key with remove() or use del.
import mailbox
import os
mbox = mailbox.Maildir('Example')
to_remove = []
for key, msg in mbox.iteritems():
if '2' in msg['subject']:
print 'Removing:', key
to_remove.append(key)
mbox.lock()
try:
for key in to_remove:
mbox.remove(key)
finally:
mbox.flush()
mbox.close()
for dirname, subdirs, files in os.walk('Example'):
print dirname
print '\tDirectories:', subdirs
for name in files:
fullname = os.path.join(dirname, name)
print
print '***', fullname
print open(fullname).read()
print '*' * 20
$ python mailbox_maildir_remove.py
Removing: 1361446554.M963206P13757Q2.hubert.local
Example
Directories: ['cur', 'new', 'tmp']
Example/cur
Directories: []
Example/new
Directories: []
*** Example/new/1361446554.M933748P13757Q1.hubert.local
From: Author <author@example.com>
To: Recipient <recipient@example.com>
Subject: Sample message 1
This is the body.
From (will not be escaped).
There are 3 lines.
********************
Example/tmp
Directories: []
Maildir folders¶
Subdirectories or folders of a Maildir mailbox can be managed directly through the methods of the Maildir class. Callers can list, retrieve, create, and remove sub-folders for a given mailbox.
import mailbox
import os
def show_maildir(name):
os.system('find %s -print' % name)
mbox = mailbox.Maildir('Example')
print 'Before:', mbox.list_folders()
show_maildir('Example')
print
print '#' * 30
print
mbox.add_folder('subfolder')
print 'subfolder created:', mbox.list_folders()
show_maildir('Example')
subfolder = mbox.get_folder('subfolder')
print 'subfolder contents:', subfolder.list_folders()
print
print '#' * 30
print
subfolder.add_folder('second_level')
print 'second_level created:', subfolder.list_folders()
show_maildir('Example')
print
print '#' * 30
print
subfolder.remove_folder('second_level')
print 'second_level removed:', subfolder.list_folders()
show_maildir('Example')
The directory name for the folder is constructed by prefixing the folder name with ..
$ python mailbox_maildir_folders.py
Example
Example/cur
Example/new
Example/new/1361446554.M933748P13757Q1.hubert.local
Example/tmp
Example
Example/.subfolder
Example/.subfolder/cur
Example/.subfolder/maildirfolder
Example/.subfolder/new
Example/.subfolder/tmp
Example/cur
Example/new
Example/new/1361446554.M933748P13757Q1.hubert.local
Example/tmp
Example
Example/.subfolder
Example/.subfolder/.second_level
Example/.subfolder/.second_level/cur
Example/.subfolder/.second_level/maildirfolder
Example/.subfolder/.second_level/new
Example/.subfolder/.second_level/tmp
Example/.subfolder/cur
Example/.subfolder/maildirfolder
Example/.subfolder/new
Example/.subfolder/tmp
Example/cur
Example/new
Example/new/1361446554.M933748P13757Q1.hubert.local
Example/tmp
Example
Example/.subfolder
Example/.subfolder/cur
Example/.subfolder/maildirfolder
Example/.subfolder/new
Example/.subfolder/tmp
Example/cur
Example/new
Example/new/1361446554.M933748P13757Q1.hubert.local
Example/tmp
Before: []
##############################
subfolder created: ['subfolder']
subfolder contents: []
##############################
second_level created: ['second_level']
##############################
second_level removed: []
Other Formats¶
MH is another multi-file mailbox format used by some mail handlers. Babyl and MMDF are single-file formats with different message separators than mbox. None seem to be as popular as mbox or Maildir. The single-file formats support the same API as mbox, and MH includes the folder-related methods found in the Maildir class.
See also
- mailbox
- The standard library documentation for this module.
- mbox manpage from qmail
- http://www.qmail.org/man/man5/mbox.html
- maildir manpage from qmail
- http://www.qmail.org/man/man5/maildir.html
- The email module.
- mhlib
- The mhlib module.