Configuring a server to parse email via a PHP script
In this tutorial I’ll show how you can setup a server to parse email with a PHP script. This tutorial assumes that your server is configured to receive email (I wrote this using a virtual machine running postfix).
The first thing you’ll need to do is configure an alias to direct email to a PHP script (instead of an email box). I added the following entry to the bottom of my /etc/aliases file and then ran the “newaliases” command to refresh my aliases database:
phpscript: "|php -q /usr/local/bin/email.php"
The above entry will pipe email sent to phpscript@MYDOMAIN to the designated PHP script.
And here’s the script:
#!/usr/bin/php
<?php
// fetch data from stdin
$data = file_get_contents("php://stdin");
// extract the body
// NOTE: a properly formatted email's first empty line defines the separation between the headers and the message body
list($data, $body) = explode("\n\n", $data, 2);
// explode on new line
$data = explode("\n", $data);
// define a variable map of known headers
$patterns = array(
'Return-Path',
'X-Original-To',
'Delivered-To',
'Received',
'To',
'Message-Id',
'Date',
'From',
'Subject',
);
// define a variable to hold parsed headers
$headers = array();
// loop through data
foreach ($data as $data_line) {
// for each line, assume a match does not exist yet
$pattern_match_exists = false;
// check for lines that start with white space
// NOTE: if a line starts with a white space, it signifies a continuation of the previous header
if ((substr($data_line,0,1)==' ' || substr($data_line,0,1)=="\t") && $last_match) {
// append to last header
$headers[$last_match][] = $data_line;
continue;
}
// loop through patterns
foreach ($patterns as $key => $pattern) {
// create preg regex
$preg_pattern = '/^' . $pattern .': (.*)$/';
// execute preg
preg_match($preg_pattern, $data_line, $matches);
// check if preg matches exist
if (count($matches)) {
$headers[$pattern][] = $matches[1];
$pattern_match_exists = true;
$last_match = $pattern;
}
}
// check if a pattern did not match for this line
if (!$pattern_match_exists) {
$headers['UNMATCHED'][] = $data_line;
}
}
?>
At this point in the code, the body of the message will be contained in the $body variable and the headers will be in $headers.
Here is an example of the parsed headers (using print_r()):
Array
(
[UNMATCHED] => Array
(
[0] => From root@Eric-Centos.localdomain Sun Jan 10 21:49:50 2010
)
[Return-Path] => Array
(
[0] => <root@Eric-Centos.localdomain>
)
[X-Original-To] => Array
(
[0] => phpscript
)
[Delivered-To] => Array
(
[0] => phpscript@Eric-Centos.localdomain
)
[Received] => Array
(
[0] => by Eric-Centos.localdomain (Postfix, from userid 0)
[1] => id 4D03F30131; Sun, 10 Jan 2010 21:49:50 -0500 (EST)
)
[To] => Array
(
[0] => phpscript@Eric-Centos.localdomain
)
[Subject] => Array
(
[0] => This is the subject
)
[Message-Id] => Array
(
[0] => <20100111024950.4D03F30131@Eric-Centos.localdomain>
)
[Date] => Array
(
[0] => Sun, 10 Jan 2010 21:49:50 -0500 (EST)
)
[From] => Array
(
[0] => root@Eric-Centos.localdomain (root)
)
)
Now you have all the email headers and message body parsed. You can do whatever your heart desires with the data, like insert it into a database or even create nodes!