PMail File Parsing
We started using PMMail on OS/2 in the early-mid 1990s and then after a couple years with Pegasus Mail, went back to PMMail when it was released for windows. Unfortunately, the authors chose to write the whole thing in an unsupportable way and then bailed on it. It went from the most configureable, advanced email client that just needed some tightening and slow evolution in 1998 to a buggy behemoth that fell behind and got lost in the ever-changing flow of tech.
The trouble is that Data is Life and I have over 100K messages sorted carefully into a complex hierarchy of folders that represent my email life from 1994 to early 2003 when I stopped using it (minus the nearly-suicidal virus disaster of 2000 and the annoying gap from 1997 when I lost a zip file of much of my back email). I switched to OSX as my primary client OS and I have also decided I want to move to having my important email data in a platform independent format.
I tried to find some conversion tools to get out of the PMMail data format and found nothing. The closest thing was using The Bat (another email client for windows) to convert from PMMail and then output into another format.
However, what I really want is an SQL database with my mail in it so I can do advanced queries and have them be instantaneous. Without an indexing scheme of some kind, searching 100K files takes a long time and PMMail is particularly slow and buggy in its search :(
So, a few months ago I spent some time figuring out how to convert the PMMail format and suck it into an SQL database with a simple web GUI. Unfortunately for other users, the simple web GUI is based on erowid3.0 pre-release code and I won't be making that available until after we publicly release the erowid3 code. But I will include the code for pulling it in.
I am doing most of my development in PHP these days, don't bother poking fun. But I will post the code for this in a couple days. The main bit of New Info that I had to discover for myself was how to pull the folder name out of the FOLDER.INI file. The code for this is:
function GetNiceDirName($FolderFile)
{
$Rval = "";
$fp=fopen("$FolderFile","r");
if ($fp === NULL) PrintError("Unable to read $FolderFile");
else
{
while (false !== ($char = fgetc($fp)) && ord($char) != 222)
{ $Rval .= $char; }
fclose($fp);
}
return $Rval;
}
The rest is just a bunch of helper functions for excluding directories based on regexps, pulling out the header info, and sticking all of it in an SQL table.
Posted by Earth at December 15, 2003 03:03 AM