How important is MSG metadata?

Information putting data into context can be essential, especially if the amount of data involved is huge. As I have written elsewhere, it is often the first step in investigating a huge data set.

According to the judge in LLC v Kaplan, the “Plantiff produced a hard drive containing 822,493 pages of email communications”. Reading between the lines, it seemed to be in PDF format. It didn’t have any of the important metadata, and they were not text searchable, and as such constituted an impermissible “document dump”.

It appears to be often the way that, in litigation or other such proceedings, that parties go out of their way to make life more difficult for the other party. On one adjudication, we requested the other party to give us electronic copies of some 40 programmes which they printed out from SureTrak, a planning package. Sure enough, they did – in PDF form. Not helpful.

However, what they didn’t know was that, when they hid a column in SureTrak, it was hid from the PDF printout but was still included in the background PDF text. Having isolated that, I was able to extract that text from all 40 programmes, and able to do a detailed analysis through all those programmes in less than 3 hours.

My question – would that hidden text be called “metadata”? In reality, it is actual data, but it was not included in the printed data. Anyway, back to the current court case.

Federal Rule of Civil Procedure (“FRCP”) 34 includes:

“(ii)If a request does not specify a form for producing electronically stored information  [ESI], a party must produce it in a form or forms in which it is ordinarily maintained…”


The court ruled that the following metadata should be produced for these 822,493 pages:

  1. Custodian (i.e. who holds the data),
  2. Beginning and ending control number (i.e. page numbers),
  3. Sender,
  4. Recipients,
  5. Carbon Copy Recipients (i.e. ‘cc’s),
  6. Subject/Re line,
  7. Date and time sent/received,
  8. Control numbers for any “parent” message or message attachments, and
  9. Searchable, extracted message text.

It is interesting to see how close to the English rules that the above represents.

It would be interesting to see how the Plantiff was able to do this in the 14 days allowed.

If the information was contained in MSG files, then Filecats Professional could help. It would have been able to extract into the spreadsheet items 3-7 and 9 above. It would also include details of the attachments in each email (item 8) and the MSG files (akin to item 2 above).

I have previously said how I was able to create such a spreadsheet about emails in a court case when our client received 161,000 emails in MSG format. The advantage of the MSG format is that a hyperlink can be created which allows you to open the email from within Excel.

What about you? How do you handle the requirements of cataloging your ESI? Also, how do you catalog any day that you received? I’d love to hear from you.

In the meantime, why not download a copy of Filecats Professional and catalog MSG files from yourself? Don’t have Excel – then download Filecats Metadata instead, which doesn’t need it. There’s a free 7-day trial – what have you got to lose?

Leave a Reply

Your email address will not be published. Required fields are marked *