Monday, July 15, 2013

MS Excel and OLE Metadata: Last Opened Time

It is well known that Microsoft Office files store internal metadata that can be very revealing during forensic examinations (Author, Last Saved By, Creation Time, Last Saved time, etc.).  What may not be as well known are the timestamps maintained within the OLE data structures of the Office 97-2003 files and how these timestamps may be used in a forensic examination.  If a file was opened and closed without saving (thus not updating the internal Last Saved time or potentially any file system timestamp) and you do not have access to operating system artifacts to demonstrate file access, examination options are limited.  However, I've found that in some cases - specifically with Microsoft Excel - OLE timestamps may be used to determine the last time a file was opened, even if the file was closed before saving.

The Details

The OLE compound file format is often referred to as a "file system within a file".  As such, there are multiple data structures maintained and used by the file, one of which is the directory entry.  There are multiple directory entries within each Microsoft Office OLE file, but the one I'm going to discuss here is the root directory entry (labeled "Root Entry" within the file).  The root directory entry within an OLE file functions similarly to the root directory in a FAT file system.  Among other things, it contains a creation and last modified timestamp stored in FILETIME format.  At this time, I have not found the creation timestamp to be overly useful in terms of the information it reveals with regard to Microsoft Excel files as it is typically zeroed out (although this is not always the case).  The last modified timestamp, on the other hand, can be particularly interesting.

When a spreadsheet is saved in the Microsoft Excel 97-2003 format, the last modified time of the Root Entry within the OLE file should either be zeroed out or updated to reflect the time that the file was saved (depending on the version of Excel used to save the file).  This may not be very helpful as the last save information is already available through other known metadata (i.e. the Summary Information stream). However, the last modified timestamp of the Root Entry appears to be updated when the Excel file is opened.  If the file is then closed without being saved, this modification time remains and reflects the last time the file was opened.  This means that it may be possible to detect the last time a Microsoft Excel file in 97-2003 format was opened if the file was not saved and an examiner is provided with nothing more than the file itself.

Updates to the last modified time of the Root Entry directory entry remained consistent in my testing of Excel 2000, Excel 2007, and Excel 2010 (I did not have Excel 2003 or 2013 available to me at the time of testing).  Further, the timestamp was updated regardless of the version of Excel that created or opened the file.  When the "Protected View" warning bar appears (requiring the user to click "Enable Editing" to edit the spreadsheet), it appears that the update to the OLE Root Entry modification timestamp will depend on the volume from which the file was opened.  Opening a file that was downloaded from the Internet but stored on the local hard disk results in an update to the modification time (regardless of whether the "Enable Editing" button is clicked by the user).  Opening a file from a network resource will not update the modification timestamp unless the user clicks the "Enable Editing" button.  It should be noted though that my testing has been limited with regard to the Protected View functionality.

Finding the Timestamp

X-Ways Forensics is currently the only tool I've tested that parses the last modified timestamp from the OLE Root Entry of Microsoft Office documents (I'd be interested in hearing about others though).  For the sake of demonstration, manually finding this timestamp is straightforward.  The easiest way is to simply search for the Unicode string "Root Entry" when the spreadsheet is opened in a hex editor.  Starting from the first byte of the Root Entry (Unicode "R"), simply skip ahead 108 bytes to find the 64-bit FILETIME modification timestamp.  Although this method should work for finding this timestamp, I would encourage you to follow along with the binary specification of the OLE compound file format (see Resources below) so that you have an idea of what fields are present and how the overall OLE file format is structured.    

Root Entry with last modified time decoded

Forensic Implications

When an examiner is provided with a limited set of data (e.g. a flash drive or external hard drive), the options for analysis will likely be limited.  Without the common operating system artifacts that we are used to examining, determining activity with regard to a particular file or set of files can be difficult.  However, if an examiner is provided with a media device storing files in Excel 97-2003 format, he or she may be able to determine if and when each file was opened without being saved.  Comparing the Last Saved time in the Summary Information stream to the last modified time of the OLE Root Entry may be revealing.  If the last modified time of the OLE Root Entry is later than the Last Saved time, the file may have been opened and closed without saving after the last time that the file was saved.  This information may be very helpful when the mere fact that a file was opened after a particular date is significant.

While this post (and my testing) has focused on Microsoft Office 97-2003 Excel files, it's important to note that the OLE Root Entry last modified and creation timestamps are not limited to Microsoft Office files.  There are a number of other files that use the OLE compound file format, such as jump lists (*.automaticDestinations-ms), thumbs.db files, and sticky notes.  Further research into the behavior of the OLE timestamps with regard to other file types may reveal interesting and useful information for forensic examinations.

Resources
OLE Compound File Format
[MS-CFB]: Compound File Binary File Format
Forensics Wiki: OLE Compound File

10 comments:

  1. Hi,

    You should try the modules 'metacompound' in DFF 1.3 It extract all the OLE stream with their metadata (minifat / difat specific attributes, ...) and also the metadata specific to the DOC and PPT Stream. (The "Last saved time", ...). It also automatically extract pictures and text from DOC and PPT.

    ReplyDelete
  2. Solal,

    Thanks for the tip, I haven't tried DFF yet. Is the metacompound module part of the free edition?

    ReplyDelete
  3. Jason,

    Great post! It really validates my thoughts that analysts need to know more about the data structures that they encounter, in order to get the most out of them.

    Even though for some, the OLE format MS Office documents may no longer be on their radar, this is valuable information in that the OLE format is in use in multiple file formats on more recent versions of Windows, as well as in more recent applications.

    Again, thanks for writing and posting this...

    ReplyDelete
    Replies
    1. Harlan,

      Thanks, and good point about OLE being used in other formats as well. It will be interesting to see if/how this type of information is helpful in analyzing files from other (and possibly more current) applications.

      Delete
  4. Yes Jason. You can download it on www.digital-forensic.org.
    (I'm sorry but most of the distribution haven't update to the 1.3 package yet and the module is not in the 1.2 version).
    To test it you can directly click 'open evidence' then the 'green cross', to add your .doc file (or other compound document). Then double-click on the document who will appear in 'Logical Files' then a tree with a node for each stream will be created. Each node (stream) will have it's own metadata (it appear on the right panel) and the Document specific metadata will be added to the root node too. Then you can use the search engine to compare metadata of multiple documents (or create a python script).

    ReplyDelete
    Replies
    1. Solal,

      I just tried DFF 1.3.0 and was able to extract a good deal of OLE metadata (version, byte order, sector locations, etc.), but it doesn't appear to extract the last modified date from the Root Entry. I manually verified the date in the hex preview of DFF, but the timestamp doesn't appear to be parsed when applying the compound module to the file...

      Delete
    2. Jason,

      If you check on the node with the 'blue cross' (who appear once the modules is applied to inform that some content was expended) this is normal.
      On this node the only metadata that appear are:
      * the general one from FAT/DIFAT and minifat table (under the attribute : metacompound.Compound document)
      the "metacompound.DocumentSummaryInformation (Root Entry)" and "metacompound.SummaryInformation (Root Entry)" who correspond to the informations of the first embedded documents (there could be other Summary Information and Document Summary Information if there is word document embedded inside an other word document for examples).

      To find the "metacompound.Creation time" and the "metacompound.Modified time" you should double-click on the node and select the "Root Entry" stream on this node the metadata should be set. Also this two metadatas is available for all other nodes (stream) but most of the time the value is set to 0. (1980-00-01 00:00:00).

      Can you tell me if you find the right result on the "Root Entry" node ?
      If not it will be good to share a 'word' document so I could patch the module.

      Thanks.

      Delete
    3. Solal,

      Thanks for the follow-up. I just tested this and the modification time listed for the Root Entry node was correct.

      Delete
  5. Jason.

    Have you seen any occasions where the OLE "Last Opened Time" is updated but the NTFS Last Modified is NOT updated? i.e. OLE date is later than NTFS date.

    ReplyDelete
    Replies
    1. Yes. In my testing, the NTFS Last Modified ($STD_INFO) timestamp was not updated when a file was opened and closed without being saved. In that case, the OLE Root Entry Last Modified (i.e. Last Opened Time) would be later than the NTFS $STD_INFO Last Modified time.

      Delete