Monday, July 22, 2013

MS Excel and BIFF Metadata: Last Opened By

In my last post, I discussed using an OLE timestamp to determine the last time an Excel spreadsheet was opened and closed without being saved.  The last opened time can be very helpful, but wouldn't it be nice to know more about who may have opened the file? The Last Saved By metadata field will help if the file was saved after it was opened, but it may not provide additional information if the file was not saved.  However, the file's Workbook stream, comprised of a Binary Interchange File Format (BIFF) data structure, includes a field that records the user name associated with the account that last opened the Excel spreadsheet.  This data is recorded regardless of whether the file is saved and can provide information regarding the last user that opened the file.

The Details

Microsoft Excel spreadsheets saved in the OLE compound file format utilize the Binary Interchange File Format (BIFF) for saving data in the Workbook stream of the spreadsheet.  I'm not going to cover the intricacies of the BIFF here; for more information, refer to the Microsoft specification.  There is more than one version of BIFF as well; version 8 is the specific version addressed in this post.  

According to this Microsoft KB article, "when you open an Excel workbook, Excel writes the name of the current user to the header of the file" (the article later states that this does not apply to .xlsx files).  The "header" of the file, as it's described, is actually the "Write Access User Name" record within the BIFF data structure that comprises the file's Workbook stream.  It's important to note that the user name is referenced from the UserInfo subkey in the user's NTUSER.DAT, which may not be the same as the user name of the Windows account.  Regardless of whether the file is saved, the user name is written to the Write Access User Name record.  As such, data stored in this record may be different from the Last Saved By metadata field located in the Summary Information stream.  

When the "Protected View" warning bar appears (requiring the user to click "Enable Editing" to edit the spreadsheet), it appears that updates to the Write Access User Name record will depend on the volume from which the file was opened.  Opening a file that was downloaded from the Internet but stored on the local hard disk results in an update to user name in the record (regardless of whether the "Enable Editing" button is clicked by the user).  Opening a file from a network resource will not update the record unless the user clicks the "Enable Editing" button.  It should be noted though that my testing has been limited with regard to the Protected View functionality.

Interestingly, it appears that as different users open the same spreadsheet, the Write Access User Name record is simply overwritten as opposed to the previous user name being cleared first.  This means that you may find residual data following the end of the most recent user name.  The screenshot below depicts this scenario.  The most recent user name is "Jason", while "e 2010" is still stored in the record (the previous user name was "Office 2010").  This remained consistent in my testing of Excel 2000, 2007, and 2010 (I did not have Excel 2003 or 2013 available to me at the time of testing).

Write Access User Name record

Finding the Record

The Write Access User Name record should be stored near the beginning of the Workbook stream.  You can easily view this stream using a tool such as SSView, although the user name may not be parsed out automatically.  Once you've identified the Workbook stream, the user name should be visible in a hex editor.  The only tool I've currently tested that parses the user name is X-Ways Forensics, so it may be necessary to manually parse this record if you don't have a tool that will do it for you (or if you want to verify the results of your tool or just enjoy manually parsing data structures).  

An easy way to find the Write Access User Name record within the Workbook stream is to search for a block of 0x20.  According to the MS documentation, this record should be exactly 112 bytes in size and is padded with spaces (0x20) after the end of the user name.  Since most user names will likely only be a few characters in length, a block of 0x20 after the end of the name will be necessary for padding the record to 112 bytes. This method should work for identifying the Write Access User Name record, but I would recommend following along using the binary specification referenced earlier to develop a better understanding of the data structure.  If there is residual data after the current user name in the record, familiarity with the data structure will allow you to easily distinguish between current and previous data.

Forensic Implications

Parsing the data from the Write Access User Name record within an Excel spreadsheet saved in the OLE compound file format can provide an examiner with a metadata field that may be equated to the "Last Opened By" user.  This can be particularly helpful when a limited set of data is provided for analysis or otherwise any time information regarding the last time a spreadsheet was opened is significant.  By combining this data with the OLE Root Entry last modified time, it is possible for an examiner to determine the last time an Excel spreadsheet was opened as well as the user name associated with the account that opened the spreadsheet, even if the file was not saved and nothing other than the file itself is available for analysis.

Resources
Microsoft Excel (xls) Binary File Format Specification 

Monday, July 15, 2013

MS Excel and OLE Metadata: Last Opened Time

It is well known that Microsoft Office files store internal metadata that can be very revealing during forensic examinations (Author, Last Saved By, Creation Time, Last Saved time, etc.).  What may not be as well known are the timestamps maintained within the OLE data structures of the Office 97-2003 files and how these timestamps may be used in a forensic examination.  If a file was opened and closed without saving (thus not updating the internal Last Saved time or potentially any file system timestamp) and you do not have access to operating system artifacts to demonstrate file access, examination options are limited.  However, I've found that in some cases - specifically with Microsoft Excel - OLE timestamps may be used to determine the last time a file was opened, even if the file was closed before saving.

The Details

The OLE compound file format is often referred to as a "file system within a file".  As such, there are multiple data structures maintained and used by the file, one of which is the directory entry.  There are multiple directory entries within each Microsoft Office OLE file, but the one I'm going to discuss here is the root directory entry (labeled "Root Entry" within the file).  The root directory entry within an OLE file functions similarly to the root directory in a FAT file system.  Among other things, it contains a creation and last modified timestamp stored in FILETIME format.  At this time, I have not found the creation timestamp to be overly useful in terms of the information it reveals with regard to Microsoft Excel files as it is typically zeroed out (although this is not always the case).  The last modified timestamp, on the other hand, can be particularly interesting.

When a spreadsheet is saved in the Microsoft Excel 97-2003 format, the last modified time of the Root Entry within the OLE file should either be zeroed out or updated to reflect the time that the file was saved (depending on the version of Excel used to save the file).  This may not be very helpful as the last save information is already available through other known metadata (i.e. the Summary Information stream). However, the last modified timestamp of the Root Entry appears to be updated when the Excel file is opened.  If the file is then closed without being saved, this modification time remains and reflects the last time the file was opened.  This means that it may be possible to detect the last time a Microsoft Excel file in 97-2003 format was opened if the file was not saved and an examiner is provided with nothing more than the file itself.

Updates to the last modified time of the Root Entry directory entry remained consistent in my testing of Excel 2000, Excel 2007, and Excel 2010 (I did not have Excel 2003 or 2013 available to me at the time of testing).  Further, the timestamp was updated regardless of the version of Excel that created or opened the file.  When the "Protected View" warning bar appears (requiring the user to click "Enable Editing" to edit the spreadsheet), it appears that the update to the OLE Root Entry modification timestamp will depend on the volume from which the file was opened.  Opening a file that was downloaded from the Internet but stored on the local hard disk results in an update to the modification time (regardless of whether the "Enable Editing" button is clicked by the user).  Opening a file from a network resource will not update the modification timestamp unless the user clicks the "Enable Editing" button.  It should be noted though that my testing has been limited with regard to the Protected View functionality.

Finding the Timestamp

X-Ways Forensics is currently the only tool I've tested that parses the last modified timestamp from the OLE Root Entry of Microsoft Office documents (I'd be interested in hearing about others though).  For the sake of demonstration, manually finding this timestamp is straightforward.  The easiest way is to simply search for the Unicode string "Root Entry" when the spreadsheet is opened in a hex editor.  Starting from the first byte of the Root Entry (Unicode "R"), simply skip ahead 108 bytes to find the 64-bit FILETIME modification timestamp.  Although this method should work for finding this timestamp, I would encourage you to follow along with the binary specification of the OLE compound file format (see Resources below) so that you have an idea of what fields are present and how the overall OLE file format is structured.    

Root Entry with last modified time decoded

Forensic Implications

When an examiner is provided with a limited set of data (e.g. a flash drive or external hard drive), the options for analysis will likely be limited.  Without the common operating system artifacts that we are used to examining, determining activity with regard to a particular file or set of files can be difficult.  However, if an examiner is provided with a media device storing files in Excel 97-2003 format, he or she may be able to determine if and when each file was opened without being saved.  Comparing the Last Saved time in the Summary Information stream to the last modified time of the OLE Root Entry may be revealing.  If the last modified time of the OLE Root Entry is later than the Last Saved time, the file may have been opened and closed without saving after the last time that the file was saved.  This information may be very helpful when the mere fact that a file was opened after a particular date is significant.

While this post (and my testing) has focused on Microsoft Office 97-2003 Excel files, it's important to note that the OLE Root Entry last modified and creation timestamps are not limited to Microsoft Office files.  There are a number of other files that use the OLE compound file format, such as jump lists (*.automaticDestinations-ms), thumbs.db files, and sticky notes.  Further research into the behavior of the OLE timestamps with regard to other file types may reveal interesting and useful information for forensic examinations.

Resources
OLE Compound File Format
[MS-CFB]: Compound File Binary File Format
Forensics Wiki: OLE Compound File