E-discovery is not a simple document swap like its paper counterpart. The fragile nature of digital evidence requires more careful consideration of a document's authenticity as well as its relevance to the issues being considered.
In addition, a digital document as well as its storage media typically contains far more data about that document than is presented to the user when simply viewing it. Thus, there is much more available to be considered.
The additional data contained by digital documents and their storage media is called metadata. There are essentially two types of metadata; application and system metadata.
Application metadata is found within the document itself. It is created by the application that is used to create and manage the documents and can include information about the author, organization, create date and last modified date. There are many ways that application metadata can be used to enhance the usefulness of digital evidence. (see, Metadata Molds E-discovery Practices)
The other kind of metadata is known as system medata. System metadata is created by the computer system on which documents are managed. System metadata provides context to the documents found on a particular media. For example, metadata may explain when and how often a document was used. In addition, their existence can help to authenticate a document and its usage or highlight data hiding and spoliation schemes. (see Computer Forensics and the Judicial Arms Race)
System metadata is found in files that are separate and distinct from the documents themselves. As a result, system metadata may reside long after the document itself has disappeared and even when the document resided on a different media. It also means that to fully understand a document, users should request system metadata along with the document itself.
There are numerous types of system metadata. A full discussion of system metadata sources is beyond this article. Thus, to fully exploit the full benefits of system metadata one should consult with a forensic expert. For illustrative purposes the benefits of file system information, event logs, file pointers, and deleted file data are discussed in the following sections.
If computer documents are like books in a library then the file system is like the card catalog. The file system, therefore, contains various attributes about the files on a computer such as names, locations, types, and date stamps.
Examining file system information can reveal many things about the data on a storage media such as;
Event logs are created by the operating system. Event logs created by the operating system are lists reflecting that certain actions occurred. In addition to logs created by the operating system, various programs may also create their own kind of logs that also memorialize what functions were run by those programs and the result of those operations.
In addition to the activities captured by these logs they often also captre the date and time when the activities occurred. When these date stamps are ordered in the sequence that they were entered in the event log it is possible for analysts to spot out-of-sequence date stamps and instances when the system clock has been changed.
Reviewing all of these logs can reveal a lot about how a computer has been used and whether for good or for bad. In addition, reviewing event logs are important system metadata that not only shows how the computer was used but whether there are any instances when date stamps should be considered suspect. As a result, event log system metadata can betray spoliation schemes as well as other weaknesses in date stamp reliability.
File pointers come in a variety of forms. The two most common types, at least on a Windows system, are link files and browser history. Link files are small files with the LNK extension. Browser history are records stored in the internet history cache.
Since file pointers are created whenever documents are opened and viewed, the examination of file pointers can help to differentiate when files were opened and viewed by a user versus simply “touched”.
While link files and browser history capture some similar information such as date stamp information, file name and full file path, they also capture some different information. Link files for example, can also capture the volume name of the media on which the file was stored. This can be very useful in confirming the recovery of a particular device when searching for a particular document.
Also, the link files and browser history have different time periods over which they can be helpful. While each is typically retained for about 25 days, the deleted browser history records can be recovered from freespace for a much longer time--perhaps 8 to 12 months or longer. Link files on the other hand, are archived in protected areas of media and can be recovered for 3 to 6 months, generally.
File pointers are an attribute that is often overlooked when files are deleted. Thus, searching for unmatched file pointers is an effective means of proving spoliation. Also, since they capture full file path information they can also reveal that the files being accessed are local or from some other device that needs to be located.
While file system information will often include references to deleted files and provide considerable information about those files, on many kinds of computer systems file deletion is a two step process. The first step simply moves the presentation of a file from the population of other active files to a special grouping of recycled or deleted files.
When files pass through the Windows recycle bin there are a few pieces of information that are captured about those files. In Windows systems prior to Vista the information is captured in a log file named INFO2. For Windows Vista systems and later the information is capture in a file pointer. Whether captured by in the INFO2 file or a file pointer, the particular data elements that are captured are things such as the actual deletion date and the original location where the file was stored. These artefacts are the only place where the actual deletion date and time is captured even though it may also be reflected in other system metadata artifacts like the file system.
The recycle bin's INFO2 and file pointers are files themselves that are deleted when the recycle bin is emptied and the computer shutdown. Since they are files, however, they can be recovered like any other deleted files. So, any examination of these recycle bin artifacts should include not only those that are still active but a search to recover earlier versions that were previously deleted.
Thus, examination of the recycle bin INFO2 file and/or deleted file pointers are another good source for spoliation detection.
In the end, system metadata has considerable evidential value. It can validate the accuracy of other date and time stamp metadata, identify omissions in preservation or production, and it can betray data hiding or other spoliation schemes.
While application metadata can be obtained simply by requesting documents in native form, system metadata is contained in files that are separate from the documents themselves. Consequently, in order to receive system metadata, discovery requests must include separate items designed to cover this kind of data such as file system file lists, file pointers, and event logs.
Since it is very unlikely that system metadata could contain privilege information that should not be a concern or basis for withholding it. Since system metadata can be so important, preservation methods that focus only on the documents without collecting related system metadata could prove fatal.