To learn more about electronic discovery or
discuss a specific matter
Gregory L Fordham
There is a paradigm shift occurring in American jurisprudence. The shift is caused by the computer and the digital evidence that it spawns.
For some, a case involving digital evidence is death by a thousand nicks. While it is only pennies a page, there are millions and million of pages.
Others see things differently. For them an examination of digital evidence is mutually assured destruction.
In either case, it is like repelling a human wave attack with muzzle loading rifles. No matter how inexpensive the ball and powder might be the end result is costly.
Clearly, a new tactic is needed—a paradigm shift.
Farmers have learned to use good insects to fight bad insects. The solution to digital evidence is somewhat similar. Lawyers must use the computer to fight the computer.
Fighting with the computer does not mean automating document management. That approach is like having a machine gun but loading it manually one bullet at a time.
Naturally, at some point the evidence will need to be converted to paper so that it can be used as a deposition or trial exhibit. The reality, however, is that there are likely to be only a few digital files that will ever get elevated to exhibit status.
So, avoid the cost of conversion and the associated evidence degradation and stick with the native format until the few documents that will be used as exhibits are identified. In the meantime, use native file viewers, native software, or other computerized means of examination.
Since examining digital evidence is like being buried alive in the treasure room of the pharaoh, the next step is to start filtering the collected data into smaller and smaller populations.
The filtering begins by removing duplicates from the population and then continues with various techniques to separate the relevant from the irrelevant. Once the relevant is determined the population is then even further culled to find the key documents.
The first step in filtering the digital data is identifying the duplicates. This can be accomplished with digital fingerprints. Several different algorithms can be used to calculate a document’s digital fingerprint or signature.
Those documents with identical fingerprints are the same. Those with different values are unique.
After narrowing the population to the unique documents, those same fingerprints can be used to further limit the population under consideration.
The population of digital data that was collected likely includes numerous files that are not relevant to the dispute. Likely examples are operating system files and installed application software.
Large databases exist that contain the digital fingerprints of known files like operating system files and application software. These databases can be compared to the fingerprint values of the collected files and irrelevant files excluded.
Likewise, if particular items form the basis of the dispute then only those items matching the digital fingerprints of specific files can be included and all others can be excluded.
The second step in filtering the documents is performing signature analysis to confirm that various file types of documents are what they claim.
It is a means of finding hidden data as well as confirming that irrelevant document types can be excluded from consideration.
At this point the population should be considerably narrowed. Yet there are still techniques that can filter further.
Metadata analysis can be used to exclude items outside the relevant date range, locations, individuals and subject matter. Various search techniques can also be used to further cull the population and confirm metadata based exclusion techniques.
In the end, search techniques along with pattern analysis methods are what will identify the nuggets that will ultimately be used as exhibits. It is then that these documents can be converted for redaction and/or reduced to paper.Until that time it is best to realize, while miniscule, that there is a penalty for converting the data from one form to another. The new paradigm is to delay and limit the number of times conversion occurs.