Why register?  |  Register  |  Login
Rackspace Managed Hosting
Home > News > Article
 Search

Commentary:

Information archiving and e-discovery

An effective information archiving system gives long-term benefits, says Paolo Cattolico, e-discovery marketing manager for HP EMEA

The term ‘e-discovery’ comes from the legal world, where it indicates a precise phase during a trial, in which parties gather electronic information as evidence to support their case in court. In recent years, e-discovery has got a lot of attention from press and analysts. Some companies have had to pay millions of dollars in fines when they were unable to retrieve data, and in other cases, the loss or salvation of reputation and money has depended on retrieval of a very small piece of information such as a single-line e-mail.

Even outside of litigation scenarios, companies may want to retrieve information for due diligence (to support a government investigation, for instance), transparency (to answer to an audit) or risk management (for example, to deal with people claiming they received e-mails which were never sent, or vice versa). Today, the term e-discovery is often used to designate all of these cases: the common denominator is that the information should be located somewhere in the company and can act as proof that a certain action has been taken.

The most challenging aspect of e-discovery, from an IT perspective, is that it targets a large amount of unstructured information, such as e-mail messages and files stored virtually everywhere in the company. Often, much of this information is not treated as ‘records’ within the enterprise content management (ECM) system. That means it’s not stored centrally, it doesn’t have an associated retention period, and it’s not classified (for example distinguishing private versus business files). Still, the ability to search though this information is crucial.

Take for example the case of a patent infringement investigation, where we should retrieve all e-mail sent from our company to a certain competitor about three years ago. Assume there’s an e-mail that would provide decisive proof of our correct conduct (lawyers call this a ‘smoking gun,’ even when it proves your innocence). Unfortunately, the e-mail was inadvertently deleted by the sender and it cannot be retrieved, even on backup tapes. Or, worse, the e-mail is there, stored on a user’s laptop or on an external USB hard drive, but the user has forgotten that (or left the company) and we may not be able to locate it.

This example contains some of the typical ingredients of an e-discovery case. First, e-mail is often a target for the investigation (analysts estimate 70-80 per cent of cases involve an e-mail search). Second, backup tapes (or any other media) are not a solution, especially for ‘aged’ information: if, say, you take a full e-mail backup at the end of every year (and discard all previous ones), that will not contain information that the users have created and then deleted during that year: without a retention time associated to e-mail, there’s little chance that backup can help. Third, users tend to store some potentially valuable information locally, on their personal computer or personal storage. They often do this because of limitations and quotas imposed by the IT department (e.g. on the size of mailboxes). This local information (often categorised as ‘PST files’, from the most common suffix of personal e-mail folders) is more difficult to index and retrieve centrally.

To a large extent, before performing the search, an e-discovery project needs to consolidate all the information sources scattered across the company, determine which policies are enforced for document retention or long-term storage, possibly restore a number of former backups, and index all of these. Each of these steps may be very complex, involving a lot of manual work and time-consuming, error-prone procedures. The complexity would be dramatically reduced if all the information to be searched was already present in a central, online, integrated archive which had automatically captured and stored all data that may be relevant from a legal standpoint. In other words, an integrated archive is a key enabler for a successful e-discovery project.

Today, many companies rely on specialised e-discovery firms for help. They provide both legal and IT consultancy, and use sophisticated search tools to retrieve and consolidate dispersed information. They often charge for their work by the amount of data that needs to be analysed, and a typical figure is US$1,500-1,800 per gigabyte. Therefore, analysing the e-mail and files of 60 employees can cost US$1 million. An analysis across thousands of company mailboxes or desktops may cost several millions. These high-cost figures are based on the assumption that a central archive is not in place, and a lot of work is needed to ‘rescue’ and categorise the information. While having a central archive will not eliminate the need for some specialised e-discovery software and/or consultancy, it can dramatically cut information collection costs, often reducing the above figures by 50-60 per cent. That would mean, in our first example, a cost saving of US$500,000 on a single case. A look at some of the list prices for archiving solutions shows that the cost of implementing an archive for e-mail and files would be largely paid back with a single e-discovery case, let alone all the other advantages it would provide, such as regulatory compliance and savings in all subsequent cases.

One such solution, the HP Integrated Archive Platform (IAP) is installed at hundreds of companies worldwide. Designed to provide a single, central archive for e-mail, files and documents, the IAP includes hardware, software and support services, and is based on a modular, scalable architecture.

The IAP comes as an all-in-one ‘box’ including all the required hardware components with software already installed and tested: it provides automatic indexing of all content, and manages retention times and anti-tampering of data. It also has built-in load balancing, security and management features, accessible via a Web interface.

Moving forward, the scope of e-discovery will likely go beyond e-mail and files, and in general beyond text-based information, to target other forms of communication such as instant messaging, voice and pictures. It’s easy to imagine the interest, within legal boundaries, in scanning and analysing phone conversations and chats for certain terms, or searching a specific graphical pattern in medical or geographical images. The amount and variety of data to be searched for will grow, and so will the importance of a unique, integrated archive.

In summary, e-discovery may not only help every organisation when facing a legal case, but it also provides a new way to capitalise on existing information, making business processes more transparent, manageable and diligent. The foundation for successful e-discovery is a well-organised base for unstructured information, such as an enterprise archive. If you are already building one, that’s a great start; make sure you include processes to seamlessly capture e-mail and other data and transfer them to the archive. In parallel, define the basic document typologies and their retention times. Your first e-discovery case may be on the way – by taking these simple steps, you are ready to welcome it with confidence and a (respectful) smile.

This article was originally published in the Spring 2008 issue of Finance on Windows magazine

 


Add comment:


    Add comment

Review comments:

There are currently no comments on this article