Using XML for Enterprise Email Discovery - Part 1
Novell Cool Solutions: Feature
By Messaging Architects
Digg This -
Posted: 17 Oct 2006
Demystifying XML: XML vs SQL for Enterprise Email Discovery - Part 1 By Greg Smith
Anatomy of an Email Record
Traditionally, information search and access has been the realm of huge relational database systems which provide access to structured records and information, such as personnel records, inventory, financial data, and business transactions. These systems are comprised of collections of inter-related tables consisting of rows and columns that together form a relational database. Typically, most data is structured with numeric or alpha-numeric makeup and defined field lengths. These characteristics make SQL the perfect language for querying and returning records in cases where the search string is part of the queried field.
In the electronic age, email has become a key asset for the enterprise. Email records are an important source of data within the enterprise: they are the vast repository of corporate memory and often they document corporate intellectual property, in addition to day-to-day business operations. Email records have attained the same status as printed records and as such are often required as evidence in legal disputes or internal company audits. Email is now a dominant form of evidence requested in lawsuits, placing a heavy burden on businesses to ensure that email is managed consistently and can be produced for discovery.
A recent Osterman Research survey shows that as many as 60% of organizations have been ordered to produce employee email by a court or regulatory body. With recent amendments to the Federal Rules of Civil Procedure, businesses are now required to keep any email that might be considered relevant as soon as there is a reasonable expectation that a lawsuit might occur. As a result, organizations may have to spend hundreds of thousands of dollars locating email documents within their messaging systems or tape backup systems in response to legal litigation or regulatory compliance.
While email records may be considered to be structured data based on common fields, such as sender, recipient, message creation date, and other header information, with respect to their body content, to a large extent they are unstructured documents that contain not only text, but also embedded graphics, sound, different kinds of attachments, web content, and other data. As such, the fast and accurate retrieval of these records is based not so much on searching the structured data fields, but on being able to search the unstructured data. Simple keywords are no longer sufficient as other parameters, such as relevancy, order, and sentence structure, come into play.
In short, the demands being placed on the discovery requirements for electronic records more and more match those of web-based search engines and not those of database applications.
About the Author: Greg Smith, MCNE and MCNI, has been working in the high-technology field for more than 15 years, predominantly with Novell Platinum integrators and resellers. Greg Smith is one of the main designers of Messaging Architects? GWArchive, the only GroupWise-native email retention solution included in the Gartner Magic Quadrant for active archiving. In his current position as Director of Professional Services at Messaging Architects, he brings his networking and messaging expertise to a company that specializes in GroupWise enhancements and product development. Greg has been active in the area of public speaking, giving technical presentations at GroupWise Advisor Summits, as well as at Novell BrainShare.
Novell Cool Solutions (corporate web communities) are produced by WebWise Solutions. www.webwiseone.com