A distributed eDiscovery processing engine which coordinates file and metadata extraction, imaging, endorsing, branding and OCR across multiple machines within a given network.
Performing electronic discovery processing on large data sets is a processor and memory intensive exercise often exceeding the resources available on a single physical computer. Even the most advanced multi-core servers struggle to process data sets that exceed a few million documents in a timely manner. Current electronic discovery processing software solutions focus on distributing the work load to multiple threads within a single instance of the running application. While this helps leverage all available resources granted to the application, the maximum processing power is constrained by the physical limitations of the host computer. Given the vast CPU and memory requirements needed to efficiently process so many documents, most electronic discovery processing systems typically reach bottlenecks when resource requirements exceed the available CPU and memory on the host computer. While these limits are acceptable for thousands or even millions of documents, they are not acceptable as the order of magnitude of data grows.