Keyword searching is an objective search method commonly used to limit data collections to documents containing terms believed to be a strong indicator of potential relevance. To accomplish this goal, eDiscovery practitioners create lists of words or search strings using proximity connectors which are then compared against an index of the terms extracted from the documents in the database. Below is a decision tree that can aid in this effort.
1) What is the purpose of the review?
a) Is there to be a production? Am I trying to cull down the dataset for review itself or only to look at the most relevant documents first?
2) Have the parties agreed on final keywords without testing?
a) If yes, it may be okay to go ahead and run pre-processing.
b) If not, you need to run post-processing and test the results of keywords and refine. Keyword searching is most defensible when run post-processing, as running the search prior to processing presents issues such as:
3) Is the dataset conducive to a keyword search?
a) If not, is Optical Character Recognition needed?
4) Does the review tool index all terms?
a) If yes, what words were not indexed by the tool?
5) Does the data set contain foreign languages?
a) If yes, do you need to capture foreign language documents?
6) Narrowly tailor the terms to attempt to fully capture all relevant material and limit the capture of non-relevant material.
7) Check for terms that have multiple uses and connotations.
8) Check for terms that might have common use in signature blocks ( i.e. “confidential”)
9) Check for a term that might be part of the client’s domain name
10) Check for common misspellings of words
11) Broaden your terms with common synonyms
12) Think broadly about the actual types of records you intend to return
13) Attempt to link with syntax terms as much as possible in order to further narrow your search - i.e. “draft* /10 will” as opposed to “will”