SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Downloaden Sie, um offline zu lesen
Best Practices:
eDiscovery Search
Improve Speed and Accuracy of Reviews & Productions with the Latest Tools
February 27, 2014
Karsten Weber
Principal, Lexbe LC
eDiscovery FAST
eDiscovery Webinar Series
○ Takes Place Monthly
○ Cover a Variety of Relevant eDiscovery Topics
Next Month:
Legal Timelines and Early Case Assessment
○ Presentations Available for Download by Registrants.
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014
Info & Future
eDiscovery FAST
If you have any questions or technical issues, please e-mail them to:
webinars@lexbe.com
Questions will be forwarded to Karsten and answered during the webinar or via
e-mail if we run out of time.
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014
eDiscovery Webinar Series
Questions & Technical Issues
eDiscovery FAST
○ Current
- Principal of Lexbe LC
- Principal Architect of Lexbe eDiscovery Suites and Lexbe
eDiscovery Services
○ Prior Experience
- Consulting Expert, Lumin Expert Group
- Director of Software, nLine Corporation
- Software Engineering Manager, KLA-Tencor
○ Education
- MBA, University of Texas
- M.S. Engineering, Danish Technical University
Karsten Weber bio
eDiscovery Webinar Series
Contact
Karsten Weber
512-686-3469
karsten@lexbe.com
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014
Best Practices for Keyword Search
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014
Use of Keyword Search In Discovery
○ Early Stage Culling - Reduce amount of ESI to be reviewed by using
keywords to cull document collections.
○ Keyword-Based Responsive & Privilege Review - Construct search
queries to return documents that are likely to be responsive,
confidential. Search by name and email of counsel; privilege, work-
product, confidential and related keywords.
○ ID Documents for Depo Prep - Find and assign key documents related
to specific case participants to prepare for depositions. Search by email
addresses used, names and nicknames used, important issues
associated with deponent.
○ ID of Key Docs for Trial - Find and mark key case documents. Code
documents that will be needed for trial.
eDiscovery FAST
Best Practices for Keyword Search
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014
Pros of Keyword Searching
eDiscovery FAST
○ Fast - Keyword search is very fast compared with other document
search methodologies.
○ Inexpensive - Good results can be obtained at little cost compared with
manual review or other computer assisted methodologies.
○ Quality - Search can deliver high quality results, particularly if keyword
terms are carefully developed and tested.
○ Avoids Manual Review Errors/Inconsistencies - Search results are
computer generated, and so avoid known human review errors that
can result from fatigue, inadequate training, lack of focus, etc.
Best Practices for Keyword Search
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014
Cons of Keyword Searching
eDiscovery FAST
○ Search Can be Over or Under-Inclusive - Search terms can bring back
too many junk results or miss good results. These are known as ‘false
positives’ and ‘false negatives’.
○ Difficulty of Creating Good Search Terms - Constructing good search
terms takes design time, testing, iterations, and analysis.
○ Non-Searchable Text - Search results can only be as good as the
underlying searchable text. ESI collections and review tools can miss
text that a human reviewer might catch for a variety of reasons.
○ Some file types can’t be indexed - There is little consistency in what
files can be indexed across litigation databases.
Best Practices for Keyword Search
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014
Construct Quality Searches
○ Start with Request for Production - Translate the demands of the RFP into
a keyword search strategy.
○ Interview Custodians - Ask key case participants / data custodians about
their ESI. Use their insights and their terminology to find obscure key
documents.
○ Include Jargon - Seek out industry or company, company sub-culture
specific terms you may not be familiar with.
○ Included Misspellings - Include misspelled versions of keywords or (use
‘fuzzy search’ settings or boolean limiters) in your search string to
account for emails, etc. with typos.
eDiscovery FAST
Best Practices for Keyword Search
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014
Use Search Expanders
Search Expanders Enable Easy Expansion to Reduce False Negatives
○ Concept - Thesaurus lookup and synonym search. Conceptually expands
search query.
○ Stemming - Expands query to include derivative terms associated with the
search keywords.
○ Fuzzy - insertion deletion, or substitution of a character in the search
query to account for search error, spelling errors within the document,
and potential OCR error
○ Phonetic - Returns results that sound similar to the search query.
eDiscovery FAST
Best Practices for Keyword Search
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014
Use Search Expanders
Concept Search Example
eDiscovery FAST
‘Trade’ = ‘Swap’ = ‘quid pro quo’
Best Practices for Keyword Search
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014
Use Search Expanders
Stemming Search Example
eDiscovery FAST
‘Trade’ = ‘Trading’ = ‘Trades’
Best Practices for Keyword Search
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014
Use Search Expanders
Fuzzy Search Example - Misspelling
eDiscovery FAST
‘Fastow’ = ‘Fastaw’ = ‘Fasto’
Best Practices for Keyword Search
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014
Use Search Expanders
Boolean Search
eDiscovery FAST
○ Basic Boolean Operators:
- AND: returns results including both terms
- OR : looking for at least one of a list of terms
- NOT : exclude terms you don’t want
- ( ) : can be used to separate OR statements from the rest of the
boolean string.
- PRE/n : First search term does not precede the second term by more
than n words.
- Wildcard Characters: ‘*’ replaces a letter in your search term, ‘!’ allows
for stemming search within a boolean query
Best Practices for Keyword Search
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014
Use Search Limiters
Search Limiters Reduce False Positives (Noise)
○ Filter Out Unneeded File Types. Some file types are unlikely to lead to
useful information and can be excluded.
○ Use Boolean Modifiers to Limit Overly Expansive Searches - Boolean
modifiers can reduce the number of documents returned from a query
while increasing the relevance of those files. Exclude certain words or
combinations, and specify word order.
eDiscovery FAST
Best Practices for Keyword Search
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014
Use Search Limiters
Boolean Search Example
eDiscovery FAST
‘Lay’! w/25 ‘Chewco’
Best Practices for Keyword Search
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014
Test Keyword Searching Results
○ Look at Results Returned. Searching without review and testing may result
in low quality results.
○ Sample & Look for Ways to Limit Search - Create new queries that reduce
false positives.
○ More new keywords. - Viewing search results may prompt the discovery of
additional keywords that could be used to expand or reduce search
queries.
○ Fuzzy and Concept Search - New keywords found by searching and
returning synonyms and near identical words.
Keyword searching becomes an iterative process.
eDiscovery FAST
There Are Traditionally Two Types of Search Indices:
○ Imaged and OCRed - The search text is coming from the files after they
have been converted to TIFF / PDF.
○ Extracted Text - The search text is coming from text extracted from the
original file.
Both approaches have significant limitations.
Best Practices for Keyword Search
Common Indexing Methods
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 eDiscovery FAST
○ Description - Native files (email, attachments, spreadsheets, etc.) are
converted to a paginated image file and then OCR is applied to make the
text searchable. (ex. TIFF production with no extracted text).
○ How? - Conversion software uses a ‘print-driver approach’ to virtually
image what would have been physically printed.
○ Data Not Indexed - Headers/footers/notes, comments and revisions,
highlighted text, hidden sheets or text, print selections, applied filters,
Best Practices for Keyword Search
Search Index Based on OCR of Imaged Files
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 eDiscovery FAST
Best Practices for Keyword Search
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 eDiscovery FAST
Search Index Based on OCR of Imaged Files
‘Chewco 2000 Pro Forma Sheet’
‘Body Text’
OCR Based Index Will Include:
How Doc Appears Natively:
○ Description - Available text from Native files (email, attachments,
spreadsheets, etc.) is extracted and indexed by the search engine using
text parsing. (ex. pure native review)
○ How? - Only available text is used. There is no OCR applied.
○ Data Not Indexed - Non-text files (ex. scanned documents) and embedded
text, objects, or visuals will not be indexed. Different native extraction
methods can also vary in their ability to recognize certain types of text.
Best Practices for Keyword Search
Search Index Based on Native Extraction
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 eDiscovery FAST
Best Practices for Keyword Search
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 eDiscovery FAST
Search Index Based on Native Extraction
Native Extraction Index Will Include:How Doc Appears Natively:
Page 1/12
Chewco 2000 Pro Forma Balance
Statement Sheet [S1: CRITICAL ENRON
EVIDENCE]
Page 1/12
Best Practices for Keyword Search
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014
Dual Index
Benefits of Dual Index Approach
eDiscovery FAST
○ The Lexbe search engine indexes both text extracted from Native files
(email, attachments, spreadsheets, etc.) and a paginated file converted
from Native files into PDF or TIFF and OCRed.
○ Most comprehensive approach minimizes potential for lost and
unsearchable data.
Index Method
Captures
Embedded Text
Captures Text
Excluded From
Print
Captures
Hidden Text
Imaged/OCR Yes No No
Native Extraction No Yes Yes
Lexbe Dual Index Yes Yes Yes
Best Practices for Keyword Search
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014
Dual Index
eDiscovery FAST
Thank You for Attending
About Lexbe and Contact Information
Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014
Phone (Toll Free) (800) 401-7809
Webinar Questions: webinars@lexbe.com
eDiscovery FAST
Next Month’s Webinar:
Legal Timelines and Early Case Assessment
Lexbe is an eDiscovery software and services provider based in Austin, TX.

Weitere ähnliche Inhalte

Kürzlich hochgeladen

Contract law. Indemnity
Contract law.                     IndemnityContract law.                     Indemnity
Contract law. Indemnity
mahikaanand16
 
一比一原版(USYD毕业证书)澳洲悉尼大学毕业证如何办理
一比一原版(USYD毕业证书)澳洲悉尼大学毕业证如何办理一比一原版(USYD毕业证书)澳洲悉尼大学毕业证如何办理
一比一原版(USYD毕业证书)澳洲悉尼大学毕业证如何办理
A AA
 
一比一原版赫尔大学毕业证如何办理
一比一原版赫尔大学毕业证如何办理一比一原版赫尔大学毕业证如何办理
一比一原版赫尔大学毕业证如何办理
Airst S
 
一比一原版(RMIT毕业证书)皇家墨尔本理工大学毕业证如何办理
一比一原版(RMIT毕业证书)皇家墨尔本理工大学毕业证如何办理一比一原版(RMIT毕业证书)皇家墨尔本理工大学毕业证如何办理
一比一原版(RMIT毕业证书)皇家墨尔本理工大学毕业证如何办理
ss
 
一比一原版(UC毕业证书)堪培拉大学毕业证如何办理
一比一原版(UC毕业证书)堪培拉大学毕业证如何办理一比一原版(UC毕业证书)堪培拉大学毕业证如何办理
一比一原版(UC毕业证书)堪培拉大学毕业证如何办理
bd2c5966a56d
 
一比一原版(Cranfield毕业证书)克兰菲尔德大学毕业证如何办理
一比一原版(Cranfield毕业证书)克兰菲尔德大学毕业证如何办理一比一原版(Cranfield毕业证书)克兰菲尔德大学毕业证如何办理
一比一原版(Cranfield毕业证书)克兰菲尔德大学毕业证如何办理
F La
 
一比一原版(JCU毕业证书)詹姆斯库克大学毕业证如何办理
一比一原版(JCU毕业证书)詹姆斯库克大学毕业证如何办理一比一原版(JCU毕业证书)詹姆斯库克大学毕业证如何办理
一比一原版(JCU毕业证书)詹姆斯库克大学毕业证如何办理
Airst S
 

Kürzlich hochgeladen (20)

Shubh_Burden of proof_Indian Evidence Act.pptx
Shubh_Burden of proof_Indian Evidence Act.pptxShubh_Burden of proof_Indian Evidence Act.pptx
Shubh_Burden of proof_Indian Evidence Act.pptx
 
Hely-Hutchinson v. Brayhead Ltd .pdf
Hely-Hutchinson v. Brayhead Ltd         .pdfHely-Hutchinson v. Brayhead Ltd         .pdf
Hely-Hutchinson v. Brayhead Ltd .pdf
 
Contract law. Indemnity
Contract law.                     IndemnityContract law.                     Indemnity
Contract law. Indemnity
 
一比一原版(USYD毕业证书)澳洲悉尼大学毕业证如何办理
一比一原版(USYD毕业证书)澳洲悉尼大学毕业证如何办理一比一原版(USYD毕业证书)澳洲悉尼大学毕业证如何办理
一比一原版(USYD毕业证书)澳洲悉尼大学毕业证如何办理
 
Elective Course on Forensic Science in Law
Elective Course on Forensic Science  in LawElective Course on Forensic Science  in Law
Elective Course on Forensic Science in Law
 
Relationship Between International Law and Municipal Law MIR.pdf
Relationship Between International Law and Municipal Law MIR.pdfRelationship Between International Law and Municipal Law MIR.pdf
Relationship Between International Law and Municipal Law MIR.pdf
 
Police Misconduct Lawyers - Law Office of Jerry L. Steering
Police Misconduct Lawyers - Law Office of Jerry L. SteeringPolice Misconduct Lawyers - Law Office of Jerry L. Steering
Police Misconduct Lawyers - Law Office of Jerry L. Steering
 
一比一原版赫尔大学毕业证如何办理
一比一原版赫尔大学毕业证如何办理一比一原版赫尔大学毕业证如何办理
一比一原版赫尔大学毕业证如何办理
 
589308994-interpretation-of-statutes-notes-law-college.pdf
589308994-interpretation-of-statutes-notes-law-college.pdf589308994-interpretation-of-statutes-notes-law-college.pdf
589308994-interpretation-of-statutes-notes-law-college.pdf
 
A SHORT HISTORY OF LIBERTY'S PROGREE THROUGH HE EIGHTEENTH CENTURY
A SHORT HISTORY OF LIBERTY'S PROGREE THROUGH HE EIGHTEENTH CENTURYA SHORT HISTORY OF LIBERTY'S PROGREE THROUGH HE EIGHTEENTH CENTURY
A SHORT HISTORY OF LIBERTY'S PROGREE THROUGH HE EIGHTEENTH CENTURY
 
一比一原版(RMIT毕业证书)皇家墨尔本理工大学毕业证如何办理
一比一原版(RMIT毕业证书)皇家墨尔本理工大学毕业证如何办理一比一原版(RMIT毕业证书)皇家墨尔本理工大学毕业证如何办理
一比一原版(RMIT毕业证书)皇家墨尔本理工大学毕业证如何办理
 
一比一原版(UC毕业证书)堪培拉大学毕业证如何办理
一比一原版(UC毕业证书)堪培拉大学毕业证如何办理一比一原版(UC毕业证书)堪培拉大学毕业证如何办理
一比一原版(UC毕业证书)堪培拉大学毕业证如何办理
 
Understanding the Role of Labor Unions and Collective Bargaining
Understanding the Role of Labor Unions and Collective BargainingUnderstanding the Role of Labor Unions and Collective Bargaining
Understanding the Role of Labor Unions and Collective Bargaining
 
CAFC Chronicles: Costly Tales of Claim Construction Fails
CAFC Chronicles: Costly Tales of Claim Construction FailsCAFC Chronicles: Costly Tales of Claim Construction Fails
CAFC Chronicles: Costly Tales of Claim Construction Fails
 
Performance of contract-1 law presentation
Performance of contract-1 law presentationPerformance of contract-1 law presentation
Performance of contract-1 law presentation
 
一比一原版(Cranfield毕业证书)克兰菲尔德大学毕业证如何办理
一比一原版(Cranfield毕业证书)克兰菲尔德大学毕业证如何办理一比一原版(Cranfield毕业证书)克兰菲尔德大学毕业证如何办理
一比一原版(Cranfield毕业证书)克兰菲尔德大学毕业证如何办理
 
一比一原版(JCU毕业证书)詹姆斯库克大学毕业证如何办理
一比一原版(JCU毕业证书)詹姆斯库克大学毕业证如何办理一比一原版(JCU毕业证书)詹姆斯库克大学毕业证如何办理
一比一原版(JCU毕业证书)詹姆斯库克大学毕业证如何办理
 
The Main Steps on Starting a Business in Spain
The Main Steps on Starting a Business in SpainThe Main Steps on Starting a Business in Spain
The Main Steps on Starting a Business in Spain
 
Cyber Laws : National and International Perspective.
Cyber Laws : National and International Perspective.Cyber Laws : National and International Perspective.
Cyber Laws : National and International Perspective.
 
Navigating Employment Law - Term Project.pptx
Navigating Employment Law - Term Project.pptxNavigating Employment Law - Term Project.pptx
Navigating Employment Law - Term Project.pptx
 

Empfohlen

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Empfohlen (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Lexbe eDiscovery Webinar- Best Practices: Advanced eDiscovery Search

  • 1. Best Practices: eDiscovery Search Improve Speed and Accuracy of Reviews & Productions with the Latest Tools February 27, 2014 Karsten Weber Principal, Lexbe LC eDiscovery FAST
  • 2. eDiscovery Webinar Series ○ Takes Place Monthly ○ Cover a Variety of Relevant eDiscovery Topics Next Month: Legal Timelines and Early Case Assessment ○ Presentations Available for Download by Registrants. Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 Info & Future eDiscovery FAST
  • 3. If you have any questions or technical issues, please e-mail them to: webinars@lexbe.com Questions will be forwarded to Karsten and answered during the webinar or via e-mail if we run out of time. Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 eDiscovery Webinar Series Questions & Technical Issues eDiscovery FAST
  • 4. ○ Current - Principal of Lexbe LC - Principal Architect of Lexbe eDiscovery Suites and Lexbe eDiscovery Services ○ Prior Experience - Consulting Expert, Lumin Expert Group - Director of Software, nLine Corporation - Software Engineering Manager, KLA-Tencor ○ Education - MBA, University of Texas - M.S. Engineering, Danish Technical University Karsten Weber bio eDiscovery Webinar Series Contact Karsten Weber 512-686-3469 karsten@lexbe.com Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014
  • 5. Best Practices for Keyword Search Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 Use of Keyword Search In Discovery ○ Early Stage Culling - Reduce amount of ESI to be reviewed by using keywords to cull document collections. ○ Keyword-Based Responsive & Privilege Review - Construct search queries to return documents that are likely to be responsive, confidential. Search by name and email of counsel; privilege, work- product, confidential and related keywords. ○ ID Documents for Depo Prep - Find and assign key documents related to specific case participants to prepare for depositions. Search by email addresses used, names and nicknames used, important issues associated with deponent. ○ ID of Key Docs for Trial - Find and mark key case documents. Code documents that will be needed for trial. eDiscovery FAST
  • 6. Best Practices for Keyword Search Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 Pros of Keyword Searching eDiscovery FAST ○ Fast - Keyword search is very fast compared with other document search methodologies. ○ Inexpensive - Good results can be obtained at little cost compared with manual review or other computer assisted methodologies. ○ Quality - Search can deliver high quality results, particularly if keyword terms are carefully developed and tested. ○ Avoids Manual Review Errors/Inconsistencies - Search results are computer generated, and so avoid known human review errors that can result from fatigue, inadequate training, lack of focus, etc.
  • 7. Best Practices for Keyword Search Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 Cons of Keyword Searching eDiscovery FAST ○ Search Can be Over or Under-Inclusive - Search terms can bring back too many junk results or miss good results. These are known as ‘false positives’ and ‘false negatives’. ○ Difficulty of Creating Good Search Terms - Constructing good search terms takes design time, testing, iterations, and analysis. ○ Non-Searchable Text - Search results can only be as good as the underlying searchable text. ESI collections and review tools can miss text that a human reviewer might catch for a variety of reasons. ○ Some file types can’t be indexed - There is little consistency in what files can be indexed across litigation databases.
  • 8. Best Practices for Keyword Search Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 Construct Quality Searches ○ Start with Request for Production - Translate the demands of the RFP into a keyword search strategy. ○ Interview Custodians - Ask key case participants / data custodians about their ESI. Use their insights and their terminology to find obscure key documents. ○ Include Jargon - Seek out industry or company, company sub-culture specific terms you may not be familiar with. ○ Included Misspellings - Include misspelled versions of keywords or (use ‘fuzzy search’ settings or boolean limiters) in your search string to account for emails, etc. with typos. eDiscovery FAST
  • 9. Best Practices for Keyword Search Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 Use Search Expanders Search Expanders Enable Easy Expansion to Reduce False Negatives ○ Concept - Thesaurus lookup and synonym search. Conceptually expands search query. ○ Stemming - Expands query to include derivative terms associated with the search keywords. ○ Fuzzy - insertion deletion, or substitution of a character in the search query to account for search error, spelling errors within the document, and potential OCR error ○ Phonetic - Returns results that sound similar to the search query. eDiscovery FAST
  • 10. Best Practices for Keyword Search Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 Use Search Expanders Concept Search Example eDiscovery FAST ‘Trade’ = ‘Swap’ = ‘quid pro quo’
  • 11. Best Practices for Keyword Search Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 Use Search Expanders Stemming Search Example eDiscovery FAST ‘Trade’ = ‘Trading’ = ‘Trades’
  • 12. Best Practices for Keyword Search Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 Use Search Expanders Fuzzy Search Example - Misspelling eDiscovery FAST ‘Fastow’ = ‘Fastaw’ = ‘Fasto’
  • 13. Best Practices for Keyword Search Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 Use Search Expanders Boolean Search eDiscovery FAST ○ Basic Boolean Operators: - AND: returns results including both terms - OR : looking for at least one of a list of terms - NOT : exclude terms you don’t want - ( ) : can be used to separate OR statements from the rest of the boolean string. - PRE/n : First search term does not precede the second term by more than n words. - Wildcard Characters: ‘*’ replaces a letter in your search term, ‘!’ allows for stemming search within a boolean query
  • 14. Best Practices for Keyword Search Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 Use Search Limiters Search Limiters Reduce False Positives (Noise) ○ Filter Out Unneeded File Types. Some file types are unlikely to lead to useful information and can be excluded. ○ Use Boolean Modifiers to Limit Overly Expansive Searches - Boolean modifiers can reduce the number of documents returned from a query while increasing the relevance of those files. Exclude certain words or combinations, and specify word order. eDiscovery FAST
  • 15. Best Practices for Keyword Search Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 Use Search Limiters Boolean Search Example eDiscovery FAST ‘Lay’! w/25 ‘Chewco’
  • 16. Best Practices for Keyword Search Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 Test Keyword Searching Results ○ Look at Results Returned. Searching without review and testing may result in low quality results. ○ Sample & Look for Ways to Limit Search - Create new queries that reduce false positives. ○ More new keywords. - Viewing search results may prompt the discovery of additional keywords that could be used to expand or reduce search queries. ○ Fuzzy and Concept Search - New keywords found by searching and returning synonyms and near identical words. Keyword searching becomes an iterative process. eDiscovery FAST
  • 17. There Are Traditionally Two Types of Search Indices: ○ Imaged and OCRed - The search text is coming from the files after they have been converted to TIFF / PDF. ○ Extracted Text - The search text is coming from text extracted from the original file. Both approaches have significant limitations. Best Practices for Keyword Search Common Indexing Methods Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 eDiscovery FAST
  • 18. ○ Description - Native files (email, attachments, spreadsheets, etc.) are converted to a paginated image file and then OCR is applied to make the text searchable. (ex. TIFF production with no extracted text). ○ How? - Conversion software uses a ‘print-driver approach’ to virtually image what would have been physically printed. ○ Data Not Indexed - Headers/footers/notes, comments and revisions, highlighted text, hidden sheets or text, print selections, applied filters, Best Practices for Keyword Search Search Index Based on OCR of Imaged Files Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 eDiscovery FAST
  • 19. Best Practices for Keyword Search Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 eDiscovery FAST Search Index Based on OCR of Imaged Files ‘Chewco 2000 Pro Forma Sheet’ ‘Body Text’ OCR Based Index Will Include: How Doc Appears Natively:
  • 20. ○ Description - Available text from Native files (email, attachments, spreadsheets, etc.) is extracted and indexed by the search engine using text parsing. (ex. pure native review) ○ How? - Only available text is used. There is no OCR applied. ○ Data Not Indexed - Non-text files (ex. scanned documents) and embedded text, objects, or visuals will not be indexed. Different native extraction methods can also vary in their ability to recognize certain types of text. Best Practices for Keyword Search Search Index Based on Native Extraction Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 eDiscovery FAST
  • 21. Best Practices for Keyword Search Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 eDiscovery FAST Search Index Based on Native Extraction Native Extraction Index Will Include:How Doc Appears Natively: Page 1/12 Chewco 2000 Pro Forma Balance Statement Sheet [S1: CRITICAL ENRON EVIDENCE] Page 1/12
  • 22. Best Practices for Keyword Search Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 Dual Index Benefits of Dual Index Approach eDiscovery FAST ○ The Lexbe search engine indexes both text extracted from Native files (email, attachments, spreadsheets, etc.) and a paginated file converted from Native files into PDF or TIFF and OCRed. ○ Most comprehensive approach minimizes potential for lost and unsearchable data. Index Method Captures Embedded Text Captures Text Excluded From Print Captures Hidden Text Imaged/OCR Yes No No Native Extraction No Yes Yes Lexbe Dual Index Yes Yes Yes
  • 23. Best Practices for Keyword Search Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 Dual Index eDiscovery FAST
  • 24. Thank You for Attending About Lexbe and Contact Information Best Practices for Keyword Search, Lexbe eDiscovery Webinar Series February 27, 2014 Phone (Toll Free) (800) 401-7809 Webinar Questions: webinars@lexbe.com eDiscovery FAST Next Month’s Webinar: Legal Timelines and Early Case Assessment Lexbe is an eDiscovery software and services provider based in Austin, TX.