The document discusses how digitizing legal documents and processes can help disrupt the legal industry for positive change. It outlines how digitizing case journals, profiling expert witnesses, and reviewing expert reports using technologies like optical character recognition, natural language processing and databases can increase access to information, improve the quality of expert testimony and reports, and help reduce wrongful convictions. The document argues the scope for further automation in areas like scheduling and case updates and provides tips on ensuring digitization transforms data into useful, searchable, and distributed information.
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
MongoDB World 2018: Digitizing Colossal Data: Using Tech to Disrupt the Legal Industry and Bring Forth Positive Change
1. Digitizing Colossal Data
Using Tech to Disrupt the Legal Industry
for Positive Change
Neha Nivedita
Software Engineer, NodeXperts
nivedit.in
@niknivedit
4. Agenda
1. Need for Tech Disruption in Legal Industry
2. Limiting factors & their effects
3. Current situation
4. Digital solutions
5. Scope for Automation
6. Tips for Digitization
12. An expert witness is a person
of specialized knowledge or
skill in a particular field
qualified to present their
opinion about the facts of a
case during legal proceedings.
13. Case Study
Vaccination case involving a severely disabled baby girl:
● Plaintiff was an infant with a history of seizures
● She was given whole cell pertussis vaccination
● Her brain was found to be profoundly damaged
14. The defendant posited fully
credentialed experts
Against a scientist with sub par
expert credentials for the
plaintiff
15. The jury ruled in favor of the
plaintiff.
However, the judge set aside
the verdict due to inadequate
proof by the plaintiff’s expert
witness.
16. “It was somewhat disquieting not to be able to reach out to the
scientific community to obtain an expert who could testify as a
‘neutral authority’ in court.
The second thing that troubled me was that when the case was
over, I felt that impartial scientists who knew the field might
well agree that the expert retained by the plaintiff should not be
allowed to testify on this subject again.”
— Judge Jack Weinstein
(1998)
17. “...I did not know, however, what, if anything, I could do about
this. There was no acceptable mechanism for contacting the
relevant professional organizations, nor did I have any
assurance that those organizations would have been receptive
to my communications.”
— Judge Jack Weinstein
(1998)
19. Current Situation
● PACER (1996)
○ Public Access to Court Electronic Records
● CM / ECF (1998)
○ Case Management / Electronic Case Files
● eCourts (2005)
● E-filing of Supreme Court cases (2017)
● Daubert Tracker
20. Hardcopy Records
● Cases before 1999
● Court argument transcripts till date
● Lack of uniform data formats across state courts
Even older case files:
● US: Archived under NARA but in unsearchable formats
● India: Still only on hardcopy records
* NARA = National Archives and Records Administration
21. Problems
● Tedious litigation process
● Inaccessibility of affordable court
transcripts
● Weak evidence and expert reports
27. Digitizing Case Journals
Digitized millions of court case documents
● Used structured format for searchability
● Stored in a live database
● Analyzed to extract meaningful keywords
28. Digitizing Case Journals
Added nifty search features on the keywords:
● Named Entity Recognition algorithm
● Name wise search using NER
● Search with citation, headnote, judge, location etc.
29. Digitizing Case Journals
Real time records availability!
● Attorneys can research easily
● Time to build a case reduced
● Execute hundreds of queries per second
● Citations abound!
30. Digitizing Case Journals
● Optical Character Recognition (OCR)
○ To convert text and images from scans into data
objects
● Named Entity Recognition (NER)
○ Labels sequences of words that are names of things
○ Stanford’s library based on JAVA and NLP
31. Digitizing Case Journals
● Solr Plugin
○ To process the NER requests
○ Return the named entities from the texts
○ Super fast search!
● Node.js + Vue + Zend app
33. Profiling Expert Witnesses
● Earlier it took 3-4 days to create a single
expert profile
● A lot of documents had to be processed and
analyzed manually
● Needed a contextual search tool to easily
browse through the docs
34. Profiling Expert Witnesses
● NER to identify names in a document
● Auto crawling through obscure legal
research databases to get hard-to-find data
● A Zend dashboard to generate expert
profiles
New turnaround time: 3-4 hours!
36. Reviewing Expert Reports
Real time web app:
● Expert witnesses write legal reports
● Verify quality of expert reports
● Peer reviews were done offline
● Need for a digital platform for this type of
service
37. Reviewing Expert Reports
Node.js + Express + Angular
● Platform for blind peer reviews of reports
● Step-wise auditable review process
● Iteration based approval and rejection
flows for reports
38. Reviewing Expert Reports
MongoDB
● Report document storage
● Maintains expert & reviewer profiles
● Fast search capabilities
● Real time updates to web app
42. Scope for Automation
● Scheduling can be a pain
● PACER
○ Public Access to Court Electronic Records
○ Uses RSS feed based alerting system
43. Scope for Automation
● Automate alerts for:
○ Court dates
○ Case status changes
● Make life easier for attorneys:
○ Especially underfunded public defenders
& pro-bono attorneys
○ They have to juggle a lot of things!