Talk describes legal NLP idea discusses the following papers:
HLDC: Hindi Legal Documents Corpus https://precog.iiit.ac.in/pubs/HLDC_ACL_2022.pdf
Drug consumption: https://precog.iiit.ac.in/pubs/Effect_oF_Feedback_on_Drug_Consumption_Disclosures_on_Social_Media___ICWSM2023___16Sept1730hrs.pdf
4. Questions from you
opportunity to work with PK as a post doc (3)
Just wanted to know his insights about the biology based technological innovations that can be done using the
present days state of the art technologies.
What are new edge-cutting research happening in the interdisciplinary field of computer and electrical
engineering
Publication of papers, what all need to take care during writing
How can we apply for funding while doing phd
Scope of Post doc opportunities in India . A comparison with abroad
ACM India opportunities
How do you publish as an undergraduate?
How can we prepare for research, when we are studying in ug
Career opportunities after phd
Any anonymisation techniques for handling Person Identifiable Information (PII) in data visualisation?
4
5. What is Social Computing?
5
https://en.wikipedia.org/wiki/Social_computing
7. Legal AI for Indian Context
District courts are usually the first
point of contact between the people
and the judiciary.
Lower courts in India are burdened
with a backlog of cases (~40 million
as of 2021).
Local languages used in the
documents filed in district courts in
India.
7
Supreme Court
High Courts
District Courts
8. Legal AI / NLP - Data
We collected ~900k district court case documents from Uttar
Pradesh
All documents in Hindi, written in Devanagari
There are legal corpora for European Court of Justice and Chinese
courts, none for Indian district courts
8
9. Legal AI / NLP - Data
There are around 300 different case types, table shows the prominent
ones
Majority of the case documents correspond to Bail Applications
9
Variation in number of case documents per district
Case types in HLDC
10. Legal AI / NLP - Bail Documents
10
District-wise ratio of number of bail applications to total cases
11. Legal AI / NLP - Bail Prediction Model
11
In general, the performance is lower in district-wise settings, possibly due to large
variation across districts
Overall, summarization models perform better than Doc2Vec and simpler
Transformer-based models
12. Legal AI / NLP for Indian Context
12
HLDC: Hindi Legal Documents Corpus
13. Legal AI / NLP for Indian Context - Takeaways
Indian Legal documents are a rich a source of domain-specific Indic-
language corpora, readily available online.
Multiple tasks still need attention especially for Indian settings
Legal Summarization
Case recommendations
Citation predictions / network
Sleeping beauty
Bias
13
33. First and continues positive feedback increased
drug consumption by upto 2x
User belief
(Likes) Scores 2.2/5
Comments 2.5/5
(Little to Moderate impact)
37. Takeaways
No dearth of problems to study…
You should be keen to look for it around…
Will be happy to discuss anything further…
Happy to host students (PhD, MS, Btech), faculty…
37