New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
INVESTIGRAPH: Using Neo4j for Investigative Journalism
1. INVESTIGRAPH: Using Neo4j for
Investigative Journalism
Sarah Blaskey
Manuel Villa
Columbia Journalism School
GraphConnect
11 May 2017
London
2. The Two Uses of Neo4J in Journalism
1. As a presentation tool.
Ideally designed to present findings and to create a curation
tool for public use (Panama Papers)
2. Aiding the investigative process internally.
6. It all started with one businessman
with holdings all over the world.
Reporting Question: Was it possible that this man was mainly doing
business with a small group of people?
11. The “discovery” phase - Looking
for reporting leads
How do we maximize the usefulness of our neo4j instance?
12.
13.
14.
15. Queries:
• Shortest Path Queries when clicking bubbles is too much.
• Using informal connections to maximize potential that the
queries will turn something back.
18. Problems
● Name matching
● Queries on a timeline
● Incomplete or poorly curated data. Queries won’t work.
● Understand data should not be modeled as a story!
19. Thank You
● Sarah Blaskey - slb2226@columbia.edu
● Manuel Villa - jmv2104@columbia.edu
Hinweis der Redaktion
The Panama Papers opened our eyes to the idea of using graph data for journalism to find stories within big data.
Since then, investigative journalists have found innovative ways to use graph data to map out investigations, and match it with publicly available information to find leads and invisible connections.
This session will explore what works, what hasn’t worked and where we hope this technology will take us.
The most basic step: clips, clips, clips.
Start building a timeline and the first, raw relationship database.
Tens of thousands of documents, analyzed via OCR and entity extraction
Innovative: Documents as connectors
This represents the structure of one, only one of the roughly 400 hundred LLCs and corporations that the original document showed our subject had interests in.