SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Working With Data and Humans
Daniel X. O’Neil
@juggernautco
Me
• Daniel X. O’Neil
• Co-founder of EveryBlock
• 2007 Knight News Challenge
• Executive Director of Smart Chicago
Collaborative
• 2012 Knight Community Information
Challenge
@juggernautco
The Data Revolution
• I know about some, but not all of it
• Since about 2005
• Working with the Mayor’s Office in Chicago
• ChicagoWorksForYou.com
• Then at EveryBlock, where I was responsible
for data acquisition
@juggernautco
The Data Revolution
• 8 Principles of Open Government Data
• Independent Government Observers Task
Force
• POTUS Executive Orders on Inauguration Day
• Apps contests
• Municipal ordinances
• Socrata
• Code for America
@juggernautco
There’s Data and There’s Humans
• Talk to me about your data and your humans
in your projects
@juggernautco
Data
• Dense
• Sits by itself
• Not social
• Not self-aware
• Unable to contextualize itself
• Does not have any problems, because it
doesn’t care about anything
@juggernautco
People
• Naturally social
• Soft
• Have problems
• See everything in context
• Prone to mistakes
@juggernautco
People make data
@juggernautco
@juggernautco
@juggernautco
@juggernautco
Value from data
• Know more than anyone
• Surfacing from the hidden Web
• Context, context, context
• Even if it is just one data set mashed against
another data set
• Did it rain * Did property crime go up or down
• Foreclosures * Retail stores
• Also: the simple act of aggregation + text
@juggernautco
@juggernautco
Ten Databases
• Building permits
• Business licenses
• Historic preservation list
• Sanborn maps (1929 and 1950)
• County assessor
• County recorder of deeds
• Original photography
• Google search for news coverage
• New York Times archive
• Walgreens surplus property
@juggernautco
We need a machine.
• A generic context engine
• To evenly distribute information
• And tell me what the information
means
• I know: that sounds like a
“reporter”
• But people used to think that
“search engine” sounded a lot like
“librarian”, too
• We need humans and machines
@juggernautco
It’s easy.
• Find dataset
• Review dataset
• Describe what the data means
• Find another dataset
• Describe what the other dataset
means
• Describe what the first dataset means
in the context of the second dataset
• Repeat
• Let’s do this thing.
@juggernautco
@juggernautco
Dedicated databases work
Call any time.
• @juggernautco
• (773) 960-6045
@juggernautco

Weitere ähnliche Inhalte

Andere mochten auch

The Chicago Police Department’s Information Collection for Automated Mapping ...
The Chicago Police Department’s Information Collection for Automated Mapping ...The Chicago Police Department’s Information Collection for Automated Mapping ...
The Chicago Police Department’s Information Collection for Automated Mapping ...
Daniel X. O'Neil
 
Request for Proposal (RFP) No. 1390-13069 for Cook County Website Implementat...
Request for Proposal (RFP) No. 1390-13069 for Cook County Website Implementat...Request for Proposal (RFP) No. 1390-13069 for Cook County Website Implementat...
Request for Proposal (RFP) No. 1390-13069 for Cook County Website Implementat...
Daniel X. O'Neil
 
2009 - 2013 Affordable Housing Plan: Keeping Chicago’s neighborhoods affordab...
2009 - 2013 Affordable Housing Plan: Keeping Chicago’s neighborhoods affordab...2009 - 2013 Affordable Housing Plan: Keeping Chicago’s neighborhoods affordab...
2009 - 2013 Affordable Housing Plan: Keeping Chicago’s neighborhoods affordab...
Daniel X. O'Neil
 
Portland Regional Arts & Culture Council: Rpeort to Community
Portland Regional Arts & Culture Council: Rpeort to CommunityPortland Regional Arts & Culture Council: Rpeort to Community
Portland Regional Arts & Culture Council: Rpeort to Community
Daniel X. O'Neil
 
Chicago Green Alley Handbook, 2010
Chicago Green Alley Handbook, 2010Chicago Green Alley Handbook, 2010
Chicago Green Alley Handbook, 2010
Daniel X. O'Neil
 

Andere mochten auch (16)

The Chicago Police Department’s Information Collection for Automated Mapping ...
The Chicago Police Department’s Information Collection for Automated Mapping ...The Chicago Police Department’s Information Collection for Automated Mapping ...
The Chicago Police Department’s Information Collection for Automated Mapping ...
 
Civic Data and Open Government: How Local Funders Can Get Involved
Civic Data and Open Government: How Local Funders Can Get InvolvedCivic Data and Open Government: How Local Funders Can Get Involved
Civic Data and Open Government: How Local Funders Can Get Involved
 
2013 Carl Sandburg Literary Awards Dinner Author Bios
2013 Carl Sandburg Literary Awards Dinner Author Bios2013 Carl Sandburg Literary Awards Dinner Author Bios
2013 Carl Sandburg Literary Awards Dinner Author Bios
 
A New Era of Responsibility: Renewing America’s Promise
A New Era of Responsibility: Renewing America’s PromiseA New Era of Responsibility: Renewing America’s Promise
A New Era of Responsibility: Renewing America’s Promise
 
Contract between Platinum-Poolcare Aquatech and the City of Chicago
Contract between Platinum-Poolcare Aquatech and the City of ChicagoContract between Platinum-Poolcare Aquatech and the City of Chicago
Contract between Platinum-Poolcare Aquatech and the City of Chicago
 
Civic Summer Business Ideas
Civic Summer Business IdeasCivic Summer Business Ideas
Civic Summer Business Ideas
 
Contact 4339: Rental And Placement Of Traffic Control Devices Chicago Departm...
Contact 4339: Rental And Placement Of Traffic Control Devices Chicago Departm...Contact 4339: Rental And Placement Of Traffic Control Devices Chicago Departm...
Contact 4339: Rental And Placement Of Traffic Control Devices Chicago Departm...
 
Affordable Internet Options (Launch of FreedomPop in Chicago)
Affordable Internet Options  (Launch of FreedomPop in Chicago)Affordable Internet Options  (Launch of FreedomPop in Chicago)
Affordable Internet Options (Launch of FreedomPop in Chicago)
 
The CUTGroup Book
The CUTGroup BookThe CUTGroup Book
The CUTGroup Book
 
Proposed American Reinvestment and Recovery Act Projects
Proposed  American Reinvestment and Recovery Act ProjectsProposed  American Reinvestment and Recovery Act Projects
Proposed American Reinvestment and Recovery Act Projects
 
Road to Government 2.0: Technological Problems and Solutions for Transparency...
Road to Government 2.0: Technological Problems and Solutions for Transparency...Road to Government 2.0: Technological Problems and Solutions for Transparency...
Road to Government 2.0: Technological Problems and Solutions for Transparency...
 
Request for Proposal (RFP) No. 1390-13069 for Cook County Website Implementat...
Request for Proposal (RFP) No. 1390-13069 for Cook County Website Implementat...Request for Proposal (RFP) No. 1390-13069 for Cook County Website Implementat...
Request for Proposal (RFP) No. 1390-13069 for Cook County Website Implementat...
 
2009 - 2013 Affordable Housing Plan: Keeping Chicago’s neighborhoods affordab...
2009 - 2013 Affordable Housing Plan: Keeping Chicago’s neighborhoods affordab...2009 - 2013 Affordable Housing Plan: Keeping Chicago’s neighborhoods affordab...
2009 - 2013 Affordable Housing Plan: Keeping Chicago’s neighborhoods affordab...
 
10 Web 2.0 Ideas to Keep Your Intranet Fresh
10 Web 2.0 Ideas to Keep Your Intranet Fresh10 Web 2.0 Ideas to Keep Your Intranet Fresh
10 Web 2.0 Ideas to Keep Your Intranet Fresh
 
Portland Regional Arts & Culture Council: Rpeort to Community
Portland Regional Arts & Culture Council: Rpeort to CommunityPortland Regional Arts & Culture Council: Rpeort to Community
Portland Regional Arts & Culture Council: Rpeort to Community
 
Chicago Green Alley Handbook, 2010
Chicago Green Alley Handbook, 2010Chicago Green Alley Handbook, 2010
Chicago Green Alley Handbook, 2010
 

Ähnlich wie Working With Data and Humans

Data Driven Postmortems - DEV201 - Sao Paulo Summit
Data Driven Postmortems -  DEV201 - Sao Paulo SummitData Driven Postmortems -  DEV201 - Sao Paulo Summit
Data Driven Postmortems - DEV201 - Sao Paulo Summit
Amazon Web Services
 
Content Creation For Boring & Regulated Industries - PubCon Presentation 2012
Content Creation For Boring & Regulated Industries - PubCon Presentation 2012Content Creation For Boring & Regulated Industries - PubCon Presentation 2012
Content Creation For Boring & Regulated Industries - PubCon Presentation 2012
Nico Miceli
 
Intro open data hackday
Intro open data hackdayIntro open data hackday
Intro open data hackday
gueste2d87d8
 
Intro open data hackday
Intro open data hackdayIntro open data hackday
Intro open data hackday
gueste2d87d8
 

Ähnlich wie Working With Data and Humans (20)

Turning Data Into Narrative
Turning Data Into NarrativeTurning Data Into Narrative
Turning Data Into Narrative
 
DXO On Big Data, Open Data, and the Perils of “Democracy by Spreadsheet”
DXO On Big Data, Open Data, and the Perils of “Democracy by Spreadsheet” DXO On Big Data, Open Data, and the Perils of “Democracy by Spreadsheet”
DXO On Big Data, Open Data, and the Perils of “Democracy by Spreadsheet”
 
Designing for Digital 2017
Designing for Digital 2017Designing for Digital 2017
Designing for Digital 2017
 
Introducing The Visual Organization
Introducing The Visual OrganizationIntroducing The Visual Organization
Introducing The Visual Organization
 
Yay for DSSG!
Yay for DSSG!Yay for DSSG!
Yay for DSSG!
 
Postmortems orientados por dados - DEV201 - Sao Paulo Summit
Postmortems orientados por dados -  DEV201 - Sao Paulo SummitPostmortems orientados por dados -  DEV201 - Sao Paulo Summit
Postmortems orientados por dados - DEV201 - Sao Paulo Summit
 
Data Driven Postmortems - DEV201 - Sao Paulo Summit
Data Driven Postmortems -  DEV201 - Sao Paulo SummitData Driven Postmortems -  DEV201 - Sao Paulo Summit
Data Driven Postmortems - DEV201 - Sao Paulo Summit
 
Data journalism
Data journalismData journalism
Data journalism
 
DXO: Smart Chicago, You, Power, Poetry, and Pictures
DXO: Smart Chicago, You, Power, Poetry, and PicturesDXO: Smart Chicago, You, Power, Poetry, and Pictures
DXO: Smart Chicago, You, Power, Poetry, and Pictures
 
Content Creation For Boring & Regulated Industries - PubCon Presentation 2012
Content Creation For Boring & Regulated Industries - PubCon Presentation 2012Content Creation For Boring & Regulated Industries - PubCon Presentation 2012
Content Creation For Boring & Regulated Industries - PubCon Presentation 2012
 
Intro open data hackday
Intro open data hackdayIntro open data hackday
Intro open data hackday
 
Intro open data hackday
Intro open data hackdayIntro open data hackday
Intro open data hackday
 
open data hackday intro
open data hackday introopen data hackday intro
open data hackday intro
 
Tim willoughby - Presentation to Open Ireland
Tim willoughby - Presentation to Open IrelandTim willoughby - Presentation to Open Ireland
Tim willoughby - Presentation to Open Ireland
 
Intro open data hackday
Intro open data hackdayIntro open data hackday
Intro open data hackday
 
Structured data: Where did that come from & why are Google asking for it
Structured data: Where did that come from & why are Google asking for itStructured data: Where did that come from & why are Google asking for it
Structured data: Where did that come from & why are Google asking for it
 
Dxo mobile-dev-day
Dxo mobile-dev-dayDxo mobile-dev-day
Dxo mobile-dev-day
 
Big Data
Big DataBig Data
Big Data
 
Infoactive Hacks/Hackers presentation
Infoactive Hacks/Hackers presentationInfoactive Hacks/Hackers presentation
Infoactive Hacks/Hackers presentation
 
The Digital Revolution Keeps on Giving (and Takig)
The Digital Revolution Keeps on Giving (and Takig)The Digital Revolution Keeps on Giving (and Takig)
The Digital Revolution Keeps on Giving (and Takig)
 

Mehr von Daniel X. O'Neil

CITY OF CHICAGO Office of Inspector General Audit and Program Review Section ...
CITY OF CHICAGO Office of Inspector General Audit and Program Review Section ...CITY OF CHICAGO Office of Inspector General Audit and Program Review Section ...
CITY OF CHICAGO Office of Inspector General Audit and Program Review Section ...
Daniel X. O'Neil
 

Mehr von Daniel X. O'Neil (20)

Widening your aperture
Widening your apertureWidening your aperture
Widening your aperture
 
Chicago Community Trust 2014 Annual Report
Chicago Community Trust 2014 Annual ReportChicago Community Trust 2014 Annual Report
Chicago Community Trust 2014 Annual Report
 
DHS Motion to Dismiss Protests in B-414175
DHS Motion to Dismiss Protests in B-414175DHS Motion to Dismiss Protests in B-414175
DHS Motion to Dismiss Protests in B-414175
 
City of Chicago Tech Plan 18 Month Progress Update
City of Chicago Tech Plan 18 Month Progress UpdateCity of Chicago Tech Plan 18 Month Progress Update
City of Chicago Tech Plan 18 Month Progress Update
 
DXO Youth-Led Tech, July 2016
DXO Youth-Led Tech, July 2016DXO Youth-Led Tech, July 2016
DXO Youth-Led Tech, July 2016
 
Open Data: Roots, Impact, and Promise
Open Data: Roots, Impact, and PromiseOpen Data: Roots, Impact, and Promise
Open Data: Roots, Impact, and Promise
 
Federal it-cost-commission-report accelerating-the mission-july 21.2016
Federal it-cost-commission-report accelerating-the mission-july 21.2016 Federal it-cost-commission-report accelerating-the mission-july 21.2016
Federal it-cost-commission-report accelerating-the mission-july 21.2016
 
CITY OF CHICAGO Office of Inspector General Audit and Program Review Section ...
CITY OF CHICAGO Office of Inspector General Audit and Program Review Section ...CITY OF CHICAGO Office of Inspector General Audit and Program Review Section ...
CITY OF CHICAGO Office of Inspector General Audit and Program Review Section ...
 
The Promise of People in Civic Tech
The Promise of People in Civic TechThe Promise of People in Civic Tech
The Promise of People in Civic Tech
 
Madonna youth
Madonna youthMadonna youth
Madonna youth
 
World chicago-italy
World chicago-italyWorld chicago-italy
World chicago-italy
 
GIS Data Sharing Policies & Procedures of the City of Chicago Department of I...
GIS Data Sharing Policies & Procedures of the City of Chicago Department of I...GIS Data Sharing Policies & Procedures of the City of Chicago Department of I...
GIS Data Sharing Policies & Procedures of the City of Chicago Department of I...
 
The Chicago Police Department’s Information Collection for Automated Mapping...
 The Chicago Police Department’s Information Collection for Automated Mapping... The Chicago Police Department’s Information Collection for Automated Mapping...
The Chicago Police Department’s Information Collection for Automated Mapping...
 
GIS!
GIS!GIS!
GIS!
 
The Smart Chicago Model, Daniel X. O’Neil, Gigabit City Summit, January 2015
The Smart Chicago Model, Daniel X. O’Neil, Gigabit City Summit, January 2015The Smart Chicago Model, Daniel X. O’Neil, Gigabit City Summit, January 2015
The Smart Chicago Model, Daniel X. O’Neil, Gigabit City Summit, January 2015
 
Community Based Broadband Report by Executive Office of the President
Community Based Broadband Report by Executive Office of the PresidentCommunity Based Broadband Report by Executive Office of the President
Community Based Broadband Report by Executive Office of the President
 
Ordinance renaming plaza where Old Chicago Water Tower structure is located a...
Ordinance renaming plaza where Old Chicago Water Tower structure is located a...Ordinance renaming plaza where Old Chicago Water Tower structure is located a...
Ordinance renaming plaza where Old Chicago Water Tower structure is located a...
 
Ordinance renaming grand ballroom at Navy Pier as "Jane M. Byrne Grand Ballroom"
Ordinance renaming grand ballroom at Navy Pier as "Jane M. Byrne Grand Ballroom"Ordinance renaming grand ballroom at Navy Pier as "Jane M. Byrne Grand Ballroom"
Ordinance renaming grand ballroom at Navy Pier as "Jane M. Byrne Grand Ballroom"
 
Ordinance renaming International terminal at Chicago O'Hare International Air...
Ordinance renaming International terminal at Chicago O'Hare International Air...Ordinance renaming International terminal at Chicago O'Hare International Air...
Ordinance renaming International terminal at Chicago O'Hare International Air...
 
DePaul College Prep Steam Lab
DePaul College Prep Steam LabDePaul College Prep Steam Lab
DePaul College Prep Steam Lab
 

Kürzlich hochgeladen

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 

Working With Data and Humans

  • 1. Working With Data and Humans Daniel X. O’Neil @juggernautco
  • 2. Me • Daniel X. O’Neil • Co-founder of EveryBlock • 2007 Knight News Challenge • Executive Director of Smart Chicago Collaborative • 2012 Knight Community Information Challenge @juggernautco
  • 3. The Data Revolution • I know about some, but not all of it • Since about 2005 • Working with the Mayor’s Office in Chicago • ChicagoWorksForYou.com • Then at EveryBlock, where I was responsible for data acquisition @juggernautco
  • 4.
  • 5.
  • 6. The Data Revolution • 8 Principles of Open Government Data • Independent Government Observers Task Force • POTUS Executive Orders on Inauguration Day • Apps contests • Municipal ordinances • Socrata • Code for America @juggernautco
  • 7.
  • 8. There’s Data and There’s Humans • Talk to me about your data and your humans in your projects @juggernautco
  • 9. Data • Dense • Sits by itself • Not social • Not self-aware • Unable to contextualize itself • Does not have any problems, because it doesn’t care about anything @juggernautco
  • 10. People • Naturally social • Soft • Have problems • See everything in context • Prone to mistakes @juggernautco
  • 16. Value from data • Know more than anyone • Surfacing from the hidden Web • Context, context, context • Even if it is just one data set mashed against another data set • Did it rain * Did property crime go up or down • Foreclosures * Retail stores • Also: the simple act of aggregation + text @juggernautco
  • 18. Ten Databases • Building permits • Business licenses • Historic preservation list • Sanborn maps (1929 and 1950) • County assessor • County recorder of deeds • Original photography • Google search for news coverage • New York Times archive • Walgreens surplus property @juggernautco
  • 19. We need a machine. • A generic context engine • To evenly distribute information • And tell me what the information means • I know: that sounds like a “reporter” • But people used to think that “search engine” sounded a lot like “librarian”, too • We need humans and machines @juggernautco
  • 20. It’s easy. • Find dataset • Review dataset • Describe what the data means • Find another dataset • Describe what the other dataset means • Describe what the first dataset means in the context of the second dataset • Repeat • Let’s do this thing. @juggernautco
  • 23.
  • 24. Call any time. • @juggernautco • (773) 960-6045 @juggernautco

Hinweis der Redaktion

  1. I’m Dan O’Neil, and I run the Smart Chicago Collaborative, an organization devoted to improving lives in Chicago through technology. Among other things, I work with Chicago city government, developers, and community groups to use civic data in new and useful ways. As a co-founder of EveryBlock, I’m also a previous Knight News Challenge granteeI certainly wouldn’t be doing any of this today if it weren’t for the vision of the Knight Foundation.
  2. Let’s do a level-set
  3. Let’s do a level-set
  4. Explain EveryBlock
  5. Explain EveryBlock
  6. Let’s do a level-set
  7. Explain EveryBlock
  8. Let’s do a level-set
  9. Data has certain characteristics
  10. People have certain characteristics
  11. Something to keep in mind– this data is generated and maintained by humans.
  12. And if you use the default search for crime records, you get this screen.It has records going back to 2005.You fill out the form and you get your answers back.Pretty typical experience.
  13. What you wouldn’t be able to tell, unless you searched the Dallas Police Web site more deeply, is this.The Dallas Police publishes an amazing cache of crime data in flat files.All of it, with no search, no letters or emails, going back 12 years.Why anyone would make any FOIA request– or why the Dallas Police would want anyone to do that– is beyond me.And this data has some of the most amazing crime details– the police narrative– that you can find in crime data anywhere.This is hidden in plain sight.
  14. Lastly, I highly recommend the Data Journalism Handbook, which was created, in part, by many people in this room.It’s a really excellent resource.
  15. Data is often more structured than you think.Over the weekend I participated in the Knight-Mozilla-MIT "Story & Algorithm" Hack Day run by Dan Sinker.I met a couple of Boston developers and we executed on a project I’ve had for about 7 years.Like many of you here, I’m not smart enough to actually make things, so I have to rely on the kindness of developers.What we made was “Condition of Anonymity”– a Web site that automatically pulls the reason that anonymity was granted to an anonymous source by a reporter for the New York Times.We often think about data as the stuff inside spreadsheets and published in flat files to FTP servers, but there is a whole world of semi-structured data like this hidden in plain sight, inside plain text.We used the NYT Search API to review every article in the NYT back to January 1, 2000 for the phrase, “condition of anonymity”, then used a natural language processing toolkit to find what I call the “because clauses”.There’s some gold in there.It takes an abundance of data types to tell a story.This story feels like a Walt Whitman poem to me.
  16. The analysis is where it’s at.The most amazing insight I can share is that data is boring.I’ve had a long time to consider why that is true, and I think I have the answer.The reason is because people are boring.We forget that data is made by people.And most people are boring most of the timeEvery object should have a page on the Internet (so let’s get to work)
  17. Here’s kind of a master example.I live near this building.It was been empty for a very long time.Then construction started.The construction was heralded by a building permit.But, of course, the building permit was boring.So I looked further.
  18. I searched ten different databases and lo and beyhold, more data made it less boring.Why? Because almost all people are interesting some of the time.So if you look hard enough, you’ll find those stories.I found a business license for a 3-day pop-up store.So this place has been empty for decades, but was open for three days.And I missed it.It used to be a bank, and in 1937 I found out that– from the NYT archive, in PDF format– the hidden Web– that there was a bank run at this location in 1937.Again, not boring.
  19. This machine can be described as a generic context engineTo evenly distribute informationAnd tell me what the information meansI know: that sounds like a “reporter”But people used to think that “search engine” sounded a lot like “librarian”, tooWe need humans and machines
  20. Find datasetReview datasetDescribe what the data meansFind another datasetDescribe what the other dataset meansDescribe what the first dataset means in the context of the second datasetRepeatLet’s do this thing.
  21. Here’s an example of two things:Finding data in unstructured text and finding interesting data.This is an Advanced Search in Google for the word “jimmied” in the Dallas crime data published by EveryBlock.So that site becomes a public, searchable instance of a previously hidden data set.Apparently police have used the word “jimmied” to describe an action taken by suspected criminals 2,430 times.All sorts of things are jimmied, apparently.It’s not boring.
  22. We’ve noticed that custom applications created with dedicated budgets are reliably updated.Getting more connectivity between these established projects and the newer, open data projects is key.
  23. Explain EveryBlock
  24. Hi.