SlideShare ist ein Scribd-Unternehmen logo
1 von 66
Data Craft
@chorn
chorn.com
github.com/chorn
linkedin.com/in/chorn
New York State
Politics
Education
Civics
NYS Education Department
VADIR:
Violent and Disruptive Incidents
Crafting Data
Crafting Data
Excel is an endpoint
Better tools are available
Crafting Data
Crafting Data
Crafting Data
Crafting Data
Crafting Data
Thumbs up
Data Convergence
Advanced Google
site:nysed.gov dataset
site:nysed.gov filetype:mdb
site:nysed.gov filetype:xls
related:nysed.gov
Crafting Data
Crafting Data
Crafting Data
Microsoft Access Databases:
•NYSED Directory
•School Report Card
Comma-Separated:
•F-33
•School Classifications
Responsibility
Crafting Data
Repeatable
Traceable
Simple
Data Integrity
MySQL
Postgresql
mdb-tools
bash
Ruby
Crafting Data
Crafting Data
Crafting Data
Crafting Data
Crafting Data
Crafting Data
Crafting Data
Crafting Data
Filter
Crafting Data
Crafting Data
Crafting Data
Crafting Data
Crafting Data
Missing Data
% High School Completers
% High School Non-Completers
Crafting Data
Crafting Data
Crafting Data
Crafting Data
Crafting Data
Crafting Data
Whoops
Crafting Data
Crafting Data
Crafting Data
More data
Crafting Data
Crafting Data
Crafting Data
Crafting Data
Crafting Data
Crafting Data
Horrible
Crafting Data
Crafting Data
Ranking
Crafting Data
Crafting Data
Conclusion
Questions?

Weitere ähnliche Inhalte

Ähnlich wie Data Craft

Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science TJ Stalcup
 
Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data ScienceTJ Stalcup
 
Introduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataIntroduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataAndre Freitas
 
Linked Open Government Data: What’s Next?
Linked Open Government Data:  What’s Next?Linked Open Government Data:  What’s Next?
Linked Open Government Data: What’s Next?Li Ding
 
A Compositional-distributional Semantic Model over Structured Data
A Compositional-distributional Semantic Model over Structured DataA Compositional-distributional Semantic Model over Structured Data
A Compositional-distributional Semantic Model over Structured DataAndre Freitas
 
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...Connotate
 
Noshir Contractor's view on the future of Linked Data
Noshir Contractor's view on the future of Linked DataNoshir Contractor's view on the future of Linked Data
Noshir Contractor's view on the future of Linked DataCarlos Pedrinaci
 
An Ontology for K-12 Education and the NIEM
An Ontology for K-12 Education and the NIEMAn Ontology for K-12 Education and the NIEM
An Ontology for K-12 Education and the NIEMOptum
 
Startds9.19.17sd
Startds9.19.17sdStartds9.19.17sd
Startds9.19.17sdThinkful
 
Semantic Search at Yahoo
Semantic Search at YahooSemantic Search at Yahoo
Semantic Search at YahooPeter Mika
 
Implementing the Verifiable Claims data model
Implementing the Verifiable Claims data modelImplementing the Verifiable Claims data model
Implementing the Verifiable Claims data modelDavid Wood
 
Making the Web Searchable - Keynote ICWE 2015
Making the Web Searchable - Keynote ICWE 2015Making the Web Searchable - Keynote ICWE 2015
Making the Web Searchable - Keynote ICWE 2015Peter Mika
 
Getstarteddssd12717sd
Getstarteddssd12717sdGetstarteddssd12717sd
Getstarteddssd12717sdThinkful
 
(Keynote) Peter Mika - “Making the Web Searchable”
(Keynote) Peter Mika - “Making the Web Searchable”(Keynote) Peter Mika - “Making the Web Searchable”
(Keynote) Peter Mika - “Making the Web Searchable”icwe2015
 
Bigdatacooltools
BigdatacooltoolsBigdatacooltools
Bigdatacooltoolssuresh sood
 
ADV Slides: Graph Databases on the Edge
ADV Slides: Graph Databases on the EdgeADV Slides: Graph Databases on the Edge
ADV Slides: Graph Databases on the EdgeDATAVERSITY
 
RDAP 15: You’re in good company: Unifying campus research data services
RDAP 15: You’re in good company: Unifying campus research data servicesRDAP 15: You’re in good company: Unifying campus research data services
RDAP 15: You’re in good company: Unifying campus research data servicesASIS&T
 
Data sci sd-11.6.17
Data sci sd-11.6.17Data sci sd-11.6.17
Data sci sd-11.6.17Thinkful
 

Ähnlich wie Data Craft (20)

Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science
 
Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data Science
 
Introduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataIntroduction to question answering for linked data & big data
Introduction to question answering for linked data & big data
 
Linked Open Government Data: What’s Next?
Linked Open Government Data:  What’s Next?Linked Open Government Data:  What’s Next?
Linked Open Government Data: What’s Next?
 
A Compositional-distributional Semantic Model over Structured Data
A Compositional-distributional Semantic Model over Structured DataA Compositional-distributional Semantic Model over Structured Data
A Compositional-distributional Semantic Model over Structured Data
 
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
 
Noshir Contractor's view on the future of Linked Data
Noshir Contractor's view on the future of Linked DataNoshir Contractor's view on the future of Linked Data
Noshir Contractor's view on the future of Linked Data
 
An Ontology for K-12 Education and the NIEM
An Ontology for K-12 Education and the NIEMAn Ontology for K-12 Education and the NIEM
An Ontology for K-12 Education and the NIEM
 
Repairing Hidden Links in Linked Data
Repairing Hidden Links in Linked DataRepairing Hidden Links in Linked Data
Repairing Hidden Links in Linked Data
 
Raven pack kevin
Raven pack kevinRaven pack kevin
Raven pack kevin
 
Startds9.19.17sd
Startds9.19.17sdStartds9.19.17sd
Startds9.19.17sd
 
Semantic Search at Yahoo
Semantic Search at YahooSemantic Search at Yahoo
Semantic Search at Yahoo
 
Implementing the Verifiable Claims data model
Implementing the Verifiable Claims data modelImplementing the Verifiable Claims data model
Implementing the Verifiable Claims data model
 
Making the Web Searchable - Keynote ICWE 2015
Making the Web Searchable - Keynote ICWE 2015Making the Web Searchable - Keynote ICWE 2015
Making the Web Searchable - Keynote ICWE 2015
 
Getstarteddssd12717sd
Getstarteddssd12717sdGetstarteddssd12717sd
Getstarteddssd12717sd
 
(Keynote) Peter Mika - “Making the Web Searchable”
(Keynote) Peter Mika - “Making the Web Searchable”(Keynote) Peter Mika - “Making the Web Searchable”
(Keynote) Peter Mika - “Making the Web Searchable”
 
Bigdatacooltools
BigdatacooltoolsBigdatacooltools
Bigdatacooltools
 
ADV Slides: Graph Databases on the Edge
ADV Slides: Graph Databases on the EdgeADV Slides: Graph Databases on the Edge
ADV Slides: Graph Databases on the Edge
 
RDAP 15: You’re in good company: Unifying campus research data services
RDAP 15: You’re in good company: Unifying campus research data servicesRDAP 15: You’re in good company: Unifying campus research data services
RDAP 15: You’re in good company: Unifying campus research data services
 
Data sci sd-11.6.17
Data sci sd-11.6.17Data sci sd-11.6.17
Data sci sd-11.6.17
 

Kürzlich hochgeladen

Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 

Kürzlich hochgeladen (20)

Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 

Data Craft

Hinweis der Redaktion

  1. In terms of storytelling, DDJ is usually about the middle parts of the story.
  2. My interest in HHROC
  3. Our last team project
  4. 3,045 schools. Lots of data, good data. No aggregates, just counts.
  5. Clean data, BEDS code, enrollment #’s, relevant to me
  6. What other data can I find that will add value, interest, weight, impact?
  7. 2011 data
  8. What if I find another dataset? What is the journalistic process for DDJ?
  9. F-33 is from 2010, and mostly didn’t match up because it’s by district.
  10. average of per student is bad statistics and unnecessarily watery
  11. I realized that I was avoiding proving that the different kinds of data was related, when an average tax paper is doesn’t care about the statistics, they’d hold with the assumption that dollars should relate to everything.
  12. Almost meaningless
  13. Low is good
  14. This would be better if I had a few degrees in math and statistics. Then again, anything I can create that also seems meaningful might also be worth reporting.