SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Garbage In
Rainbows Out
Zach Briggs
New Developer
7 Years in Data Analytics




Mike Fidler
Systems Security Specialist
Ex Geologist
His Unix experience is old enough to drink
Amateur Inventor
Validates
This form contains one error.
Screenshot of
 column view
Screenshot of serial view
Application
Raw Data
            Database
Continuous   Application
Raw Data    Process      Database
Coffee
Why your coffee is shit.
Anything but drip
Thank You

Zach - briggszj@gmail.com
@theotherzach
Title of Record



Mike - rockmastermike@gmail.com
@rockmastermike
Unix Neck Beard
Available for hire

Weitere ähnliche Inhalte

Andere mochten auch

περιβαλλοντικη
περιβαλλοντικηπεριβαλλοντικη
περιβαλλοντικηFani Kosmidou
 
A crise e o direito público
A crise e o direito públicoA crise e o direito público
A crise e o direito públicoRicardo Duarte Jr
 
Bloodborne pathogen training
Bloodborne pathogen trainingBloodborne pathogen training
Bloodborne pathogen trainingbeskid613
 
Colocation Market Trends 2015
Colocation Market Trends 2015Colocation Market Trends 2015
Colocation Market Trends 2015Markus Krisetya
 
Gottman Presentation Philosophy & Implementation of Couples Interventions
Gottman Presentation Philosophy & Implementation of Couples InterventionsGottman Presentation Philosophy & Implementation of Couples Interventions
Gottman Presentation Philosophy & Implementation of Couples InterventionsRod Minaker
 
Gottman Presentation Sound Marital House
Gottman Presentation Sound Marital HouseGottman Presentation Sound Marital House
Gottman Presentation Sound Marital HouseRod Minaker
 

Andere mochten auch (9)

Catálogo vasos plástico
Catálogo vasos plásticoCatálogo vasos plástico
Catálogo vasos plástico
 
περιβαλλοντικη
περιβαλλοντικηπεριβαλλοντικη
περιβαλλοντικη
 
Harish.h.nair
Harish.h.nairHarish.h.nair
Harish.h.nair
 
A crise e o direito público
A crise e o direito públicoA crise e o direito público
A crise e o direito público
 
Behavioral economics
Behavioral economicsBehavioral economics
Behavioral economics
 
Bloodborne pathogen training
Bloodborne pathogen trainingBloodborne pathogen training
Bloodborne pathogen training
 
Colocation Market Trends 2015
Colocation Market Trends 2015Colocation Market Trends 2015
Colocation Market Trends 2015
 
Gottman Presentation Philosophy & Implementation of Couples Interventions
Gottman Presentation Philosophy & Implementation of Couples InterventionsGottman Presentation Philosophy & Implementation of Couples Interventions
Gottman Presentation Philosophy & Implementation of Couples Interventions
 
Gottman Presentation Sound Marital House
Gottman Presentation Sound Marital HouseGottman Presentation Sound Marital House
Gottman Presentation Sound Marital House
 

Kürzlich hochgeladen

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Kürzlich hochgeladen (20)

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Garbage in, rainbows out

Hinweis der Redaktion

  1. I wanted to call this talk “dirty inputs.” \n
  2. \n
  3. Gatekeeper\nUnexpected inputs fail, push back to the user\n\n
  4. Fault tolerant systems. Model validations are the most obvious form cleansing. They are the gatekeepers.\n
  5. How about bulk records?\nUsing CSV, uploads as my example could be any source\nConsuming external json, sharing databases. Anything outside of the black rectangle\nWhat’s the downside to relying on validation when we get garbage?\nBest case is it fails, we ask the user to fix their stuff and try again.\nAllow half-fails? “Fix lines X Y and Z?” \n
  6. Super basic example. Unravels a CSV file, turns a potnentially wide table into a long one.\n
  7. Typical data grid, once again from any source\n
  8. And now we have a stream of data. Allows for more graceful failures. Since the entire input is in the system we can prompt the user to fix the errors or devise filters to do it automatically. \n\nIs it possible we would get better filters in the future? Better methods of cleaning the data. I’m sure none of you have ever seen a database where the columns were shifted by 1 because of a bone headed mistake that happened 2 months ago. Me either.\n
  9. Schemaless store is just the landing area for the data to be moved into our database in batches. The stream could be MongoDB, SQL Light, cave drawings with a web cam where your OCR software processes it into something usable. \n\nIt doesn’t matter.\n
  10. \n
  11. What if it looked more like this? How many do fake deletes? Why? How is an update different from a delete?\nIf we automate the input/ filter process why do it only once?\nWhy throw out anything at all? How would that system be different? Here is as far as I am. Ish. That “All data” is a few hundred gigs in MySQL tables and I have scripts that run when something updates. Add a ZIP and 56 minutes later it shows up in my Rails app.\n
  12. Nathan Marz had this idea first. \n
  13. How’s about this? \n\nQuery is a function of all data. Capture is done in the rawest granular way possible so speed wouldn’t be a consideration. Events rather than “stuff” so it can be rewound to the beginning of time.\n
  14. What is coffee? It’s filthy ass water, that’s what it is. Coffeeologists (board certified ones) measure the quality of coffee using the same dimensions as clean drinking water. pH, dissolved solids, rat feces. The usual.\n
  15. Pre ground grocery store beans have been sitting there for months and have lost their volatile flavor molecules. \nThe drip machine sprays unfiltered water that is too hot into the center of the filter over extracting some grounds and leaving others under extracted. \nThe coffee hits the bottom of the hot glass carafe and is instantly burned.\nWhat about the coffee nerds here? \n
  16. Pour over fixes the water temp and center over-extraction\nPress pot goes further and allows for extraction fine tuning\n\n
  17. The issue is variables out of your control:\nBean age\nWater quality\n\nPress pots can come close but you’re brewing blind. \n
  18. \n
  19. \n