SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Garbage In
Rainbows Out
Zach Briggs
New Developer
7 Years in Data Analytics




Mike Fidler
Systems Security Specialist
Ex Geologist
His Unix experience is old enough to drink
Amateur Inventor
Validates
This form contains one error.
Screenshot of
 column view
Screenshot of serial view
Application
Raw Data
            Database
Continuous   Application
Raw Data    Process      Database
Coffee
Why your coffee is shit.
Anything but drip
Thank You

Zach - briggszj@gmail.com
@theotherzach
Title of Record



Mike - rockmastermike@gmail.com
@rockmastermike
Unix Neck Beard
Available for hire

Weitere ähnliche Inhalte

Andere mochten auch

περιβαλλοντικη
περιβαλλοντικηπεριβαλλοντικη
περιβαλλοντικηFani Kosmidou
 
A crise e o direito público
A crise e o direito públicoA crise e o direito público
A crise e o direito públicoRicardo Duarte Jr
 
Bloodborne pathogen training
Bloodborne pathogen trainingBloodborne pathogen training
Bloodborne pathogen trainingbeskid613
 
Colocation Market Trends 2015
Colocation Market Trends 2015Colocation Market Trends 2015
Colocation Market Trends 2015Markus Krisetya
 
Gottman Presentation Philosophy & Implementation of Couples Interventions
Gottman Presentation Philosophy & Implementation of Couples InterventionsGottman Presentation Philosophy & Implementation of Couples Interventions
Gottman Presentation Philosophy & Implementation of Couples InterventionsRod Minaker
 
Gottman Presentation Sound Marital House
Gottman Presentation Sound Marital HouseGottman Presentation Sound Marital House
Gottman Presentation Sound Marital HouseRod Minaker
 

Andere mochten auch (9)

Catálogo vasos plástico
Catálogo vasos plásticoCatálogo vasos plástico
Catálogo vasos plástico
 
περιβαλλοντικη
περιβαλλοντικηπεριβαλλοντικη
περιβαλλοντικη
 
Harish.h.nair
Harish.h.nairHarish.h.nair
Harish.h.nair
 
A crise e o direito público
A crise e o direito públicoA crise e o direito público
A crise e o direito público
 
Behavioral economics
Behavioral economicsBehavioral economics
Behavioral economics
 
Bloodborne pathogen training
Bloodborne pathogen trainingBloodborne pathogen training
Bloodborne pathogen training
 
Colocation Market Trends 2015
Colocation Market Trends 2015Colocation Market Trends 2015
Colocation Market Trends 2015
 
Gottman Presentation Philosophy & Implementation of Couples Interventions
Gottman Presentation Philosophy & Implementation of Couples InterventionsGottman Presentation Philosophy & Implementation of Couples Interventions
Gottman Presentation Philosophy & Implementation of Couples Interventions
 
Gottman Presentation Sound Marital House
Gottman Presentation Sound Marital HouseGottman Presentation Sound Marital House
Gottman Presentation Sound Marital House
 

Kürzlich hochgeladen

Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 

Garbage in, rainbows out

Hinweis der Redaktion

  1. I wanted to call this talk “dirty inputs.” \n
  2. \n
  3. Gatekeeper\nUnexpected inputs fail, push back to the user\n\n
  4. Fault tolerant systems. Model validations are the most obvious form cleansing. They are the gatekeepers.\n
  5. How about bulk records?\nUsing CSV, uploads as my example could be any source\nConsuming external json, sharing databases. Anything outside of the black rectangle\nWhat’s the downside to relying on validation when we get garbage?\nBest case is it fails, we ask the user to fix their stuff and try again.\nAllow half-fails? “Fix lines X Y and Z?” \n
  6. Super basic example. Unravels a CSV file, turns a potnentially wide table into a long one.\n
  7. Typical data grid, once again from any source\n
  8. And now we have a stream of data. Allows for more graceful failures. Since the entire input is in the system we can prompt the user to fix the errors or devise filters to do it automatically. \n\nIs it possible we would get better filters in the future? Better methods of cleaning the data. I’m sure none of you have ever seen a database where the columns were shifted by 1 because of a bone headed mistake that happened 2 months ago. Me either.\n
  9. Schemaless store is just the landing area for the data to be moved into our database in batches. The stream could be MongoDB, SQL Light, cave drawings with a web cam where your OCR software processes it into something usable. \n\nIt doesn’t matter.\n
  10. \n
  11. What if it looked more like this? How many do fake deletes? Why? How is an update different from a delete?\nIf we automate the input/ filter process why do it only once?\nWhy throw out anything at all? How would that system be different? Here is as far as I am. Ish. That “All data” is a few hundred gigs in MySQL tables and I have scripts that run when something updates. Add a ZIP and 56 minutes later it shows up in my Rails app.\n
  12. Nathan Marz had this idea first. \n
  13. How’s about this? \n\nQuery is a function of all data. Capture is done in the rawest granular way possible so speed wouldn’t be a consideration. Events rather than “stuff” so it can be rewound to the beginning of time.\n
  14. What is coffee? It’s filthy ass water, that’s what it is. Coffeeologists (board certified ones) measure the quality of coffee using the same dimensions as clean drinking water. pH, dissolved solids, rat feces. The usual.\n
  15. Pre ground grocery store beans have been sitting there for months and have lost their volatile flavor molecules. \nThe drip machine sprays unfiltered water that is too hot into the center of the filter over extracting some grounds and leaving others under extracted. \nThe coffee hits the bottom of the hot glass carafe and is instantly burned.\nWhat about the coffee nerds here? \n
  16. Pour over fixes the water temp and center over-extraction\nPress pot goes further and allows for extraction fine tuning\n\n
  17. The issue is variables out of your control:\nBean age\nWater quality\n\nPress pots can come close but you’re brewing blind. \n
  18. \n
  19. \n