Slides from Strata 2017 talk, "Data Engineering Efficiency @ Netflix."
Michelle Ufford explains how Netflix’s data engineering and analytics team is using data to find common patterns among the chaos that enable the company to automate repetitive and time-consuming tasks and discover ways to improve data quality, reduce costs, and quickly identify and respond to issues. Michelle provides a quick overview of Netflix’s analytics environment before diving into some of the major challenges facing the company’s data engineers. Along the way, Michelle shares how Netflix is building more intelligent data platform services and tools to improve data quality, automate data maintenance, alert on job optimization opportunities, and more.
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
Data Engineering Efficiency @ Netflix - Strata 2017
1. Working smarter,
not harder
DATA ENGINEERING
EFFICIENCY @
MICHELLE UFFORD
MANAGER, CORE INNOVATION
DATA ENGINEERING & ANALYTICS
STRATA NYC, FALL 2017
4. year 1 year 2 year 3
EngineerTime Michelle’s Wildly Subjective & Completely Unscientific
Observations of Engineering Efforts over Time
new development support & maintenance everything else
Circa 2007
5.
6. year 1 year 2 year 3
EngineerTime Michelle’s Wildly Subjective & Completely Unscientific
Observations of Engineering Efforts over Time
new development support & maintenance everything else
Circa 2007
~10%
~75%
when Michelle jumps ship
7.
8. ● archiving old data or unused tables
● fixing & reflowing bad data
● documenting lineage & relationships
● etc. etc. etc.
Support & Maintenance.
● troubleshooting failed jobs
● investigating data quality issues
● migrating to newer releases
● optimizing job performance
23. Simplify & Automate:
Data Quality.
● identify appropriate level of quality coverage for a given table based upon usage data
● provide initial configuration of quality thresholds based upon table behavior patterns
● simplify integration of quality checks into data pipelines
● etc.
33. Simplify & Automate:
Data Insight.
● provide easy visibility into current state & changes over time
● provide prescriptive guidance on impactful optimization opportunities
● notify users of unexpected conditions which may indicate problems
● etc.
34.
35.
36.
37.
38. Data Engineering @ Netflix.
Support & maintenance: 35%
New development & functionality:
45%
40. A sneak peak at what we’re working on now
Part four.
41. year 1 year 2 year 3
EngineerTime Michelle’s Wildly Subjective & Currently Unproven
Theory of the Impact of ‘Smarter’ Solutions
new development support & maintenance everything else
Circa 2017
~20% ???
~60% ???
42. Faster & Smarter:
Data Maintenance.
● multi-node object deprecation
● field-level deprecation
● beyond pattern matching
● etc.
43. Faster & Smarter:
Data Quality.
● additional Metacat statistics
● robust anomaly detection
● aggressively experiment with configurations
● etc.
https://conferences.oreilly.com/strata/strata-ny/user/proposal/status/59963
Working smarter, not harder: Driving data engineering efficiency at Netflix
Complex data structures. Incomplete data. Upstream failures. Cryptic error messages. Rapidly evolving technology. No question, data engineering can be hard. But what if we used the wealth of data and experience at our disposal to make data engineering a little easier?
Michelle Ufford explains how Netflix’s Data Engineering and Analytics team is using data to find common patterns among the chaos that enable the company to automate repetitive and time-consuming tasks and discover ways to improve data quality, reduce costs, or quickly identify and respond to issues. Michelle provides a quick overview of Netflix’s analytics environment before diving into some of the major challenges facing the company’s data engineers. Along the way, Michelle shares how Netflix is building more intelligent data platform services and tools to improve data quality, automate data maintenance, alert on job optimization opportunities, and more.