Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige

Hier ansehen

1 von 33 Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Ähnlich wie Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty (20)

Anzeige

Aktuellste (20)

Anzeige

Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty

  1. 1. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Suki Dhuphar April 8th, 2020
  2. 2. Today’s Speaker Suki Dhuphar EMEA Lead, Tamr
  3. 3. Agenda ● Introduction ● What is Modern Enterprise Data Engineering? (Why it’s more important than ever before) ● How to adapt your Data Engineering processes to rapid change ● Strategies to keep remote teams aligned and virtually connected ● The importance of using data to drive business decisions ● Why organizations need to modernize data management quickly and effectively (How to accelerate cloud migration)
  4. 4. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Data is more important now than ever before!! 4
  5. 5. Agenda ● Introduction ● What is Modern Enterprise Data Engineering? (Why it’s more important than ever before) ● How to adapt your Data Engineering processes to rapid change ● Strategies to keep remote teams aligned and virtually connected ● The importance of using data to drive business decisions ● Why organizations need to modernize data management quickly and effectively (How to accelerate cloud migration)
  6. 6. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Large Orgs Can Be Tempted to Put AI/ML Cart Before Data Horse 6
  7. 7. What is DataOps? = Modern Data Engineering Practice DataOps is an automated, process oriented methodology, used by analytic and data teams to improve the quality and reduce the cycle time of data analytics. 7
  8. 8. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Data Science vs. Data Engineering There is a significant overlap between data engineers & data scientists when it comes to skills and responsibilities. The main difference is one of focus. Data Engineers are focused on building infrastructure & architecture for data generation, preparation and publishing. In contrast, data scientists are focused on advanced mathematics and statistical analysis on published data. Some traditional enterprise “data management’ professionals will become data engineers. Link here to detailed infographic from DataCamp Reporting and Visualization Statistical Modeling & Machine Learning Data Movement Data Cleaning, Unification, Alignment Database Performance Optimization Software Engineer Data Scientist Data Engineer Data Engineer Data Scientist 8
  9. 9. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Reality of modern enterprise data scientists. They are constantly and idiosyncratically “fixing” the core data Data Scientist Survey by Figure Eight 9
  10. 10. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Empowering data consumers is core to success of industry disruptors “One of the primary goals of the platform is to enable other teams to focus on business logic, making experimentation, implementation, operation of stream processing jobs easy. By having a platform to abstract the “hard stuff”, removing complexities away from users, this would unleash broader team agility and product innovations.” 10
  11. 11. Agenda ● Introduction ● What is Modern Enterprise Data Engineering? (Why it’s more important than ever before) ● How to adapt your Data Engineering processes to rapid change ● Strategies to keep remote teams aligned and virtually connected ● The importance of using data to drive business decisions ● Why organizations need to modernize data management quickly and effectively (How to accelerate cloud migration)
  12. 12. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Traditional “Methods” for data engineering All necessary, none alone is sufficient to solve broader problems either approach alone is sufficient ● Standardization -- one schema to rule them all ● Aggregation -- tends to create more/bigger siloes ● Federation -- always creates significant query performance challenges ● Master Data Management -- “deviations” difficult to handle and too deterministic ● Rationalize Systems -- single vendor = radical compromise ● Throw Bodies at it -- expensive, time-consuming, ineffective & inconsistent 12
  13. 13. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Human/behavioral challenges often primary bottleneck ● Afraid to share data ○ Due to data quality (worry about being judged or having to take on the responsibility of fixing the data consumers’ requests) ● Hoarding data ○ A method of organizational control or job preservation ● Obscuring data complexity ○ Failure to embrace the complexity, diversity, and idiosyncrasy of data generated in a large enterprise ● Limiting access to a small number of users ○ A method of control or as a reflection of insecurity of data quality 13
  14. 14. Traditional companies have significant “legacy drag coefficient” Manage data from their business systems more as “exhaust” than “asset” > “significant data debt” Result: “Random Data Salad” Data debt from constant change/entropy Restructuring Leadership Changes Politics Dynamic Schema DBs - Mongo et al “Data Hoarding” Legacy Burden M&A Problem: Thousands of systems generating data every day that were built over decades to support business processes - idiosyncratic to that time/context. Data is idiosyncratic to each system - creates fundamental “data disconnect” and “data decay” Consequences: 1. Too much time spent on data prep vs. analysis / action. 2. High failure rate of BI / analytics projects 3. Game changing initiatives deemed ‘impossible’ and never start Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 14
  15. 15. Modern, Open Data Engineering Ecosystem (aka DataOps) Sources (tabular) Internal databases Internal apps / systems External endpoints Internal files External files Feedback & Usage Mastering & Quality Movement & Automation Storage & Compute Governance, Privacy & Policy Catalog & Crawling Consumption endpoints Analytics Source Remediation Source / Cloud Migration Custom Apps Consumers Citizens Analysts Data Scientists Developers Publishing & Versioning Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 15
  16. 16. Sources People, Process, Tools Internal Tabular Data External Tabular Data ● Cloud First ● Continuous (assume data will change) ● Agile (deploy quickly and iteratively) ● Highly Automated - automate whenever possible ● Open/Best of Breed (not one platform/vendor) ● Bi-Directional (Feedback) ● Collaborative (Humans at the Core) ● Service Oriented (clear endpoints for data) ● Loosely Coupled (Restful Interfaces Table(s) In/Out) ● Both aggregated AND federated storage ● Both batch AND Streaming ● Lineage/Provenance is essential ● Scale Out/Distributed Modern Enterprise Data Engineering Principles ≈ “DataOps” Consumers Citizens Analysts Data Scientists Developers Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 16
  17. 17. Reality of Data Ecosystem/Landscape : EXTREME NOISE Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 17
  18. 18. Agenda ● Introduction ● What is Modern Enterprise Data Engineering? (Why it’s more important than ever before) ● How to adapt your Data Engineering processes to rapid change ● Strategies to keep remote teams aligned and virtually connected ● The importance of using data to drive business decisions ● Why organizations need to modernize data management quickly and effectively (How to accelerate cloud migration)
  19. 19. How many employees work from home? Regular work-at-home has grown 173% since 2005, 11% faster than the rest of the workforce.[Global Workplace Analytics’ analysis of 2018 ACS data] How many people could work- from-home? ● 56% of employees have a job where at least some of what they do could be done remotely [Global Workplace Analytics analysis of BLS data, 2017] ● 62% of employees say they could work remotely [Citrix 2019 poll] ● Studies repeatedly show desks are vacant 50- 60% of the time. Adapting to change - Telecommuting/WFH/Remote working is not a new concept The chart shows the percentage of people who work-at- home by industry. [Global Workplace Analytics’ special analysis of 2016 ACS data] Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 19
  20. 20. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Data Engineering Data Suppliers CIO Source Owner DBA IT Professional CDO Data Engineer Curator Steward Business owners and Other CxOs Data Consumers Data Scientist Data Analyst Data Citizen Developer ELT Professional Key People/Personas in the Modern Data Ecosystem 20
  21. 21. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Key Roles in next-gen Data Ecosystem Role Goals Tools Citizen Use data to make business decisions Viz, CRM, Excel, PowerPoint, Word, Web Search Analyst Deliver insights to the business, typically through dashboards and reports Viz, Excel, SSDP, Web Search Scientist Deliver insights to the business, typically through models and algorithms R, Python, SAS, SSDP Developer Build applications which leverage corporate data Python, Java, JS, SQL, REST Engineer Deliver and manage data pipelines ETL, SQL Curator Ensure consumers have the data they need, in the form they need it Data mastering tools Steward Uses feedback from consumers to improve data broadly Data Feedback Tools Source Owner Define and manage purpose, processes (data creation, consumption) & users (i.e., access) of the data source EDW, SQL, ERWin, LDAP, SAP ConsumersPreparersSuppliers 21
  22. 22. Agenda ● Introduction ● What is Modern Enterprise Data Engineering? (Why it’s more important than ever before) ● How to adapt your Data Engineering processes to rapid change ● Strategies to keep remote teams aligned and virtually connected ● The importance of using data to drive business decisions ● Why organizations need to modernize data management quickly and effectively (How to accelerate cloud migration)
  23. 23. Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty Useful comparison between CDO and CFO Tablestakes CHIEF FINANCIAL OFFICER ● What money do we have? ● Where did it come from? ● Where is it going and why? Long term goal: Return on Assets CHIEF DATA OFFICER ● What data do we have? ● Where does it come from? ● Who consumes it and why? Long term goal: Return on Data 23
  24. 24. How Do I Start? You may have already started... ● Leverage existing mastered data as ground truth ● Keep the best parts of your MDM, just enhance the Mastering capability ...But if you haven’t… ● Find a data-rich, analytically valuable problem for which fragmented data and knowledge present a challenge ...Either way, it’s essential to keep an agile... ● ...Mindset - focus on quick wins that have been beyond reach, then build ● ...Skillset - engage the data experts at their current skill level, let machines do the rest ● ...Toolset - simple, collaborative data curation, optimally in the tools they already use Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 24
  25. 25. 25
  26. 26. 212 Sources (tables) - mostly SAP ● Enterprises have hundreds of source systems ● Sources must be combined, consolidated, and classified ● These lists are building blocks for transformational analytics What are Transformational Analytic Outcomes? Question: How many customers do we have? Before After Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 26
  27. 27. Before After What are Transformational Analytic Outcomes? Question: What is our customer distribution by sales totals? ● Analytics begin with sell more and/or spend less ● Transformational analytics aren’t new, they are broader ● Business wants speed and up-to- date information ● Data variety skews answers, creating misinformation instead of clarity Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 27
  28. 28. Agenda ● Introduction ● What is Modern Enterprise Data Engineering? (Why it’s more important than ever before) ● How to adapt your Data Engineering processes to rapid change ● Strategies to keep remote teams aligned and virtually connected ● The importance of using data to drive business decisions ● Why organizations need to modernize data management quickly and effectively (How to accelerate cloud migration)
  29. 29. Why now? 7 years ago, we needed data scientists! But now that we have them - where do they get their data? Data Scientist: The Sexiest Job of the 21st Century Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 29
  30. 30. Today: we have data scientists! (and want to do cool AI stuff) Data Scientist Jobs Indeed.com, % of all postings Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 30
  31. 31. Unique Moment in Time: Enterprise Data as an Asset Vestibulum congueLatent Opportunity of Enterprise “Data As An Asset” Enterprise Migration to the Cloud Generational Change in Enterprise Data Management Fear of disruption by “Data Natives” Low Hanging Analytical Fruit “Competing on Analytics”/Strategic Imperative Popularity of AI but really of Core Data Quality Inability of traditional tech to scale Lack of innovation from old vendors Maturing Big Data Tech (HDFS/Lakes) Democratization of analytics Rise of the CDO Decades of treading data as operational exhaust Deeply Fragmented/Siloed Data Environments Inability to leverage new sources - esp external “AI Cart” before the “Data Horse” Significant “lift & shift” opportunity Potential for behavioral changes New infra good/secure enough Now that we’ve established Data Science as critical component of enterprise: It’s time for each enterprise in the Global 2000 to build their data engineering muscles to enable them to “compete on analytics” over the coming decades. 31
  32. 32. What NOT to do ● Avoid boil the ocean/”waterfall” (projects measured in years/quarters) ○ Build rational long term infra while delivering real analytic value along the way ● Single “Platform”: Don’t overestimate what single piece of software can do ○ Focus on thoughtfully designed ecosystem of loosely coupled best of breed tools ● Single Vendor: Don’t overestimate what single vendor can do ○ Align vendors with APIs and expectations that they MUST work together ● Don’t Underestimate effort required to make FOSS work ○ Just because Google does it doesn’t mean you can do it ● Don’t underestimate human/behavioral challenges with data ○ Most often the reason that projects fail/stall are human/behavioral ● Avoid “Data Engineering/Science Hubris” ○ I Data - therefore I am Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty 32
  33. 33. Thank You Questions Email: suki.dhuphar@tamr.com For Joining this Webinar You’ll receive exclusive access to Tamr’s new guide 33

×