Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

How to Streamline DataOps on AWS

888 Aufrufe

Veröffentlicht am

The quest for the insight-driven enterprise has spurned a mass exodus to the cloud. But cloud data ecosystems can be very complex with multiple data storage and processing options.

These slides-based on the webinar featuring leading IT analyst firm EMA, Amazon Web Services (AWS), and Trifacta--will help you: understand technology trends that simplify your analytics modernization journey; learn best practices to operationalize data management on AWS; establish operational excellence leveraging AWS data storage and processing; accelerate time-to-value for analytics projects with data preparation on AWS.

Veröffentlicht in: Technologie
  • Login to see the comments

How to Streamline DataOps on AWS

  1. 1. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING John Santaferraro Research Director EMA How to Streamline DataOps on AWS Modernizing Data Management in the Cloud Will Davis Sr. Director of Product Marketing Trifacta Nikki Rouda Principal Product Marketing Manager, Amazon Web Services
  2. 2. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Watch the On-Demand Webinar Slide 2 • EMA How to Streamline DataOps on AWS On-Demand webinar is available here: http://info.enterprisemanagement.com/how-to- streamline-dataops-on-aws-webinar-ws • Check out upcoming webinars from EMA here: http://www.enterprisemanagement.com/freeResearch
  3. 3. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Featured Speakers John Santaferraro, Research Director, EMA John is the research director for analytics, business intelligence, and data management at EMA. He has 23 years of experience in data and analytics, from startups to executive positions at Fortune 50 companies. His deep understanding of the industry comes from years of leadership in implementation, product and marketing organizations, along with multiple big data imagineering efforts for finance, communications, retail, manufacturing, healthcare, events, oil and gas, and utilities. John's coverage area also includes data integration, data discovery, metadata management, artificial intelligence, machine learning, data science, digital marketing, and innovation. Will Davis, Sr. Director of Product Marketing, Trifacta Will drives go-to-market and product marketing efforts at Trifacta having spent the past ten years managing the marketing initiatives for several high-growth data companies. Prior to Trifacta, Will worked with a variety of companies focused on data infrastructure, analytics and visualization, including GoodData, Greenplum and ClearStory Data. Will leads Trifacta’s marketing strategy to rapidly expand business growth and brand awareness. Nikki Rouda, Principal Product Marketing Manager, Amazon Web Services Nikki is the principal product marketing manager for data lakes and big data at AWS. Nikki has spent 20+ years helping enterprises in 40+ countries develop and implement solutions to their analytics and IT infrastructure challenges. Nikki holds an MBA from the University of Cambridge and an ScB in geophysics and math from Brown University.
  4. 4. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Logistics for Today’s Webinar An archived version of the event recording will be available at www.enterprisemanagement.com • Log questions in the chat panel located on the lower left-hand corner of your screen • Questions will be addressed during the Q&A session of the event QUESTIONS EVENT RECORDING A PDF of the speaker slides will be distributed to all attendees PDF SLIDES
  5. 5. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Agenda • Top Cloud Challenges and Opportunities • Modernization and DataOps in the Cloud • Data Lakes and Analytics on AWS • Trifacta DataOps on AWS • Question and Answer Slide 5 © 2018 Enterprise Management Associates, Inc.
  6. 6. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Top Cloud Challenges and Opportunities
  7. 7. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTINGSlide 7 © 2019 Enterprise Management Associates, Inc.
  8. 8. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTINGSlide 8 © 2019 Enterprise Management Associates, Inc. MORE PLATFORMS: Almost 8 of 10 participants indicated they have between 3 and 7 different platforms in their big data environment.
  9. 9. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTINGSlide 9 © 2019 Enterprise Management Associates, Inc. FASTER SPEEDS: Almost 3 of 4 participants indicated they were adopting real-time processing strategies.
  10. 10. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTINGSlide 10 © 2019 Enterprise Management Associates, Inc. FASTER SPEEDS: Streaming platforms take the #1 spot for platforms implemented in 2018.
  11. 11. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTINGSlide 11 © 2019 Enterprise Management Associates, Inc. MORE COMPLEXITY: Almost 3 of 4 respondents indicated they were adopting complex workloads like data science and machine learning.
  12. 12. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTINGSlide 12 © 2019 Enterprise Management Associates, Inc.
  13. 13. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTINGSlide 13 © 2019 Enterprise Management Associates, Inc. MORE CLOUD: More than 3 of every 4 big data projects are using some form of cloud implementation.
  14. 14. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Modernization and DataOps in the Cloud
  15. 15. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING EMA Hybrid Data Ecosystems Slide 15 © 2019 Enterprise Management Associates, Inc. H/S AP DW DM NSSP RS OS DP AP - Analytic Platforms DP - Discovery Platforms H/S - Hadoop/Spark DW - Data Warehouse DM - Data Marts NS - NoSQL OS - Operational Systems SP - Streaming Platforms
  16. 16. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTINGSlide 16 © 2019 Enterprise Management Associates, Inc. H/S AP DW DM NSSP SS OS RDS DP AP - Analytic Platforms DP - Discovery Platforms H/S - Hadoop/Spark DW - Data Warehouse DM - Data Marts NS - NoSQL OS - Operational Systems SP - Streaming Platforms SS - Simple Storage EMA Hybrid Data Ecosystems - in the cloud
  17. 17. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTINGSlide 17 © 2019 Enterprise Management Associates, Inc. H/S EMR AP Redshift DW Redshift DM RDS NS DynomoDB SP Kenesis SS S3 OS RDS DP Several AP - Analytic Platforms DP - Discovery Platforms H/S - Hadoop/Spark DW - Data Warehouse DM - Data Marts NS - NoSQL OS - Operational Systems SP - Streaming Platforms SS - Simple Storage EMA Hybrid Data Ecosystems - Example: AWS
  18. 18. IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING 7 Principles of DataOps for Cloud Data Ecosystems • Multi-model data access replaces single model • Interoperability replaces integration • Data preparation and pipelines replace data cleansing • Automation replaces manual data everything • Elasticity replaces enterprise scalability • Multidimensional agility replaces extensibility • Automated governance replaces simple metadata Slide 18 © 2019 Enterprise Management Associates, Inc.
  19. 19. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Data lakes and analytics on AWS Nikki Rouda, AWS February 21, 2019
  20. 20. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Driving business with analytics • Fighting fraud • Quoting car & truck prices • Movies & TV on demand • Delivering software quality • Mitigating safety issues • Targeted marketingWhat do all these business challenges have in common? They are solved with AWS data lakes and analytics. • Finding new revenue • Improving health • Serving retail customers • Valuing real estate • Reducing advertising costs • Making music
  21. 21. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Companies want more value from their data Complications: Siloed approaches don’t work anymore It’s too expensive and limiting to store data on-premises Data is: Implication: A new approach is needed to extract insights and value Growing exponentially From new sources Increasingly diverse Used by many people Analyzed by many applications
  22. 22. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Cloud data lakes are the future Customers want: To eliminate siloes of data To move to a single store, i.e. a data lake in the cloud To store data securely in standard formats To grow to any scale, with low costs To analyze their data in a variety of ways To have real-time analytics To predict future outcomes Data Lake
  23. 23. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Why choose AWS for data lakes and analytics Most comprehensive Most secure Most scalable Most cost-effective
  24. 24. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Our portfolio Broadest and deepest portfolio, purpose-built for builders Migration & Streaming Services Infrastructure Data Catalog & ETL Security & Management Dashboards Machine Learning Data Warehousing Big Data Processing Interactive Query Operational Analytics Real time Analytics Serverless Data processing Visualization & machine learning Data movement Analytics Data lake
  25. 25. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Data movement Analytics Data lake Our portfolio Broadest and deepest portfolio, purpose-built for builders QuickSight SageMaker Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams | Managed Streaming for Kafka Redshift EMR Athena Elasticsearc h Service Kinesis Data Analytics Glue S3/Glacier GlueLake Formation Visualization & machine learning
  26. 26. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential More data lakes and analytics than anywhere else
  27. 27. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential 125+ million players Create a constant feedback loop for game designers Up-to-the-minute understanding of gamer satisfaction to guarantee gamers are engaged Resulting in the most popular game played in the world Fortnite
  28. 28. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Migrated from on-premises data warehouse Built a data warehouse with Amazon Redshift and data lake with Amazon S3 Analytics on data lake with Amazon Athena, Amazon Redshift Spectrum, and Amazon EMR Report delivery went from months to days, at far lower cost
  29. 29. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Needed to analyze data to find insights, identify opportunities and evaluate business performance The Oracle data warehouse did not scale, was difficult to maintain and costly Deployed a data lake with Amazon S3, and run analytics with Amazon Redshift, Amazon Redshift Spectrum, and Amazon EMR Result: They doubled the data stored (100PB), lowered costs, and was able to gain insights faster 50 PB of data 600,000 analytics jobs/day
  30. 30. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Enabling all types of data-driven analytics Retrospective analysis and reporting Here-and-now real-time processing and dashboards Predictions to enable smart applications
  31. 31. Trifacta on AWS How to Streamline DataOps on AWS: Modernizing Data Management in the Cloud Will Davis, Sr. Director of Product Marketing, Trifacta
  32. 32. DataOps Definition 32 Proprietary & Confidential “DataOps is the function within an organization that controls the data journey from source to value.” Jarah Euston, What is DataOps?, (Nexla, 2017), https://www.nexla.com/define-dataops/
  33. 33. Data Platforms Databases Log Files Spreadsheets IoT Sensors Apps 80% Usage Patterns Data Onboarding ML/AI Analytics “It’s impossible to overstress this: 80% of the work in any data project is in cleaning the data.” — DJ Patil, Former Chief Data Scientist of the United States Proprietary & Confidential.33 What is Biggest Challenge in Streamlining DataOps?
  34. 34. Analysis Enterprise Data Warehouse AI Business Intelligence Proprietary & Confidential.34 Data Analyst Data Engineer Data Scientist ValidatingDiscovering Structuring Cleaning Enriching Deploying Data Platforms Databases Log Files Spreadsheets IoT Sensors Apps And it Impacts Your Entire Data Team...
  35. 35. “Poor data quality is enemy number one to the widespread, profitable use of machine learning.” —Harvard Business Review “So, while there is a visible arms race as companies bring on machine learning coders and kick off AI initiatives, there is also a behind-the-scenes, panicked race for new and different data.” —MIT Sloan Management Review The Rise of Machine Learning & AI Compounds the Problem
  36. 36. "The hard part of AI is data wrangling.” wrangles For AI —SWAMI SIVASUBRAMANIAN VP – AMAZON MACHINE LEARNING #reInvent2018 Proprietary & Confidential.36
  37. 37. 37 Proprietary & Confidential WHAT TO DO?
  38. 38. Data Platforms Databases Log Files Spreadsheets IoT Sensors Apps Analysis Enterprise Data Warehouse AI Business Intelligence Proprietary & Confidential.38 DATA WRANGLING
  39. 39. Data Platforms Databases Log Files Spreadsheets IoT Sensors Apps Analysis Enterprise Data Warehouse AI Business Intelligence Proprietary & Confidential.39 DATA WRANGLING • Empowers domain experts with intelligent visual interfaces that automate assessment and transformation of data • Enable IT to collaboratively curate and operationalize data pipelines authored by domain experts • Establish an enterprise-wide platform that refines data from a variety of sources, supporting a range of users and use cases
  40. 40. Predictive Modeling Business Intelligence Data Onboarding Risk & Compliance Audit, Testing & Validation Data Migration OPERATIONAL Data Platforms Databases Log Files Spreadsheets IoT Sensors Apps Proprietary & Confidential.40 ValidatingDiscovering Structuring Cleaning Enriching Deploying ANALYTIC Data Analyst Data Engineer Data Scientist
  41. 41. And We’re Natively Integrated into AWS • Native storage • Native processing • Native security
  42. 42. 50+ Trifacta Customers Deployed on AWS 42 Proprietary & Confidential
  43. 43. Why Trifacta? QUALITY SPEED EFFICIENCY
  44. 44. QUALITY SPEED EFFICIENCY Empower the people who know the data best While maintaining governance and lineage Intuitive, visual interface Intuitive, visual interface Self-documenting lineage
  45. 45. QUALITY SPEED EFFICIENCY Faster to Design Preparation Workflows Instant previews, continuous validation ML-driven suggestion s
  46. 46. QUALITY SPEED EFFICIENCY Faster to Put Workflows into Production Automate data pipelines Share, test & version control
  47. 47. QUALITY SPEED EFFICIENCY Retire legacy solutions Utilize native cloud elasticity
  48. 48. Industry Analysts All Rank Trifacta #1 48 Proprietary & Confidential Self Service Data Preparation Wave “Customer references can't say enough about Trifacta’s ease of use" A perfect score in 14 of 17 categories. #1 with Gartner #1 with Ovum, Dresner, and Bloor
  49. 49. 49 Proprietary & Confidential. How to Get Started?
  50. 50. Different Editions & Deployments Options for Any Use Case Proprietary & Confidential.50 FOR INDIVIDUALS Free FOR TEAMS & DEPARTMENTS Starts at $5K per user* FOR ENTERPRISE DEPLOYMENT Contact Us • Trifacta Managed Cloud • Works with desktop files • Functional, data volume, and processing limitations • Community support • Trifacta Managed Cloud • Files, relational, cloud connectivity • Job Scheduling & Collaboration • Phone/email support • Customer Managed Cloud • Unlimited volume & scalability • Broad connectivity • Advanced security, access controls, and governance • Enterprise support & dedicated customer success manager
  51. 51. Available on the AWS Marketplace 51 Proprietary & Confidential
  52. 52. Start Wrangling Today with Free Wrangler 52 Proprietary & Confidential
  53. 53. Questions?

×