Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Introducing the data science sandbox as a service 8.30.18

778 Aufrufe

Veröffentlicht am

How can companies integrate data science into their businesses more effectively? Watch this recorded webinar and demonstration to hear more about operationalizing data science with Cloudera Data Science Workbench on Cazena’s fully-managed cloud platform.

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

Introducing the data science sandbox as a service 8.30.18

  1. 1. © Cloudera, Inc. All rights reserved. Enterprise-Ready Data Science: Scaling, Governance, and Operationalization
  2. 2. © Cloudera, Inc. All rights reserved. 2© Cloudera, Inc. All rights reserved.2 Mark Chisam Senior Solution Engineer Introducing Cloudera Data Science Workbench
  3. 3. © Cloudera, Inc. All rights reserved. 3© Cloudera, Inc. All rights reserved.
  4. 4. © Cloudera, Inc. All rights reserved. 4© Cloudera, Inc. All rights reserved.
  5. 5. © Cloudera, Inc. All rights reserved. 5© Cloudera, Inc. All rights reserved.
  6. 6. © Cloudera, Inc. All rights reserved. 6© Cloudera, Inc. All rights reserved.
  7. 7. © Cloudera, Inc. All rights reserved. 7© Cloudera, Inc. All rights reserved.
  8. 8. © Cloudera, Inc. All rights reserved. 8© Cloudera, Inc. All rights reserved.
  9. 9. © Cloudera, Inc. All rights reserved. 9© Cloudera, Inc. All rights reserved.9 Dr. Daniel Parton Lead Data Scientist Operationalizing Data Science for Enterprises
  10. 10. © Cloudera, Inc. All rights reserved. 10© Cloudera, Inc. All rights reserved. Bardess® is a consulting company focused on designing and implementing data analytics solutions. We are a team of data and business professionals, who ask insightful questions, extend boundaries and take action. We transform data into insights and action, everyday. 1 0
  11. 11. © Cloudera, Inc. All rights reserved. 11 Requirements Discovery Strategy + Planning Solution Design Ingestion + Shaping Data Architecture Storage + Processing Predictive Analytics Machine Learning Artificial Intelligence Visualization Data Discovery Dev / Ops Bardess Data Practices MANAGEMENT CONSULTING DATA OPS DATA SCIENCE DATA ANALYTICS
  12. 12. © Cloudera, Inc. All rights reserved. 12© Cloudera, Inc. All rights reserved.12 AI MACHINE LEARNING DATA SCIENCE ANALYTICS "BIG DATA"
  13. 13. © Cloudera, Inc. All rights reserved. 13© Cloudera, Inc. All rights reserved. WHAT IS A DATA SCIENTIST?
  14. 14. © Cloudera, Inc. All rights reserved. 14© Cloudera, Inc. All rights reserved. WHAT IS A DATA SCIENTIST?
  15. 15. © Cloudera, Inc. All rights reserved. 15© Cloudera, Inc. All rights reserved.15 Data Engineering Data Science (Exploratory) Production (Operational) Data has never been more plentiful. Open source data science and machine learning libraries are rapidly evolving. Commodity (and on-demand) compute makes scalable production machine learning affordable. Reports, Dashboards Production Data Pipelines Batch scoring … THE GOOD NEWS
  16. 16. © Cloudera, Inc. All rights reserved. 16© Cloudera, Inc. All rights reserved. THE BAD NEWS Data needs to move across multiple different systems. Teams have different conflicting requests for languages and libraries. Most data science done at small scale, individually, and is difficult to replace. Very few models reach production. Data Engineering Data Science (Exploratory) Production (Operational)
  17. 17. © Cloudera, Inc. All rights reserved. 17© Cloudera, Inc. All rights reserved.17 THE CHALLENGE Balance these needs DATA SCIENCE • Access to granular data • Flexibility • Preferred open source tools • Elastic provisioning • Compute • Storage • Reproducible research • Path to production DATA MANAGEMENT • Security • Governance • Standards • Low maintenance • Low cost • Self-service access
  18. 18. © Cloudera, Inc. All rights reserved. 18© Cloudera, Inc. All rights reserved.18 THE TYPICAL SOLUTION “If I can’t use my favorite tools, I’ll…” • Copy data to my laptop • Copy data to a data science appliance • Copy data to a cloud service Why this is a problem: • Complicates security • Breaks data governance • Adds latency to process • Makes collaboration more difficult • Complicates model management and deployment • Creates infrastructure silos
  19. 19. © Cloudera, Inc. All rights reserved. 19© Cloudera, Inc. All rights reserved.19 CLOUDERA DATA SCIENCE WORKBENCH Accelerate Machine Learning from Research to Production For data scientists • Experiment faster Use R, Python, or Scala with on-demand compute and secure CDH data access • Work together Share reproducible research with your whole team • Deploy with confidence Get to production repeatably and without recoding For IT professionals • Bring data science to the data Give your data science team more freedom while reducing the risk and cost of silos • Secure by default Leverage common security and governance across workloads • Run anywhere On-premises or in the cloud
  20. 20. © Cloudera, Inc. All rights reserved. 20© Cloudera, Inc. All rights reserved.20 CASE STUDY Transforming Business Decision-Making with Machine Learning at Scale Background: • Retail client aimed to use clustering to understand their most common types of transactions • And to find which groups of products tend to be purchased together • Cloudera cluster, storing 2 billion rows of historical transaction data • Used CDSW to build custom clustering workflow in Spark and Python Representative image of clustering
  21. 21. © Cloudera, Inc. All rights reserved. 21© Cloudera, Inc. All rights reserved.21 CASE STUDY Transforming Business Decision-Making with Machine Learning at Scale Result: • Clusters describe transactions with far more nuance than the simple category- level aggregations that were previously in use • Identified major trends in certain types of transaction, worth multiples of $100M • Clusters transforming how company thinks about their business, from shop floor to board level • Clustering workflow is easily maintainable, reproducible, and scalable Representative image of clustering
  22. 22. © Cloudera, Inc. All rights reserved. 22© Cloudera, Inc. All rights reserved.22 CASE STUDY Transforming Business Decision-Making with Machine Learning at Scale Benefits of CDSW: • Easy access to big datasets from Cloudera HDFS • Access to Spark to apply clustering on entire 2 billion row dataset • Notebook environment allows data scientists to innovate while staying within secure Cloudera environment • Collaborative environment enabling organized project structure and collaboration within team of data scientists Representative image of clustering
  23. 23. © Cloudera, Inc. All rights reserved. 23© Cloudera, Inc. All rights reserved. LIVE DEMO
  24. 24. © Cloudera, Inc. All rights reserved. 24© Cloudera, Inc. All rights reserved.24 Introducing the Data Science Sandbox Lovan Chetty VP, Product
  25. 25. © Cloudera, Inc. All rights reserved. 25© Cloudera, Inc. All rights reserved. SOLUTION Data Science Workbench EDH Stack + Option for Altus PaaS & More… Cloud IaaS (Fully-Managed) + BYOL options End to End Management (Cloud>Cluster>Workload) 24x7 Production DevOps Security, Governance & Compliance Workload Optimization Fully-Managed, Complete Cloud Platform for Analytics and Data Science DevOps Built-In, Cloudera & Cloud IaaS Included. Fast Setup, Ready in Hours. Fully-Managed Data Science Sandbox as a Service
  26. 26. © Cloudera, Inc. All rights reserved. 26© Cloudera, Inc. All rights reserved. The Fastest, Most Cost-Effective Way to Expand or Deploy a Modern Platform for Data Science in the Cloud. • Ready Now, with No New Resources 24x7 Production DevOps & Monitoring • Secure, Enterprise-Ready: Hybrid Gateways, Governance, Compliance • Simple: All-in-one solutions for agility, flexibility in analytics & tools • Cost-Effective: ½ TCO, Best price-performance, SLA Optimization Benefits Fully-Managed Data Science Sandbox www.cazena.com/cloudera WHY CLOUD?
  27. 27. © Cloudera, Inc. All rights reserved. 27© Cloudera, Inc. All rights reserved. Q&A
  28. 28. © Cloudera, Inc. All rights reserved. 28© Cloudera, Inc. All rights reserved.28 Q&A - TECHNICAL PANELISTS Lovan Chetty VP, Products lovan@cazena.com Dr. Daniel Parton Lead Data Scientist dparton@bardess.com Mark Chisam Senior Solution Engineer mchisam@cloudera.com
  29. 29. © Cloudera, Inc. All rights reserved. 29© Cloudera, Inc. All rights reserved. The Data Science Sandbox as a Service Try it Now with the FastStart Business Value Pilot: 4 Weeks to a Guaranteed Business Outcome. Philip Duplisey, Senior Director of Consulting pduplisey@bardess.com Bardess.com Bardess: Data Science & Management Consulting Cazena: Fully-Managed Cloudera Solutions for Azure & AWS Cloudera: The Modern Platform for Data Science and Analytics. Sam Berg VP Sales sberg@cazena.com Cazena.com Tia Watson Partner Manager twatson@cloudera.com Cloudera.com
  30. 30. © Cloudera, Inc. All rights reserved. THANK YOU

×