Anzeige

How Data Virtualization Puts Machine Learning into Production (APAC)

Denodo
23. Dec 2020
Anzeige

Más contenido relacionado

Presentaciones para ti(20)

Similar a How Data Virtualization Puts Machine Learning into Production (APAC)(20)

Anzeige

Más de Denodo (20)

Anzeige

How Data Virtualization Puts Machine Learning into Production (APAC)

  1. DATA VIRTUALIZATION APAC WEBINAR SERIES Sessions Covering Key Data Integration Challenges Solved with Data Virtualization
  2. How Data Virtualization Puts Enterprise Machine Learning Programs into Production Chris Day Director, APAC Sales Engineering, Denodo Sushant Kumar Product Marketing Manager, Denodo
  3. Agenda 1. What are Advanced Analytics? 2. The Data Challenge 3. The Rise of Logical Data Architectures 4. Tackling the Data Pipeline Problem 5. Customer Stories 6. Key Takeaways 7. Q&A 8. Next Steps
  4. 4 VentureBeat AI, July 2019 87% of data science projects never make it into production.
  5. 5 Advanced Analytics & Machine Learning Exercises Need Data Improving Patient Outcomes Data includes patient demographics, family history, patient vitals, lab test results, claims data etc. Predictive Maintenance Maintenance data logs, data coming in from sensors – including temperature, running time, power level duration etc. Predicting Late Payment Data includes company or individual demographics, payment history, customer support logs etc. Preventing Frauds Data includes the location where the claim originated, time of the day, claimant history and any recent adverse events. Reducing Customer Churn Data includes customer demographics, products purchased, products used, pat transaction, company size, history, revenue etc.
  6. Logical Data Warehouse
  7. 8 Gartner, Adopt the Logical Data Warehouse Architecture to Meet Your Modern Analytical Needs, May 2018 “When designed properly, Data Virtualization can speed data integration, lower data latency, offer flexibility and reuse, and reduce data sprawl across dispersed data sources. Due to its many benefits, Data Virtualization is often the first step for organizations evolving a traditional, repository-style data warehouse into a Logical Architecture”
  8. 9 Logical Data Warehouse Reference Architecture
  9. 10 Why A Logical Architecture Is Needed ü The analytical technology landscape has shifted over time. ü You need a flexible architecture that will allow you to embrace those shifts rather than tie you down to a monolithic approach. ü Only a logical architecture will easily accommodate such changes, and not a physical architecture. ü IT should be able to adopt newer technologies without impacting business users.
  10. Tackling the Data Pipeline Problem
  11. 12 Typical Data Science Workflow A typical workflow for a data scientist is: 1. Gather the requirements for the business problem 2. Identify useful data § Ingest data 3. Cleanse data into a useful format 4. Analyze data 5. Prepare input for your algorithms 6. Execute data science algorithms (ML, AI, etc.) § Iterate steps 2 to 6 until valuable insights are produced 7. Visualize and share Source:http://sudeep.co/data-science/Understanding-the-Data-Science-Lifecycle/
  12. 13 Where Does Your Time Go? • 80% of time – Finding and preparing the data • 10% of time – Analysis • 10% of time – Visualizing data Source:http://sudeep.co/data-science/Understanding-the-Data-Science-Lifecycle/
  13. 14 Where Does Your Time Go? A large amount of time and effort goes into tasks not intrinsically related to data science: • Finding where the right data may be • Getting access to the data § Bureaucracy § Understand access methods and technology (noSQL, REST APIs, etc.) • Transforming data into a format easy to work with • Combining data originally available in different sources and formats • Profile and cleanse data to eliminate incomplete or inconsistent data points
  14. 15 Data Scientist Workflow Identify useful data Modify datainto auseful format Analyzedata Executedata science algorithms (ML,AI, etc.) Prepare for MLalgorithm
  15. 16 Identify Useful Data If the company has a virtual layer with a good coverage of data sources, this task is greatly simplified. § A data virtualization tool like Denodo can offer unified access to all data available in the company. § It abstracts the technologies underneath, offering a standard SQL interface to query and manipulate. To further simplify the challenge, Denodo offers a Data Catalog to search, find and explore your data assets.
  16. 17 Data Scientist Workflow Identify useful data Modify datainto auseful format Analyzedata Executedata science algorithms (ML,AI, etc.) Prepare for MLalgorithm
  17. 18 Data Virtualization offers the unique opportunity of using standard SQL (joins, aggregations, transformations, etc.) to access, manipulate and analyze any data. Cleansing and transformation steps can be easily accomplished in SQL. Its modeling capabilities enable the definition of views that embed this logic to foster reusability. Ingestion And Data Manipulation Tasks
  18. Customer Story
  19. 20 Prologis Launches Data Analytics Program for Cost Optimization Background § Create a single governed data access layer to create reusable and consistent analytical assets that could be used by the rest of the business teams to run their own analytics. § Save time for data scientists in finding , transforming and analysing data sets without having to learn new skills and create data models that could be refreshed on demand. § Efficiently maintain its new data architecture with minimum downtime and configuration management. Prologis is the largest industrial real estate company in the world, serving 5000 customers in over 20 countries and USD 87 billion in assets under management.
  20. 21 Prologis Architecture Diagram wc_monthly_occupancy_rpt_f wc_lease_amendment_d w_day_d wc_property_d MARKET_AVAILABILITY_CURRENT MARKET_AVAILABILITY_FUTURE Prologis SnowFlake API Access Informatica Cloud ShareHouse ODBC JDBC peoplesoft_gl_actuals yardi_unit_leasing p360_property WAF AWS Lambda APIs
  21. 22 Data Virtualization Benefits Experienced by Prologis § The analytics team was able to create business focussed subject areas with consistent data sets that were 30% faster in speed to analytics. § Denodo made it possible for Prologis to quick start advanced analytics projects. § The Denodo platform’s deployment was as easy as a click of a button with centralized configuration management. This simplified Prologis’s data architecture and also helped bring down the overall maintenance cost.
  22. 23 Luke Slotwinski, VP, IT Data and Analytics at Prologis The speed of business is faster than before. It is now critical to be able to make decisions on a dime to pivot the business in its needed direction. This is why Prologis went with the Denodo Platform.
  23. 24 ü Denodo can play key role in the data science ecosystem to reduce data exploration and analysis timeframes. ü Extends and integrates with the capabilities of notebooks, Python, R, etc. to improve the toolset of the data scientist. ü Provides a modern “SQL-on-Anything” engine. ü Can leverage Big Data technologies like Spark (as a data source, an ingestion tool and for external processing) to efficiently work with large data volumes. ü New and expanded tools for data scientists and citizen analysts: “Apache Zeppelin for Denodo” Notebook. Data Virtualization Benefits for AI and Machine Learning Projects
  24. Product Demonstration Chris Day Director, APAC Sales Engineering, Denodo
  25. 26 Key Takeaways ü Information architectures are getting more complex, more diverse, and more distributed. ü Traditional technologies and data replication don’t cut it anymore. ü Data virtualization makes it quick and easy to expose data from multiple source to your users. ü Data virtualization provides a governance and management infrastructure required for successful data management.
  26. Q&A
  27. Next Steps
  28. 29 bit.ly/2AouQLQ
  29. Thanks! www.denodo.com info@denodo.com © Copyright Denodo Technologies. All rights reserved Unless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical, including photocopying and microfilm, without prior the written authorization from Denodo Technologies.
Anzeige