Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

DevOps Spain 2019. Olivier Perard-Oracle

158 Aufrufe

Veröffentlicht am

Ponencia. DataOps. El ciclo de despliegue continuo en el análisis de datos

Veröffentlicht in: Technologie
  • Loggen Sie sich ein, um Kommentare anzuzeigen.

DevOps Spain 2019. Olivier Perard-Oracle

  1. 1. Patrocina Colabora DataOps El ciclo de despliegue continuo en el análisis de datos Olivier Perard| Data Scientist en Oracle
  2. 2. DataOps Definitions VP Technology Strategy, MapR DataOps is an agile methodology for developing and deploying data-intensive applications, including data science and machine learning. A DataOps workflow supports cross-functional collaboration and fast time to value. http://www.gartner.com/it-glossary/data-ops/ A hub for collecting and distributing data, with a mandate to provide controlled access to systems of record for customer and marketing performance data, while protecting privacy, usage restrictions, and data integrity.. Tamr CEO Andy Palmer DataOps is an enterprise collaboration framework that aligns data-management objectives with data-consumption ideals to maximize data-derived value. Nexla CEO DataOps is the function within an organization that controls the data journey from source to value.
  3. 3. DataOps Gartner Data & Analytics Summit 2018 DataOps, la plataforma de base de datos de nube privada como servicio (dbPaaS) y la gestión de datos habilitados para el aprendizaje automático. DataOps es una nueva práctica sin estándares ni frameworks Nick Heudecker, vicepresidente de investigación de Gartner
  4. 4. COMPARING DEVOPS AND DATAOPS WHAT’S DIFFERENT OR THE SAME? Developers & Architects Data Engineers Data Scientists Security & Governance Operations DataOps DevOps DataOps
  5. 5. DataOps Brings Flexibility & Focus Expands DevOps to include data-heavy roles Organized around data-related goals Better collaboration and communication between roles
  6. 6. DataOps AN AGILE METHODOLOGY FOR DATA-DRIVEN ORGANIZATIONS AXIOMS: Continuous model deployment Promote repeatability Promote productivity -- focus on core competencies Promote agility Promote self-service Data is central to disruptive enterprise applications • Lightweight, stateless functions do not represent the majority of workloads Data science and machine learning are an important paradigm • Scientists become active users -- no longer just application developers • Iterative workflow with different data usage patterns Data volumes continue to grow Moving data is a performance bottleneck DataOps Goals:
  7. 7. DataOps 7 Analyze and VisualizeStore and ProcessConnect and Integrate Structured Data Unstructured Data 1010101 01010 Sandboxes Data lakes Varying data types Quick and actionable business insights Focus on algorithms, not infrastructure Data available from structured and unstructured sources Data marts / warehouses DATA PLATFORM DATA Stream DATA ANALYTICS
  9. 9. DataOps Approach Advantages Data Self-Service • Data Scientists need to develop Use Cases quickly using the enterprise’s data without any restrictions from IT. Improved efficiency and better use of Team’s time • Deploy Analytic platform in one click Faster Time-to-Value Improve productivity • Implement use cases in parallel using the same data, but with dedicated platforms to each analytic teams. Storage Compute LIBRARI ES TOO LS DATA SCIENCE PLATFORMS
  10. 10. DataOps Continuous Model Deployment Key Building Blocks for Agility: • Unified data platform • Data governance • Self-service data and compute access • Multitenancy and resource management Data Engineering Model Development Model Management Model Deployment Model Monitoring & Rescoring
  11. 11. DataOps Storage Compute Data Lab Sand box Data Pod
  12. 12. DataOps Data Platform Deployment Oracle GitHub OCI Ansible Modules Oracle Database 12c Jupyter Zeppelin OML 1 2 Data Integration CDC / ETL 3 Data Lab
  13. 13. DataOps Data-Driven Architecture Traditional and Modern Legacy, Custom, Mainframe, SaaS, Microservices, … Source: Oracle Insight Data Platform Analytics • Advanced Analytics • Self-service • Predictive Data Science • Machine Learning • Deep Learning Modern Data Platform Security & Compliance X Data Applications Real-time Analytics • Real-time Marketing • Fraud detection • Exec Dashboarding Real-time Real-time Services {OOP} SparklineData • Accessing multiple source of data (Technologies, Silos/Locations, Clouds) … • … with high performances … • … for broader Cross Multi-model queries/algorithms on real-time data as well as historical data Applications BigData SQL
  14. 14. DataOps Cloud Native & Open Source Community Artificial Intelligence Block Chain Internet of Things Container Native Microservices Open Serverless Computing DevOps Prometeus Open Source Cloud Native Innovation Open Source Cloud Native Development ISTIO Cloud-Native and Community Driven Innovation Open Source Managed and Autonomous Cloud Native
  15. 15. DataOps Data Stream Data Preparation Data Replication Data ETLLogs Oracle Cloud Infrastructure Analytics Consumers Data Platform BI NL / AI Data Integration CDC / ETL Discovering Structuring Cleaning Enriching Validating Deploying
  16. 16. DataOps Data Stream Lineage Pipeline Quality Speed Efficiency
  17. 17. Oracle Data Science Data Science Requires a Comprehensive Platform to Simplify Operations and Deliver Value at Scale • Accelerate use of proper tools, frameworks and infrastructure • Overcome restricted skillsets with a simple, collaborative platform • Quickly leverage predictive analytics to drive positive business outcomes Collaborate securely Power business Work in standardized environments A Robust, Easy-to-Use Data Science Platform Removes Barriers to Deploying Valuable Machine Learning Models in Production Manage data and tools
  18. 18. Oracle Data Science Projects LifeCycle Reproducibility Data Versioning Code Versioning Model Versioning Environment Management Model Deployment Operationalize Models as Scalable APIs Model Management Monitor and Optimize Model Performance Data Exploration Collaborative Data Analysis / Feature Engineering Model Build and Train with Open Source Frameworks Collaborators ∙ Data Scientists ∙ Business Stakeholders ∙ App Developers ∙ IT Admins Business Analyst/Leader Defining business problem and objective of analyses Data Engineer Prepare data, build pipelines, and provide data access for analytical or operational uses. IT Admin Oversees underlying process, architecture, operations, resource constraints. Data Scientist Analyze data using statistical methods and coding languages like Python, R, Scala Application Developer Deploy data science models into applications. Build data products.
  19. 19. Oracle Data Science Modules Collaborative Integrated Enterprise-Grade Oracle Data Science Cloud Oracle PaaS & IaaS Projects Notebooks Open Source Languages & Libraries Version Control Use Case Templates Model Build & Train Self-Service Scalable Compute (OCI) Object Store Catalog Data Lake Streaming Autonomous Data Warehouse Model Deployment Model Monitoring Access Controls & Security Project driven UI enables teams to easily work together on end-to-end modeling workflows with self-service access to data and resources Support for latest open source tools, version control, and tight integration with OCI and Oracle Big Data Platform A fully managed platform built to meet the needs of the modern enterprise
  20. 20. Oracle Data Science Environment complexity
  21. 21. Oracle Data Science Configure, Train & Deploy Oracle PaaS Language Image Video HREmotion Easy Deployment 3 Deploy Model Train Data Definitio n Model Test Publish API Data Select Code Noteboo k 2 Train • Frameworks • AI libraries • Samples • GPU clusters • Connect to data • Auto scale, updates • HS network, storage •Object Stores •Database CS •Spark Easy Data Access + 1 Configure Autonomous Setup Model Sharing Model Library APIsModel Analytics IT Persona DevOps Data Scientist Data Scientist Easy Development Easy setup
  22. 22. Oracle Data Science Build & Train DEV TEST PROD
  23. 23. Oracle Data Science Deploy DEV TEST PROD
  24. 24. DataOps Conclusiones Multi-Model Data Access Interoperability Data preparation and pipeline Automation Elasticity Multidimensional agility Automated governance Next Generation Platform for All Data Complete, Integrated, Open AI and Machine Learning ALL IN ONE ORACLE PROVIDES
  25. 25. Patrocina Colabora Muchas Gracias Olivier Perard https://twitter.com/oracle_es?lang=es