SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Cloud and Analytics
– from Platforms to an Ecosystem
Ming Yuan, Zurich North America
David Carlson, Databricks
Agenda
▪ Data and Analytics at ZNA
▪ Data and Metadata
▪ Data Exploration and ETL
▪ Containerization
▪ DevOps in Analytics
Zurich is a data-enabled innovative company
• Data is used in day-to-day decision makings in key
business domains
• A strong data science team delivers predictive models and
business insights
• We are an early adopter of advanced analytics and cloud
analytics
Multiple Databases
On-premises Data Warehouse
Hadoop Data Lake
Cloud Data Lake
• Governance processes on data access and utilization are established
• Metadata is collected and stored in the repository system
Key capabilities support data analytics life cycle
• Data Discovery
• Data Integration
• Collaboration
• Business Impact (Operationalization)
• Scalability
• Multiple Personas
• Support multiple types of implementations
Ideation
Model Build
Model
Deployment
Model
Execution
Model
Monitoring
▪ Support ML and advanced analysis to discover business insights and drive
appropriate actions
▪ Enable cross-domain data sharing, aggregation, and integration
▪ Modernize the technical landscape to handle data sets that were previously
unprocessable
Data foundation and processing power
Data
▪ Optimize data processing and archiving strategies to
reduce operation costs
▪ Apply data governance best practices to manage
utilization
Data lake consists of ADLS and Databricks®
clusters
Provisioning Store
Landing Staging Active Archive
Change Data
Capture
(CDC) or full
snapshots
Enrich
Landing zone
data with
additional
Date format
fields and
remove
Special
characters.
CDC records
applied
(I, U, D) to
copy of
previous
day's data
Rolling
pointers to
previous
day's
Active…
Curation Layer
Universal
Data
Model
Curated
Data Sets
Data
Sources
Data
Consumption
Azure Subscription
Services
Enterprise
level
curated
datasets
covering
broad
utilization
Pertaining
to the
needs in
specific
business
domain
Metadata management and data discovery
▪ For metadata administrators
▪ Maintain business glossary for data domains that are owned by function or business units
▪ Import technical metadata and catalog it as data assets
▪ Curate technical metadata relating them to logical business terms
▪ Maintain data-flows mapping transformations
▪ For data consumers
▪ Search, explore and discover data assets and data lineage
▪ Interpret data with correct meaning and context
▪ Navigate data flows to analyses processes and assess change impact
▪ Evaluate data quality reports and drive improvement actions
Alation® Data Catalog manages metadata
ingestions
Database
Data Warehouse
Cloud Data Lake
JSON Streams
Ingest and refresh schema, table, and column definitions
Build data lineage, popularity, common queries, and more
Profile and store sample data sets
Collect user information and usage metrics
Open APIs to programmatically import business glossaries
2,053,632
Intuitive user interfaces to access metadata
Users and Stewards
actively curate the
pages
Natural-language
search to easily
discover unknowns
Everyone collaborates
and communicates
Query intelligently against
source systems
Data exploration and ETL implementations
▪ Explore, valid and analyze existing data sets
▪ Curate new data sets for model development
▪ Construct ETL flows with embedded AI/modeling components
▪ Release ETL flows to production environment
▪ Provide runtime environments to trigger, manage, and monitor ETL flows in
production
Leverage technical stack and skills across
Personas
LINUX Server on
Azure Cloud
CENTRALIZED OR AD-
HOC DATA SOURCES,
DATA LAKE
AVAILABLE OR SPUN-
UP PROCESSING
RESOURCES
Leveraging
best storage
and compute
resources
Dataiku deployment servers for
enterprise grade operationalization
PRODUCTION
SYSTEMS
Centralized server to
facilitate
access to data, and foster
collaboration
Browser
based user
interfaces
User/task specific
interaction modes
INTEGRATION WITH
METADATA SYSTEM
Containerization in building model API services
▪ Standardize the runtime environment using commonly used ML libraries for
development and production
▪ Elastically scale the system capacity for the development environment
▪ Easily migrate system stacks from development environment to production
▪ Build CI/CD pipelines and deployment environments based on
open standards
▪ Monitor and ensure the health of model implementations in
production
Containerize models as cloud-native applications
Client App
Client App
Orchestration
We observed improved agility in development, more portability in deployment, and better elasticity in production
DevOps in data & analytics
▪ For platform administrators
▪ Codify the installation and configuration of key components in the ecosystem
▪ Streamline the process of testing and upgrading systems to newer versions
▪ Automate system’s backup and restoration
▪ For model services developers
▪ Standardize the deployment pipelines to reduce the effort per project
▪ Increase the agility of deploying applications from development to production
▪ Reduce the time to fix bugs after production releases
CI/CD processes accelerate app deployments
Prod
Azure App
Services
Azure Container
Registry
Dev
Azure Pipeline
(Release)
Azure Pipeline
(Build)
Azure Code
Repos
Azure App
Services
Azure Container
Registry
QA
Azure Pipeline
(Release)
Azure Pipeline
(Build)
Azure Code
Repos
Azure App
Services
Analytical platforms fitting into different
scenarios are integrated as an ecosystem
Ideation Model Build Model Deployment Model Execution Model Monitoring
Feedback
Your feedback is important to us.
Don’t forget to rate and
review the sessions.
Zurich Insurance Group (Zurich), headquartered and founded in Switzerland, is a leading multi-
line insurance group with more than 140 years’ experience serving businesses worldwide,
including over 100 years in North America. We are committed to delivering broad and flexible
insurance solutions to our customers and helping them understand, manage and minimize risk.
Through member companies in North America, Zurich is a leading commercial property-casualty
insurance provider serving small businesses, mid-sized and large companies, including
multinational corporations.
 Approximately 55,000 employees
 Managing complex risks for 7,600 international programs through our global network
 Achieving USD 5.3 billion in business operating profit (BOP) in 2019
 Providing comprehensive solutions and insights for 25 industries
 Insuring more than 215,500 customers
 Insuring more than 90 percent of the Fortune 500
The Alation Data Catalog and its logo is used with kind permission of Alation, Inc.
The Dataiku DSS and its logo is used with kind permission of Dataiku, Inc.
The Domino Data Lab and its logo is used with kind permission of Domino Data Lab, Inc.
Use of them does not endorse the products.

Weitere ähnliche Inhalte

Was ist angesagt?

ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...
ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...
ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...DATAVERSITY
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced Analytics
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced AnalyticsADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced Analytics
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced AnalyticsDATAVERSITY
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceDATAVERSITY
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureDATAVERSITY
 
DAS Slides: Data Virtualization – Separating Myth from Reality
DAS Slides: Data Virtualization – Separating Myth from RealityDAS Slides: Data Virtualization – Separating Myth from Reality
DAS Slides: Data Virtualization – Separating Myth from RealityDATAVERSITY
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeDATAVERSITY
 
Data-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsData-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsDATAVERSITY
 
Big Data Analytics Architecture PowerPoint Presentation Slides
Big Data Analytics Architecture PowerPoint Presentation SlidesBig Data Analytics Architecture PowerPoint Presentation Slides
Big Data Analytics Architecture PowerPoint Presentation SlidesSlideTeam
 
Information & Data Architecture
Information & Data ArchitectureInformation & Data Architecture
Information & Data ArchitectureSammer Qader
 
A Broader Data Management Strategy with DKAN
A Broader Data Management Strategy with DKANA Broader Data Management Strategy with DKAN
A Broader Data Management Strategy with DKANDinothan Muthulingam
 
ADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsDATAVERSITY
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...DATAVERSITY
 
RWDG Slides: Using Tools to Advance Your Data Governance Program
RWDG Slides: Using Tools to Advance Your Data Governance ProgramRWDG Slides: Using Tools to Advance Your Data Governance Program
RWDG Slides: Using Tools to Advance Your Data Governance ProgramDATAVERSITY
 
Do-It-Yourself (DIY) Data Governance Framework
Do-It-Yourself (DIY) Data Governance FrameworkDo-It-Yourself (DIY) Data Governance Framework
Do-It-Yourself (DIY) Data Governance FrameworkDATAVERSITY
 
Death of the Dashboard
Death of the DashboardDeath of the Dashboard
Death of the DashboardDATAVERSITY
 
Building an Effective Data & Analytics Operating Model A Data Modernization G...
Building an Effective Data & Analytics Operating Model A Data Modernization G...Building an Effective Data & Analytics Operating Model A Data Modernization G...
Building an Effective Data & Analytics Operating Model A Data Modernization G...Mark Hewitt
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...
Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...
Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...DATAVERSITY
 
Slides: Why You Need End-to-End Data Quality to Build Trust in Kafka
Slides: Why You Need End-to-End Data Quality to Build Trust in KafkaSlides: Why You Need End-to-End Data Quality to Build Trust in Kafka
Slides: Why You Need End-to-End Data Quality to Build Trust in KafkaDATAVERSITY
 

Was ist angesagt? (20)

ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...
ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...
ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced Analytics
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced AnalyticsADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced Analytics
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced Analytics
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data Governance
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
DAS Slides: Data Virtualization – Separating Myth from Reality
DAS Slides: Data Virtualization – Separating Myth from RealityDAS Slides: Data Virtualization – Separating Myth from Reality
DAS Slides: Data Virtualization – Separating Myth from Reality
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data Lake
 
Data-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsData-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling Fundamentals
 
Big Data Analytics Architecture PowerPoint Presentation Slides
Big Data Analytics Architecture PowerPoint Presentation SlidesBig Data Analytics Architecture PowerPoint Presentation Slides
Big Data Analytics Architecture PowerPoint Presentation Slides
 
Information & Data Architecture
Information & Data ArchitectureInformation & Data Architecture
Information & Data Architecture
 
A Broader Data Management Strategy with DKAN
A Broader Data Management Strategy with DKANA Broader Data Management Strategy with DKAN
A Broader Data Management Strategy with DKAN
 
ADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic Solutions
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
 
RWDG Slides: Using Tools to Advance Your Data Governance Program
RWDG Slides: Using Tools to Advance Your Data Governance ProgramRWDG Slides: Using Tools to Advance Your Data Governance Program
RWDG Slides: Using Tools to Advance Your Data Governance Program
 
Do-It-Yourself (DIY) Data Governance Framework
Do-It-Yourself (DIY) Data Governance FrameworkDo-It-Yourself (DIY) Data Governance Framework
Do-It-Yourself (DIY) Data Governance Framework
 
Death of the Dashboard
Death of the DashboardDeath of the Dashboard
Death of the Dashboard
 
Building an Effective Data & Analytics Operating Model A Data Modernization G...
Building an Effective Data & Analytics Operating Model A Data Modernization G...Building an Effective Data & Analytics Operating Model A Data Modernization G...
Building an Effective Data & Analytics Operating Model A Data Modernization G...
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...
Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...
Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Dat...
 
Slides: Why You Need End-to-End Data Quality to Build Trust in Kafka
Slides: Why You Need End-to-End Data Quality to Build Trust in KafkaSlides: Why You Need End-to-End Data Quality to Build Trust in Kafka
Slides: Why You Need End-to-End Data Quality to Build Trust in Kafka
 

Ähnlich wie Cloud and Analytics -- 2020 sparksummit

2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise AnalyticsDATAVERSITY
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse OptimizationCloudera, Inc.
 
DevOps Spain 2019. Olivier Perard-Oracle
DevOps Spain 2019. Olivier Perard-OracleDevOps Spain 2019. Olivier Perard-Oracle
DevOps Spain 2019. Olivier Perard-OracleatSistemas
 
Houd controle over uw data
Houd controle over uw dataHoud controle over uw data
Houd controle over uw dataICT-Partners
 
Migrating Thousands of Workloads to AWS at Enterprise Scale – Chris Wegmann, ...
Migrating Thousands of Workloads to AWS at Enterprise Scale – Chris Wegmann, ...Migrating Thousands of Workloads to AWS at Enterprise Scale – Chris Wegmann, ...
Migrating Thousands of Workloads to AWS at Enterprise Scale – Chris Wegmann, ...Amazon Web Services
 
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy:  A Simple, Scalable Solution for Getting Started with HadoopBig Data Made Easy:  A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with HadoopPrecisely
 
SphereEx pitch deck
SphereEx pitch deckSphereEx pitch deck
SphereEx pitch deckTech in Asia
 
CSC - Presentation at Hortonworks Booth - Strata 2014
CSC - Presentation at Hortonworks Booth - Strata 2014CSC - Presentation at Hortonworks Booth - Strata 2014
CSC - Presentation at Hortonworks Booth - Strata 2014Hortonworks
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopDatameer
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategyJames Serra
 
Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic IntelAPAC
 
Accenture 2014 AWS re:Invent Enterprise Migration Breakout Session
Accenture 2014 AWS re:Invent Enterprise Migration Breakout SessionAccenture 2014 AWS re:Invent Enterprise Migration Breakout Session
Accenture 2014 AWS re:Invent Enterprise Migration Breakout SessionTom Laszewski
 
(ENT206) Migrating Thousands of Workloads to AWS at Enterprise Scale | AWS re...
(ENT206) Migrating Thousands of Workloads to AWS at Enterprise Scale | AWS re...(ENT206) Migrating Thousands of Workloads to AWS at Enterprise Scale | AWS re...
(ENT206) Migrating Thousands of Workloads to AWS at Enterprise Scale | AWS re...Amazon Web Services
 
MSFT MAIW Data Mod - Session 1 Deck_Why Migrate your databases to Azure_Sept ...
MSFT MAIW Data Mod - Session 1 Deck_Why Migrate your databases to Azure_Sept ...MSFT MAIW Data Mod - Session 1 Deck_Why Migrate your databases to Azure_Sept ...
MSFT MAIW Data Mod - Session 1 Deck_Why Migrate your databases to Azure_Sept ...ssuser01a66e
 
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaCloudera, Inc.
 
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...DataWorks Summit
 
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseCloudera, Inc.
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Cloudera, Inc.
 
EDB Executive Presentation 101515
EDB Executive Presentation 101515EDB Executive Presentation 101515
EDB Executive Presentation 101515Pierre Fricke
 

Ähnlich wie Cloud and Analytics -- 2020 sparksummit (20)

2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
 
Oracle canvas 140604 2
Oracle canvas 140604 2Oracle canvas 140604 2
Oracle canvas 140604 2
 
DevOps Spain 2019. Olivier Perard-Oracle
DevOps Spain 2019. Olivier Perard-OracleDevOps Spain 2019. Olivier Perard-Oracle
DevOps Spain 2019. Olivier Perard-Oracle
 
Houd controle over uw data
Houd controle over uw dataHoud controle over uw data
Houd controle over uw data
 
Migrating Thousands of Workloads to AWS at Enterprise Scale – Chris Wegmann, ...
Migrating Thousands of Workloads to AWS at Enterprise Scale – Chris Wegmann, ...Migrating Thousands of Workloads to AWS at Enterprise Scale – Chris Wegmann, ...
Migrating Thousands of Workloads to AWS at Enterprise Scale – Chris Wegmann, ...
 
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy:  A Simple, Scalable Solution for Getting Started with HadoopBig Data Made Easy:  A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
 
SphereEx pitch deck
SphereEx pitch deckSphereEx pitch deck
SphereEx pitch deck
 
CSC - Presentation at Hortonworks Booth - Strata 2014
CSC - Presentation at Hortonworks Booth - Strata 2014CSC - Presentation at Hortonworks Booth - Strata 2014
CSC - Presentation at Hortonworks Booth - Strata 2014
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & Hadoop
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategy
 
Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic
 
Accenture 2014 AWS re:Invent Enterprise Migration Breakout Session
Accenture 2014 AWS re:Invent Enterprise Migration Breakout SessionAccenture 2014 AWS re:Invent Enterprise Migration Breakout Session
Accenture 2014 AWS re:Invent Enterprise Migration Breakout Session
 
(ENT206) Migrating Thousands of Workloads to AWS at Enterprise Scale | AWS re...
(ENT206) Migrating Thousands of Workloads to AWS at Enterprise Scale | AWS re...(ENT206) Migrating Thousands of Workloads to AWS at Enterprise Scale | AWS re...
(ENT206) Migrating Thousands of Workloads to AWS at Enterprise Scale | AWS re...
 
MSFT MAIW Data Mod - Session 1 Deck_Why Migrate your databases to Azure_Sept ...
MSFT MAIW Data Mod - Session 1 Deck_Why Migrate your databases to Azure_Sept ...MSFT MAIW Data Mod - Session 1 Deck_Why Migrate your databases to Azure_Sept ...
MSFT MAIW Data Mod - Session 1 Deck_Why Migrate your databases to Azure_Sept ...
 
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache Impala
 
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
 
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the Enterprise
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
 
EDB Executive Presentation 101515
EDB Executive Presentation 101515EDB Executive Presentation 101515
EDB Executive Presentation 101515
 

Mehr von Ming Yuan

Forrester2019
Forrester2019Forrester2019
Forrester2019Ming Yuan
 
R & Python on Hadoop
R & Python on HadoopR & Python on Hadoop
R & Python on HadoopMing Yuan
 
SSO with sfdc
SSO with sfdcSSO with sfdc
SSO with sfdcMing Yuan
 
Rest and beyond
Rest and beyondRest and beyond
Rest and beyondMing Yuan
 
Simplifying Apache Cascading
Simplifying Apache CascadingSimplifying Apache Cascading
Simplifying Apache CascadingMing Yuan
 
Building calloutswithoutwsdl2apex
Building calloutswithoutwsdl2apexBuilding calloutswithoutwsdl2apex
Building calloutswithoutwsdl2apexMing Yuan
 

Mehr von Ming Yuan (7)

Forrester2019
Forrester2019Forrester2019
Forrester2019
 
R & Python on Hadoop
R & Python on HadoopR & Python on Hadoop
R & Python on Hadoop
 
SSO with sfdc
SSO with sfdcSSO with sfdc
SSO with sfdc
 
Singleton
SingletonSingleton
Singleton
 
Rest and beyond
Rest and beyondRest and beyond
Rest and beyond
 
Simplifying Apache Cascading
Simplifying Apache CascadingSimplifying Apache Cascading
Simplifying Apache Cascading
 
Building calloutswithoutwsdl2apex
Building calloutswithoutwsdl2apexBuilding calloutswithoutwsdl2apex
Building calloutswithoutwsdl2apex
 

Kürzlich hochgeladen

Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shardsChristopher Curtin
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldRoberto Pérez Alcolea
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...OnePlan Solutions
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesKrzysztofKkol1
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsJean Silva
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingShane Coughlan
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITmanoharjgpsolutions
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...Bert Jan Schrijver
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogueitservices996
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxRTS corp
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingShane Coughlan
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?Alexandre Beguel
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...OnePlan Solutions
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slidesvaideheekore1
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 

Kürzlich hochgeladen (20)

Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh IT
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogue
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slides
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 

Cloud and Analytics -- 2020 sparksummit

  • 1. Cloud and Analytics – from Platforms to an Ecosystem Ming Yuan, Zurich North America David Carlson, Databricks
  • 2. Agenda ▪ Data and Analytics at ZNA ▪ Data and Metadata ▪ Data Exploration and ETL ▪ Containerization ▪ DevOps in Analytics
  • 3. Zurich is a data-enabled innovative company • Data is used in day-to-day decision makings in key business domains • A strong data science team delivers predictive models and business insights • We are an early adopter of advanced analytics and cloud analytics Multiple Databases On-premises Data Warehouse Hadoop Data Lake Cloud Data Lake • Governance processes on data access and utilization are established • Metadata is collected and stored in the repository system
  • 4. Key capabilities support data analytics life cycle • Data Discovery • Data Integration • Collaboration • Business Impact (Operationalization) • Scalability • Multiple Personas • Support multiple types of implementations Ideation Model Build Model Deployment Model Execution Model Monitoring
  • 5. ▪ Support ML and advanced analysis to discover business insights and drive appropriate actions ▪ Enable cross-domain data sharing, aggregation, and integration ▪ Modernize the technical landscape to handle data sets that were previously unprocessable Data foundation and processing power Data ▪ Optimize data processing and archiving strategies to reduce operation costs ▪ Apply data governance best practices to manage utilization
  • 6. Data lake consists of ADLS and Databricks® clusters Provisioning Store Landing Staging Active Archive Change Data Capture (CDC) or full snapshots Enrich Landing zone data with additional Date format fields and remove Special characters. CDC records applied (I, U, D) to copy of previous day's data Rolling pointers to previous day's Active… Curation Layer Universal Data Model Curated Data Sets Data Sources Data Consumption Azure Subscription Services Enterprise level curated datasets covering broad utilization Pertaining to the needs in specific business domain
  • 7. Metadata management and data discovery ▪ For metadata administrators ▪ Maintain business glossary for data domains that are owned by function or business units ▪ Import technical metadata and catalog it as data assets ▪ Curate technical metadata relating them to logical business terms ▪ Maintain data-flows mapping transformations ▪ For data consumers ▪ Search, explore and discover data assets and data lineage ▪ Interpret data with correct meaning and context ▪ Navigate data flows to analyses processes and assess change impact ▪ Evaluate data quality reports and drive improvement actions
  • 8. Alation® Data Catalog manages metadata ingestions Database Data Warehouse Cloud Data Lake JSON Streams Ingest and refresh schema, table, and column definitions Build data lineage, popularity, common queries, and more Profile and store sample data sets Collect user information and usage metrics Open APIs to programmatically import business glossaries 2,053,632
  • 9. Intuitive user interfaces to access metadata Users and Stewards actively curate the pages Natural-language search to easily discover unknowns Everyone collaborates and communicates Query intelligently against source systems
  • 10. Data exploration and ETL implementations ▪ Explore, valid and analyze existing data sets ▪ Curate new data sets for model development ▪ Construct ETL flows with embedded AI/modeling components ▪ Release ETL flows to production environment ▪ Provide runtime environments to trigger, manage, and monitor ETL flows in production
  • 11. Leverage technical stack and skills across Personas LINUX Server on Azure Cloud CENTRALIZED OR AD- HOC DATA SOURCES, DATA LAKE AVAILABLE OR SPUN- UP PROCESSING RESOURCES Leveraging best storage and compute resources Dataiku deployment servers for enterprise grade operationalization PRODUCTION SYSTEMS Centralized server to facilitate access to data, and foster collaboration Browser based user interfaces User/task specific interaction modes INTEGRATION WITH METADATA SYSTEM
  • 12. Containerization in building model API services ▪ Standardize the runtime environment using commonly used ML libraries for development and production ▪ Elastically scale the system capacity for the development environment ▪ Easily migrate system stacks from development environment to production ▪ Build CI/CD pipelines and deployment environments based on open standards ▪ Monitor and ensure the health of model implementations in production
  • 13. Containerize models as cloud-native applications Client App Client App Orchestration We observed improved agility in development, more portability in deployment, and better elasticity in production
  • 14. DevOps in data & analytics ▪ For platform administrators ▪ Codify the installation and configuration of key components in the ecosystem ▪ Streamline the process of testing and upgrading systems to newer versions ▪ Automate system’s backup and restoration ▪ For model services developers ▪ Standardize the deployment pipelines to reduce the effort per project ▪ Increase the agility of deploying applications from development to production ▪ Reduce the time to fix bugs after production releases
  • 15. CI/CD processes accelerate app deployments Prod Azure App Services Azure Container Registry Dev Azure Pipeline (Release) Azure Pipeline (Build) Azure Code Repos Azure App Services Azure Container Registry QA Azure Pipeline (Release) Azure Pipeline (Build) Azure Code Repos Azure App Services
  • 16. Analytical platforms fitting into different scenarios are integrated as an ecosystem Ideation Model Build Model Deployment Model Execution Model Monitoring
  • 17. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.
  • 18. Zurich Insurance Group (Zurich), headquartered and founded in Switzerland, is a leading multi- line insurance group with more than 140 years’ experience serving businesses worldwide, including over 100 years in North America. We are committed to delivering broad and flexible insurance solutions to our customers and helping them understand, manage and minimize risk. Through member companies in North America, Zurich is a leading commercial property-casualty insurance provider serving small businesses, mid-sized and large companies, including multinational corporations.  Approximately 55,000 employees  Managing complex risks for 7,600 international programs through our global network  Achieving USD 5.3 billion in business operating profit (BOP) in 2019  Providing comprehensive solutions and insights for 25 industries  Insuring more than 215,500 customers  Insuring more than 90 percent of the Fortune 500 The Alation Data Catalog and its logo is used with kind permission of Alation, Inc. The Dataiku DSS and its logo is used with kind permission of Dataiku, Inc. The Domino Data Lab and its logo is used with kind permission of Domino Data Lab, Inc. Use of them does not endorse the products.