SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Downloaden Sie, um offline zu lesen
Designing Fast Data Architecture
for Big Data using Logical Data
Warehouse and Data Lakes
A Case Study presented by Kurt Jackson
Platform Lead, Autodesk
Kurt Jackson
Platform Lead
Ravi Shankar
Chief Marketing
Officer
Speakers
Agenda
1. Towards a Logical Data Lake – An Autodesk Case Study
2. Performance Considerations in Logical Data
Warehouse/ Lakes
3. Q&A and Next Steps
© 2016 Autodesk | Enterprise Information Services
Towards the Logical Data Lake
Kurt Jackson
Platform Lead
© 2016 Autodesk | Enterprise Information Services 5
What is a Logical Data Warehouse?
 A logical data warehouse is a data system that follows the ideas of
traditional EDW (star or snowflake schemas) and includes, in
addition to one (or more) core DWs, data from external sources.
 The main motivations are improved decision making and/or cost
reduction
© 2016 Autodesk | Enterprise Information Services 6
What about the Logical Data Lake?
 A Data Lake will not have a star or snowflake schema, but rather a
more heterogeneous collection of views with raw data from
heterogeneous sources
 The virtual layer will act as a common umbrella under which these
different sources are presented to the end user as a single system
 However, from the virtualization perspective, a Virtual Data Lake
shares many technical aspects with a LDW and most of these
contents also apply to a Logical Data Lake
© 2016 Autodesk | Enterprise Information Services
Introduction
© 2016 Autodesk | Enterprise Information Services 8
Agile
Data
Architecture
Logical Data
Warehouse
Enterprise
Access Point
Data
Virtualization
Logical Data
Lake
Agile Data Architecture Lifecycle
Rapid, iterative
delivery
Modeling of
business
concepts
Curated,
governed, fit for
purpose data
Stable, system-
independent,
durable views
Single source of
data across the
enterprise
Enterprise
governance and
access control
© 2016 Autodesk | Enterprise Information Services
Business Problem
© 2016 Autodesk | Enterprise Information Services 10
© 2016 Autodesk | Enterprise Information Services 11
Most of us are in the same boat
© 2016 Autodesk | Enterprise Information Services
The Autodesk Agile Data Architecture
© 2016 Autodesk | Enterprise Information Services 13
Philosophy
 Access and refine data
near the source
 Published logical data
interfaces
 Implementing interfaces is
only an IT concern
 Agile and opportunistic
retirement of legacy
systems
© 2016 Autodesk | Enterprise Information Services 14
Autodesk Data Architecture
© 2016 Autodesk | Enterprise Information Services 15
The journey to the Logical Data Lake Data virtualization can be used
throughout your data pipeline!
© 2016 Autodesk | Enterprise Information Services 16
Big Data Ecosystem
© 2016 Autodesk | Enterprise Information Services
Towards the Logical Data Lake
© 2016 Autodesk | Enterprise Information Services 18
Logical Data Warehouse (LDW)
 Usability
 Single repository for schema
definitions
 Integrity
 Only published views are
publically available
 Business ownership
guarantees the quality
Agile
Data
Architectur
e
Logical
Data
Warehouse
Enterprise
Access
Point
Data
Virtualization
Logical
Data Lake
© 2016 Autodesk | Enterprise Information Services 19
The Enterprise Access Point
 Availability
 A single access point to
administer
 Security
 A single point for
authentication,
authorization and audit trail
Agile
Data
Architectur
e
Logical
Data
Warehouse
Enterprise
Access
Point
Data
Virtualization
Logical
Data Lake
© 2016 Autodesk | Enterprise Information Services 20
Data Virtualization
 System-agnostic
 Data contract independent of
implementation
 Enables the agile enterprise
 Aggressively and
opportunistically refactor
enterprise systems
 Manifest support for federation
and data blending
Agile
Data
Architectur
e
Logical
Data
Warehouse
Enterprise
Access
Point
Data
Virtualization
Logical
Data Lake
© 2016 Autodesk | Enterprise Information Services 21
Logical Data Lake
 Beyond the traditional
Enterprise Data Lake
 Combine traditional data lakes
with other data
 Blending of disparate,
heterogeneous data sources
 Internal, external, 3rd party…
 Approaches a true, single
origin for all enterprise data
Agile
Data
Architectur
e
Logical
Data
Warehouse
Enterprise
Access
Point
Data
Virtualization
Logical
Data Lake
© 2016 Autodesk | Enterprise Information Services 22
Towards the Logical Data Lake implements the philosophy
 Access and refine data near the
source
 No painful ETL pipelines for data
derivation
 Published logical data interfaces
 Single access point for all of external
data sets
 Implementing interfaces is only an IT
concern
 Accelerate to best of breed solutions
 Agile and opportunistic retirement of
legacy systems
 Rip and replace becomes almost
transparent
© 2016 Autodesk | Enterprise Information Services
Building the Agile Data Architecture at Autodesk
© 2016 Autodesk | Enterprise Information Services 24
Implementation Approach
 Identify enterprise data sources
 Harder than you think
 All new streaming, highly-available
ingestion mechanism
 Self-service or nearly so
 Stream-based facilitates batch and
streaming data processing
 Leverage highly-redundant cloud
storage for the data lake
 S3
 Leverage best-of breed for individual
components
 Open source, selected commercial
vendors
 Develop canonical representations for your
data sets
 Very difficult – consider a CDO
 Virtualize the data warehouses and marts
with a next generation Logical DW
 New implementations leverage the
LDW
 Legacy migrates opportunistically
© 2016 Autodesk | Enterprise Information Services 25
Architecting the Data Virtualization Layer
Corporate
LDAP
Data Consumer
Data Sources
Data
DV Instance 1
Source
Repository
Code
Logging Infrastructure
Data
Audit
Audit
CI/CD
DV Instance n
…
© 2016 Autodesk | Enterprise Information Services 26
Build an Information Architecture
 Base views to abstract data sources
 Layered derived views to reflect successively refined
derivations
 Create the notion of publication for curated, externally
visible views
 Expose services on top of views to make views more
accessible
 Separate namespaces (schemas) by project or
subject area
 Build the notion of commonality for views shared
across schemas
 Naming conventions for all objects
 Data portal for one-stop shopping for data consumers
© 2016 Autodesk | Enterprise Information Services 27
Towards the Logical Data Lake can
be a liberating journey!
Performance Considerations in
Logical Data Warehouse/ Lakes
Ravi Shankar, CMO
Denodo
Query Optimization in Data Virtualization
Real-time Data Integration
“Data virtualization integrates disparate data sources in real time or near-real time
to meet demands for analytics and transactional data.”
– Create a Road Map For A Real-time, Agile, Self-Service Data Platform, Forrester Research, Dec 16, 2015
Publishes
the data to applications
Combines
related data into views
Connects
to disparate data sources
2
3
1
Dynamic Query Rewriting for Big Data
30
1. Full Aggregation Pushdown
Total sales for each product in the last 2 years
Dynamic Query Rewriting for Big Data
31
2. Partition Pruning
Total sales for each product in the last 9 months
Dynamic Query Rewriting for Big Data
32
3. Partial Aggregation Pushdown
Total sales for each product category in the last 9 months
2. Total Sales by Product – Data Warehouse1. Products and their Categories – Products DB
3. Join both tables and aggregate – Data Virtualization
Dynamic Query Rewriting for Big Data
33
4. On the Fly Data Movement
Maximum discount for each product in the last 9 months
34
Compares the performance of a federated approach in Denodo with an MPP system where
all the data has been replicated via ETL
Customer Dim.
2 M rows
Sales Facts
290 M rows
Items Dim.
400 K rows
* TPC-DS is the de-facto industry standard benchmark for measuring the performance of
decision support solutions including, but not limited to, Big Data systems.
vs.
Sales Facts
290 M rows
Items Dim.
400 K rows
Customer Dim.
2 M rows
Performance Comparison
Logical Data Warehouse vs. Physical Data Warehouse
35
Performance Comparison
Query Description
Returned
Rows
Time Netezza
Time Denodo
(Federated Oracle,
Netezza & SQL Server)
Optimization Technique
(automatically selected)
Total sales by customer 1.99 M 20.9 sec. 21.4 sec. Full aggregation push-down
Total sales by customer and
year between 2000 and 2004
5.51 M 52.3 sec. 59.0 sec Full aggregation push-down
Total sales by item brand 31.35 K 4.7 sec. 5.0 sec. Partial aggregation push-down
Total sales by item where
sale price less than current
list price
17.05 K 3.5 sec. 5.2 sec On the fly data movement
Logical Data Warehouse vs. Physical Data Warehouse
Dynamic Query Optimizer
36
Delivers Breakthrough Performance for Big Data, Logical Data Warehouse, and
Operational Scenarios
Dynamically determines lowest-cost query execution plan based on
statistics
Factors in all the special characteristics of big data sources such as
number of processing units and partitions
Can easily handle any number of incremental queries
Enables connectivity to the broadest array of big data sources such
as Redshift, Impala, Spark.
Best dynamic query optimization engine in the industry.
Kurt Jackson
Platform Lead
Ravi Shankar
Chief Marketing
Officer
Q&A
Next Steps
View Educational Seminar (on-demand): Logical Data
Warehouse, Data Lakes, and Data Services Marketplace
Visit: www.denodo.com
Read about Data Virtualization for Logical Data Warehouse and
Data Lakes
Visit: denodo.com/en/solutions/horizontal-solutions/logical-data-warehouse
Get Started! Download Denodo Express
Visit: www.denodo.com/en/denodo-platform/denodo-express
Access Denodo Platform on AWS
Visit: www.denodo.com/en/denodo-platform/denodo-platform-for-aws
Kurt Jackson
Platform Lead
Ravi Shankar
Chief Marketing
Officer
Thank You

Weitere ähnliche Inhalte

Was ist angesagt?

Performance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and morePerformance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and moreDenodo
 
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitecturePerficient, Inc.
 
Dell Technology World - IT as a Business - Multi-Cloud Strategy is your Product
Dell Technology World - IT as a Business - Multi-Cloud Strategy is your ProductDell Technology World - IT as a Business - Multi-Cloud Strategy is your Product
Dell Technology World - IT as a Business - Multi-Cloud Strategy is your ProductManuel "Manny" Rodriguez-Perez
 
From Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data WarehouseFrom Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data WarehouseBui Ha
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationAccelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationDenodo
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014MapR Technologies
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Denodo
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Denodo
 
Big Data: Architecture and Performance Considerations in Logical Data Lakes
Big Data: Architecture and Performance Considerations in Logical Data LakesBig Data: Architecture and Performance Considerations in Logical Data Lakes
Big Data: Architecture and Performance Considerations in Logical Data LakesDenodo
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Denodo
 
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...Data Con LA
 
Can data virtualization uphold performance with complex queries?
Can data virtualization uphold performance with complex queries?Can data virtualization uphold performance with complex queries?
Can data virtualization uphold performance with complex queries?Denodo
 
Data architecture for modern enterprise
Data architecture for modern enterpriseData architecture for modern enterprise
Data architecture for modern enterprisekayalvizhi kandasamy
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...Cloudera, Inc.
 
Are You Killing the Benefits of Your Data Lake?
Are You Killing the Benefits of Your Data Lake?Are You Killing the Benefits of Your Data Lake?
Are You Killing the Benefits of Your Data Lake?Denodo
 
The Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the SameThe Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the SameCloudera, Inc.
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouseStephen Alex
 
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...Denodo
 

Was ist angesagt? (20)

Performance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and morePerformance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and more
 
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data Architecture
 
Dell Technology World - IT as a Business - Multi-Cloud Strategy is your Product
Dell Technology World - IT as a Business - Multi-Cloud Strategy is your ProductDell Technology World - IT as a Business - Multi-Cloud Strategy is your Product
Dell Technology World - IT as a Business - Multi-Cloud Strategy is your Product
 
From Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data WarehouseFrom Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data Warehouse
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationAccelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and Visualization
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
 
Big Data: Architecture and Performance Considerations in Logical Data Lakes
Big Data: Architecture and Performance Considerations in Logical Data LakesBig Data: Architecture and Performance Considerations in Logical Data Lakes
Big Data: Architecture and Performance Considerations in Logical Data Lakes
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
 
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
 
Can data virtualization uphold performance with complex queries?
Can data virtualization uphold performance with complex queries?Can data virtualization uphold performance with complex queries?
Can data virtualization uphold performance with complex queries?
 
Data architecture for modern enterprise
Data architecture for modern enterpriseData architecture for modern enterprise
Data architecture for modern enterprise
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
 
Are You Killing the Benefits of Your Data Lake?
Are You Killing the Benefits of Your Data Lake?Are You Killing the Benefits of Your Data Lake?
Are You Killing the Benefits of Your Data Lake?
 
The Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the SameThe Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the Same
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
 

Andere mochten auch

Building Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Building Reactive Fast Data & the Data Lake with Akka, Kafka, SparkBuilding Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Building Reactive Fast Data & the Data Lake with Akka, Kafka, SparkTodd Fritz
 
Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Reactive Fast Data & the Data Lake with Akka, Kafka, SparkReactive Fast Data & the Data Lake with Akka, Kafka, Spark
Reactive Fast Data & the Data Lake with Akka, Kafka, SparkTodd Fritz
 
Exploring cloud for data warehousing
Exploring cloud for data warehousingExploring cloud for data warehousing
Exploring cloud for data warehousingmark madsen
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouseKrish_ver2
 
Software Architectures, Week 5 - Advanced Architectures
Software Architectures, Week 5 - Advanced ArchitecturesSoftware Architectures, Week 5 - Advanced Architectures
Software Architectures, Week 5 - Advanced ArchitecturesAngelos Kapsimanis
 
Whitepaper-The-Data-Lake-3_0
Whitepaper-The-Data-Lake-3_0Whitepaper-The-Data-Lake-3_0
Whitepaper-The-Data-Lake-3_0Jane Roberts
 
Information Management Training Courses & Certification
Information Management Training Courses & CertificationInformation Management Training Courses & Certification
Information Management Training Courses & CertificationChristopher Bradley
 
Architecting a Scalable Hadoop Platform: Top 10 considerations for success
Architecting a Scalable Hadoop Platform: Top 10 considerations for successArchitecting a Scalable Hadoop Platform: Top 10 considerations for success
Architecting a Scalable Hadoop Platform: Top 10 considerations for successDataWorks Summit
 
Is the Data asset really different?
Is the Data asset really different?Is the Data asset really different?
Is the Data asset really different?Christopher Bradley
 
Information Management best_practice_guide
Information Management best_practice_guideInformation Management best_practice_guide
Information Management best_practice_guideChristopher Bradley
 
Information is at the heart of all architecture disciplines
Information is at the heart of all architecture disciplinesInformation is at the heart of all architecture disciplines
Information is at the heart of all architecture disciplinesChristopher Bradley
 
Information Management Training & Certification
Information Management Training & CertificationInformation Management Training & Certification
Information Management Training & CertificationChristopher Bradley
 
One Large Data Lake, Hold the Hype
One Large Data Lake, Hold the HypeOne Large Data Lake, Hold the Hype
One Large Data Lake, Hold the HypeJared Winick
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...Institute of Contemporary Sciences
 
Information Management Capabilities, Competencies & Staff Maturity Assessment
Information Management Capabilities, Competencies & Staff Maturity AssessmentInformation Management Capabilities, Competencies & Staff Maturity Assessment
Information Management Capabilities, Competencies & Staff Maturity AssessmentChristopher Bradley
 
Selecting Data Management Tools - A practical approach
Selecting Data Management Tools - A practical approachSelecting Data Management Tools - A practical approach
Selecting Data Management Tools - A practical approachChristopher Bradley
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)Spark Summit
 
SimplifyStreamingArchitecture
SimplifyStreamingArchitectureSimplifyStreamingArchitecture
SimplifyStreamingArchitectureMaheedhar Gunturu
 

Andere mochten auch (20)

Code camp2012
Code camp2012Code camp2012
Code camp2012
 
Building Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Building Reactive Fast Data & the Data Lake with Akka, Kafka, SparkBuilding Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Building Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
 
Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Reactive Fast Data & the Data Lake with Akka, Kafka, SparkReactive Fast Data & the Data Lake with Akka, Kafka, Spark
Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
 
Exploring cloud for data warehousing
Exploring cloud for data warehousingExploring cloud for data warehousing
Exploring cloud for data warehousing
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouse
 
Accumulo design
Accumulo designAccumulo design
Accumulo design
 
Software Architectures, Week 5 - Advanced Architectures
Software Architectures, Week 5 - Advanced ArchitecturesSoftware Architectures, Week 5 - Advanced Architectures
Software Architectures, Week 5 - Advanced Architectures
 
Whitepaper-The-Data-Lake-3_0
Whitepaper-The-Data-Lake-3_0Whitepaper-The-Data-Lake-3_0
Whitepaper-The-Data-Lake-3_0
 
Information Management Training Courses & Certification
Information Management Training Courses & CertificationInformation Management Training Courses & Certification
Information Management Training Courses & Certification
 
Architecting a Scalable Hadoop Platform: Top 10 considerations for success
Architecting a Scalable Hadoop Platform: Top 10 considerations for successArchitecting a Scalable Hadoop Platform: Top 10 considerations for success
Architecting a Scalable Hadoop Platform: Top 10 considerations for success
 
Is the Data asset really different?
Is the Data asset really different?Is the Data asset really different?
Is the Data asset really different?
 
Information Management best_practice_guide
Information Management best_practice_guideInformation Management best_practice_guide
Information Management best_practice_guide
 
Information is at the heart of all architecture disciplines
Information is at the heart of all architecture disciplinesInformation is at the heart of all architecture disciplines
Information is at the heart of all architecture disciplines
 
Information Management Training & Certification
Information Management Training & CertificationInformation Management Training & Certification
Information Management Training & Certification
 
One Large Data Lake, Hold the Hype
One Large Data Lake, Hold the HypeOne Large Data Lake, Hold the Hype
One Large Data Lake, Hold the Hype
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 
Information Management Capabilities, Competencies & Staff Maturity Assessment
Information Management Capabilities, Competencies & Staff Maturity AssessmentInformation Management Capabilities, Competencies & Staff Maturity Assessment
Information Management Capabilities, Competencies & Staff Maturity Assessment
 
Selecting Data Management Tools - A practical approach
Selecting Data Management Tools - A practical approachSelecting Data Management Tools - A practical approach
Selecting Data Management Tools - A practical approach
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
 
SimplifyStreamingArchitecture
SimplifyStreamingArchitectureSimplifyStreamingArchitecture
SimplifyStreamingArchitecture
 

Ähnlich wie Designing Fast Data Architecture for Big Data using Logical Data Warehouse and Data Lakes

Bridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItBridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItDenodo
 
Denodo DataFest 2016: The Governed Data Lake – Putting Big Data to Work
Denodo DataFest 2016: The Governed Data Lake – Putting Big Data to WorkDenodo DataFest 2016: The Governed Data Lake – Putting Big Data to Work
Denodo DataFest 2016: The Governed Data Lake – Putting Big Data to WorkDenodo
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization Denodo
 
Fast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationFast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationDenodo
 
Data Virtualization. An Introduction (ASEAN)
Data Virtualization. An Introduction (ASEAN)Data Virtualization. An Introduction (ASEAN)
Data Virtualization. An Introduction (ASEAN)Denodo
 
Connecting Silos in Real Time with Data Virtualization
Connecting Silos in Real Time with Data VirtualizationConnecting Silos in Real Time with Data Virtualization
Connecting Silos in Real Time with Data VirtualizationDenodo
 
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATIONBig Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATIONMatt Stubbs
 
Enabling Self-Service Analytics with Logical Data Warehouse
Enabling Self-Service Analytics with Logical Data WarehouseEnabling Self-Service Analytics with Logical Data Warehouse
Enabling Self-Service Analytics with Logical Data WarehouseDenodo
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An IntroductionDenodo
 
DAMA Webinar: Turn Grand Designs into a Reality with Data Virtualization
DAMA Webinar: Turn Grand Designs into a Reality with Data VirtualizationDAMA Webinar: Turn Grand Designs into a Reality with Data Virtualization
DAMA Webinar: Turn Grand Designs into a Reality with Data VirtualizationDenodo
 
Opportunity: Data, Analytic & Azure
Opportunity: Data, Analytic & Azure Opportunity: Data, Analytic & Azure
Opportunity: Data, Analytic & Azure Abhimanyu Singhal
 
Virtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & BénéficesVirtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & BénéficesDenodo
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)Denodo
 
Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Denodo
 
A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)Denodo
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Denodo
 
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...Denodo
 
Why Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionWhy Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionDenodo
 
Data Virtualization for Data Architects (New Zealand)
Data Virtualization for Data Architects (New Zealand)Data Virtualization for Data Architects (New Zealand)
Data Virtualization for Data Architects (New Zealand)Denodo
 
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Denodo
 

Ähnlich wie Designing Fast Data Architecture for Big Data using Logical Data Warehouse and Data Lakes (20)

Bridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItBridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need It
 
Denodo DataFest 2016: The Governed Data Lake – Putting Big Data to Work
Denodo DataFest 2016: The Governed Data Lake – Putting Big Data to WorkDenodo DataFest 2016: The Governed Data Lake – Putting Big Data to Work
Denodo DataFest 2016: The Governed Data Lake – Putting Big Data to Work
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
Fast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationFast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow Presentation
 
Data Virtualization. An Introduction (ASEAN)
Data Virtualization. An Introduction (ASEAN)Data Virtualization. An Introduction (ASEAN)
Data Virtualization. An Introduction (ASEAN)
 
Connecting Silos in Real Time with Data Virtualization
Connecting Silos in Real Time with Data VirtualizationConnecting Silos in Real Time with Data Virtualization
Connecting Silos in Real Time with Data Virtualization
 
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATIONBig Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
 
Enabling Self-Service Analytics with Logical Data Warehouse
Enabling Self-Service Analytics with Logical Data WarehouseEnabling Self-Service Analytics with Logical Data Warehouse
Enabling Self-Service Analytics with Logical Data Warehouse
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
DAMA Webinar: Turn Grand Designs into a Reality with Data Virtualization
DAMA Webinar: Turn Grand Designs into a Reality with Data VirtualizationDAMA Webinar: Turn Grand Designs into a Reality with Data Virtualization
DAMA Webinar: Turn Grand Designs into a Reality with Data Virtualization
 
Opportunity: Data, Analytic & Azure
Opportunity: Data, Analytic & Azure Opportunity: Data, Analytic & Azure
Opportunity: Data, Analytic & Azure
 
Virtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & BénéficesVirtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & Bénéfices
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)
 
Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)
 
A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
 
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
 
Why Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionWhy Data Virtualization? An Introduction
Why Data Virtualization? An Introduction
 
Data Virtualization for Data Architects (New Zealand)
Data Virtualization for Data Architects (New Zealand)Data Virtualization for Data Architects (New Zealand)
Data Virtualization for Data Architects (New Zealand)
 
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
 

Mehr von Denodo

Enterprise Monitoring and Auditing in Denodo
Enterprise Monitoring and Auditing in DenodoEnterprise Monitoring and Auditing in Denodo
Enterprise Monitoring and Auditing in DenodoDenodo
 
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps ApproachLunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps ApproachDenodo
 
Achieving Self-Service Analytics with a Governed Data Services Layer
Achieving Self-Service Analytics with a Governed Data Services LayerAchieving Self-Service Analytics with a Governed Data Services Layer
Achieving Self-Service Analytics with a Governed Data Services LayerDenodo
 
What you need to know about Generative AI and Data Management?
What you need to know about Generative AI and Data Management?What you need to know about Generative AI and Data Management?
What you need to know about Generative AI and Data Management?Denodo
 
Mastering Data Compliance in a Dynamic Business Landscape
Mastering Data Compliance in a Dynamic Business LandscapeMastering Data Compliance in a Dynamic Business Landscape
Mastering Data Compliance in a Dynamic Business LandscapeDenodo
 
Denodo Partner Connect: Business Value Demo with Denodo Demo Lite
Denodo Partner Connect: Business Value Demo with Denodo Demo LiteDenodo Partner Connect: Business Value Demo with Denodo Demo Lite
Denodo Partner Connect: Business Value Demo with Denodo Demo LiteDenodo
 
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...Denodo
 
Drive Data Privacy Regulatory Compliance
Drive Data Privacy Regulatory ComplianceDrive Data Privacy Regulatory Compliance
Drive Data Privacy Regulatory ComplianceDenodo
 
Знакомство с виртуализацией данных для профессионалов в области данных
Знакомство с виртуализацией данных для профессионалов в области данныхЗнакомство с виртуализацией данных для профессионалов в области данных
Знакомство с виртуализацией данных для профессионалов в области данныхDenodo
 
Data Democratization: A Secret Sauce to Say Goodbye to Data Fragmentation
Data Democratization: A Secret Sauce to Say Goodbye to Data FragmentationData Democratization: A Secret Sauce to Say Goodbye to Data Fragmentation
Data Democratization: A Secret Sauce to Say Goodbye to Data FragmentationDenodo
 
Denodo Partner Connect - Technical Webinar - Ask Me Anything
Denodo Partner Connect - Technical Webinar - Ask Me AnythingDenodo Partner Connect - Technical Webinar - Ask Me Anything
Denodo Partner Connect - Technical Webinar - Ask Me AnythingDenodo
 
Lunch and Learn ANZ: Key Takeaways for 2023!
Lunch and Learn ANZ: Key Takeaways for 2023!Lunch and Learn ANZ: Key Takeaways for 2023!
Lunch and Learn ANZ: Key Takeaways for 2023!Denodo
 
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way Forward
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way ForwardIt’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way Forward
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way ForwardDenodo
 
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...Denodo
 
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...Denodo
 
How to Build Your Data Marketplace with Data Virtualization?
How to Build Your Data Marketplace with Data Virtualization?How to Build Your Data Marketplace with Data Virtualization?
How to Build Your Data Marketplace with Data Virtualization?Denodo
 
Webinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit UnionsWebinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit UnionsDenodo
 
Enabling Data Catalog users with advanced usability
Enabling Data Catalog users with advanced usabilityEnabling Data Catalog users with advanced usability
Enabling Data Catalog users with advanced usabilityDenodo
 
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...Denodo Partner Connect: Technical Webinar - Architect Associate Certification...
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...Denodo
 
GenAI y el futuro de la gestión de datos: mitos y realidades
GenAI y el futuro de la gestión de datos: mitos y realidadesGenAI y el futuro de la gestión de datos: mitos y realidades
GenAI y el futuro de la gestión de datos: mitos y realidadesDenodo
 

Mehr von Denodo (20)

Enterprise Monitoring and Auditing in Denodo
Enterprise Monitoring and Auditing in DenodoEnterprise Monitoring and Auditing in Denodo
Enterprise Monitoring and Auditing in Denodo
 
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps ApproachLunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
 
Achieving Self-Service Analytics with a Governed Data Services Layer
Achieving Self-Service Analytics with a Governed Data Services LayerAchieving Self-Service Analytics with a Governed Data Services Layer
Achieving Self-Service Analytics with a Governed Data Services Layer
 
What you need to know about Generative AI and Data Management?
What you need to know about Generative AI and Data Management?What you need to know about Generative AI and Data Management?
What you need to know about Generative AI and Data Management?
 
Mastering Data Compliance in a Dynamic Business Landscape
Mastering Data Compliance in a Dynamic Business LandscapeMastering Data Compliance in a Dynamic Business Landscape
Mastering Data Compliance in a Dynamic Business Landscape
 
Denodo Partner Connect: Business Value Demo with Denodo Demo Lite
Denodo Partner Connect: Business Value Demo with Denodo Demo LiteDenodo Partner Connect: Business Value Demo with Denodo Demo Lite
Denodo Partner Connect: Business Value Demo with Denodo Demo Lite
 
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
 
Drive Data Privacy Regulatory Compliance
Drive Data Privacy Regulatory ComplianceDrive Data Privacy Regulatory Compliance
Drive Data Privacy Regulatory Compliance
 
Знакомство с виртуализацией данных для профессионалов в области данных
Знакомство с виртуализацией данных для профессионалов в области данныхЗнакомство с виртуализацией данных для профессионалов в области данных
Знакомство с виртуализацией данных для профессионалов в области данных
 
Data Democratization: A Secret Sauce to Say Goodbye to Data Fragmentation
Data Democratization: A Secret Sauce to Say Goodbye to Data FragmentationData Democratization: A Secret Sauce to Say Goodbye to Data Fragmentation
Data Democratization: A Secret Sauce to Say Goodbye to Data Fragmentation
 
Denodo Partner Connect - Technical Webinar - Ask Me Anything
Denodo Partner Connect - Technical Webinar - Ask Me AnythingDenodo Partner Connect - Technical Webinar - Ask Me Anything
Denodo Partner Connect - Technical Webinar - Ask Me Anything
 
Lunch and Learn ANZ: Key Takeaways for 2023!
Lunch and Learn ANZ: Key Takeaways for 2023!Lunch and Learn ANZ: Key Takeaways for 2023!
Lunch and Learn ANZ: Key Takeaways for 2023!
 
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way Forward
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way ForwardIt’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way Forward
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way Forward
 
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
 
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...
 
How to Build Your Data Marketplace with Data Virtualization?
How to Build Your Data Marketplace with Data Virtualization?How to Build Your Data Marketplace with Data Virtualization?
How to Build Your Data Marketplace with Data Virtualization?
 
Webinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit UnionsWebinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit Unions
 
Enabling Data Catalog users with advanced usability
Enabling Data Catalog users with advanced usabilityEnabling Data Catalog users with advanced usability
Enabling Data Catalog users with advanced usability
 
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...Denodo Partner Connect: Technical Webinar - Architect Associate Certification...
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...
 
GenAI y el futuro de la gestión de datos: mitos y realidades
GenAI y el futuro de la gestión de datos: mitos y realidadesGenAI y el futuro de la gestión de datos: mitos y realidades
GenAI y el futuro de la gestión de datos: mitos y realidades
 

Kürzlich hochgeladen

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 

Kürzlich hochgeladen (20)

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 

Designing Fast Data Architecture for Big Data using Logical Data Warehouse and Data Lakes

  • 1. Designing Fast Data Architecture for Big Data using Logical Data Warehouse and Data Lakes A Case Study presented by Kurt Jackson Platform Lead, Autodesk
  • 2. Kurt Jackson Platform Lead Ravi Shankar Chief Marketing Officer Speakers
  • 3. Agenda 1. Towards a Logical Data Lake – An Autodesk Case Study 2. Performance Considerations in Logical Data Warehouse/ Lakes 3. Q&A and Next Steps
  • 4. © 2016 Autodesk | Enterprise Information Services Towards the Logical Data Lake Kurt Jackson Platform Lead
  • 5. © 2016 Autodesk | Enterprise Information Services 5 What is a Logical Data Warehouse?  A logical data warehouse is a data system that follows the ideas of traditional EDW (star or snowflake schemas) and includes, in addition to one (or more) core DWs, data from external sources.  The main motivations are improved decision making and/or cost reduction
  • 6. © 2016 Autodesk | Enterprise Information Services 6 What about the Logical Data Lake?  A Data Lake will not have a star or snowflake schema, but rather a more heterogeneous collection of views with raw data from heterogeneous sources  The virtual layer will act as a common umbrella under which these different sources are presented to the end user as a single system  However, from the virtualization perspective, a Virtual Data Lake shares many technical aspects with a LDW and most of these contents also apply to a Logical Data Lake
  • 7. © 2016 Autodesk | Enterprise Information Services Introduction
  • 8. © 2016 Autodesk | Enterprise Information Services 8 Agile Data Architecture Logical Data Warehouse Enterprise Access Point Data Virtualization Logical Data Lake Agile Data Architecture Lifecycle Rapid, iterative delivery Modeling of business concepts Curated, governed, fit for purpose data Stable, system- independent, durable views Single source of data across the enterprise Enterprise governance and access control
  • 9. © 2016 Autodesk | Enterprise Information Services Business Problem
  • 10. © 2016 Autodesk | Enterprise Information Services 10
  • 11. © 2016 Autodesk | Enterprise Information Services 11 Most of us are in the same boat
  • 12. © 2016 Autodesk | Enterprise Information Services The Autodesk Agile Data Architecture
  • 13. © 2016 Autodesk | Enterprise Information Services 13 Philosophy  Access and refine data near the source  Published logical data interfaces  Implementing interfaces is only an IT concern  Agile and opportunistic retirement of legacy systems
  • 14. © 2016 Autodesk | Enterprise Information Services 14 Autodesk Data Architecture
  • 15. © 2016 Autodesk | Enterprise Information Services 15 The journey to the Logical Data Lake Data virtualization can be used throughout your data pipeline!
  • 16. © 2016 Autodesk | Enterprise Information Services 16 Big Data Ecosystem
  • 17. © 2016 Autodesk | Enterprise Information Services Towards the Logical Data Lake
  • 18. © 2016 Autodesk | Enterprise Information Services 18 Logical Data Warehouse (LDW)  Usability  Single repository for schema definitions  Integrity  Only published views are publically available  Business ownership guarantees the quality Agile Data Architectur e Logical Data Warehouse Enterprise Access Point Data Virtualization Logical Data Lake
  • 19. © 2016 Autodesk | Enterprise Information Services 19 The Enterprise Access Point  Availability  A single access point to administer  Security  A single point for authentication, authorization and audit trail Agile Data Architectur e Logical Data Warehouse Enterprise Access Point Data Virtualization Logical Data Lake
  • 20. © 2016 Autodesk | Enterprise Information Services 20 Data Virtualization  System-agnostic  Data contract independent of implementation  Enables the agile enterprise  Aggressively and opportunistically refactor enterprise systems  Manifest support for federation and data blending Agile Data Architectur e Logical Data Warehouse Enterprise Access Point Data Virtualization Logical Data Lake
  • 21. © 2016 Autodesk | Enterprise Information Services 21 Logical Data Lake  Beyond the traditional Enterprise Data Lake  Combine traditional data lakes with other data  Blending of disparate, heterogeneous data sources  Internal, external, 3rd party…  Approaches a true, single origin for all enterprise data Agile Data Architectur e Logical Data Warehouse Enterprise Access Point Data Virtualization Logical Data Lake
  • 22. © 2016 Autodesk | Enterprise Information Services 22 Towards the Logical Data Lake implements the philosophy  Access and refine data near the source  No painful ETL pipelines for data derivation  Published logical data interfaces  Single access point for all of external data sets  Implementing interfaces is only an IT concern  Accelerate to best of breed solutions  Agile and opportunistic retirement of legacy systems  Rip and replace becomes almost transparent
  • 23. © 2016 Autodesk | Enterprise Information Services Building the Agile Data Architecture at Autodesk
  • 24. © 2016 Autodesk | Enterprise Information Services 24 Implementation Approach  Identify enterprise data sources  Harder than you think  All new streaming, highly-available ingestion mechanism  Self-service or nearly so  Stream-based facilitates batch and streaming data processing  Leverage highly-redundant cloud storage for the data lake  S3  Leverage best-of breed for individual components  Open source, selected commercial vendors  Develop canonical representations for your data sets  Very difficult – consider a CDO  Virtualize the data warehouses and marts with a next generation Logical DW  New implementations leverage the LDW  Legacy migrates opportunistically
  • 25. © 2016 Autodesk | Enterprise Information Services 25 Architecting the Data Virtualization Layer Corporate LDAP Data Consumer Data Sources Data DV Instance 1 Source Repository Code Logging Infrastructure Data Audit Audit CI/CD DV Instance n …
  • 26. © 2016 Autodesk | Enterprise Information Services 26 Build an Information Architecture  Base views to abstract data sources  Layered derived views to reflect successively refined derivations  Create the notion of publication for curated, externally visible views  Expose services on top of views to make views more accessible  Separate namespaces (schemas) by project or subject area  Build the notion of commonality for views shared across schemas  Naming conventions for all objects  Data portal for one-stop shopping for data consumers
  • 27. © 2016 Autodesk | Enterprise Information Services 27 Towards the Logical Data Lake can be a liberating journey!
  • 28. Performance Considerations in Logical Data Warehouse/ Lakes Ravi Shankar, CMO Denodo
  • 29. Query Optimization in Data Virtualization Real-time Data Integration “Data virtualization integrates disparate data sources in real time or near-real time to meet demands for analytics and transactional data.” – Create a Road Map For A Real-time, Agile, Self-Service Data Platform, Forrester Research, Dec 16, 2015 Publishes the data to applications Combines related data into views Connects to disparate data sources 2 3 1
  • 30. Dynamic Query Rewriting for Big Data 30 1. Full Aggregation Pushdown Total sales for each product in the last 2 years
  • 31. Dynamic Query Rewriting for Big Data 31 2. Partition Pruning Total sales for each product in the last 9 months
  • 32. Dynamic Query Rewriting for Big Data 32 3. Partial Aggregation Pushdown Total sales for each product category in the last 9 months 2. Total Sales by Product – Data Warehouse1. Products and their Categories – Products DB 3. Join both tables and aggregate – Data Virtualization
  • 33. Dynamic Query Rewriting for Big Data 33 4. On the Fly Data Movement Maximum discount for each product in the last 9 months
  • 34. 34 Compares the performance of a federated approach in Denodo with an MPP system where all the data has been replicated via ETL Customer Dim. 2 M rows Sales Facts 290 M rows Items Dim. 400 K rows * TPC-DS is the de-facto industry standard benchmark for measuring the performance of decision support solutions including, but not limited to, Big Data systems. vs. Sales Facts 290 M rows Items Dim. 400 K rows Customer Dim. 2 M rows Performance Comparison Logical Data Warehouse vs. Physical Data Warehouse
  • 35. 35 Performance Comparison Query Description Returned Rows Time Netezza Time Denodo (Federated Oracle, Netezza & SQL Server) Optimization Technique (automatically selected) Total sales by customer 1.99 M 20.9 sec. 21.4 sec. Full aggregation push-down Total sales by customer and year between 2000 and 2004 5.51 M 52.3 sec. 59.0 sec Full aggregation push-down Total sales by item brand 31.35 K 4.7 sec. 5.0 sec. Partial aggregation push-down Total sales by item where sale price less than current list price 17.05 K 3.5 sec. 5.2 sec On the fly data movement Logical Data Warehouse vs. Physical Data Warehouse
  • 36. Dynamic Query Optimizer 36 Delivers Breakthrough Performance for Big Data, Logical Data Warehouse, and Operational Scenarios Dynamically determines lowest-cost query execution plan based on statistics Factors in all the special characteristics of big data sources such as number of processing units and partitions Can easily handle any number of incremental queries Enables connectivity to the broadest array of big data sources such as Redshift, Impala, Spark. Best dynamic query optimization engine in the industry.
  • 37. Kurt Jackson Platform Lead Ravi Shankar Chief Marketing Officer Q&A
  • 38. Next Steps View Educational Seminar (on-demand): Logical Data Warehouse, Data Lakes, and Data Services Marketplace Visit: www.denodo.com Read about Data Virtualization for Logical Data Warehouse and Data Lakes Visit: denodo.com/en/solutions/horizontal-solutions/logical-data-warehouse Get Started! Download Denodo Express Visit: www.denodo.com/en/denodo-platform/denodo-express Access Denodo Platform on AWS Visit: www.denodo.com/en/denodo-platform/denodo-platform-for-aws
  • 39. Kurt Jackson Platform Lead Ravi Shankar Chief Marketing Officer Thank You