This document provides an overview of a webinar on data virtualization and the Denodo platform. The webinar agenda includes an introduction to adaptive data architectures and data virtualization, benefits of data virtualization, a demo of the Denodo platform, and a question and answer session. Key takeaways are that traditional data integration technologies do not support today's complex, distributed data environments, while data virtualization provides a way to access and integrate data across multiple sources.
2. Speakers
Paul Moxon
SVP Data Architecture & Chief Evangelist
Denodo
Director, EMEA Sales Engineering
Denodo
Mark Pritchard
3. Agenda1. The Need for Adaptive Data Architectures
2. What is Data Virtualization?
3. Benefits of Data Virtualization
4. Denodo Platform 8.0 Demo
5. Key Takeaways
6. Q&A
7. Next Steps
4. 4
Data Integration – A Journey Through Time…
S
Data
Sources
Data Ingestion
Staging S
Data Transformation
External
Data
Consumers
Web Logs,
Click stream
GEO location
data
Social
Networks
Sensor
data
Machine
Generated
New Data Sources
7. 7
Adaptive Data Architectures
• Organizations need an adaptive data architecture
• An architecture that can flex and adapt to new technologies, new data sources, new formats,
new protocols, new data uses, etc. while minimizing the impact on the consumers
• Future-proofs the architecture
• We can’t predict what technologies will emerge in next 3-5 years (or 5-10 years), but we can
build architectures that will accommodate them
• Allows users to access new data, new technologies using existing, familiar tools
• e.g. read data from a Parquet file using Excel (via the Data Virtualization Platform)
• A Data Fabric – built on Data Virtualization – provides this adaptability and protects your
existing technology investments and de-risks the adoption of new, emerging technologies
8. 8
Adaptive Data Architecture
Reporting
Analytics
Data Science
Data Market Place
Data Monetization
AI/ML
iPaaS
Kafka
ETL
CDC
Sqoop
Flume
RawDataZoneStagingArea
CuratedDataZoneCoreDWHmodel
Data Warehouse
Data Lake
Data Virtualization Platform
Analytical Views
Data Science Views
λ Views
Real-Time Views
DWH Views
Hybrid Views
Cloud Views
UniversalCatalogofDataServices
CentralizedAccessControl
Enterprise Data Fabric
9. 9
Source: “Gartner Market Guide for Data Virtualization, November 16, 2018”
Data virtualization can be used to create virtualized and integrated views
of data in-memory rather than executing data movement and physically
storing integrated views in a target data structure. It provides a layer of
abstraction above the physical implementation of data, to simplify query
logic.
10. 10
What is Data Virtualization?
Consume
in business applications
Combine
related data into views
Connect
to disparate data sources
2
3
1
DATA CONSUMERS
DISPARATE DATA SOURCES
Enterprise Applications, Reporting, BI, Portals, ESB, Mobile, Web, Users
Databases & Warehouses, Cloud/Saas Applications, Big Data, NoSQL, Web, XML, Excel, PDF, Word...
Analytical Operational
Less StructuredMore Structured
CONNECT COMBINE PUBLISH
Multiple Protocols,
Formats
Query, Search,
Browse
Request/Reply,
Event Driven
Secure
Delivery
SQL,
MDX
Web
Services
Big Data
APIs
Web Automation
and Indexing
CONNECT COMBINE CONSUME
Share, Deliver,
Publish, Govern,
Collaborate
Discover, Transform,
Prepare, Improve
Quality, Integrate
Normalized views of
disparate data
“Data virtualization
integrates disparate
data sources in real
time or near-real
time to meet
demands for
analytics and
transactional data.”
– Create a Road Map For
A Real-time, Agile, Self-
Service Data Platform,
Forrester Research, Dec
16, 2015
11. 11
How Does It Work?
Development
Lifecycle Mgmt
Monitoring & Audit
Governance
Security
Development Tools
and SDK
Scheduled Tasks
Data Caching
Query Optimizer
JDBC/ODBC/ADO.Net SOAP / REST WS
U
Customer 360
View
Virtual Data
Mart View
J
Application
Layer
Business
Layer
Unified View Unified ViewUnified ViewUnified View
A
J
J
Derived View Derived View
J
JS
Transformation
& Cleansing
Data
Source
Layer
Base
View
Base
View
Base
View
Base
View
Base
View
Base
View
Base
View
Abstraction
12. 12
Data Virtualization Connects the Users to the Data That They Need
1. Data Virtualization allows you to connect to (almost) any data source
2. You can combine and transform that data into the format needed by the consumer
3. The data can be exposed to the consumers in a format and interface that is usable
by them
• Typically consumers use the tools that they already use – they don’t have to learn new tools
and skills to access the data
4. All of this can be done without copying or moving the data
• The data stays in the original sources (databases, applications, files, etc.) and is retrieved, in
real-time, on demand
Cliffs Notes version (TL;DR)
13. 13
Decoupling Business from IT
IT: Flexible Source Architecture
Business: Flexible
Tool Choice
IT can now
move at slower
speed without
affecting the
business
Business can now
make faster and
more
sophisticated
decisions as all
data accessible
by any tool of
choice
14. 14
Benefits of Using Data Virtualization
• For Business Users
• Simplicity: Users don’t need to navigate the complexity of the architecture. Where is
data (on-prem, cloud, multi-cloud)? How to Access it? Which location has priority?
• Agility: All data is securely delivered from a single (virtual) system
• Accessibility: Data is accessible in a variety of formats (SQL, REST, OData, GraphQL)
and in a web-based Data Catalog, regardless of original format and location
• Common Semantic Layer: All users see the same definitions and data, providing data
consistency
• Governed Self-Service: Users can use their own tools (BYOT) to access and query the
data that is governed, secure, and trusted data.
15. 15
Benefits of Using Data Virtualization
• For IT
• Abstraction: Decouples storage and processing engines from the delivery of data
• Flexibility: Allows IT to change technologies and move data without service
interruptions
• Security: Centralized governance and security controls for all data assets
• Governance: The data accessed by the users can be governed, secured, and managed
so that users are accessing known, trusted, and approved data sets.
• Accelerated Delivery: As data is not be replicated to a staging area or data mart for
use, it is significantly quicker (up to 90% quicker) to deliver the data needed by the
users.
16. 16
Data Virtualization Use Cases
From Data Storage & Management, to Data Consumers, going through Data Governance & Security
Real-time
Decisions
K.Y.C.
(Customer 360)
Self-Service
Analytics
Data Science
(ML & AI)
Apps
(Mobile & web)
Mergers &
Acquisitions
Data
Marketplace
Compliance
(IFRS17, GRC)
Data
Security
APIfication
(& SQLification)
Semantic
Layer
Agility
& Simplicity
Real-time
Delivery
Data
Abstraction
Zero
Replication
Data
Governance
Sophisticated
Optimizations
Logical Data
Warehouse
Enterprise
Data Fabric
Hybrid
Data Fabric
Data
Integration
Cloud
Modernization
Refactoring &
Replatforming
Data Consumption
Data Storage & Management
Data Governance, Manipulation & Access
Sales
HR
Executive
Marketing Apps/API
Data Science
AI/ML
18. 18
Demo Scenario
Distributed Data:
▪ Historical sales data offloaded to
Hadoop cluster for cheaper storage
▪ Marketing campaigns managed in an
external cloud app
▪ Customer details table, stored in the
DW
1) On-board and expose distributed data
through a single logical layer.
2) Publish a logical view calculating the
impact of a new marketing campaign by
country?
Sources
Combine,
Transform
&
Integrate
Consume
Base View
Source
Abstraction
Sales Campaign Customer
Sales Evolution
20. 1. Information architectures are getting more
complex, more diverse, and more distributed.
2. Traditional technologies and data replication don’t
cut it anymore.
3. Data virtualization makes it quick and easy to
expose data from multiple source to your users
while still maintaining governance and security…
4. …and enables a wide range of use cases; from self-
service analytics to data marketplaces to
regulatory reporting and compliance.
Key Takeaways