SlideShare a Scribd company logo
1 of 32
1 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Modernize Your Existing EDW
With IBM Big SQL &
Hortonworks Data Platform
Nagapriya Tiruthani, Offering Manager, IBM Big SQL
Carter Shanklin, Sr. Director, Product Management, Hortonworks
Roni Fontaine, Director, Product Marketing, Hortonworks
2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Agenda
Partnership
Modernizing your existing EDW – Big SQL
How Hive helps in the modernization
Using Hive & Big SQL
Resources / Q & A
3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Focus on extending data science, machine learning and Big SQL to analyze the data in
Apache Hadoop systems
Provides Data Science, Machine
Learning & Big SQL
Provides Open Hadoop Data
Platform
Make our clients competitive in their markets
using advanced analytics faster and at scale
+
IBM and Hortonworks Deliver Data Science and Big SQL
Announcement –June 13, 2017
1. IBM standardizes on HDP by
leveraging Hortonworks Data
Platform as the core Hadoop
distribution for Big SQL and DSX
2. Hortonworks introduces New
Product Offerings:
• IBM Data Science
Experience (DSX)
• IBM Big SQL
4
IBM Confidential
Hortonworks Data Platform & IBM BIG SQL
the Bridge for Customers
• #1 Pure Open Source Hadoop
Distribution
• 1000+ customers and 2100+
ecosystem partners
• Compatibility
• Capabilities & Overlap between Hive
(HWX) and Big SQL (IBM)
• Leader in SQL technology for Hadoop
• Leader in on premise and hybrid cloud
data and analytics solutions
• Federation
• High Performance
• Improved Concurrence
• Complex queries
• Enterprise security features
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
IBM Cloud
Modernizing your EDW
IBM Cloud
Advanced Analytics
Speed
Scalability
Cost
Productivity
• Build a centralized data repository (virtual/physical) to support business
decisions.
• Bring together siloed data available either locally or by federation to gain a
variety of benefits
• Modern data warehouse architecture enables deeper analytics and
advanced reporting from diverse sets of data.
Benefits
• Cost efficiency
• Deeper analytics (predictive, proactive
and historical analytics)
• Faster and efficient to do the analytics
• Enrich or augment warehouse data
What is EDW Modernization?
IBM Cloud
Want to modernize
your EDW without
long and costly
migration efforts
Offloading historical
data from Oracle,
Db2, Netezza
because reaching
capacity
Operationalize
machine learning
Need to query,
optimize and
integrate multiple
data sources from
one single endpoint
Slow query
performance for SQL
workloads
Require skill set to
migrate data from
RDBMS to
Hadoop/Hive
Do you have any of these challenges?
IBM Cloud
 Compatible with Oracle, Db2 & Netezza SQL syntax
 Modernizing EDW workloads on Hadoop has never been easier
 Application portability (eg: Cognos, Tableau, MicroStrategy,…)
 Federates all your data behind a single SQL engine
 Query Hive, Spark and HBase data from a single endpoint
 Federate your Hadoop data using connectors to Teradata, Oracle, Db2 & more
 Query data sources that have Spark connectors
 Addresses a skillset gap needed to migrate technologies
 Delivers high performance & concurrency for BI workloads
 Unlock Hadoop data with analytics tools of choice
 Provides greater security while accessing data
 Robust SQL based row filtering, column masking with role-based access control and
Ranger integration
 Operationalize machine learning through integration with Spark
 Bi-directional integration with Spark exploits Spark’s connectors as well as ML
capabilities
Here’s How Big SQL Addresses These Challenges
IBM Cloud
Big SQL queries heterogeneous systems in a single query - only SQL-on-Hadoop that virtualizes more than 10
different data sources: RDBMS, NoSQL, HDFS or Object Store
Big SQL
Fluid Query (federation)
Oracle
SQL
Server
Teradata DB2
Netezza
(PDA) Informix
Microsoft
SQL Server
Hive HBase HDFS
Object Store
(S3)
WebHDFS
Big SQL allows query federation by virtualizing data sources and processing where data resides
Hortonworks Data Platform (HDP)
Data Virtualization
IBM Cloud
 Automatic memory management and workload management handles
complex query execution without failures
 Executes all 99 TPC-DS queries in single stream and multi-stream environment
 Easy porting of enterprise applications
 Ability to work seamlessly with Business Intelligence tools like Cognos,
Tableau, etc. to gain insights
Oracle
SQL
DB2
SQL
Netezza
SQL
Big SQL
SQL syntax tolerance (ANSI SQL Compliant)
Cognos Analytics
InfoSphere Metadata Asset Manager
Big SQL is a synergetic SQL engine that offers SQL compatibility, portability and collaborative
ability to get composite analysis on data
Data Offloading and Analytics
IBM Cloud
BRANCH_A FINANCE
(security admin)BRANCH_B
Role Based Access Control
enables separation
of Duties / Audit
Row Level Security
Row and Column Level Security
Big SQL offers row and column level access control (RBAC) among other security settings
Data Security
IBM Cloud
Big SQL
Head
NM NM NM NM NM NM
HDFS
Slider Client
YARN
Resource
Manager
&
Scheduler
Big SQL
AM
Users
Big
SQL
Work
er
Big
SQL
Work
er
Big
SQL
Work
er
Big
SQL
Work
er
Big
SQL
Work
er
Big
SQL
Work
er Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Launch Multiple Workers per Host
More Granular Elasticity
For both 1 and 10 TB TPC-DS dataset
2 Workers/Node: 1.6x speedup
4 Workers/Node: 2.2x speedup
In each scenario,
the same TOTAL
CPU/memory is
used
Performance: Big SQL Elastic Boost
IBM Cloud
Enhancements to ORC file format has improved performance when compared to previous releases and it at
par with Parquet file format
(on v5.0.1)(on v4.2)
Performance: Parquet vs ORC File Format
IBM Cloud
BigSQLworker
Sparkexecutor
Share data in
memory
Spark 2.1 is a powerful analytic co-processor that complements
the rich SQL functionality of Big SQL
Tight integration with Spark enables Big SQL
worker and Spark Executor to communicate in
memory without writing to disk
Bi-directional integration allows Spark
jobs can be executed from Big SQL
HDFS
Big SQL is a self-tuning memory management SQL engine that integrates with Spark 2.1
Spark Integration
IBM Cloud
Data Ingestion
Data Transformation/
Data Science/
Machine Learning
Data Visualization
Leverage Big SQL throughout your journey
Virtualize disparate data sources
like Hadoop, RDBMS, and Object
Stores (S3) to join data in a single
query
Manipulate data and
operationalize data science
models written in various
languages
Perform data discovery, analyze,
and visualize business results in
notebooks or other BI tools
Democratize Data Science and Machine Learning
IBM Cloud
Big SQL
Archive Data away from EDW
• Move cold or rarely used data to Hadoop
as active archive
• Store more of data longer
Offload costly ETL process
• Free your EDW to perform high-value functions like analytics
& operations, not ETL
• Use Hadoop for advanced ETL
Optimize the value of your EDW
• Use Hadoop to refine new data sources, such as web and
machine data for new analytical context
• Access old data in traditional RDBMS using Big SQL’s
federation
• Combine data in different silos without duplication
Exploit Spark to avail latest technologies
• Spark integration brings data from NoSQL databases and
other non-traditional sources
• Make Spark’s use cases secure with granular security
defined in Big SQL
Reduce the migration effort & skillset gap
• Use existing investment in Oracle, Db2 and Netezza
technology skills
• Big SQL allows you to migrate applications with out major
code rewrites and additional SQL development
ANALYTICSDATASYSTEMS
Data
Marts
Business
Analytics
Visualization
& Dashboards
Systems of Record
RDBMS
ERP
CRM
Other
Clickstream Web & Social Geolocation Sensor
& Machine
Server
Logs
Unstructured
NEWSOURCES
HDP 2.6
ELT
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
N
Cold Data,
Deeper Archive
& New Sources
Enterprise Data
Warehouse
Hot
Realize Cost Savings Faster with Big SQL
Top Use case – EDW Optimization
18 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
How Hive Helps in the
Modernization
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDP 2.6: A Major Milestone for Apache Hive
 Major Improvements:
– Hive LLAP Now GA
– ACID MERGE
– SQL: All 99 TPC-DS out-of-the-box with only
trivial rewrites
– Hive View 2.0: Great Features for DBAs
– Diagnostics: Tez UI Total Timeline View
– Hive OLAP Indexes powered by Druid
 At a High Level:
– 1200+ features, improvements and bug fixes
in Hive since HDP 2.5.
– 400+ of these from outside of Hortonworks.
820
413
From
Hortonworks
From
Community
HDP 2.6 Improvements
Hive LLAP GA+
SQL MERGE+
All TPC-DS Queries+
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Hive LLAP: MPP Performance at Hadoop Scale
Deep
Storage
YARN Cluster
LLAP Daemon
Query
Executors
LLAP Daemon
Query
Executors
LLAP Daemon
Query
Executors
LLAP Daemon
Query
Executors
Query
Coordinators
Coord-
inator
Coord-
inator
Coord-
inator
HiveServer2
(Query
Endpoint)
ODBC /
JDBC
SQL
Queries In-Memory Cache
(Shared Across All Users)
HDFS and
Compatible
S3 WASB Isilon
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDP 2.6 Makes Hadoop Data Management a Reality with SQL MERGE
 Hive implements ANSI-standard SQL MERGE.
 MERGE makes data maintenance 8x simpler with 5x higher performance.
 Legacy Hive or Spark approaches don’t protect applications against dirty reads or partial
failures.
Complexity of a Type 2 SCD Update with and without MERGE
Number of
Queries
Number of
Full Table Scans
Isolation
Applications
Protected From
Partial Failures
Hive MERGE 1 1 Yes Yes
Old Techniques 8 5 No No
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Comprehensive SQL in Hive Including All 99 TPC-DS Queries
 Multiple and Scalar Subqueries
 INTERSECT and EXCEPT
 Standard syntax for ROLLUP /
GROUPING
 Syntax improvements for GROUP BY
and ORDER BY
 In HDP 2.6+ Hive runs all 99 TPC-DS
with only trivial re-writes.
Highlights
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Announcing: Druid is Now GA in HDP 2.6.3
✓ Analyze Streaming and Historical Data with SQL
✓ Powerful Visualization
✓ Simple management and monitoring with Ambari
✓ Fine-grained security
✓ Integrates with Hortonworks SAM for simple development
24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
OLAP Analytics in Milliseconds with Hive over Druid
.0
.5
1.0
1.5
2.0
2.5
3.0
Q1.1 Q1.2 Q1.3 Q2.1 Q2.2 Q2.3 Q3.1 Q3.2 Q3.3 Q3.4 Q4.1 Q4.2 Q4.3
Runtime(s)
Star Schema Benchmark 1TB Scale with Hive over 10 Druid
Nodes
https://hortonworks.com/blog/apache-hive-druid-part-1-3/#comment-24833
25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Runtime Comparison: HDP 2.5 and HDP 2.6 (Lower is Better)
1
10
100
1000
10000 query32
query64
query84
query18
query95
query58
query50
query82
query88
query28
query90
query45
query94
query27
query96
query17
query13
query26
query34
query3
query93
query40
query39
query66
query71
query85
query76
query89
query12
query46
query19
query48
query7
query42
query21
query43
query73
query92
query49
query68
query55
query98
query20
query87
query91
query97
query15
query52
query25
query79
Runtime(s)(LogScale) Runtime Comparison: HDP 2.5 to HDP 2.6
Legacy TPC-DS Queries, 10TB, 9 Nodes
HDP 2.5 HDP 2.6
26 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Using Hive & Big SQL
27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Application Portability & Integration
Data shared with Hadoop ecosystem
Comprehensive file format support
Superior enablement of IBM and Third Party software
Performance
Modern MPP runtime
Powerful SQL query rewriter
Cost based optimizer
Optimized for concurrent user throughput
Results not constrained by memory
Federation
Distributed requests to multiple data sources within a
single SQL statement
Main data sources supported:
DB2, Teradata, Oracle, Netezza,
Informix, SQL Server
Enterprise Features
Advanced security/auditing
Resource and workload management
Self tuning memory management
Comprehensive monitoring
Rich SQL
Comprehensive SQL Support
IBM SQL PL compatibility
Extensive Analytic Functions
How Big SQL Complements HDP
28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Breaking Things Down: Where IBM Big SQL Shines
Run Oracle, DB2 or Netezza Workloads on Hadoop+
Federate Hadoop and Non-Hadoop Data+
Complex SQL Workloads+
29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Breaking Things Down: Where Apache Hive Shines
Fast SQL That Scales from Terabytes to Petabytes+
Easy to Keep Data Fresh with ACID MERGE+
Join Historical and Streaming Data in Real Time+
30 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Resources
31 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Resources
 HWX Big SQL Web Page: lhttps://hortonworks.com/partners/ibm-bigsql/
 Big SQL Solutions Sheet: https://2xbbhjxc6wk3v21p62t8n4d4-
wpengine.netdna-ssl.com/wp-content/uploads/2017/08/IBM-Big-SQL-
Solution-Sheet_final.pdf
 Big SQL Data Sheet https://2xbbhjxc6wk3v21p62t8n4d4-wpengine.netdna-
ssl.com/wp-content/uploads/2017/08/IBM-Big-SQL-Datasheet_final.pdf
 Big SQL Resources
– Big SQL Web Page/Sandbox https://www.ibm.com/us-en/marketplace/big-sql
– Big SQL Master Class Videos
https://www.youtube.com/playlist?list=PL7FnN5oi7Ez9itAnZ6rs9A30YYjVB1wN_
 Big SQL Blog: https://hortonworks.com/blog/big-sql-apache-hadoop-across-
enterprise/
32 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Questions?

More Related Content

What's hot

Presentacin webinar move_up_to_power8_with_scale_out_servers_final
Presentacin webinar move_up_to_power8_with_scale_out_servers_finalPresentacin webinar move_up_to_power8_with_scale_out_servers_final
Presentacin webinar move_up_to_power8_with_scale_out_servers_finalDiego Alberto Tamayo
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Hortonworks
 
Hadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureHadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureDataWorks Summit
 
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseEnabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseHortonworks
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Hortonworks
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...Hortonworks
 
Data Centric Transformation in Telecom
Data Centric Transformation in TelecomData Centric Transformation in Telecom
Data Centric Transformation in TelecomDataWorks Summit
 
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?DataWorks Summit
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoptionHortonworks
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsHortonworks
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks
 
Machine Learning Everywhere
Machine Learning EverywhereMachine Learning Everywhere
Machine Learning EverywhereDataWorks Summit
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack EuropeHortonworks
 
Powering Big Data Success On-Prem and in the Cloud
Powering Big Data Success On-Prem and in the CloudPowering Big Data Success On-Prem and in the Cloud
Powering Big Data Success On-Prem and in the CloudHortonworks
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsHortonworks
 
10 Lessons Learned from Meeting with 150 Banks Across the Globe
10 Lessons Learned from Meeting with 150 Banks Across the Globe10 Lessons Learned from Meeting with 150 Banks Across the Globe
10 Lessons Learned from Meeting with 150 Banks Across the GlobeDataWorks Summit
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 

What's hot (20)

Presentacin webinar move_up_to_power8_with_scale_out_servers_final
Presentacin webinar move_up_to_power8_with_scale_out_servers_finalPresentacin webinar move_up_to_power8_with_scale_out_servers_final
Presentacin webinar move_up_to_power8_with_scale_out_servers_final
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
 
Hadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureHadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and Future
 
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseEnabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical Enterprise
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
Data Centric Transformation in Telecom
Data Centric Transformation in TelecomData Centric Transformation in Telecom
Data Centric Transformation in Telecom
 
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
 
Machine Learning Everywhere
Machine Learning EverywhereMachine Learning Everywhere
Machine Learning Everywhere
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack Europe
 
Powering Big Data Success On-Prem and in the Cloud
Powering Big Data Success On-Prem and in the CloudPowering Big Data Success On-Prem and in the Cloud
Powering Big Data Success On-Prem and in the Cloud
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data Analytics
 
10 Lessons Learned from Meeting with 150 Banks Across the Globe
10 Lessons Learned from Meeting with 150 Banks Across the Globe10 Lessons Learned from Meeting with 150 Banks Across the Globe
10 Lessons Learned from Meeting with 150 Banks Across the Globe
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
OpenPOWER Update
OpenPOWER UpdateOpenPOWER Update
OpenPOWER Update
 

Similar to Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform

TDC2017 | POA Trilha BigData - IBM BigSQL - Engine de consulta de dados de al...
TDC2017 | POA Trilha BigData - IBM BigSQL - Engine de consulta de dados de al...TDC2017 | POA Trilha BigData - IBM BigSQL - Engine de consulta de dados de al...
TDC2017 | POA Trilha BigData - IBM BigSQL - Engine de consulta de dados de al...tdc-globalcode
 
Big SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor LandscapeBig SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor LandscapeNicolas Morales
 
Microsoft ignite 2018 SQL server 2019 big data clusters - deep dive session
Microsoft ignite 2018 SQL server 2019 big data clusters - deep dive sessionMicrosoft ignite 2018 SQL server 2019 big data clusters - deep dive session
Microsoft ignite 2018 SQL server 2019 big data clusters - deep dive sessionTravis Wright
 
The Transformation of your Data in modern IT (Presented by DellEMC)
The Transformation of your Data in modern IT (Presented by DellEMC)The Transformation of your Data in modern IT (Presented by DellEMC)
The Transformation of your Data in modern IT (Presented by DellEMC)Cloudera, Inc.
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data AnalyticsAttunity
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Precisely
 
Modernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APSModernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APSStéphane Fréchette
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Innovative Management Services
 
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016MLconf
 
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoWhat's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoDataWorks Summit
 
Presentation big dataappliance-overview_oow_v3
Presentation   big dataappliance-overview_oow_v3Presentation   big dataappliance-overview_oow_v3
Presentation big dataappliance-overview_oow_v3xKinAnx
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the OrganizationSeeling Cheung
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanJim Kaskade
 
Ibm integrated analytics system
Ibm integrated analytics systemIbm integrated analytics system
Ibm integrated analytics systemModusOptimum
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseRizaldy Ignacio
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Martin Bém
 
Edw Optimization Solution
Edw Optimization Solution Edw Optimization Solution
Edw Optimization Solution Hortonworks
 

Similar to Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform (20)

TDC2017 | POA Trilha BigData - IBM BigSQL - Engine de consulta de dados de al...
TDC2017 | POA Trilha BigData - IBM BigSQL - Engine de consulta de dados de al...TDC2017 | POA Trilha BigData - IBM BigSQL - Engine de consulta de dados de al...
TDC2017 | POA Trilha BigData - IBM BigSQL - Engine de consulta de dados de al...
 
Big SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor LandscapeBig SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor Landscape
 
Microsoft ignite 2018 SQL server 2019 big data clusters - deep dive session
Microsoft ignite 2018 SQL server 2019 big data clusters - deep dive sessionMicrosoft ignite 2018 SQL server 2019 big data clusters - deep dive session
Microsoft ignite 2018 SQL server 2019 big data clusters - deep dive session
 
The Transformation of your Data in modern IT (Presented by DellEMC)
The Transformation of your Data in modern IT (Presented by DellEMC)The Transformation of your Data in modern IT (Presented by DellEMC)
The Transformation of your Data in modern IT (Presented by DellEMC)
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data Analytics
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
 
Modernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APSModernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APS
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
 
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoWhat's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - Tokyo
 
Presentation big dataappliance-overview_oow_v3
Presentation   big dataappliance-overview_oow_v3Presentation   big dataappliance-overview_oow_v3
Presentation big dataappliance-overview_oow_v3
 
Ibm db2 big sql
Ibm db2 big sqlIbm db2 big sql
Ibm db2 big sql
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps Ironfan
 
Twitter with hadoop for oow
Twitter with hadoop for oowTwitter with hadoop for oow
Twitter with hadoop for oow
 
Ibm integrated analytics system
Ibm integrated analytics systemIbm integrated analytics system
Ibm integrated analytics system
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27
 
Azure Big data
Azure Big data Azure Big data
Azure Big data
 
Edw Optimization Solution
Edw Optimization Solution Edw Optimization Solution
Edw Optimization Solution
 

More from Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 
4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive Data4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive DataHortonworks
 
5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of Data5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of DataHortonworks
 
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake DebateExploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake DebateHortonworks
 

More from Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 
4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive Data4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive Data
 
5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of Data5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of Data
 
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake DebateExploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
 

Recently uploaded

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Recently uploaded (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform

  • 1. 1 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Modernize Your Existing EDW With IBM Big SQL & Hortonworks Data Platform Nagapriya Tiruthani, Offering Manager, IBM Big SQL Carter Shanklin, Sr. Director, Product Management, Hortonworks Roni Fontaine, Director, Product Marketing, Hortonworks
  • 2. 2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Agenda Partnership Modernizing your existing EDW – Big SQL How Hive helps in the modernization Using Hive & Big SQL Resources / Q & A
  • 3. 3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Focus on extending data science, machine learning and Big SQL to analyze the data in Apache Hadoop systems Provides Data Science, Machine Learning & Big SQL Provides Open Hadoop Data Platform Make our clients competitive in their markets using advanced analytics faster and at scale + IBM and Hortonworks Deliver Data Science and Big SQL Announcement –June 13, 2017 1. IBM standardizes on HDP by leveraging Hortonworks Data Platform as the core Hadoop distribution for Big SQL and DSX 2. Hortonworks introduces New Product Offerings: • IBM Data Science Experience (DSX) • IBM Big SQL
  • 4. 4 IBM Confidential Hortonworks Data Platform & IBM BIG SQL the Bridge for Customers • #1 Pure Open Source Hadoop Distribution • 1000+ customers and 2100+ ecosystem partners • Compatibility • Capabilities & Overlap between Hive (HWX) and Big SQL (IBM) • Leader in SQL technology for Hadoop • Leader in on premise and hybrid cloud data and analytics solutions • Federation • High Performance • Improved Concurrence • Complex queries • Enterprise security features
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
  • 7. IBM Cloud Advanced Analytics Speed Scalability Cost Productivity • Build a centralized data repository (virtual/physical) to support business decisions. • Bring together siloed data available either locally or by federation to gain a variety of benefits • Modern data warehouse architecture enables deeper analytics and advanced reporting from diverse sets of data. Benefits • Cost efficiency • Deeper analytics (predictive, proactive and historical analytics) • Faster and efficient to do the analytics • Enrich or augment warehouse data What is EDW Modernization?
  • 8. IBM Cloud Want to modernize your EDW without long and costly migration efforts Offloading historical data from Oracle, Db2, Netezza because reaching capacity Operationalize machine learning Need to query, optimize and integrate multiple data sources from one single endpoint Slow query performance for SQL workloads Require skill set to migrate data from RDBMS to Hadoop/Hive Do you have any of these challenges?
  • 9. IBM Cloud  Compatible with Oracle, Db2 & Netezza SQL syntax  Modernizing EDW workloads on Hadoop has never been easier  Application portability (eg: Cognos, Tableau, MicroStrategy,…)  Federates all your data behind a single SQL engine  Query Hive, Spark and HBase data from a single endpoint  Federate your Hadoop data using connectors to Teradata, Oracle, Db2 & more  Query data sources that have Spark connectors  Addresses a skillset gap needed to migrate technologies  Delivers high performance & concurrency for BI workloads  Unlock Hadoop data with analytics tools of choice  Provides greater security while accessing data  Robust SQL based row filtering, column masking with role-based access control and Ranger integration  Operationalize machine learning through integration with Spark  Bi-directional integration with Spark exploits Spark’s connectors as well as ML capabilities Here’s How Big SQL Addresses These Challenges
  • 10. IBM Cloud Big SQL queries heterogeneous systems in a single query - only SQL-on-Hadoop that virtualizes more than 10 different data sources: RDBMS, NoSQL, HDFS or Object Store Big SQL Fluid Query (federation) Oracle SQL Server Teradata DB2 Netezza (PDA) Informix Microsoft SQL Server Hive HBase HDFS Object Store (S3) WebHDFS Big SQL allows query federation by virtualizing data sources and processing where data resides Hortonworks Data Platform (HDP) Data Virtualization
  • 11. IBM Cloud  Automatic memory management and workload management handles complex query execution without failures  Executes all 99 TPC-DS queries in single stream and multi-stream environment  Easy porting of enterprise applications  Ability to work seamlessly with Business Intelligence tools like Cognos, Tableau, etc. to gain insights Oracle SQL DB2 SQL Netezza SQL Big SQL SQL syntax tolerance (ANSI SQL Compliant) Cognos Analytics InfoSphere Metadata Asset Manager Big SQL is a synergetic SQL engine that offers SQL compatibility, portability and collaborative ability to get composite analysis on data Data Offloading and Analytics
  • 12. IBM Cloud BRANCH_A FINANCE (security admin)BRANCH_B Role Based Access Control enables separation of Duties / Audit Row Level Security Row and Column Level Security Big SQL offers row and column level access control (RBAC) among other security settings Data Security
  • 13. IBM Cloud Big SQL Head NM NM NM NM NM NM HDFS Slider Client YARN Resource Manager & Scheduler Big SQL AM Users Big SQL Work er Big SQL Work er Big SQL Work er Big SQL Work er Big SQL Work er Big SQL Work er Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Launch Multiple Workers per Host More Granular Elasticity For both 1 and 10 TB TPC-DS dataset 2 Workers/Node: 1.6x speedup 4 Workers/Node: 2.2x speedup In each scenario, the same TOTAL CPU/memory is used Performance: Big SQL Elastic Boost
  • 14. IBM Cloud Enhancements to ORC file format has improved performance when compared to previous releases and it at par with Parquet file format (on v5.0.1)(on v4.2) Performance: Parquet vs ORC File Format
  • 15. IBM Cloud BigSQLworker Sparkexecutor Share data in memory Spark 2.1 is a powerful analytic co-processor that complements the rich SQL functionality of Big SQL Tight integration with Spark enables Big SQL worker and Spark Executor to communicate in memory without writing to disk Bi-directional integration allows Spark jobs can be executed from Big SQL HDFS Big SQL is a self-tuning memory management SQL engine that integrates with Spark 2.1 Spark Integration
  • 16. IBM Cloud Data Ingestion Data Transformation/ Data Science/ Machine Learning Data Visualization Leverage Big SQL throughout your journey Virtualize disparate data sources like Hadoop, RDBMS, and Object Stores (S3) to join data in a single query Manipulate data and operationalize data science models written in various languages Perform data discovery, analyze, and visualize business results in notebooks or other BI tools Democratize Data Science and Machine Learning
  • 17. IBM Cloud Big SQL Archive Data away from EDW • Move cold or rarely used data to Hadoop as active archive • Store more of data longer Offload costly ETL process • Free your EDW to perform high-value functions like analytics & operations, not ETL • Use Hadoop for advanced ETL Optimize the value of your EDW • Use Hadoop to refine new data sources, such as web and machine data for new analytical context • Access old data in traditional RDBMS using Big SQL’s federation • Combine data in different silos without duplication Exploit Spark to avail latest technologies • Spark integration brings data from NoSQL databases and other non-traditional sources • Make Spark’s use cases secure with granular security defined in Big SQL Reduce the migration effort & skillset gap • Use existing investment in Oracle, Db2 and Netezza technology skills • Big SQL allows you to migrate applications with out major code rewrites and additional SQL development ANALYTICSDATASYSTEMS Data Marts Business Analytics Visualization & Dashboards Systems of Record RDBMS ERP CRM Other Clickstream Web & Social Geolocation Sensor & Machine Server Logs Unstructured NEWSOURCES HDP 2.6 ELT ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° N Cold Data, Deeper Archive & New Sources Enterprise Data Warehouse Hot Realize Cost Savings Faster with Big SQL Top Use case – EDW Optimization
  • 18. 18 © Hortonworks Inc. 2011 – 2017. All Rights Reserved How Hive Helps in the Modernization
  • 19. 19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDP 2.6: A Major Milestone for Apache Hive  Major Improvements: – Hive LLAP Now GA – ACID MERGE – SQL: All 99 TPC-DS out-of-the-box with only trivial rewrites – Hive View 2.0: Great Features for DBAs – Diagnostics: Tez UI Total Timeline View – Hive OLAP Indexes powered by Druid  At a High Level: – 1200+ features, improvements and bug fixes in Hive since HDP 2.5. – 400+ of these from outside of Hortonworks. 820 413 From Hortonworks From Community HDP 2.6 Improvements Hive LLAP GA+ SQL MERGE+ All TPC-DS Queries+
  • 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hive LLAP: MPP Performance at Hadoop Scale Deep Storage YARN Cluster LLAP Daemon Query Executors LLAP Daemon Query Executors LLAP Daemon Query Executors LLAP Daemon Query Executors Query Coordinators Coord- inator Coord- inator Coord- inator HiveServer2 (Query Endpoint) ODBC / JDBC SQL Queries In-Memory Cache (Shared Across All Users) HDFS and Compatible S3 WASB Isilon
  • 21. 21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDP 2.6 Makes Hadoop Data Management a Reality with SQL MERGE  Hive implements ANSI-standard SQL MERGE.  MERGE makes data maintenance 8x simpler with 5x higher performance.  Legacy Hive or Spark approaches don’t protect applications against dirty reads or partial failures. Complexity of a Type 2 SCD Update with and without MERGE Number of Queries Number of Full Table Scans Isolation Applications Protected From Partial Failures Hive MERGE 1 1 Yes Yes Old Techniques 8 5 No No
  • 22. 22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Comprehensive SQL in Hive Including All 99 TPC-DS Queries  Multiple and Scalar Subqueries  INTERSECT and EXCEPT  Standard syntax for ROLLUP / GROUPING  Syntax improvements for GROUP BY and ORDER BY  In HDP 2.6+ Hive runs all 99 TPC-DS with only trivial re-writes. Highlights
  • 23. 23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Announcing: Druid is Now GA in HDP 2.6.3 ✓ Analyze Streaming and Historical Data with SQL ✓ Powerful Visualization ✓ Simple management and monitoring with Ambari ✓ Fine-grained security ✓ Integrates with Hortonworks SAM for simple development
  • 24. 24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved OLAP Analytics in Milliseconds with Hive over Druid .0 .5 1.0 1.5 2.0 2.5 3.0 Q1.1 Q1.2 Q1.3 Q2.1 Q2.2 Q2.3 Q3.1 Q3.2 Q3.3 Q3.4 Q4.1 Q4.2 Q4.3 Runtime(s) Star Schema Benchmark 1TB Scale with Hive over 10 Druid Nodes https://hortonworks.com/blog/apache-hive-druid-part-1-3/#comment-24833
  • 25. 25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Runtime Comparison: HDP 2.5 and HDP 2.6 (Lower is Better) 1 10 100 1000 10000 query32 query64 query84 query18 query95 query58 query50 query82 query88 query28 query90 query45 query94 query27 query96 query17 query13 query26 query34 query3 query93 query40 query39 query66 query71 query85 query76 query89 query12 query46 query19 query48 query7 query42 query21 query43 query73 query92 query49 query68 query55 query98 query20 query87 query91 query97 query15 query52 query25 query79 Runtime(s)(LogScale) Runtime Comparison: HDP 2.5 to HDP 2.6 Legacy TPC-DS Queries, 10TB, 9 Nodes HDP 2.5 HDP 2.6
  • 26. 26 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Using Hive & Big SQL
  • 27. 27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Application Portability & Integration Data shared with Hadoop ecosystem Comprehensive file format support Superior enablement of IBM and Third Party software Performance Modern MPP runtime Powerful SQL query rewriter Cost based optimizer Optimized for concurrent user throughput Results not constrained by memory Federation Distributed requests to multiple data sources within a single SQL statement Main data sources supported: DB2, Teradata, Oracle, Netezza, Informix, SQL Server Enterprise Features Advanced security/auditing Resource and workload management Self tuning memory management Comprehensive monitoring Rich SQL Comprehensive SQL Support IBM SQL PL compatibility Extensive Analytic Functions How Big SQL Complements HDP
  • 28. 28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Breaking Things Down: Where IBM Big SQL Shines Run Oracle, DB2 or Netezza Workloads on Hadoop+ Federate Hadoop and Non-Hadoop Data+ Complex SQL Workloads+
  • 29. 29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Breaking Things Down: Where Apache Hive Shines Fast SQL That Scales from Terabytes to Petabytes+ Easy to Keep Data Fresh with ACID MERGE+ Join Historical and Streaming Data in Real Time+
  • 30. 30 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Resources
  • 31. 31 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Resources  HWX Big SQL Web Page: lhttps://hortonworks.com/partners/ibm-bigsql/  Big SQL Solutions Sheet: https://2xbbhjxc6wk3v21p62t8n4d4- wpengine.netdna-ssl.com/wp-content/uploads/2017/08/IBM-Big-SQL- Solution-Sheet_final.pdf  Big SQL Data Sheet https://2xbbhjxc6wk3v21p62t8n4d4-wpengine.netdna- ssl.com/wp-content/uploads/2017/08/IBM-Big-SQL-Datasheet_final.pdf  Big SQL Resources – Big SQL Web Page/Sandbox https://www.ibm.com/us-en/marketplace/big-sql – Big SQL Master Class Videos https://www.youtube.com/playlist?list=PL7FnN5oi7Ez9itAnZ6rs9A30YYjVB1wN_  Big SQL Blog: https://hortonworks.com/blog/big-sql-apache-hadoop-across- enterprise/
  • 32. 32 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Questions?

Editor's Notes

  1. On June 13, 2017, we were very excited to announce an expansion to our partnership with Hortonworks to bring Data Science to the world on an Open platform. This partnership will bring together IBM's Data Science Experience,  IBM Machine Learning and IBM Big SQL on top of Hortonworks Data Platform,  to form new integrated solutions. These solutions will help extend IBM's leadership in data science and machine learning to everyone from data scientists to business leaders to analyze and manage their data. We specifically announced : Hortonworks adopts IBM Data Science Experience (DSX) as its Data Science Platform. Hortonworks adopts IBM BigSQL as its SQL engine for complex analytics on Hadoop. IBM will leverage Hortonworks HDP as the core Hadoop distribution for BigSQL and DSX
  2. Hello, my name is Jessica Lee and I am the Big SQL Offering Manager. You can see my email on this slide. Today I am going to walk through the pricing and packaging for IBM Big SQL 5.0, and HDP and HDF for IBM (Hortonworks resell products) The last section of this deck will also provide information on BigInsights Entitlement migration
  3. Some of these challenges might sound familiar Moderning your existing EDW without long and costly migration efforts Querying across multiple types of data management Lack ability to query across traditional data warehouse and cloud based data systems Lack of skill set to migrate data from RDBMS to Hadoop/Hive Optimzing and integrating external data sources with existing data sources Offloading and porting workloads from Oracle, Db2 & Netezza Difficult to offload workloads from EDW platforms Performance Slows down once too many users access the system Interactive query performance is unacceptable Operational efficiency around older data warehouse environments Requires tools to help with ease of use and automation to manage workloads and schedule jobs
  4. 4 things to remember Compatible with Oracle, Db2, Netezza Modernizing EDW Workloads on Hadoop App portability Federates data behind a single SQL engine Uses connectors to Teradata, Oracle, Db2, Netezza Addresses a skill gap needed to migrated technologies People hate migrations on rewrite code – interrupts business process – takes other resources to do this work Skills gap to do this. Want an engine that is 100% compatible With Big SQL – you can go into your accounts with Netezza, Oracle, DB2 – they can move part of their RBDMS  Dataware house to hadoop without having to change any code Ask your customer – do you want to optimize your oracle, netezza, db2 workloads by moving them to Hive Sick of Oracle, Netezza?  We got a solution to help you get them out. Finally – it delivers that high performance for complex SQL workloads
  5. Federation consists of the following: a federated server, federated database, data sources, clients which can be users and applications to access both local database and database from data sources. Federation is known for its strengths: - Transparent: Correlate data from local tables and remote data sources, as if all the data is stored locally in the federated database - Extensible: Update data in relational data sources, as if the data is stored in the federated database - Autonomous: Move data to and from relational data sources without interruptions - High Function: Take advantage of the data source processing strengths, by sending requests to the data sources for processing - High Performance: Compensate for SQL limitations at the data source by processing parts of a distributed request at the federated server Below is a list of data sources supported for Big SQL. For the latest information, visit https://www-304.ibm.com/support/entdocview.wss?uid=swg27044495 Data Source Supported Versions Notes DB2 for z/OS® 8.x, 9.x, 10.x DB2®DB2 for LUW. 9.7, 9.8, 10.1, 10.5 Oracle 11g, 11gR1, 11g R2, 12c Teradata 12, 13, 14, 15 Not supported on POWER systems. Netezza 4.6, 5.0, 6.0, 7.2 Not supported on POWER systems. Informix 11.5 Microsoft SQL Server 2012, 2014 ODBC 3.0 or later Big SQL now comes pre-bundled with DataDirect drivers from Progress that eliminates the need for downloading drivers and enables easy setup Spark connectors enhances Big SQL’s connectivity to other data sources and also operationalize machine learning models
  6. Federation consists of the following: a federated server, federated database, data sources, clients which can be users and applications to access both local database and database from data sources. Federation is known for its strengths: - Transparent: Correlate data from local tables and remote data sources, as if all the data is stored locally in the federated database - Extensible: Update data in relational data sources, as if the data is stored in the federated database - Autonomous: Move data to and from relational data sources without interruptions - High Function: Take advantage of the data source processing strengths, by sending requests to the data sources for processing - High Performance: Compensate for SQL limitations at the data source by processing parts of a distributed request at the federated server Below is a list of data sources supported for Big SQL. For the latest information, visit https://www-304.ibm.com/support/entdocview.wss?uid=swg27044495 Data Source Supported Versions Notes DB2 for z/OS® 8.x, 9.x, 10.x DB2®DB2 for LUW. 9.7, 9.8, 10.1, 10.5 Oracle 11g, 11gR1, 11g R2, 12c Teradata 12, 13, 14, 15 Not supported on POWER systems. Netezza 4.6, 5.0, 6.0, 7.2 Not supported on POWER systems. Informix 11.5 Microsoft SQL Server 2012, 2014 ODBC 3.0 or later Big SQL is ANSI compliant, therefore it can run Oracle SQL, DB2 SQL and Netezza SQL. Big SQL tables can be created for data residing in HDFS, Hive, Hbase, Object Store and WebHDFS. Big SQL is aware of remote indexes and table statistics to optimize federated queries. Big SQL provides: a unified view of all your tables, with federated query support to external databases stored optimally as Hive, HBase, or read with Spark - optimized for the expected workload secured under a single security model (including row/column security across all) with the ability to join across all datasets using standard ANSI SQL across all types of tables with Oracle, NZ, DB2 extensions if you prefer using a single database connection and driver No other SQL engine for Hadoop even comes close.
  7. For Albert, from Branch A, only rows from his branch are listed because he has access to records with same branch name as his. Bonnie, from Branch B, can only view records from her branch and does have access to view salary. But Cindy, a security administrator, can view records from all branches and also view all columns. When customers take the time to review the security capabilities of other SQL Engines on Hadoop, they realize a lot of things they are used to are missing. Big SQL supports all of the important features. Separation of Duties – an important feature if you really want to operationalize the environment. Most enterprise customers don’t like having a database “super user” who can manage the environment and also see all data (like root access for the SQL engine). Big SQL can define Database Administrator roles that allow administration of the environment without permitting access to the data, for example. Big SQL also provide row level and column level security using row permissions (CREATE PERMISSION) or column masking (CREATE MASK) on the tables. Note that both Big SQL ROLES (same concept as DB2 or Oracle roles) and OS Groups (Local and LDAP) are supported.
  8. Instead of running one (potentially very large memory) worker per host, elastic boost enables the ability to run multiple smaller Big SQL workers per host. We can start and stop workers fully independent of each other. Therefore, while the same amount of maximum memory/CPU is consumed when all workers are on…. we get much more granularity of resource usage in terms of scaling up and down capacity with more workers. We have found that for INSERT… SELECT FROM workloads (common for EL-T type workloads), elastic boost can improve performance by 1.6X for 2 workers/node and 2.2x for 4 workers per node. Note that for each of these bars, physical CPU, memory, disk, and network is fixed. We are only increasing # of workers. #comment: this chart may be confusing as it looks like we’re comparing ORC vs. Parquet. This is not the case as we’re actually showing two different scale factors (1TB, 10TB) and just trying to get test coverage with the least amount of work. Both formats showed similar improvements in performance. https://developer.ibm.com/hadoop/2017/03/15/logical-big-sql-workers-boost-big-sql-performance/
  9. Big SQL Head node can actually start its own driver program and spawn Spark executors. From there, Spark Executors can “pair up” with Big SQL workers and now workers can communicate with each other, in memory, without first writing to disk. Further, RDD partitions can be fed directly to Big SQL workers without passing through the driver or performing disk I/O. In Big SQL 5.0, Big SQL is tightly integrated with Spark. This integration enables the development of hybrid applications where Spark jobs can be executed from Big SQL SELECT statements and the results efficiently streamed in parallel from Spark to Big SQL. Big SQL applications can treat Spark as a powerful analytic co-processor that complements the rich SQL functionality that is available in Big SQL. Big SQL users, who already enjoy the best SQL performance, richest SQL language, and extensive enterprise capabilities, can also leverage Spark for its non-relational distributed processing capabilities and rich analytic libraries. The capabilities of both engines are seamlessly blended in a single SQL statement, with large volumes of data flowing efficiently between them. A built-in table function can make Spark functionality directly available at the SQL language level, and a Big SQL SELECT statement can invoke Spark jobs directly and process the results of those jobs as though they were tables. For example, you can use the EXECSPARK table function to invoke Spark jobs from Big SQL
  10. This is the most important use case for using Big SQL – for EDW optimization End Result – your customer will realized cost savings faster with Big SQL One of the most unique values that Big SQL brings- Helps reduce the migration effort and skill set gap. What does this mean? Customers can use their existing investment in Oracle, Db2 and Netezza technology skills Big SQL allows customers to migrate their existing applications without major code rewrites and additional SQL development. This is huge!
  11. What distinguishes Big SQL from other SQL-on-Hadoop offerings? Well, I’ve tried to summarize the key characteristics here, categorizing them into 4 broad areas. We’ll dig into many of these features in greater detail later in this presentation. Note to IBMers: Separate deep-dive presentations and other assets are available on ZACS for federation, performance, and other Big SQL features.