SlideShare ist ein Scribd-Unternehmen logo
1 von 12
© 2014 MapR Technologies 1#NoSQLNow @apachedrill © 2014 MapR Technologies#NoSQLNow
Drilling on JSON
© 2014 MapR Technologies 2#NoSQLNow @apachedrill
NoSQL
We don't need no transaction
We don't need no ACID control
No schema in the tables
No limit to the scale out
DBA, leave them JSON alone
Hey DBA, leave them JSON alone
All in all it's just another data in the BASE
All in all it’s just another shard into cloud.
…With apologies to Roger Waters
© 2014 MapR Technologies 3
Martin Fowler says: “aggregate-
oriented”
What you're most likely to access as
a unit.
Key Value Store
 Couchbase
 Riak
 Citrusleaf
 Redis
 BerkeleyDB
 Membrain
 ...
Document
 MongoDB
 CouchDB
 RavenDB
 Couchbase
 ... Graph
 OrientDB
 DEX
 Neo4j
 GraphBase
 ...Wide Column
 HBase
 Hypertable
 Cassandra
 MapR-DB
 ...
NoSQL Landscape
© 2014 MapR Technologies 4
Data landscape is changing
New types of applications
• Social, mobile, Web, “Internet
of Things”, Cloud…
• Iterative/Agile in nature
• More users, more data
New data models & data types
• Flexible Schema/Schema less
• Rapidly changing
• Semi-structured/Nested data
{
"data": [
"id": "X999_Y999",
"from": {
"name": "Tom Brady", "id": "X12"
},
"message": "Looking forward to 2014!",
"actions": [
{
"name": "Comment",
"link": "http://www.facebook.com/X99/posts Y999"
},
{
"name": "Like",
"link": "http://www.facebook.com/X99/posts Y999"
}
],
"type": "status",
"created_time": "2013-08-02T21:27:44+0000",
"updated_time": "2013-08-02T21:27:44+0000"
}
}
JSON
© 2014 MapR Technologies 5
• Pioneering Data Agility for Hadoop
• Apache open source project
• Scale-out execution engine for low-latency queries
• Unified SQL-based API for analytics & operational applications
APACHE DRILL
40+ contributors
150+ years of experience building
databases and distributed systems
© 2014 MapR Technologies 6#NoSQLNow @apachedrill
Zero to Results in 2 Minutes (3 Commands)
$ tar xzf apache-drill.tar.gz
$ apache-drill/bin/sqlline -u jdbc:drill:zk=local
0: jdbc:drill:zk=local>
SELECT DISTINCT users.name as name, users.emails.work as email
FROM dfs.logs.`/data/logs` logs,
dfs.users.`/profiles.json` users
WHERE logs.uid = users.id AND
logs.errorLevel > 5;
+------------+------------+
| name | email |
+------------+------------+
| john | john@gmail.com|
| jack | jack@yahoo.com|
| Ronn | ronn@mapr.com |
| Pat | pat@hotmail.com|
...
Install
Launch shell
(embedded
mode)
Query
Query
© 2014 MapR Technologies 7
Drill Supports Schema Discovery On-The-Fly
• Fixed schema
• Leverage schema in centralized
repository (Hive Metastore)
• Fixed schema, evolving schema or
schema-less
• Leverage schema in centralized
repository or self-describing data
2Schema Discovered On-The-FlySchema Declared In Advance
SCHEMA ON
WRITE
SCHEMA
BEFORE READ
SCHEMA ON THE
FLY
© 2014 MapR Technologies 8#NoSQLNow @apachedrill
Self-Describing Data is Ubiquitous
Flat files in DFS
• Complex data (Thrift, Avro, protobuf)
• Columnar data (Parquet, ORC)
• Loosely defined (JSON)
• Traditional files (CSV, TSV)
Data stored in NoSQL stores
• Relational-like (rows, columns)
• Sparse data (NoSQL maps)
• Embedded blobs (JSON)
• Document stores (nested objects)
{
name: {
first: Michael,
last: Smith
},
hobbies: [ski, soccer],
district: Los Altos
}
{
name: {
first: Jennifer,
last: Gates
},
hobbies: [sing],
preschool: CCLC
}
© 2014 MapR Technologies 9#NoSQLNow @apachedrill
Drill’s Data Model is Flexible
HBase
JSON
BSON
CSV
TSV
Parquet
Avro
Schema-lessFixed schema
Flat
Complex
Flexibility
Flexibility
Name Gender Age
Michael M 6
Jennifer F 3
{
name: {
first: Michael,
last: Smith
},
hobbies: [ski, soccer],
district: Los Altos
}
{
name: {
first: Jennifer,
last: Gates
},
hobbies: [sing],
preschool: CCLC
}
RDBMS/SQL-on-Hadoop table
Apache Drill table
© 2014 MapR Technologies 10#NoSQLNow @apachedrill
Core Modules within a Drillbit
SQL
Parser Optimizer
PhysicalPlan
DFS
HBase
RPC Endpoint
Distributed Cache
StoragePlugins
LogicalPlan Execution Hive
MongoDB
CouchBase
Cassandra
RDBMS
© 2014 MapR Technologies 11#NoSQLNow @apachedrill
Processing
in Files
MapReduce
Generic
fileformats
Rows/Columns
in files (tables)
Hive – Pig - etc
Query
Impala
Tez
Hive
NoSQL
MongoDB
Hbase
Cassandra
Riak
Redis
HADOOPDisk &
Storage
RDBMS
Highly Structured Data
ANSI-
SQL
SQL++
R, etc
bits,bytes,blocks
$100K – $200K / TB$1K/TB$10K/TB
Semi Structured & Self describingNo Structure
OLTP EDW
Apache
Drill
© 2014 MapR Technologies 12#NoSQLNow @apachedrill
NoSQL NoETL
Drill, Baby, Drill: Self-Service Data Exploration using Apache Drill
Thursday, August 21st. 9.30 AM
Apache Drill

Weitere ähnliche Inhalte

Was ist angesagt?

Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Simplilearn
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
Varun Narang
 
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, ClouderaSolr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
Lucidworks
 

Was ist angesagt? (20)

Hadoop Ecosystem Overview
Hadoop Ecosystem OverviewHadoop Ecosystem Overview
Hadoop Ecosystem Overview
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
 
Hadoop Tutorial
Hadoop TutorialHadoop Tutorial
Hadoop Tutorial
 
Hadoop presentation
Hadoop presentationHadoop presentation
Hadoop presentation
 
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
 
Hadoop
Hadoop Hadoop
Hadoop
 
Hadoop-Quick introduction
Hadoop-Quick introductionHadoop-Quick introduction
Hadoop-Quick introduction
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
 
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, ClouderaSolr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
 
Hadoop configuration & performance tuning
Hadoop configuration & performance tuningHadoop configuration & performance tuning
Hadoop configuration & performance tuning
 
Real Time and Big Data – It’s About Time
Real Time and Big Data – It’s About TimeReal Time and Big Data – It’s About Time
Real Time and Big Data – It’s About Time
 
Redis
RedisRedis
Redis
 
Lecture 2 part 1
Lecture 2 part 1Lecture 2 part 1
Lecture 2 part 1
 
Hadoop overview
Hadoop overviewHadoop overview
Hadoop overview
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 
SQOOP - RDBMS to Hadoop
SQOOP - RDBMS to HadoopSQOOP - RDBMS to Hadoop
SQOOP - RDBMS to Hadoop
 
Hadoop, MapReduce and R = RHadoop
Hadoop, MapReduce and R = RHadoopHadoop, MapReduce and R = RHadoop
Hadoop, MapReduce and R = RHadoop
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Welcome to Hadoop2Land!
Welcome to Hadoop2Land!Welcome to Hadoop2Land!
Welcome to Hadoop2Land!
 

Andere mochten auch

Bringing SQL to NoSQL: Rich, Declarative Query for NoSQL
Bringing SQL to NoSQL: Rich, Declarative Query for NoSQLBringing SQL to NoSQL: Rich, Declarative Query for NoSQL
Bringing SQL to NoSQL: Rich, Declarative Query for NoSQL
Keshav Murthy
 

Andere mochten auch (12)

Introduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseIntroduction to NoSQL and Couchbase
Introduction to NoSQL and Couchbase
 
Utilizing Arrays: Modeling, Querying and Indexing
Utilizing Arrays: Modeling, Querying and IndexingUtilizing Arrays: Modeling, Querying and Indexing
Utilizing Arrays: Modeling, Querying and Indexing
 
Understanding N1QL Optimizer to Tune Queries
Understanding N1QL Optimizer to Tune QueriesUnderstanding N1QL Optimizer to Tune Queries
Understanding N1QL Optimizer to Tune Queries
 
Query in Couchbase. N1QL: SQL for JSON
Query in Couchbase.  N1QL: SQL for JSONQuery in Couchbase.  N1QL: SQL for JSON
Query in Couchbase. N1QL: SQL for JSON
 
Bringing SQL to NoSQL: Rich, Declarative Query for NoSQL
Bringing SQL to NoSQL: Rich, Declarative Query for NoSQLBringing SQL to NoSQL: Rich, Declarative Query for NoSQL
Bringing SQL to NoSQL: Rich, Declarative Query for NoSQL
 
N1QL workshop: Indexing & Query turning.
N1QL workshop: Indexing & Query turning.N1QL workshop: Indexing & Query turning.
N1QL workshop: Indexing & Query turning.
 
Tuning for Performance: indexes & Queries
Tuning for Performance: indexes & QueriesTuning for Performance: indexes & Queries
Tuning for Performance: indexes & Queries
 
Deep dive into N1QL: SQL for JSON: Internals and power features.
Deep dive into N1QL: SQL for JSON: Internals and power features.Deep dive into N1QL: SQL for JSON: Internals and power features.
Deep dive into N1QL: SQL for JSON: Internals and power features.
 
Couchbase Day
Couchbase DayCouchbase Day
Couchbase Day
 
Couchbase @ Big Data France 2016
Couchbase @ Big Data France 2016Couchbase @ Big Data France 2016
Couchbase @ Big Data France 2016
 
SDEC2011 Using Couchbase for social game scaling and speed
SDEC2011 Using Couchbase for social game scaling and speedSDEC2011 Using Couchbase for social game scaling and speed
SDEC2011 Using Couchbase for social game scaling and speed
 
Accelerating analytics on the Sensor and IoT Data.
Accelerating analytics on the Sensor and IoT Data. Accelerating analytics on the Sensor and IoT Data.
Accelerating analytics on the Sensor and IoT Data.
 

Ähnlich wie Drilling on JSON

Analyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache DrillAnalyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache Drill
Tomer Shiran
 
The Future of Hadoop: MapR VP of Product Management, Tomer Shiran
The Future of Hadoop: MapR VP of Product Management, Tomer ShiranThe Future of Hadoop: MapR VP of Product Management, Tomer Shiran
The Future of Hadoop: MapR VP of Product Management, Tomer Shiran
MapR Technologies
 
Drilling into Data with Apache Drill
Drilling into Data with Apache DrillDrilling into Data with Apache Drill
Drilling into Data with Apache Drill
MapR Technologies
 

Ähnlich wie Drilling on JSON (20)

Real Time and Big Data – It’s About Time
Real Time and Big Data – It’s About TimeReal Time and Big Data – It’s About Time
Real Time and Big Data – It’s About Time
 
Analyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache DrillAnalyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache Drill
 
Apache Drill - Why, What, How
Apache Drill - Why, What, HowApache Drill - Why, What, How
Apache Drill - Why, What, How
 
Drill into Drill – How Providing Flexibility and Performance is Possible
Drill into Drill – How Providing Flexibility and Performance is PossibleDrill into Drill – How Providing Flexibility and Performance is Possible
Drill into Drill – How Providing Flexibility and Performance is Possible
 
Analyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache DrillAnalyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache Drill
 
The Future of Hadoop: MapR VP of Product Management, Tomer Shiran
The Future of Hadoop: MapR VP of Product Management, Tomer ShiranThe Future of Hadoop: MapR VP of Product Management, Tomer Shiran
The Future of Hadoop: MapR VP of Product Management, Tomer Shiran
 
Big Data Everywhere Chicago: SQL on Hadoop
Big Data Everywhere Chicago: SQL on Hadoop Big Data Everywhere Chicago: SQL on Hadoop
Big Data Everywhere Chicago: SQL on Hadoop
 
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...
Apache Drill: Building Highly Flexible, High Performance Query Engines by M.C...
 
Webinar: Selecting the Right SQL-on-Hadoop Solution
Webinar: Selecting the Right SQL-on-Hadoop SolutionWebinar: Selecting the Right SQL-on-Hadoop Solution
Webinar: Selecting the Right SQL-on-Hadoop Solution
 
Drill 1.0
Drill 1.0Drill 1.0
Drill 1.0
 
2014 08-20-pit-hug
2014 08-20-pit-hug2014 08-20-pit-hug
2014 08-20-pit-hug
 
Drilling into Data with Apache Drill
Drilling into Data with Apache DrillDrilling into Data with Apache Drill
Drilling into Data with Apache Drill
 
PhillyDB Talk - Beyond Batch
PhillyDB Talk - Beyond BatchPhillyDB Talk - Beyond Batch
PhillyDB Talk - Beyond Batch
 
Self-Service Data Exploration with Apache Drill
Self-Service Data Exploration with Apache DrillSelf-Service Data Exploration with Apache Drill
Self-Service Data Exploration with Apache Drill
 
Drilling into Data with Apache Drill
Drilling into Data with Apache DrillDrilling into Data with Apache Drill
Drilling into Data with Apache Drill
 
Putting Apache Drill into Production
Putting Apache Drill into ProductionPutting Apache Drill into Production
Putting Apache Drill into Production
 
The Heterogeneous Data lake
The Heterogeneous Data lakeThe Heterogeneous Data lake
The Heterogeneous Data lake
 
Free Code Friday: Drill 101 - Basics of Apache Drill
Free Code Friday: Drill 101 - Basics of Apache DrillFree Code Friday: Drill 101 - Basics of Apache Drill
Free Code Friday: Drill 101 - Basics of Apache Drill
 
Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)
Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)
Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)
 
NoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DBNoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DB
 

Mehr von Keshav Murthy

Mehr von Keshav Murthy (20)

N1QL New Features in couchbase 7.0
N1QL New Features in couchbase 7.0N1QL New Features in couchbase 7.0
N1QL New Features in couchbase 7.0
 
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Couchbase Tutorial: Big data Open Source Systems: VLDB2018Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
 
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
 
XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...
XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...
XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...
 
Couchbase 5.5: N1QL and Indexing features
Couchbase 5.5: N1QL and Indexing featuresCouchbase 5.5: N1QL and Indexing features
Couchbase 5.5: N1QL and Indexing features
 
N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli
N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram VemulapalliN1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli
N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli
 
Couchbase N1QL: Language & Architecture Overview.
Couchbase N1QL: Language & Architecture Overview.Couchbase N1QL: Language & Architecture Overview.
Couchbase N1QL: Language & Architecture Overview.
 
Couchbase Query Workbench Enhancements By Eben Haber
Couchbase Query Workbench Enhancements  By Eben Haber Couchbase Query Workbench Enhancements  By Eben Haber
Couchbase Query Workbench Enhancements By Eben Haber
 
Mindmap: Oracle to Couchbase for developers
Mindmap: Oracle to Couchbase for developersMindmap: Oracle to Couchbase for developers
Mindmap: Oracle to Couchbase for developers
 
Couchbase N1QL: Index Advisor
Couchbase N1QL: Index AdvisorCouchbase N1QL: Index Advisor
Couchbase N1QL: Index Advisor
 
N1QL: What's new in Couchbase 5.0
N1QL: What's new in Couchbase 5.0N1QL: What's new in Couchbase 5.0
N1QL: What's new in Couchbase 5.0
 
From SQL to NoSQL: Structured Querying for JSON
From SQL to NoSQL: Structured Querying for JSONFrom SQL to NoSQL: Structured Querying for JSON
From SQL to NoSQL: Structured Querying for JSON
 
Extended JOIN in Couchbase Server 4.5
Extended JOIN in Couchbase Server 4.5Extended JOIN in Couchbase Server 4.5
Extended JOIN in Couchbase Server 4.5
 
SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications 
SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications 
SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications 
 
Introducing N1QL: New SQL Based Query Language for JSON
Introducing N1QL: New SQL Based Query Language for JSONIntroducing N1QL: New SQL Based Query Language for JSON
Introducing N1QL: New SQL Based Query Language for JSON
 
Enterprise Architect's view of Couchbase 4.0 with N1QL
Enterprise Architect's view of Couchbase 4.0 with N1QLEnterprise Architect's view of Couchbase 4.0 with N1QL
Enterprise Architect's view of Couchbase 4.0 with N1QL
 
You know what iMEAN? Using MEAN stack for application dev on Informix
You know what iMEAN? Using MEAN stack for application dev on InformixYou know what iMEAN? Using MEAN stack for application dev on Informix
You know what iMEAN? Using MEAN stack for application dev on Informix
 
Informix SQL & NoSQL: Putting it all together
Informix SQL & NoSQL: Putting it all togetherInformix SQL & NoSQL: Putting it all together
Informix SQL & NoSQL: Putting it all together
 
Informix SQL & NoSQL -- for Chat with the labs on 4/22
Informix SQL & NoSQL -- for Chat with the labs on 4/22Informix SQL & NoSQL -- for Chat with the labs on 4/22
Informix SQL & NoSQL -- for Chat with the labs on 4/22
 
NoSQL Deepdive - with Informix NoSQL. IOD 2013
NoSQL Deepdive - with Informix NoSQL. IOD 2013NoSQL Deepdive - with Informix NoSQL. IOD 2013
NoSQL Deepdive - with Informix NoSQL. IOD 2013
 

Kürzlich hochgeladen

%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 

Kürzlich hochgeladen (20)

Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 

Drilling on JSON

  • 1. © 2014 MapR Technologies 1#NoSQLNow @apachedrill © 2014 MapR Technologies#NoSQLNow Drilling on JSON
  • 2. © 2014 MapR Technologies 2#NoSQLNow @apachedrill NoSQL We don't need no transaction We don't need no ACID control No schema in the tables No limit to the scale out DBA, leave them JSON alone Hey DBA, leave them JSON alone All in all it's just another data in the BASE All in all it’s just another shard into cloud. …With apologies to Roger Waters
  • 3. © 2014 MapR Technologies 3 Martin Fowler says: “aggregate- oriented” What you're most likely to access as a unit. Key Value Store  Couchbase  Riak  Citrusleaf  Redis  BerkeleyDB  Membrain  ... Document  MongoDB  CouchDB  RavenDB  Couchbase  ... Graph  OrientDB  DEX  Neo4j  GraphBase  ...Wide Column  HBase  Hypertable  Cassandra  MapR-DB  ... NoSQL Landscape
  • 4. © 2014 MapR Technologies 4 Data landscape is changing New types of applications • Social, mobile, Web, “Internet of Things”, Cloud… • Iterative/Agile in nature • More users, more data New data models & data types • Flexible Schema/Schema less • Rapidly changing • Semi-structured/Nested data { "data": [ "id": "X999_Y999", "from": { "name": "Tom Brady", "id": "X12" }, "message": "Looking forward to 2014!", "actions": [ { "name": "Comment", "link": "http://www.facebook.com/X99/posts Y999" }, { "name": "Like", "link": "http://www.facebook.com/X99/posts Y999" } ], "type": "status", "created_time": "2013-08-02T21:27:44+0000", "updated_time": "2013-08-02T21:27:44+0000" } } JSON
  • 5. © 2014 MapR Technologies 5 • Pioneering Data Agility for Hadoop • Apache open source project • Scale-out execution engine for low-latency queries • Unified SQL-based API for analytics & operational applications APACHE DRILL 40+ contributors 150+ years of experience building databases and distributed systems
  • 6. © 2014 MapR Technologies 6#NoSQLNow @apachedrill Zero to Results in 2 Minutes (3 Commands) $ tar xzf apache-drill.tar.gz $ apache-drill/bin/sqlline -u jdbc:drill:zk=local 0: jdbc:drill:zk=local> SELECT DISTINCT users.name as name, users.emails.work as email FROM dfs.logs.`/data/logs` logs, dfs.users.`/profiles.json` users WHERE logs.uid = users.id AND logs.errorLevel > 5; +------------+------------+ | name | email | +------------+------------+ | john | john@gmail.com| | jack | jack@yahoo.com| | Ronn | ronn@mapr.com | | Pat | pat@hotmail.com| ... Install Launch shell (embedded mode) Query Query
  • 7. © 2014 MapR Technologies 7 Drill Supports Schema Discovery On-The-Fly • Fixed schema • Leverage schema in centralized repository (Hive Metastore) • Fixed schema, evolving schema or schema-less • Leverage schema in centralized repository or self-describing data 2Schema Discovered On-The-FlySchema Declared In Advance SCHEMA ON WRITE SCHEMA BEFORE READ SCHEMA ON THE FLY
  • 8. © 2014 MapR Technologies 8#NoSQLNow @apachedrill Self-Describing Data is Ubiquitous Flat files in DFS • Complex data (Thrift, Avro, protobuf) • Columnar data (Parquet, ORC) • Loosely defined (JSON) • Traditional files (CSV, TSV) Data stored in NoSQL stores • Relational-like (rows, columns) • Sparse data (NoSQL maps) • Embedded blobs (JSON) • Document stores (nested objects) { name: { first: Michael, last: Smith }, hobbies: [ski, soccer], district: Los Altos } { name: { first: Jennifer, last: Gates }, hobbies: [sing], preschool: CCLC }
  • 9. © 2014 MapR Technologies 9#NoSQLNow @apachedrill Drill’s Data Model is Flexible HBase JSON BSON CSV TSV Parquet Avro Schema-lessFixed schema Flat Complex Flexibility Flexibility Name Gender Age Michael M 6 Jennifer F 3 { name: { first: Michael, last: Smith }, hobbies: [ski, soccer], district: Los Altos } { name: { first: Jennifer, last: Gates }, hobbies: [sing], preschool: CCLC } RDBMS/SQL-on-Hadoop table Apache Drill table
  • 10. © 2014 MapR Technologies 10#NoSQLNow @apachedrill Core Modules within a Drillbit SQL Parser Optimizer PhysicalPlan DFS HBase RPC Endpoint Distributed Cache StoragePlugins LogicalPlan Execution Hive MongoDB CouchBase Cassandra RDBMS
  • 11. © 2014 MapR Technologies 11#NoSQLNow @apachedrill Processing in Files MapReduce Generic fileformats Rows/Columns in files (tables) Hive – Pig - etc Query Impala Tez Hive NoSQL MongoDB Hbase Cassandra Riak Redis HADOOPDisk & Storage RDBMS Highly Structured Data ANSI- SQL SQL++ R, etc bits,bytes,blocks $100K – $200K / TB$1K/TB$10K/TB Semi Structured & Self describingNo Structure OLTP EDW Apache Drill
  • 12. © 2014 MapR Technologies 12#NoSQLNow @apachedrill NoSQL NoETL Drill, Baby, Drill: Self-Service Data Exploration using Apache Drill Thursday, August 21st. 9.30 AM Apache Drill

Hinweis der Redaktion

  1. With other technologies you have to do this, then this, then this, …
  2. TODO: Add Impala and Splunk logos
  3. Need an example or analogy to explain self-describing data.
  4. All SQL engines (traditional or SQL-on-Hadoop) view tables as spreadsheet-like data structures with rows and columns. All records have the same structure, and there is no support for nested data or repeating fields. Drill views tables conceptually as collections of JSON (with additional types) documents. Each record can have a different structure (hence, schema-less). This is revolutionary and has never been done before. If you consider the four data models shown in the 2x2, all models can be represented by the complex, no schema model (JSON) because it is the most flexible. However, no other data model can be represented by the flat, fixed schema model. Therefore, when using any SQL engine except Drill, the data has to be transformed before it can be available to queries.