SlideShare ist ein Scribd-Unternehmen logo
1 von 49
Downloaden Sie, um offline zu lesen
© 2014 MapR Technologies 1© 2014 MapR Technologies
The Internet of Things and Big Data: Intro
© 2014 MapR Technologies 2
What This Is; What This Is Not
• It’s not specific to IoT
– It’s not about any specific type of data or protocol
– It’s not specific to any particular industry
• It’s about processing big data
– IoT data can be big data
– IoT might be the biggest data of the coming decade
– But it’s just big data
– Same strategies & technologies apply
© 2014 MapR Technologies 3
© 2014 MapR Technologies 4
© 2014 MapR Technologies 5
When Does Data Become ―Big?‖
• When the size of the data, itself, becomes a problem
• When the ―old way‖ of processing data just doesn’t work
effectively
• It’s ―big‖ when we have to rethink:
– How we store that much data
– How we move that much data
– How we extract, load & transform that much data
– How we explore and analyze that much data
– How we process and get meaningful insights from that much data
© 2014 MapR Technologies 6
C’mon! What does that mean in size?
• Not gigabytes
• Most likely not a few terabytes
• Possibly not 10’s of terabytes
• Probably 100’s of terabytes
• Definitely petabytes
© 2014 MapR Technologies 7
So How Do We Handle Big Data?
• Distribute & parallelize!
© 2014 MapR Technologies 8
MPP Analytic Databases or Hadoop
© 2014 MapR Technologies 9
Big Data Analytics
Bridging classic & big data worlds
“Capture only what’s needed”
SQL performance and structure
Hadoop scale and flexibility
IT delivers a platform for storing,
refining, and analyzing all data
sources
Business explores data for
questions worth answering
Big Data Method
Multi-structured & iterative analysis
IT structures the data
to answer those questions
Business determines
what questions to ask
Classic Method
Structured & Repeatable Analysis
“Capture in case it’s needed”
© 2014 MapR Technologies 10
Philosophical Differences
Traditional Methods
• More power
• Summarize data
• Transform and store
• Pre-defined schema
• Move data -> compute
• Less data / more complex
algorithms
Big Data
• More machines
• Keep all data
• Transform on demand
• Flexible / no schema
• Move compute -> data
• Mode data / simple
algorithms
© 2014 MapR Technologies 11
answer = f(all data)
• Save all raw data
• Data immutability
• Transform as needed
• Result is based on the raw data
© 2014 MapR Technologies 12
Q&A
@mapr maprtech
jberns@mapr.com
Engage with us!
MapR
maprtech
mapr-technologies
© 2014 MapR Technologies 13© 2014 MapR Technologies
Iot and Big Data:
Hadoop as a Data Platform
© 2014 MapR Technologies 14
Hadoop: The Disruptive Technology at the Core of Big Data
© 2014 MapR Technologies 15
Forces of Adoption
Hadoop TAM comes from disrupting enterprise data warehouse and storage spending
Data
IT Budgets
• Gartner, "Forecast Analysis: Enterprise IT Spending by Vertical Industry Market, Worldwide, 2010-2016, 3Q12 Update.―
• Wall Street Journal, ―Financial Services Companies Firms See Results from Big Data Push‖, Jan. 27, 2014
$9,000
$40,000
<$1,000
2013
ENTERPRISE
STORAGE
IT BUDGETS
GROWING AT 2.5%
2014 2015 2016 2017
DATABASE
WAREHOUSE
DATA GROWING
AT 40%
$ PER TERABYTE
HADOOP
© 2014 MapR Technologies 16© 2014 MapR Technologies
Hadoop 101 (External Presentation)
© 2014 MapR Technologies 17© 2014 MapR Technologies
Hadoop Hardware
© 2014 MapR Technologies 18
Typical Compute Node
• Two CPUs, each with 4-8 cores per CPU
• 32-128 GB Memory
• 6-24 hard disks
• 2-4 10GB Network cards
© 2014 MapR Technologies 19© 2014 MapR Technologies
Hadoop Ecosystem
© 2014 MapR Technologies 20
Ecosystem of Projects Built of Hadoop
© 2014 MapR Technologies 21© 2014 MapR Technologies
SQL On Hadoop
© 2014 MapR Technologies 22
SQL on Hadoop
• Generally data has no inherent ―schema‖
• Schema is defined by user / interpreted from structure
• Schema is applied during processing
• One file can have many schemas applied
• Works for many kinds of data—but not all
– Temperature sensor data? Sure
– Video feeds? Not really
© 2014 MapR Technologies 23
Key Use Cases
• Exploratory analysis on large
scale raw data
• Unknown value
• No defined schema
• Variety of data types
• Large-scale SQL queries on
long history
• Well defined schema
• Known value, but high cost in
existing systems
2
Big Data Analysis Big Data Exploration
© 2014 MapR Technologies 24
What is Driving the Need for SQL-on-Hadoop?
Organizations are looking for
• Reuse existing tools and skills to unlock Hadoop data to broader
audience
• Analysis on new types of data
• More complete data analysis
• More up-to-date and real-time data analysis
(not just ―after the fact‖)
© 2014 MapR Technologies 25
Drill 1.0 Hive 0.13 with Tez Impala 1.x Presto 0.56 Shark 0.8 Vertica
Latency Low Medium Low Low Medium Low
Files Yes (all Hive file
formats)
Yes (all Hive file
formats)
Yes (Parquet,
Sequence, …)
Yes (RC,
Sequence, Text)
Yes (all Hive file
formats)
Yes (all Hive file
formats)
HBase/M7 Yes Yes Various issues No Yes No
Schema Hive or schema-
less
Hive Hive Hive Hive Proprietary or Hive
SQL support ANSI SQL HiveQL HiveQL (subset) ANSI SQL HiveQL ANSI SQL +
advanced analytics
Client support ODBC/JDBC ODBC/JDBC ODBC/JDBC ODBC/JDBC ODBC/JDBC ODBC/JDBC,
ADO.NET, …
Large joins Yes Yes No No No Yes
Nested data Yes Limited No Limited Limited Limited
Hive UDFs Yes Yes Limited No Yes No
Transactions No No No No No Yes
Optimizer Limited Limited Limited Limited Limited Yes
Concurrency Limited Limited Limited Limited Limited Yes
SQL on Hadoop: Many Options
Flexibility to choose when to use which based on use case
© 2014 MapR Technologies 26
ENTERPRISE
DATA HUB
MARKETING
ANALYTICS
RISK
ANALYTICS
OPERATIONS
INTELLIGENCE
• Multi-structured
data staging & archive
• ETL / DW optimization
• Mainframe
optimization
• Data exploration
• Recommendation
engines & targeting
• Ad optimization
• Pricing analysis
• Lead scoring
• Network security
monitoring
• Security information &
event management
• Fraudulent behavioral
analysis
• Supply chain & logistics
• System log analysis
• Manufacturing quality
assurance
• Preventative
maintenance
• Sensor analysis
Proven Hadoop Production Success
© 2014 MapR Technologies 27© 2014 MapR Technologies
Other Tools & Frameworks of Note
© 2014 MapR Technologies 28
Pig
• Procedural Language
• Loops, if-then statements
© 2014 MapR Technologies 29
• Map Reduce Framwork
• Lingual: SQL-like operations
• Pattern: Machine Learning Applications
• Scalding: Cascading for Scala
• Cascalog: Cascading for Clojure
© 2014 MapR Technologies 30
• Python, Scala and Java
• Spark powers a stack of high-level tools including
– Shark for SQL,
– MLlib for machine learning,
– GraphX, and
– Spark Streaming.
• You can combine these frameworks seamlessly in the same
application.
© 2014 MapR Technologies 31
• Machine Learning / Predictive Analytics
– Collaborative Filtering
– Linear / Logistic Regression
– Naïve Bayes
– Random Forests
– K-Mean Clustering
– Canopy Clustering
– Principal Component Analysis
© 2014 MapR Technologies 32
• Database on Hadoop
• Highly scalable
• Columnar – Flexible schema
• Data source for Map Reduce and Spark jobs
© 2014 MapR Technologies 33
Q&A
@mapr maprtech
jberns@mapr.com
Engage with us!
MapR
maprtech
mapr-technologies
© 2014 MapR Technologies 34© 2014 MapR Technologies
Iot and Big Data:
Architectures & Use Cases
© 2014 MapR Technologies 35© 2014 MapR Technologies
NoSQL
© 2014 MapR Technologies 36
NoSQL Databases
• No-SQL or ―Not only‖ SQL
• Give up some of the functionality of traditional relational
databases for speed and scalability
• Types
– Key-Value
– Columnar
– Document
– Graph
• NoSQL databases favor flexible schemas
© 2014 MapR Technologies 37
HBase
© 2014 MapR Technologies 38© 2014 MapR Technologies
Queues
© 2014 MapR Technologies 39
Queues
• Just like a queue at an amusement park
• First-in-first out
• Queues messages or events
© 2014 MapR Technologies 40
Message Queue
© 2014 MapR Technologies 41© 2014 MapR Technologies
Stream Processing
© 2014 MapR Technologies 42
Stream Processing
• Handles data at high velocity
• If Hadoop is the ocean, streams are the firehose
• Processing in near real-time
© 2014 MapR Technologies 43
Storm
© 2014 MapR Technologies 44© 2014 MapR Technologies
Batch Processing
© 2014 MapR Technologies 45© 2014 MapR Technologies
Combination Architectures
© 2014 MapR Technologies 46
Lambda Architecture
© 2014 MapR Technologies 47
Complex Architectures Using Many Big Data Technologies
© 2014 MapR Technologies 48
Wanna Play?
• http://www.mapr.com/products/mapr-sandbox-hadoop
© 2014 MapR Technologies 49
Q&A
@mapr maprtech
jberns@mapr.com
Engage with us!
MapR
maprtech
mapr-technologies

Más contenido relacionado

Was ist angesagt?

Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016StampedeCon
 
Key Data Management Requirements for the IoT
Key Data Management Requirements for the IoTKey Data Management Requirements for the IoT
Key Data Management Requirements for the IoTMongoDB
 
Xanadu Based Big Data Deep Learning for Medical Data Analysis
Xanadu Based Big Data Deep Learning for Medical Data AnalysisXanadu Based Big Data Deep Learning for Medical Data Analysis
Xanadu Based Big Data Deep Learning for Medical Data AnalysisAlex G. Lee, Ph.D. Esq. CLP
 
NextGen Infrastructure for Big Data
NextGen Infrastructure for Big DataNextGen Infrastructure for Big Data
NextGen Infrastructure for Big DataEd Dodds
 
Xanadu Big Data Platform Technology BMT@ Rackspace Cloud
Xanadu Big Data Platform Technology BMT@ Rackspace Cloud Xanadu Big Data Platform Technology BMT@ Rackspace Cloud
Xanadu Big Data Platform Technology BMT@ Rackspace Cloud Alex G. Lee, Ph.D. Esq. CLP
 
Overview of big data in cloud computing
Overview of big data in cloud computingOverview of big data in cloud computing
Overview of big data in cloud computingViet-Trung TRAN
 
Big Data in Action : Operations, Analytics and more
Big Data in Action : Operations, Analytics and moreBig Data in Action : Operations, Analytics and more
Big Data in Action : Operations, Analytics and moreSoftweb Solutions
 
A novel approach to big data veracity using crowd-sourcing techniques
A novel approach to big data veracity using crowd-sourcing techniques A novel approach to big data veracity using crowd-sourcing techniques
A novel approach to big data veracity using crowd-sourcing techniques Abhiram Ravikumar
 
ParStream - Big Data for Business Users
ParStream - Big Data for Business UsersParStream - Big Data for Business Users
ParStream - Big Data for Business UsersParStream Inc.
 
Green Compute and Storage - Why does it Matter and What is in Scope
Green Compute and Storage - Why does it Matter and What is in ScopeGreen Compute and Storage - Why does it Matter and What is in Scope
Green Compute and Storage - Why does it Matter and What is in ScopeNarayanan Subramaniam
 
Apache Spark and future of advanced analytics
Apache Spark and future of advanced analyticsApache Spark and future of advanced analytics
Apache Spark and future of advanced analyticsMuralidhar Somisetty
 
International Journal of Computer Science, Engineering and Information Techn...
International Journal of Computer Science, Engineering and  Information Techn...International Journal of Computer Science, Engineering and  Information Techn...
International Journal of Computer Science, Engineering and Information Techn...ijcseit
 
Transforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform StrategyTransforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform StrategyDatabricks
 
big data overview ppt
big data overview pptbig data overview ppt
big data overview pptVIKAS KATARE
 
Managing your Assets with Big Data Tools
Managing your Assets with Big Data ToolsManaging your Assets with Big Data Tools
Managing your Assets with Big Data ToolsMachinePulse
 
Oil & Gas Big Data use cases
Oil & Gas Big Data use casesOil & Gas Big Data use cases
Oil & Gas Big Data use caseselephantscale
 
Great Expectations Presentation
Great Expectations PresentationGreat Expectations Presentation
Great Expectations PresentationAdam Doyle
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Databricks
 
Realtime stream analytics with momentum
Realtime stream analytics with momentumRealtime stream analytics with momentum
Realtime stream analytics with momentumShamshad Ansari
 

Was ist angesagt? (20)

Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016
 
Key Data Management Requirements for the IoT
Key Data Management Requirements for the IoTKey Data Management Requirements for the IoT
Key Data Management Requirements for the IoT
 
Xanadu Based Big Data Deep Learning for Medical Data Analysis
Xanadu Based Big Data Deep Learning for Medical Data AnalysisXanadu Based Big Data Deep Learning for Medical Data Analysis
Xanadu Based Big Data Deep Learning for Medical Data Analysis
 
NextGen Infrastructure for Big Data
NextGen Infrastructure for Big DataNextGen Infrastructure for Big Data
NextGen Infrastructure for Big Data
 
Xanadu Big Data Platform Technology BMT@ Rackspace Cloud
Xanadu Big Data Platform Technology BMT@ Rackspace Cloud Xanadu Big Data Platform Technology BMT@ Rackspace Cloud
Xanadu Big Data Platform Technology BMT@ Rackspace Cloud
 
Overview of big data in cloud computing
Overview of big data in cloud computingOverview of big data in cloud computing
Overview of big data in cloud computing
 
Big Data in Action : Operations, Analytics and more
Big Data in Action : Operations, Analytics and moreBig Data in Action : Operations, Analytics and more
Big Data in Action : Operations, Analytics and more
 
A novel approach to big data veracity using crowd-sourcing techniques
A novel approach to big data veracity using crowd-sourcing techniques A novel approach to big data veracity using crowd-sourcing techniques
A novel approach to big data veracity using crowd-sourcing techniques
 
Big data storage
Big data storageBig data storage
Big data storage
 
ParStream - Big Data for Business Users
ParStream - Big Data for Business UsersParStream - Big Data for Business Users
ParStream - Big Data for Business Users
 
Green Compute and Storage - Why does it Matter and What is in Scope
Green Compute and Storage - Why does it Matter and What is in ScopeGreen Compute and Storage - Why does it Matter and What is in Scope
Green Compute and Storage - Why does it Matter and What is in Scope
 
Apache Spark and future of advanced analytics
Apache Spark and future of advanced analyticsApache Spark and future of advanced analytics
Apache Spark and future of advanced analytics
 
International Journal of Computer Science, Engineering and Information Techn...
International Journal of Computer Science, Engineering and  Information Techn...International Journal of Computer Science, Engineering and  Information Techn...
International Journal of Computer Science, Engineering and Information Techn...
 
Transforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform StrategyTransforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform Strategy
 
big data overview ppt
big data overview pptbig data overview ppt
big data overview ppt
 
Managing your Assets with Big Data Tools
Managing your Assets with Big Data ToolsManaging your Assets with Big Data Tools
Managing your Assets with Big Data Tools
 
Oil & Gas Big Data use cases
Oil & Gas Big Data use casesOil & Gas Big Data use cases
Oil & Gas Big Data use cases
 
Great Expectations Presentation
Great Expectations PresentationGreat Expectations Presentation
Great Expectations Presentation
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
 
Realtime stream analytics with momentum
Realtime stream analytics with momentumRealtime stream analytics with momentum
Realtime stream analytics with momentum
 

Andere mochten auch

IOT and Big Data - The Perfect Marriage
IOT and Big Data - The Perfect MarriageIOT and Big Data - The Perfect Marriage
IOT and Big Data - The Perfect MarriageDr. Mazlan Abbas
 
Internet of Things and Big Data: Vision and Concrete Use Cases
Internet of Things and Big Data: Vision and Concrete Use CasesInternet of Things and Big Data: Vision and Concrete Use Cases
Internet of Things and Big Data: Vision and Concrete Use CasesMongoDB
 
Internet of Things (IoT) and Big Data
Internet of Things (IoT) and Big DataInternet of Things (IoT) and Big Data
Internet of Things (IoT) and Big DataGuido Schmutz
 
Internet-of-things- (IOT) - a-seminar - ppt - by- mohan-kumar-g
Internet-of-things- (IOT) - a-seminar - ppt - by- mohan-kumar-gInternet-of-things- (IOT) - a-seminar - ppt - by- mohan-kumar-g
Internet-of-things- (IOT) - a-seminar - ppt - by- mohan-kumar-gMohan Kumar G
 
Internet of things (IoT) and big data- r.nabati
Internet of things (IoT) and big data- r.nabatiInternet of things (IoT) and big data- r.nabati
Internet of things (IoT) and big data- r.nabatinabati
 
Internet of things, Big Data and Analytics 101
Internet of things, Big Data and Analytics 101Internet of things, Big Data and Analytics 101
Internet of things, Big Data and Analytics 101Mukul Krishna
 
Silicon Valley Workshop: IoT/Big Data/AI Innovation & Strategy Insights from ...
Silicon Valley Workshop: IoT/Big Data/AI Innovation & Strategy Insights from ...Silicon Valley Workshop: IoT/Big Data/AI Innovation & Strategy Insights from ...
Silicon Valley Workshop: IoT/Big Data/AI Innovation & Strategy Insights from ...Alex G. Lee, Ph.D. Esq. CLP
 
What Exactly Is The "Internet of Things"?
What Exactly Is The "Internet of Things"?What Exactly Is The "Internet of Things"?
What Exactly Is The "Internet of Things"?Postscapes
 
5 questions about the IoT (Internet of Things)
5 questions about the IoT (Internet of Things) 5 questions about the IoT (Internet of Things)
5 questions about the IoT (Internet of Things) Deloitte United States
 
Seattle Scalability Meetup - Ted Dunning - MapR
Seattle Scalability Meetup - Ted Dunning - MapRSeattle Scalability Meetup - Ted Dunning - MapR
Seattle Scalability Meetup - Ted Dunning - MapRclive boulton
 
NYC Hadoop Meetup - MapR, Architecture, Philosophy and Applications
NYC Hadoop Meetup - MapR, Architecture, Philosophy and ApplicationsNYC Hadoop Meetup - MapR, Architecture, Philosophy and Applications
NYC Hadoop Meetup - MapR, Architecture, Philosophy and ApplicationsJason Shao
 
Intelligent APIs for Big Data & IoT Create customized data views for mobile,...
Intelligent APIs for Big Data & IoT  Create customized data views for mobile,...Intelligent APIs for Big Data & IoT  Create customized data views for mobile,...
Intelligent APIs for Big Data & IoT Create customized data views for mobile,...CA API Management
 
What does an internet of things business look like?
What does an internet of things business look like?What does an internet of things business look like?
What does an internet of things business look like?Alexandra Deschamps-Sonsino
 
SQL-on-Hadoop with Apache Drill
SQL-on-Hadoop with Apache DrillSQL-on-Hadoop with Apache Drill
SQL-on-Hadoop with Apache DrillMapR Technologies
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Mathieu Dumoulin
 
AWS re:Invent 2016: Driving Innovation with Big Data and IoT (GPSST304)
AWS re:Invent 2016: Driving Innovation with Big Data and IoT (GPSST304)AWS re:Invent 2016: Driving Innovation with Big Data and IoT (GPSST304)
AWS re:Invent 2016: Driving Innovation with Big Data and IoT (GPSST304)Amazon Web Services
 
Which data should you move to Hadoop?
Which data should you move to Hadoop?Which data should you move to Hadoop?
Which data should you move to Hadoop?Attunity
 
Interplay of Big Data and IoT - StampedeCon 2016
Interplay of Big Data and IoT - StampedeCon 2016Interplay of Big Data and IoT - StampedeCon 2016
Interplay of Big Data and IoT - StampedeCon 2016StampedeCon
 
Tug Boat Loading in Singapore
Tug Boat Loading in SingaporeTug Boat Loading in Singapore
Tug Boat Loading in Singaporeravsinha
 

Andere mochten auch (20)

IOT and Big Data - The Perfect Marriage
IOT and Big Data - The Perfect MarriageIOT and Big Data - The Perfect Marriage
IOT and Big Data - The Perfect Marriage
 
Internet of Things and Big Data: Vision and Concrete Use Cases
Internet of Things and Big Data: Vision and Concrete Use CasesInternet of Things and Big Data: Vision and Concrete Use Cases
Internet of Things and Big Data: Vision and Concrete Use Cases
 
Internet of Things (IoT) and Big Data
Internet of Things (IoT) and Big DataInternet of Things (IoT) and Big Data
Internet of Things (IoT) and Big Data
 
Internet-of-things- (IOT) - a-seminar - ppt - by- mohan-kumar-g
Internet-of-things- (IOT) - a-seminar - ppt - by- mohan-kumar-gInternet-of-things- (IOT) - a-seminar - ppt - by- mohan-kumar-g
Internet-of-things- (IOT) - a-seminar - ppt - by- mohan-kumar-g
 
Internet of things (IoT) and big data- r.nabati
Internet of things (IoT) and big data- r.nabatiInternet of things (IoT) and big data- r.nabati
Internet of things (IoT) and big data- r.nabati
 
Internet of things, Big Data and Analytics 101
Internet of things, Big Data and Analytics 101Internet of things, Big Data and Analytics 101
Internet of things, Big Data and Analytics 101
 
Silicon Valley Workshop: IoT/Big Data/AI Innovation & Strategy Insights from ...
Silicon Valley Workshop: IoT/Big Data/AI Innovation & Strategy Insights from ...Silicon Valley Workshop: IoT/Big Data/AI Innovation & Strategy Insights from ...
Silicon Valley Workshop: IoT/Big Data/AI Innovation & Strategy Insights from ...
 
What Exactly Is The "Internet of Things"?
What Exactly Is The "Internet of Things"?What Exactly Is The "Internet of Things"?
What Exactly Is The "Internet of Things"?
 
5 questions about the IoT (Internet of Things)
5 questions about the IoT (Internet of Things) 5 questions about the IoT (Internet of Things)
5 questions about the IoT (Internet of Things)
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
Seattle Scalability Meetup - Ted Dunning - MapR
Seattle Scalability Meetup - Ted Dunning - MapRSeattle Scalability Meetup - Ted Dunning - MapR
Seattle Scalability Meetup - Ted Dunning - MapR
 
NYC Hadoop Meetup - MapR, Architecture, Philosophy and Applications
NYC Hadoop Meetup - MapR, Architecture, Philosophy and ApplicationsNYC Hadoop Meetup - MapR, Architecture, Philosophy and Applications
NYC Hadoop Meetup - MapR, Architecture, Philosophy and Applications
 
Intelligent APIs for Big Data & IoT Create customized data views for mobile,...
Intelligent APIs for Big Data & IoT  Create customized data views for mobile,...Intelligent APIs for Big Data & IoT  Create customized data views for mobile,...
Intelligent APIs for Big Data & IoT Create customized data views for mobile,...
 
What does an internet of things business look like?
What does an internet of things business look like?What does an internet of things business look like?
What does an internet of things business look like?
 
SQL-on-Hadoop with Apache Drill
SQL-on-Hadoop with Apache DrillSQL-on-Hadoop with Apache Drill
SQL-on-Hadoop with Apache Drill
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
 
AWS re:Invent 2016: Driving Innovation with Big Data and IoT (GPSST304)
AWS re:Invent 2016: Driving Innovation with Big Data and IoT (GPSST304)AWS re:Invent 2016: Driving Innovation with Big Data and IoT (GPSST304)
AWS re:Invent 2016: Driving Innovation with Big Data and IoT (GPSST304)
 
Which data should you move to Hadoop?
Which data should you move to Hadoop?Which data should you move to Hadoop?
Which data should you move to Hadoop?
 
Interplay of Big Data and IoT - StampedeCon 2016
Interplay of Big Data and IoT - StampedeCon 2016Interplay of Big Data and IoT - StampedeCon 2016
Interplay of Big Data and IoT - StampedeCon 2016
 
Tug Boat Loading in Singapore
Tug Boat Loading in SingaporeTug Boat Loading in Singapore
Tug Boat Loading in Singapore
 

Ähnlich wie IoT and Big Data - Iot Asia 2014

Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which DataWorks Summit
 
Hadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapRHadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapRData Con LA
 
The Hadoop Ecosystem for Developers
The Hadoop Ecosystem for DevelopersThe Hadoop Ecosystem for Developers
The Hadoop Ecosystem for DevelopersZohar Elkayam
 
Self-Service BI for big data applications using Apache Drill (Big Data Amster...
Self-Service BI for big data applications using Apache Drill (Big Data Amster...Self-Service BI for big data applications using Apache Drill (Big Data Amster...
Self-Service BI for big data applications using Apache Drill (Big Data Amster...Dataconomy Media
 
Self-Service BI for big data applications using Apache Drill (Big Data Amster...
Self-Service BI for big data applications using Apache Drill (Big Data Amster...Self-Service BI for big data applications using Apache Drill (Big Data Amster...
Self-Service BI for big data applications using Apache Drill (Big Data Amster...Mats Uddenfeldt
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...MapR Technologies
 
Modul_1_Introduction_to_Big_Data.pptx
Modul_1_Introduction_to_Big_Data.pptxModul_1_Introduction_to_Big_Data.pptx
Modul_1_Introduction_to_Big_Data.pptxNouhaElhaji1
 
MapR-DB – The First In-Hadoop Document Database
MapR-DB – The First In-Hadoop Document DatabaseMapR-DB – The First In-Hadoop Document Database
MapR-DB – The First In-Hadoop Document DatabaseMapR Technologies
 
Enterprise Architecture in the Era of Big Data and Quantum Computing
Enterprise Architecture in the Era of Big Data and Quantum ComputingEnterprise Architecture in the Era of Big Data and Quantum Computing
Enterprise Architecture in the Era of Big Data and Quantum ComputingKnowledgent
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Denodo
 
Big data - Online Training
Big data - Online TrainingBig data - Online Training
Big data - Online TrainingLearntek1
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksMapR Technologies
 
Hadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data ModelHadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data ModelUwe Printz
 
Practical introduction to hadoop
Practical introduction to hadoopPractical introduction to hadoop
Practical introduction to hadoopinside-BigData.com
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Perficient, Inc.
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSatish Mohan
 
Big data hadoop-no sql and graph db-final
Big data hadoop-no sql and graph db-finalBig data hadoop-no sql and graph db-final
Big data hadoop-no sql and graph db-finalramazan fırın
 

Ähnlich wie IoT and Big Data - Iot Asia 2014 (20)

Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which
 
Hadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapRHadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapR
 
The Hadoop Ecosystem for Developers
The Hadoop Ecosystem for DevelopersThe Hadoop Ecosystem for Developers
The Hadoop Ecosystem for Developers
 
Self-Service BI for big data applications using Apache Drill (Big Data Amster...
Self-Service BI for big data applications using Apache Drill (Big Data Amster...Self-Service BI for big data applications using Apache Drill (Big Data Amster...
Self-Service BI for big data applications using Apache Drill (Big Data Amster...
 
Self-Service BI for big data applications using Apache Drill (Big Data Amster...
Self-Service BI for big data applications using Apache Drill (Big Data Amster...Self-Service BI for big data applications using Apache Drill (Big Data Amster...
Self-Service BI for big data applications using Apache Drill (Big Data Amster...
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
 
Modul_1_Introduction_to_Big_Data.pptx
Modul_1_Introduction_to_Big_Data.pptxModul_1_Introduction_to_Big_Data.pptx
Modul_1_Introduction_to_Big_Data.pptx
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
 
MapR-DB – The First In-Hadoop Document Database
MapR-DB – The First In-Hadoop Document DatabaseMapR-DB – The First In-Hadoop Document Database
MapR-DB – The First In-Hadoop Document Database
 
Enterprise Architecture in the Era of Big Data and Quantum Computing
Enterprise Architecture in the Era of Big Data and Quantum ComputingEnterprise Architecture in the Era of Big Data and Quantum Computing
Enterprise Architecture in the Era of Big Data and Quantum Computing
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
 
Keys for Success from Streams to Queries
Keys for Success from Streams to QueriesKeys for Success from Streams to Queries
Keys for Success from Streams to Queries
 
Big data - Online Training
Big data - Online TrainingBig data - Online Training
Big data - Online Training
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
 
Hadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data ModelHadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data Model
 
Practical introduction to hadoop
Practical introduction to hadoopPractical introduction to hadoop
Practical introduction to hadoop
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform Concept
 
Big data hadoop-no sql and graph db-final
Big data hadoop-no sql and graph db-finalBig data hadoop-no sql and graph db-final
Big data hadoop-no sql and graph db-final
 

Último

Empowering Decisions A Guide to Embedded Analytics
Empowering Decisions A Guide to Embedded AnalyticsEmpowering Decisions A Guide to Embedded Analytics
Empowering Decisions A Guide to Embedded AnalyticsGain Insights
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsNeo4j
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
TCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI PipelinesTCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI PipelinesTimothy Spann
 
Unleashing Datas Potential - Mastering Precision with FCO-IM
Unleashing Datas Potential - Mastering Precision with FCO-IMUnleashing Datas Potential - Mastering Precision with FCO-IM
Unleashing Datas Potential - Mastering Precision with FCO-IMMarco Wobben
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j
 
PPT for Presiding Officer.pptxvvdffdfgggg
PPT for Presiding Officer.pptxvvdffdfggggPPT for Presiding Officer.pptxvvdffdfgggg
PPT for Presiding Officer.pptxvvdffdfggggbhadratanusenapati1
 
Prediction Of Cryptocurrency Prices Using Lstm, Svm And Polynomial Regression...
Prediction Of Cryptocurrency Prices Using Lstm, Svm And Polynomial Regression...Prediction Of Cryptocurrency Prices Using Lstm, Svm And Polynomial Regression...
Prediction Of Cryptocurrency Prices Using Lstm, Svm And Polynomial Regression...ferisulianta.com
 
Deloitte+RedCross_Talk to your data with Knowledge-enriched Generative AI.ppt...
Deloitte+RedCross_Talk to your data with Knowledge-enriched Generative AI.ppt...Deloitte+RedCross_Talk to your data with Knowledge-enriched Generative AI.ppt...
Deloitte+RedCross_Talk to your data with Knowledge-enriched Generative AI.ppt...Neo4j
 
The market for cross-border mortgages in Europe
The market for cross-border mortgages in EuropeThe market for cross-border mortgages in Europe
The market for cross-border mortgages in Europe321k
 
STOCK PRICE ANALYSIS Furkan Ali TASCI --.pptx
STOCK PRICE ANALYSIS  Furkan Ali TASCI --.pptxSTOCK PRICE ANALYSIS  Furkan Ali TASCI --.pptx
STOCK PRICE ANALYSIS Furkan Ali TASCI --.pptxFurkanTasci3
 
Air Con Energy Rating Info411 Presentation.pdf
Air Con Energy Rating Info411 Presentation.pdfAir Con Energy Rating Info411 Presentation.pdf
Air Con Energy Rating Info411 Presentation.pdfJasonBoboKyaw
 
Understanding the Impact of video length on student performance
Understanding the Impact of video length on student performanceUnderstanding the Impact of video length on student performance
Understanding the Impact of video length on student performancePrithaVashisht1
 
Data Analytics Fundamentals: data analytics types.potx
Data Analytics Fundamentals: data analytics types.potxData Analytics Fundamentals: data analytics types.potx
Data Analytics Fundamentals: data analytics types.potxEmmanuel Dauda
 
How to Build an Experimentation Culture for Data-Driven Product Development
How to Build an Experimentation Culture for Data-Driven Product DevelopmentHow to Build an Experimentation Culture for Data-Driven Product Development
How to Build an Experimentation Culture for Data-Driven Product DevelopmentAggregage
 
Brain Tumor Detection with Machine Learning.pptx
Brain Tumor Detection with Machine Learning.pptxBrain Tumor Detection with Machine Learning.pptx
Brain Tumor Detection with Machine Learning.pptxShammiRai3
 
Using DAX & Time-based Analysis in Data Warehouse
Using DAX & Time-based Analysis in Data WarehouseUsing DAX & Time-based Analysis in Data Warehouse
Using DAX & Time-based Analysis in Data WarehouseThinkInnovation
 
Bengaluru Tableau UG event- 2nd March 2024 Q1
Bengaluru Tableau UG event- 2nd March 2024 Q1Bengaluru Tableau UG event- 2nd March 2024 Q1
Bengaluru Tableau UG event- 2nd March 2024 Q1bengalurutug
 
Data Collection from Social Media Platforms
Data Collection from Social Media PlatformsData Collection from Social Media Platforms
Data Collection from Social Media PlatformsMahmoud Yasser
 

Último (20)

Empowering Decisions A Guide to Embedded Analytics
Empowering Decisions A Guide to Embedded AnalyticsEmpowering Decisions A Guide to Embedded Analytics
Empowering Decisions A Guide to Embedded Analytics
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge Graphs
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
TCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI PipelinesTCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI Pipelines
 
Unleashing Datas Potential - Mastering Precision with FCO-IM
Unleashing Datas Potential - Mastering Precision with FCO-IMUnleashing Datas Potential - Mastering Precision with FCO-IM
Unleashing Datas Potential - Mastering Precision with FCO-IM
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
 
PPT for Presiding Officer.pptxvvdffdfgggg
PPT for Presiding Officer.pptxvvdffdfggggPPT for Presiding Officer.pptxvvdffdfgggg
PPT for Presiding Officer.pptxvvdffdfgggg
 
Prediction Of Cryptocurrency Prices Using Lstm, Svm And Polynomial Regression...
Prediction Of Cryptocurrency Prices Using Lstm, Svm And Polynomial Regression...Prediction Of Cryptocurrency Prices Using Lstm, Svm And Polynomial Regression...
Prediction Of Cryptocurrency Prices Using Lstm, Svm And Polynomial Regression...
 
Deloitte+RedCross_Talk to your data with Knowledge-enriched Generative AI.ppt...
Deloitte+RedCross_Talk to your data with Knowledge-enriched Generative AI.ppt...Deloitte+RedCross_Talk to your data with Knowledge-enriched Generative AI.ppt...
Deloitte+RedCross_Talk to your data with Knowledge-enriched Generative AI.ppt...
 
The market for cross-border mortgages in Europe
The market for cross-border mortgages in EuropeThe market for cross-border mortgages in Europe
The market for cross-border mortgages in Europe
 
STOCK PRICE ANALYSIS Furkan Ali TASCI --.pptx
STOCK PRICE ANALYSIS  Furkan Ali TASCI --.pptxSTOCK PRICE ANALYSIS  Furkan Ali TASCI --.pptx
STOCK PRICE ANALYSIS Furkan Ali TASCI --.pptx
 
Target_Company_Data_breach_2013_110million
Target_Company_Data_breach_2013_110millionTarget_Company_Data_breach_2013_110million
Target_Company_Data_breach_2013_110million
 
Air Con Energy Rating Info411 Presentation.pdf
Air Con Energy Rating Info411 Presentation.pdfAir Con Energy Rating Info411 Presentation.pdf
Air Con Energy Rating Info411 Presentation.pdf
 
Understanding the Impact of video length on student performance
Understanding the Impact of video length on student performanceUnderstanding the Impact of video length on student performance
Understanding the Impact of video length on student performance
 
Data Analytics Fundamentals: data analytics types.potx
Data Analytics Fundamentals: data analytics types.potxData Analytics Fundamentals: data analytics types.potx
Data Analytics Fundamentals: data analytics types.potx
 
How to Build an Experimentation Culture for Data-Driven Product Development
How to Build an Experimentation Culture for Data-Driven Product DevelopmentHow to Build an Experimentation Culture for Data-Driven Product Development
How to Build an Experimentation Culture for Data-Driven Product Development
 
Brain Tumor Detection with Machine Learning.pptx
Brain Tumor Detection with Machine Learning.pptxBrain Tumor Detection with Machine Learning.pptx
Brain Tumor Detection with Machine Learning.pptx
 
Using DAX & Time-based Analysis in Data Warehouse
Using DAX & Time-based Analysis in Data WarehouseUsing DAX & Time-based Analysis in Data Warehouse
Using DAX & Time-based Analysis in Data Warehouse
 
Bengaluru Tableau UG event- 2nd March 2024 Q1
Bengaluru Tableau UG event- 2nd March 2024 Q1Bengaluru Tableau UG event- 2nd March 2024 Q1
Bengaluru Tableau UG event- 2nd March 2024 Q1
 
Data Collection from Social Media Platforms
Data Collection from Social Media PlatformsData Collection from Social Media Platforms
Data Collection from Social Media Platforms
 

IoT and Big Data - Iot Asia 2014

  • 1. © 2014 MapR Technologies 1© 2014 MapR Technologies The Internet of Things and Big Data: Intro
  • 2. © 2014 MapR Technologies 2 What This Is; What This Is Not • It’s not specific to IoT – It’s not about any specific type of data or protocol – It’s not specific to any particular industry • It’s about processing big data – IoT data can be big data – IoT might be the biggest data of the coming decade – But it’s just big data – Same strategies & technologies apply
  • 3. © 2014 MapR Technologies 3
  • 4. © 2014 MapR Technologies 4
  • 5. © 2014 MapR Technologies 5 When Does Data Become ―Big?‖ • When the size of the data, itself, becomes a problem • When the ―old way‖ of processing data just doesn’t work effectively • It’s ―big‖ when we have to rethink: – How we store that much data – How we move that much data – How we extract, load & transform that much data – How we explore and analyze that much data – How we process and get meaningful insights from that much data
  • 6. © 2014 MapR Technologies 6 C’mon! What does that mean in size? • Not gigabytes • Most likely not a few terabytes • Possibly not 10’s of terabytes • Probably 100’s of terabytes • Definitely petabytes
  • 7. © 2014 MapR Technologies 7 So How Do We Handle Big Data? • Distribute & parallelize!
  • 8. © 2014 MapR Technologies 8 MPP Analytic Databases or Hadoop
  • 9. © 2014 MapR Technologies 9 Big Data Analytics Bridging classic & big data worlds “Capture only what’s needed” SQL performance and structure Hadoop scale and flexibility IT delivers a platform for storing, refining, and analyzing all data sources Business explores data for questions worth answering Big Data Method Multi-structured & iterative analysis IT structures the data to answer those questions Business determines what questions to ask Classic Method Structured & Repeatable Analysis “Capture in case it’s needed”
  • 10. © 2014 MapR Technologies 10 Philosophical Differences Traditional Methods • More power • Summarize data • Transform and store • Pre-defined schema • Move data -> compute • Less data / more complex algorithms Big Data • More machines • Keep all data • Transform on demand • Flexible / no schema • Move compute -> data • Mode data / simple algorithms
  • 11. © 2014 MapR Technologies 11 answer = f(all data) • Save all raw data • Data immutability • Transform as needed • Result is based on the raw data
  • 12. © 2014 MapR Technologies 12 Q&A @mapr maprtech jberns@mapr.com Engage with us! MapR maprtech mapr-technologies
  • 13. © 2014 MapR Technologies 13© 2014 MapR Technologies Iot and Big Data: Hadoop as a Data Platform
  • 14. © 2014 MapR Technologies 14 Hadoop: The Disruptive Technology at the Core of Big Data
  • 15. © 2014 MapR Technologies 15 Forces of Adoption Hadoop TAM comes from disrupting enterprise data warehouse and storage spending Data IT Budgets • Gartner, "Forecast Analysis: Enterprise IT Spending by Vertical Industry Market, Worldwide, 2010-2016, 3Q12 Update.― • Wall Street Journal, ―Financial Services Companies Firms See Results from Big Data Push‖, Jan. 27, 2014 $9,000 $40,000 <$1,000 2013 ENTERPRISE STORAGE IT BUDGETS GROWING AT 2.5% 2014 2015 2016 2017 DATABASE WAREHOUSE DATA GROWING AT 40% $ PER TERABYTE HADOOP
  • 16. © 2014 MapR Technologies 16© 2014 MapR Technologies Hadoop 101 (External Presentation)
  • 17. © 2014 MapR Technologies 17© 2014 MapR Technologies Hadoop Hardware
  • 18. © 2014 MapR Technologies 18 Typical Compute Node • Two CPUs, each with 4-8 cores per CPU • 32-128 GB Memory • 6-24 hard disks • 2-4 10GB Network cards
  • 19. © 2014 MapR Technologies 19© 2014 MapR Technologies Hadoop Ecosystem
  • 20. © 2014 MapR Technologies 20 Ecosystem of Projects Built of Hadoop
  • 21. © 2014 MapR Technologies 21© 2014 MapR Technologies SQL On Hadoop
  • 22. © 2014 MapR Technologies 22 SQL on Hadoop • Generally data has no inherent ―schema‖ • Schema is defined by user / interpreted from structure • Schema is applied during processing • One file can have many schemas applied • Works for many kinds of data—but not all – Temperature sensor data? Sure – Video feeds? Not really
  • 23. © 2014 MapR Technologies 23 Key Use Cases • Exploratory analysis on large scale raw data • Unknown value • No defined schema • Variety of data types • Large-scale SQL queries on long history • Well defined schema • Known value, but high cost in existing systems 2 Big Data Analysis Big Data Exploration
  • 24. © 2014 MapR Technologies 24 What is Driving the Need for SQL-on-Hadoop? Organizations are looking for • Reuse existing tools and skills to unlock Hadoop data to broader audience • Analysis on new types of data • More complete data analysis • More up-to-date and real-time data analysis (not just ―after the fact‖)
  • 25. © 2014 MapR Technologies 25 Drill 1.0 Hive 0.13 with Tez Impala 1.x Presto 0.56 Shark 0.8 Vertica Latency Low Medium Low Low Medium Low Files Yes (all Hive file formats) Yes (all Hive file formats) Yes (Parquet, Sequence, …) Yes (RC, Sequence, Text) Yes (all Hive file formats) Yes (all Hive file formats) HBase/M7 Yes Yes Various issues No Yes No Schema Hive or schema- less Hive Hive Hive Hive Proprietary or Hive SQL support ANSI SQL HiveQL HiveQL (subset) ANSI SQL HiveQL ANSI SQL + advanced analytics Client support ODBC/JDBC ODBC/JDBC ODBC/JDBC ODBC/JDBC ODBC/JDBC ODBC/JDBC, ADO.NET, … Large joins Yes Yes No No No Yes Nested data Yes Limited No Limited Limited Limited Hive UDFs Yes Yes Limited No Yes No Transactions No No No No No Yes Optimizer Limited Limited Limited Limited Limited Yes Concurrency Limited Limited Limited Limited Limited Yes SQL on Hadoop: Many Options Flexibility to choose when to use which based on use case
  • 26. © 2014 MapR Technologies 26 ENTERPRISE DATA HUB MARKETING ANALYTICS RISK ANALYTICS OPERATIONS INTELLIGENCE • Multi-structured data staging & archive • ETL / DW optimization • Mainframe optimization • Data exploration • Recommendation engines & targeting • Ad optimization • Pricing analysis • Lead scoring • Network security monitoring • Security information & event management • Fraudulent behavioral analysis • Supply chain & logistics • System log analysis • Manufacturing quality assurance • Preventative maintenance • Sensor analysis Proven Hadoop Production Success
  • 27. © 2014 MapR Technologies 27© 2014 MapR Technologies Other Tools & Frameworks of Note
  • 28. © 2014 MapR Technologies 28 Pig • Procedural Language • Loops, if-then statements
  • 29. © 2014 MapR Technologies 29 • Map Reduce Framwork • Lingual: SQL-like operations • Pattern: Machine Learning Applications • Scalding: Cascading for Scala • Cascalog: Cascading for Clojure
  • 30. © 2014 MapR Technologies 30 • Python, Scala and Java • Spark powers a stack of high-level tools including – Shark for SQL, – MLlib for machine learning, – GraphX, and – Spark Streaming. • You can combine these frameworks seamlessly in the same application.
  • 31. © 2014 MapR Technologies 31 • Machine Learning / Predictive Analytics – Collaborative Filtering – Linear / Logistic Regression – Naïve Bayes – Random Forests – K-Mean Clustering – Canopy Clustering – Principal Component Analysis
  • 32. © 2014 MapR Technologies 32 • Database on Hadoop • Highly scalable • Columnar – Flexible schema • Data source for Map Reduce and Spark jobs
  • 33. © 2014 MapR Technologies 33 Q&A @mapr maprtech jberns@mapr.com Engage with us! MapR maprtech mapr-technologies
  • 34. © 2014 MapR Technologies 34© 2014 MapR Technologies Iot and Big Data: Architectures & Use Cases
  • 35. © 2014 MapR Technologies 35© 2014 MapR Technologies NoSQL
  • 36. © 2014 MapR Technologies 36 NoSQL Databases • No-SQL or ―Not only‖ SQL • Give up some of the functionality of traditional relational databases for speed and scalability • Types – Key-Value – Columnar – Document – Graph • NoSQL databases favor flexible schemas
  • 37. © 2014 MapR Technologies 37 HBase
  • 38. © 2014 MapR Technologies 38© 2014 MapR Technologies Queues
  • 39. © 2014 MapR Technologies 39 Queues • Just like a queue at an amusement park • First-in-first out • Queues messages or events
  • 40. © 2014 MapR Technologies 40 Message Queue
  • 41. © 2014 MapR Technologies 41© 2014 MapR Technologies Stream Processing
  • 42. © 2014 MapR Technologies 42 Stream Processing • Handles data at high velocity • If Hadoop is the ocean, streams are the firehose • Processing in near real-time
  • 43. © 2014 MapR Technologies 43 Storm
  • 44. © 2014 MapR Technologies 44© 2014 MapR Technologies Batch Processing
  • 45. © 2014 MapR Technologies 45© 2014 MapR Technologies Combination Architectures
  • 46. © 2014 MapR Technologies 46 Lambda Architecture
  • 47. © 2014 MapR Technologies 47 Complex Architectures Using Many Big Data Technologies
  • 48. © 2014 MapR Technologies 48 Wanna Play? • http://www.mapr.com/products/mapr-sandbox-hadoop
  • 49. © 2014 MapR Technologies 49 Q&A @mapr maprtech jberns@mapr.com Engage with us! MapR maprtech mapr-technologies

Hinweis der Redaktion

  1. Let’s start with this chart. To reinforce you’re in the right room you picked the right session…Hadoop Not only is it the fastest growing Big Data technology…It is one of the fastest technologies period….Hadoop adoption is happening across industries and across a wide range of application areas.What’s driving this adoption
  2. Need a Platform that serves the broadest sets of use cases….
  3. Large media company – 30 days worth of data in GP; 90 days in Hadoop (5 Petabytes).. Want to make all data available for analysis – can not do with GP (400 nodes required). Want to make SQL-on-Hadoop available – if people are happy with performance they will transition workloads to Hadoop. Have 200 nodes of GP today (analytics platform); (aggregates in DW are 40 nodes
  4. Hadoop is being used in lots of different use cases across a variety of industriesOne way to think of this are functional areas of an organization (from left to right CIO/chief data officer, CMO (marketing), CSO or CRO (chief security or risk), or the COO, head of quality, or IT operations)We have many customers in each of these areas. Here are some example customers of MapR (give example snippets of each)