SlideShare a Scribd company logo
1 of 39
Download to read offline
Grab some
coffee and
enjoy the
pre-­show
banter
before the
top of the
hour!
The Briefing Room
Full Speed Ahead: Hadoop and Spark for Big Data Applications
Twitter Tag: #briefr The Briefing Room
Welcome
Host:
Eric Kavanagh
eric.kavanagh@bloorgroup.com
@eric_kavanagh
Twitter Tag: #briefr The Briefing Room
  Reveal the essential characteristics of enterprise
software, good and bad
  Provide a forum for detailed analysis of today s innovative
technologies
  Give vendors a chance to explain their product to savvy
analysts
  Allow audience members to pose serious questions... and
get answers!
Mission
Twitter Tag: #briefr The Briefing Room
Topics
September: HADOOP 2.0
October: DATA MANAGEMENT
November: ANALYTICS
Twitter Tag: #briefr The Briefing Room
The Age of Big Data
Twitter Tag: #briefr The Briefing Room
Analyst: John Myers
John Myers is Managing
Research Director at
Enterprise Management
Associates
Twitter Tag: #briefr The Briefing Room
MapR
  MapR develops Apache Hadoop-related software
  Its Hadoop distribution boasts data protection, no single
point of failure and industry leading performance
  The MapR distribution also features the complete Apache
Spark stack, including Spark SQL, Spark Streaming, MLLib
and GraphX
Twitter Tag: #briefr The Briefing Room
Guest: Sameer Nori
Sameer Nori is the
Senior Product
Marketing Manager for
MapR
®
© 2015 MapR Technologies 1
®
© 2015 MapR Technologies
Sameer Nori
Sep 29, 2015
®
© 2015 MapR Technologies 2
Agenda
1.  Customer Requirements
2.  Hadoop ecosystem and The MapR Data Platform
3.  Evolution of SQL-on-Hadoop
4.  Customer Examples
®
© 2015 MapR Technologies 3
MapR Architected A Platform For The Age Of Big Data
Apps
Databases
Operational
App platform
Storage
1980s 2000s 2010s
Big data apps
RDBMs
SAN/NAS
Monolithic
UNIX Linux
RDBMs
Scale out
Web
Structured Unstructured
Operational Analytics
®
© 2015 MapR Technologies 4
What MapR Customers Demand
1.  Efficiency at scale
–  Multi-tenancy: Ability to support multiple teams/projects on one platform
–  Resource management: MUST support Hadoop and non-Hadoop workloads
2.  Real-time: MUST support real-time and batch workloads on one cluster
3.  Reliable – Business continuity – must meet SLA’s
4.  Secure – MUST integrate with existing security & data governance standards
5.  Agile - MUST support governed and exploratory BI on one platform
®
© 2015 MapR Technologies 5
2004
2006
2009
2011
2013
2015
Architecting for Production Success
MapR in stealth
MapR 5.0 – Extending Real-time
beyond Hadoop for Big Data Apps
MapR becomes Hadoop technology
leader
MapR-DB – real-time, in-Hadoop DB
Google publishes details of GFS
Hadoop developed at Yahoo!
Built for the enterprise
Built for today’s use
cases
Built for as-it-happens,
agile businesses
®
© 2015 MapR Technologies 6
The Power of the Open Source Community
®
© 2015 MapR Technologies 7
No NameNode architecture
MapReduce/YARN HA
NFS HA
Instant recovery
Rolling upgrades
HA is built in
•  Distributed metadata can self-heal
•  No practical limit on # of files
•  Jobs are not impacted by failures
•  Meet your data processing SLAs
•  High throughput and resilience for NFS-based data
ingestion, import/export and multi-client access
•  Files and tables are accessible within seconds of a node
failure or cluster restart
•  Upgrade the software with no downtime
•  No special configuration to enable HA
•  All MapR customers operate with HA
High Availability (HA) Everywhere
®
© 2015 MapR Technologies 8
Disaster Recovery: Mirroring
•  Flexible
–  Choose the volumes/directories to mirror
–  You don’t need to mirror the entire cluster
–  Any remote cluster can run active volumes
mirrored to other clusters
–  Scheduled/incremental to set low RPO
–  Promotable mirrors to set low RTO
•  Fast
–  No performance impact
–  Block-level (8KB) deltas
–  Automatic compression
•  Safe
–  Point-in-time consistency
–  End-to-end checksums
•  Easy
–  Graceful handling of network issues
–  No third-party software
–  Takes less than two minutes to configure!
Production
WAN
Production Research
Datacenter	
  1	
   Datacenter	
  2	
  
WAN EC
2
®
© 2015 MapR Technologies 9
Multi-tenancy
Isolation
•  Tasks sandboxed so they don’t impact other tasks or system daemons
•  System resources protected from runaway jobs
•  Volume-based data placement
•  Label-based job scheduling
Quotas
•  Storage quotas by volume/user/group
•  CPU and memory quotas by queue/user/group
Security and delegation
•  Wire-level authentication and encryption (Kerberos not required)
•  Fine-grained administration permissions including volume-level delegation
•  Authenticate users to AD, LDAP and Kerberos via Linux PAM
Reporting
•  Detailed reporting on resource usage (75+ different metrics)
•  All reports are available via UI, CLI and REST API
®
© 2015 MapR Technologies 10
1980 2000 20101990 2020
Fixed schema
DBA controls structure
Dynamic / Flexible schema
Application controls structure
NON-RELATIONAL DATASTORESRELATIONAL DATABASES
GBs-TBs TBs-PBsVolume
Database
Data Increasingly Stored in Non-Relational Datastores
Structure
Development
Structured Structured, semi-structured and unstructured
Planned (release cycle = months-years) Iterative (release cycle = days-weeks)
®
© 2015 MapR Technologies 11
Drill’s Role in the Enterprise Data Architecture
Raw data
•  JSON, CSV, ...
“Optimized” data
•  Parquet, …
Centrally-structured
data
•  Schemas in Hive
Metastore
Relational data
•  Highly-structured data
Hive, Impala, Spark SQL
Oracle, Teradata
Exploration
(known and unknown questions)
®
© 2015 MapR Technologies 12
Drill is Designed for a Wide Set of Use Cases
Raw Data Exploration JSON Analytics Data Hub Analytics…
Hive HBaseFiles Directories
…
{JSON}, Parquet
Text Files …
…
®
© 2015 MapR Technologies 13
Cisco was able to analyze service sales opportunities in 1/10 the time, at 1/10 the
cost,
and generated $40 million in incremental service bookings in the first year.
Cisco: 360° Customer View
Cisco uses integrated customer data to increase revenues
•  Create shared view of customer & operations across 75,000 employees
•  Increase revenue opportunities with sales partners
•  Customer information was siloed in different divisions
•  Customer interactions were inconsistent and not satisfying
•  Missed opportunities for upselling/cross selling
•  Use MapR to collect customer information across touch points
•  Integrate billing, support, manufacturing, social media, websites, dial-in
data
•  Generate new sales leads internally and for partners
OBJECTIVES
CHALLENGES
SOLUTION
Architecture for
Sales Partner Opportunities
Business
Impact
®
© 2015 MapR Technologies 14
Cisco Data Platforms Reference Architecture
“The entire market is starting to realize that data is everywhere and an agile ecosystem is paramount. The marketplace
demands the flexibility to meet specific needs and decisions are being made based on how well the ecosystem players are
integrated.”
Arvind Bedi, Director IT, Cisco Systems
DATABASES
DOSC, CASES,
CONTENT, SOCIAL
MEDIA, CLICKSTEAM
Data Storage and Processing
ERP
SFDC
SAP HANA ON UCS
AGILE ANALYTICS
MAPR DISTRIBUTION FOR
HADOOP
Streaming
(Spark
Streaming,
Storm)
MapR-DB
MAPR DISTRIBUTION FOR
HADOOP
Batch
(MR, Spark,
Hive, Pig, …)
MapR-FS
BIG DATA PLATFORM
MISSION CRITICAL
REPORTING
DATA SECURITY,
INFRASTRUCTURE
CUSTOMER NETWORK,
PRODUCT USAGE
INTERNET OF
EVERYTHING (IoE)
SELF SERVICE
DASHBOARD
RAPID BUSINESS
MODEL
DATA
EXPLORATION
REAL TIME
PREDICTIVE
MISSION CRITICAL
OPERATIONAL
REPORTS
FINANCIAL
REPORTING &
EXTRACT
DATA ANALYSIS,
TEXT ANALYTICS
MACHINE LEARNING,
STATISTICAL
ANALYSIS
MACHINE DATA
INSIGHTS
FINANCIALS
STABLE CORE
CONTROLLED CHANGE
Network of
Trust
MapR Data Platform
Data ConsumptionData Sources
ALL Other Sources
Data Bases
(Mobile/ Browser/ Data Service)
Interactive
(Drill, Impala)
®
© 2015 MapR Technologies 15
“HDFS is great internally, but to get data in and out of Hadoop, you have to do some
kind of HDFS export. With MapR, you can just mount [HDFS] as NFS and then use
native tools whether they’re in Windows, Unix, Linux or whatever.” - Mike
Brown, comScore CTO
comScore: Internet Analytics and Ad Optimization
comScore delivers insights about online consumer behavior
•  Provide digital analytics services—syndicated and custom solutions in
audience measurement, e-commerce, advertising,video & mobile
•  Keeping up with data. In the past 5 years, comScore’s volume of new
data/month has grown from 100 billion to 1.7 trillion records
•  comScore chose MapR for NFS, performance, operational efficiency
•  MapR processes over 1.7 trillion Internet and mobile records/month,
reaching more than 90% of the Internet population
•  MapR streaming writes eliminated Cassandra staging cluster cost
OBJECTIVES
CHALLENGES
SOLUTION
Business
Impact
®
© 2015 MapR Technologies 16
Getting Started with MapR
On- Demand Training
https://www.mapr.com/training
MapR Sandbox
https://www.mapr.com/sandbox
Twitter Tag: #briefr The Briefing Room
Perceptions & Questions
Analyst:
John Myers
Importance of Low-Latency in Next
Generation Data Management
Slide 11
Disparate Data Sources
Slide 12 © 2015 Enterprise Management Associates, Inc.
Empowering the Line of Business
Slide 13 © 2015 Enterprise Management Associates, Inc.
Latency of Processing
Slide 14 © 2015 Enterprise Management Associates, Inc.
Obstacles Implementing Analytics
Slide 15 © 2015 Enterprise Management Associates, Inc.
Managing Processing Latency
Slide 16 © 2015 Enterprise Management Associates, Inc.
Questions
Slide 17
Discussion Questions
•  What sets Apache Drill above other SQL on
Hadoop options? There are several either in
“development” or available with standard
distributions
•  How does SPARK work with MapReduce to
provide both the “high speed” and the “high
capacity?” Many business users “want it all
and they want it now”…
© 2015 Enterprise Management Associates, Inc.Slide 18
Discussion Questions
•  Without a “structure” or utilizing a variable,
multi-structured data sets causes issues for
SQL toolsets. How does MapR approach the
ingestion of those variable sources before
they are “finalized” or during times of flux?
•  Continuous data streams are becoming more
important as apart of sensor and IoT use cases.
How does MapR handle the truly real-time
aspects of data ingestion as well as data
query?
© 2015 Enterprise Management Associates, Inc.Slide 19
Discussion Questions
•  EMA research is showing the growth of data
democratization or the penetration of data
“work” and decision making in organizations.
How many users of MapR environments are
business stakeholders vs technologists?
© 2015 Enterprise Management Associates, Inc.Slide 20
Twitter Tag: #briefr The Briefing Room
Twitter Tag: #briefr The Briefing Room
Upcoming Topics
www.insideanalysis.com
September: HADOOP 2.0
October: DATA MANAGEMENT
November: ANALYTICS
Twitter Tag: #briefr The Briefing Room
THANK YOU
for your
ATTENTION!
Some images provided courtesy of Wikimedia Commons

More Related Content

More from Inside Analysis

Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsInside Analysis
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingInside Analysis
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLInside Analysis
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelInside Analysis
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureInside Analysis
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskInside Analysis
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataInside Analysis
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseInside Analysis
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopInside Analysis
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldInside Analysis
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave DuggalInside Analysis
 
Phasic Systems - Dr. Geoffrey Malafsky
Phasic Systems - Dr. Geoffrey MalafskyPhasic Systems - Dr. Geoffrey Malafsky
Phasic Systems - Dr. Geoffrey MalafskyInside Analysis
 
Red Hat - Sarangan Rangachari
Red Hat - Sarangan RangachariRed Hat - Sarangan Rangachari
Red Hat - Sarangan RangachariInside Analysis
 
DisrupTech - Robin Bloor (2)
DisrupTech - Robin Bloor (2)DisrupTech - Robin Bloor (2)
DisrupTech - Robin Bloor (2)Inside Analysis
 
DisrupTech - Robin Bloor (1)
DisrupTech - Robin Bloor (1)DisrupTech - Robin Bloor (1)
DisrupTech - Robin Bloor (1)Inside Analysis
 
Big Data Refinery: Distilling Value for User-Driven Analytics
Big Data Refinery: Distilling Value for User-Driven AnalyticsBig Data Refinery: Distilling Value for User-Driven Analytics
Big Data Refinery: Distilling Value for User-Driven AnalyticsInside Analysis
 
Understanding What’s Possible: Getting Business Value from Big Data Quickly
Understanding What’s Possible: Getting Business Value from Big Data QuicklyUnderstanding What’s Possible: Getting Business Value from Big Data Quickly
Understanding What’s Possible: Getting Business Value from Big Data QuicklyInside Analysis
 

More from Inside Analysis (20)

Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time Analytics
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of Everything
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global Level
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your Architecture
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the Risk
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big Data
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data Warehouse
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile World
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave Duggal
 
Modus Operandi
Modus OperandiModus Operandi
Modus Operandi
 
Phasic Systems - Dr. Geoffrey Malafsky
Phasic Systems - Dr. Geoffrey MalafskyPhasic Systems - Dr. Geoffrey Malafsky
Phasic Systems - Dr. Geoffrey Malafsky
 
Red Hat - Sarangan Rangachari
Red Hat - Sarangan RangachariRed Hat - Sarangan Rangachari
Red Hat - Sarangan Rangachari
 
WebAction-Sami Abkay
WebAction-Sami AbkayWebAction-Sami Abkay
WebAction-Sami Abkay
 
DisrupTech 2015ek
DisrupTech 2015ekDisrupTech 2015ek
DisrupTech 2015ek
 
DisrupTech - Robin Bloor (2)
DisrupTech - Robin Bloor (2)DisrupTech - Robin Bloor (2)
DisrupTech - Robin Bloor (2)
 
DisrupTech - Robin Bloor (1)
DisrupTech - Robin Bloor (1)DisrupTech - Robin Bloor (1)
DisrupTech - Robin Bloor (1)
 
Big Data Refinery: Distilling Value for User-Driven Analytics
Big Data Refinery: Distilling Value for User-Driven AnalyticsBig Data Refinery: Distilling Value for User-Driven Analytics
Big Data Refinery: Distilling Value for User-Driven Analytics
 
Understanding What’s Possible: Getting Business Value from Big Data Quickly
Understanding What’s Possible: Getting Business Value from Big Data QuicklyUnderstanding What’s Possible: Getting Business Value from Big Data Quickly
Understanding What’s Possible: Getting Business Value from Big Data Quickly
 

Recently uploaded

WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceSamy Fodil
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch TuesdayIvanti
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxFIDO Alliance
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...FIDO Alliance
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGDSC PJATK
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfSrushith Repakula
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxFIDO Alliance
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfFIDO Alliance
 
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?Paolo Missier
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe中 央社
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform EngineeringMarcus Vechiato
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024Stephen Perrenod
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...FIDO Alliance
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPTiSEO AI
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Skynet Technologies
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FIDO Alliance
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Hiroshi SHIBATA
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfFIDO Alliance
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024Lorenzo Miniero
 
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfBreaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfUK Journal
 

Recently uploaded (20)

WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfBreaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
 

Full Speed Ahead: The Briefing Room with John Myers and MapR

  • 1. Grab some coffee and enjoy the pre-­show banter before the top of the hour!
  • 2. The Briefing Room Full Speed Ahead: Hadoop and Spark for Big Data Applications
  • 3. Twitter Tag: #briefr The Briefing Room Welcome Host: Eric Kavanagh eric.kavanagh@bloorgroup.com @eric_kavanagh
  • 4. Twitter Tag: #briefr The Briefing Room   Reveal the essential characteristics of enterprise software, good and bad   Provide a forum for detailed analysis of today s innovative technologies   Give vendors a chance to explain their product to savvy analysts   Allow audience members to pose serious questions... and get answers! Mission
  • 5. Twitter Tag: #briefr The Briefing Room Topics September: HADOOP 2.0 October: DATA MANAGEMENT November: ANALYTICS
  • 6. Twitter Tag: #briefr The Briefing Room The Age of Big Data
  • 7. Twitter Tag: #briefr The Briefing Room Analyst: John Myers John Myers is Managing Research Director at Enterprise Management Associates
  • 8. Twitter Tag: #briefr The Briefing Room MapR   MapR develops Apache Hadoop-related software   Its Hadoop distribution boasts data protection, no single point of failure and industry leading performance   The MapR distribution also features the complete Apache Spark stack, including Spark SQL, Spark Streaming, MLLib and GraphX
  • 9. Twitter Tag: #briefr The Briefing Room Guest: Sameer Nori Sameer Nori is the Senior Product Marketing Manager for MapR
  • 10. ® © 2015 MapR Technologies 1 ® © 2015 MapR Technologies Sameer Nori Sep 29, 2015
  • 11. ® © 2015 MapR Technologies 2 Agenda 1.  Customer Requirements 2.  Hadoop ecosystem and The MapR Data Platform 3.  Evolution of SQL-on-Hadoop 4.  Customer Examples
  • 12. ® © 2015 MapR Technologies 3 MapR Architected A Platform For The Age Of Big Data Apps Databases Operational App platform Storage 1980s 2000s 2010s Big data apps RDBMs SAN/NAS Monolithic UNIX Linux RDBMs Scale out Web Structured Unstructured Operational Analytics
  • 13. ® © 2015 MapR Technologies 4 What MapR Customers Demand 1.  Efficiency at scale –  Multi-tenancy: Ability to support multiple teams/projects on one platform –  Resource management: MUST support Hadoop and non-Hadoop workloads 2.  Real-time: MUST support real-time and batch workloads on one cluster 3.  Reliable – Business continuity – must meet SLA’s 4.  Secure – MUST integrate with existing security & data governance standards 5.  Agile - MUST support governed and exploratory BI on one platform
  • 14. ® © 2015 MapR Technologies 5 2004 2006 2009 2011 2013 2015 Architecting for Production Success MapR in stealth MapR 5.0 – Extending Real-time beyond Hadoop for Big Data Apps MapR becomes Hadoop technology leader MapR-DB – real-time, in-Hadoop DB Google publishes details of GFS Hadoop developed at Yahoo! Built for the enterprise Built for today’s use cases Built for as-it-happens, agile businesses
  • 15. ® © 2015 MapR Technologies 6 The Power of the Open Source Community
  • 16. ® © 2015 MapR Technologies 7 No NameNode architecture MapReduce/YARN HA NFS HA Instant recovery Rolling upgrades HA is built in •  Distributed metadata can self-heal •  No practical limit on # of files •  Jobs are not impacted by failures •  Meet your data processing SLAs •  High throughput and resilience for NFS-based data ingestion, import/export and multi-client access •  Files and tables are accessible within seconds of a node failure or cluster restart •  Upgrade the software with no downtime •  No special configuration to enable HA •  All MapR customers operate with HA High Availability (HA) Everywhere
  • 17. ® © 2015 MapR Technologies 8 Disaster Recovery: Mirroring •  Flexible –  Choose the volumes/directories to mirror –  You don’t need to mirror the entire cluster –  Any remote cluster can run active volumes mirrored to other clusters –  Scheduled/incremental to set low RPO –  Promotable mirrors to set low RTO •  Fast –  No performance impact –  Block-level (8KB) deltas –  Automatic compression •  Safe –  Point-in-time consistency –  End-to-end checksums •  Easy –  Graceful handling of network issues –  No third-party software –  Takes less than two minutes to configure! Production WAN Production Research Datacenter  1   Datacenter  2   WAN EC 2
  • 18. ® © 2015 MapR Technologies 9 Multi-tenancy Isolation •  Tasks sandboxed so they don’t impact other tasks or system daemons •  System resources protected from runaway jobs •  Volume-based data placement •  Label-based job scheduling Quotas •  Storage quotas by volume/user/group •  CPU and memory quotas by queue/user/group Security and delegation •  Wire-level authentication and encryption (Kerberos not required) •  Fine-grained administration permissions including volume-level delegation •  Authenticate users to AD, LDAP and Kerberos via Linux PAM Reporting •  Detailed reporting on resource usage (75+ different metrics) •  All reports are available via UI, CLI and REST API
  • 19. ® © 2015 MapR Technologies 10 1980 2000 20101990 2020 Fixed schema DBA controls structure Dynamic / Flexible schema Application controls structure NON-RELATIONAL DATASTORESRELATIONAL DATABASES GBs-TBs TBs-PBsVolume Database Data Increasingly Stored in Non-Relational Datastores Structure Development Structured Structured, semi-structured and unstructured Planned (release cycle = months-years) Iterative (release cycle = days-weeks)
  • 20. ® © 2015 MapR Technologies 11 Drill’s Role in the Enterprise Data Architecture Raw data •  JSON, CSV, ... “Optimized” data •  Parquet, … Centrally-structured data •  Schemas in Hive Metastore Relational data •  Highly-structured data Hive, Impala, Spark SQL Oracle, Teradata Exploration (known and unknown questions)
  • 21. ® © 2015 MapR Technologies 12 Drill is Designed for a Wide Set of Use Cases Raw Data Exploration JSON Analytics Data Hub Analytics… Hive HBaseFiles Directories … {JSON}, Parquet Text Files … …
  • 22. ® © 2015 MapR Technologies 13 Cisco was able to analyze service sales opportunities in 1/10 the time, at 1/10 the cost, and generated $40 million in incremental service bookings in the first year. Cisco: 360° Customer View Cisco uses integrated customer data to increase revenues •  Create shared view of customer & operations across 75,000 employees •  Increase revenue opportunities with sales partners •  Customer information was siloed in different divisions •  Customer interactions were inconsistent and not satisfying •  Missed opportunities for upselling/cross selling •  Use MapR to collect customer information across touch points •  Integrate billing, support, manufacturing, social media, websites, dial-in data •  Generate new sales leads internally and for partners OBJECTIVES CHALLENGES SOLUTION Architecture for Sales Partner Opportunities Business Impact
  • 23. ® © 2015 MapR Technologies 14 Cisco Data Platforms Reference Architecture “The entire market is starting to realize that data is everywhere and an agile ecosystem is paramount. The marketplace demands the flexibility to meet specific needs and decisions are being made based on how well the ecosystem players are integrated.” Arvind Bedi, Director IT, Cisco Systems DATABASES DOSC, CASES, CONTENT, SOCIAL MEDIA, CLICKSTEAM Data Storage and Processing ERP SFDC SAP HANA ON UCS AGILE ANALYTICS MAPR DISTRIBUTION FOR HADOOP Streaming (Spark Streaming, Storm) MapR-DB MAPR DISTRIBUTION FOR HADOOP Batch (MR, Spark, Hive, Pig, …) MapR-FS BIG DATA PLATFORM MISSION CRITICAL REPORTING DATA SECURITY, INFRASTRUCTURE CUSTOMER NETWORK, PRODUCT USAGE INTERNET OF EVERYTHING (IoE) SELF SERVICE DASHBOARD RAPID BUSINESS MODEL DATA EXPLORATION REAL TIME PREDICTIVE MISSION CRITICAL OPERATIONAL REPORTS FINANCIAL REPORTING & EXTRACT DATA ANALYSIS, TEXT ANALYTICS MACHINE LEARNING, STATISTICAL ANALYSIS MACHINE DATA INSIGHTS FINANCIALS STABLE CORE CONTROLLED CHANGE Network of Trust MapR Data Platform Data ConsumptionData Sources ALL Other Sources Data Bases (Mobile/ Browser/ Data Service) Interactive (Drill, Impala)
  • 24. ® © 2015 MapR Technologies 15 “HDFS is great internally, but to get data in and out of Hadoop, you have to do some kind of HDFS export. With MapR, you can just mount [HDFS] as NFS and then use native tools whether they’re in Windows, Unix, Linux or whatever.” - Mike Brown, comScore CTO comScore: Internet Analytics and Ad Optimization comScore delivers insights about online consumer behavior •  Provide digital analytics services—syndicated and custom solutions in audience measurement, e-commerce, advertising,video & mobile •  Keeping up with data. In the past 5 years, comScore’s volume of new data/month has grown from 100 billion to 1.7 trillion records •  comScore chose MapR for NFS, performance, operational efficiency •  MapR processes over 1.7 trillion Internet and mobile records/month, reaching more than 90% of the Internet population •  MapR streaming writes eliminated Cassandra staging cluster cost OBJECTIVES CHALLENGES SOLUTION Business Impact
  • 25. ® © 2015 MapR Technologies 16 Getting Started with MapR On- Demand Training https://www.mapr.com/training MapR Sandbox https://www.mapr.com/sandbox
  • 26. Twitter Tag: #briefr The Briefing Room Perceptions & Questions Analyst: John Myers
  • 27. Importance of Low-Latency in Next Generation Data Management Slide 11
  • 28. Disparate Data Sources Slide 12 © 2015 Enterprise Management Associates, Inc.
  • 29. Empowering the Line of Business Slide 13 © 2015 Enterprise Management Associates, Inc.
  • 30. Latency of Processing Slide 14 © 2015 Enterprise Management Associates, Inc.
  • 31. Obstacles Implementing Analytics Slide 15 © 2015 Enterprise Management Associates, Inc.
  • 32. Managing Processing Latency Slide 16 © 2015 Enterprise Management Associates, Inc.
  • 34. Discussion Questions •  What sets Apache Drill above other SQL on Hadoop options? There are several either in “development” or available with standard distributions •  How does SPARK work with MapReduce to provide both the “high speed” and the “high capacity?” Many business users “want it all and they want it now”… © 2015 Enterprise Management Associates, Inc.Slide 18
  • 35. Discussion Questions •  Without a “structure” or utilizing a variable, multi-structured data sets causes issues for SQL toolsets. How does MapR approach the ingestion of those variable sources before they are “finalized” or during times of flux? •  Continuous data streams are becoming more important as apart of sensor and IoT use cases. How does MapR handle the truly real-time aspects of data ingestion as well as data query? © 2015 Enterprise Management Associates, Inc.Slide 19
  • 36. Discussion Questions •  EMA research is showing the growth of data democratization or the penetration of data “work” and decision making in organizations. How many users of MapR environments are business stakeholders vs technologists? © 2015 Enterprise Management Associates, Inc.Slide 20
  • 37. Twitter Tag: #briefr The Briefing Room
  • 38. Twitter Tag: #briefr The Briefing Room Upcoming Topics www.insideanalysis.com September: HADOOP 2.0 October: DATA MANAGEMENT November: ANALYTICS
  • 39. Twitter Tag: #briefr The Briefing Room THANK YOU for your ATTENTION! Some images provided courtesy of Wikimedia Commons