SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Mobile, Big Data, Cloud, Security, Virtualization
http://www.exceleratesystems.com
David Bennett - CEO
2
• Founded in 2008
• Excelerate Systems is a leading Company in the Americas focusing on Big
Data, Cloud, IT Operations and Security.
• With Offices in the US, Mexico, Chile and France as well as individual
contributors in Brasil, Uruguay, Argentina, Canada, Spain, China and India
we have a global delivery capability.
• 125 customers in 25 countries
3
• 4 Technology Areas:
Security
IT Operations
&
The Data
Center
Cloud
Services
Cloud Services, IaaS and SaaS
Big Data
5
Big Data and Hadoop
6
Storage Only Grid (original raw data)
Instrumentation
Collection
RDBMS (aggregated data)
BI Reports + Interactive Apps
Mostly Append
ETL Compute Grid
1. Moving Data To
Compute Doesn’t Scale
3. Can’t Explore Original High
Fidelity Raw Data
2. Archiving
= Premature
Data Death
The Problems with Current Data Systems
7
The Solution: A Combined Storage/Compute Layer
Hadoop: Storage + Compute Grid
Instrumentation
Collection
RDBMS (aggregated data)
BI Reports + Interactive Apps
3. Data Exploration &
Advanced Analytics
2. Keep Data
Alive For Ever
(Active Archive)
1. Scalable Throughput
For ETL & Aggregation
(ETL Acceleration)
Mostly Append
So What is Apache Hadoop ?
• A scalable fault-tolerant distributed system for data storage and
processing (open source under the Apache license).
• Core Hadoop has two main systems:
• Hadoop Distributed File System: self-healing high-bandwidth clustered
storage.
• MapReduce: distributed fault-tolerant resource management and scheduling
coupled with a scalable data programming abstraction.
• Key business values:
• Flexibility – Store any data, Run any analysis.
• Scalability – Start at 1TB/3-nodes grow to petabytes/1000s of nodes.
• Economics – Cost per TB at a fraction of traditional options.
8
The Hadoop Big Bang
9
• Fastest sort of a TB, 62secs
over 1,460 nodes
• Sorted a PB in 16.25hours
over 3,658 nodes
Hadoop World 2009,
500 attendees
The Key Benefit: Agility/Flexibility
10
Schema-on-Read (Hadoop):Schema-on-Write (RDBMS):
• Schema must be created before
any data can be loaded.
• An explicit load operation has to
take place which transforms data
to DB internal serialization format.
• New columns must be added
explicitly before new data for such
columns can be loaded into the
database.
• OLAP is Fast
• Standards/Governance
• Data is simply copied to the file store,
no transformation is needed.
• A SerDe (Serializer/Deserlizer) is
applied during read time to extract
the required columns (late binding)
• New data can start flowing anytime
and will appear retroactively once the
SerDe is updated to parse it.
• Load is Fast
• Flexibility/Agility
Pros
Scalability: Scalable Software Development
11
Grows without requiring developers to
re-architect their algorithms/application.
AUTO SCALE
Economics: Return on Byte
• Return on Byte (ROB) = value to be extracted from that
byte divided by the cost of storing that byte
• If ROB is < 1 then it will be buried into tape wasteland, thus
we need more economical active storage.
12
Low ROB
High ROB
The Big Data Platform: CDH5
16
CDH in the Enterprise Data Stack
Logs Files Web Data
Relational
Databases
IDEs
BI /
Analytics
Enterprise
Reporting
Enterprise Data
Warehouse
Online Serving
Systems
Cloudera
Manager
SYSTEM
OPERATORS
ENGINEERS ANALYSTS BUSINESS USERS
Web/Mobile
Applications
CUSTOMERS
Sqoop
Sqoop
Sqoop
FlumeFlumeFlume
Modeling
Tools
DATA SCIENTISTS
DATA
ARCHITECTS
Meta Data/
ETL Tools
ODBC, JDBC,
NFS, HTTP
17
HBase versus HDFS
HDFS: HBase:
Use For:
• Dimension tables which are updated
frequently and require random low-
latency lookups.
Use For:
• Fact tables that are mostly append only
and require sequential full table scans.
Optimized For:
• Large Files
• Sequential Access (Hi Throughput)
• Append Only
Optimized For:
• Small Records
• Random Access (Lo Latency)
• Atomic Record Updates
Not Suitable For:
• Low Latency Interactive OLAP.
18
• Retail: Price Optimization
• Media: Content Targeting
• Finance: Fraud Detection
• Manufacturing: Diagnostics
• Info Services: Satellite Imagery
• Agriculture: Seed Optimization
• Power: Smart Consumption
Use Case Examples
19
1. FLEXIBILITY
STORE ANY DATA
RUN ANY ANALYSIS
KEEP’S PACE WITH THE RATE OF CHANGE OF INCOMING DATA
2. SCALABILITY
PROVEN GROWTH TO PBS/1,000s OF NODES
NO NEED TO REWRITE QUERIES, AUTOMATICALLY SCALES
KEEP’S PACE WITH THE RATE OF GROWTH OF INCOMING DATA
3. ECONOMICS
COST PER TB AT A FRACTION OF OTHER OPTIONS
KEEP ALL OF YOUR DATA ALIVE IN AN ACTIVE ARCHIVE
POWERING THE DATA BEATS ALGORITHM MOVEMENT
20
Core Benefits of the Platform for Big Data
How do I start?
21
I
II
III
IV
4 Options
Cloudera cluster up and running in the Cloud in 24 hours.
Use and Excelerate Systems Data Scientist to set customer’s
Data strategy..
Get an on-premise Cloudera Cluster up and running in 5
days with 5 nodes and upto 10TB of Data..
Training: Customers who invest in training are generally
more successful than those who do not.
Cloudera from Excelerate Systems
22
There is a worldwide shortage of Big Data skills,
especially in Latin America. Excelerate Systems has
invested heavily in building a global network of
certified specialists in Cloudera who can design,
implement, configure, develop and Support Big Data
solutions. No other company in the region has these
skills yet.
Excelerate Systems is Cloudera’s Primary partner in
the region.
• 8 Certified Cloudera Developers
• 6 Certified Cloudera Administrators
• 2 Hbase developers
• 2 Hadoop Developers
• 2 Data Scientists
Excelerate Systems Big Data Resources
25
Questions and next steps
David Bennett, CEO David.bennett@exceleratesystems.net
Victor Pichardo, President, Victor.pichardo@exceleratesystems.net
Alex Campos, Systems Engineer, alex.campos@exceleratesystems.net
Plus consulting Resources in various countries
26

Weitere ähnliche Inhalte

Was ist angesagt?

Smartsheet’s Transition to Snowflake and Databricks: The Why and Immediate Im...
Smartsheet’s Transition to Snowflake and Databricks: The Why and Immediate Im...Smartsheet’s Transition to Snowflake and Databricks: The Why and Immediate Im...
Smartsheet’s Transition to Snowflake and Databricks: The Why and Immediate Im...Databricks
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data AnalyticsAttunity
 
Data platform architecture
Data platform architectureData platform architecture
Data platform architectureSudheer Kondla
 
Managed Cluster Services
Managed Cluster ServicesManaged Cluster Services
Managed Cluster ServicesAdam Doyle
 
Multi-tenant Hadoop - the challenge of maintaining high SLAS
Multi-tenant Hadoop - the challenge of maintaining high SLASMulti-tenant Hadoop - the challenge of maintaining high SLAS
Multi-tenant Hadoop - the challenge of maintaining high SLASDataWorks Summit
 
Revolutionising Storage for your Future Business Requirements
Revolutionising Storage for your Future Business RequirementsRevolutionising Storage for your Future Business Requirements
Revolutionising Storage for your Future Business RequirementsNetApp
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Databricks
 
Snowflake Company Presentation
Snowflake Company PresentationSnowflake Company Presentation
Snowflake Company PresentationAndrewJiang18
 
Event Sponsor NetApp - CSO- Jon Kissane
Event Sponsor NetApp - CSO- Jon Kissane  Event Sponsor NetApp - CSO- Jon Kissane
Event Sponsor NetApp - CSO- Jon Kissane Hostway|HOSTING
 
Yellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Webcast with DBTA for Real-Time AnalyticsYellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Webcast with DBTA for Real-Time AnalyticsYellowbrick Data
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation Brett VanderPlaats
 
Get Savvy with Snowflake
Get Savvy with SnowflakeGet Savvy with Snowflake
Get Savvy with SnowflakeMatillion
 
Building the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump InBuilding the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump InSnapLogic
 
Introduction to Big Data Technologies & Applications
Introduction to Big Data Technologies & ApplicationsIntroduction to Big Data Technologies & Applications
Introduction to Big Data Technologies & ApplicationsNguyen Cao
 
Denodo DataFest 2016: Big Data Virtualization in the Cloud
Denodo DataFest 2016: Big Data Virtualization in the CloudDenodo DataFest 2016: Big Data Virtualization in the Cloud
Denodo DataFest 2016: Big Data Virtualization in the CloudDenodo
 
Delivering digital transformation and business impact with io t, machine lear...
Delivering digital transformation and business impact with io t, machine lear...Delivering digital transformation and business impact with io t, machine lear...
Delivering digital transformation and business impact with io t, machine lear...Robert Sanders
 
Dataiku & Snowflake Meetup Berlin 2020
Dataiku & Snowflake Meetup Berlin 2020Dataiku & Snowflake Meetup Berlin 2020
Dataiku & Snowflake Meetup Berlin 2020Harald Erb
 

Was ist angesagt? (20)

Smartsheet’s Transition to Snowflake and Databricks: The Why and Immediate Im...
Smartsheet’s Transition to Snowflake and Databricks: The Why and Immediate Im...Smartsheet’s Transition to Snowflake and Databricks: The Why and Immediate Im...
Smartsheet’s Transition to Snowflake and Databricks: The Why and Immediate Im...
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data Analytics
 
Data platform architecture
Data platform architectureData platform architecture
Data platform architecture
 
Managed Cluster Services
Managed Cluster ServicesManaged Cluster Services
Managed Cluster Services
 
BigData Hadoop
BigData Hadoop BigData Hadoop
BigData Hadoop
 
Multi-tenant Hadoop - the challenge of maintaining high SLAS
Multi-tenant Hadoop - the challenge of maintaining high SLASMulti-tenant Hadoop - the challenge of maintaining high SLAS
Multi-tenant Hadoop - the challenge of maintaining high SLAS
 
Revolutionising Storage for your Future Business Requirements
Revolutionising Storage for your Future Business RequirementsRevolutionising Storage for your Future Business Requirements
Revolutionising Storage for your Future Business Requirements
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
 
Snowflake Company Presentation
Snowflake Company PresentationSnowflake Company Presentation
Snowflake Company Presentation
 
Event Sponsor NetApp - CSO- Jon Kissane
Event Sponsor NetApp - CSO- Jon Kissane  Event Sponsor NetApp - CSO- Jon Kissane
Event Sponsor NetApp - CSO- Jon Kissane
 
Yellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Webcast with DBTA for Real-Time AnalyticsYellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Webcast with DBTA for Real-Time Analytics
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation
 
Hadoop for the Masses
Hadoop for the MassesHadoop for the Masses
Hadoop for the Masses
 
Get Savvy with Snowflake
Get Savvy with SnowflakeGet Savvy with Snowflake
Get Savvy with Snowflake
 
Building the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump InBuilding the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump In
 
Introduction to Big Data Technologies & Applications
Introduction to Big Data Technologies & ApplicationsIntroduction to Big Data Technologies & Applications
Introduction to Big Data Technologies & Applications
 
Denodo DataFest 2016: Big Data Virtualization in the Cloud
Denodo DataFest 2016: Big Data Virtualization in the CloudDenodo DataFest 2016: Big Data Virtualization in the Cloud
Denodo DataFest 2016: Big Data Virtualization in the Cloud
 
Delivering digital transformation and business impact with io t, machine lear...
Delivering digital transformation and business impact with io t, machine lear...Delivering digital transformation and business impact with io t, machine lear...
Delivering digital transformation and business impact with io t, machine lear...
 
Dataiku & Snowflake Meetup Berlin 2020
Dataiku & Snowflake Meetup Berlin 2020Dataiku & Snowflake Meetup Berlin 2020
Dataiku & Snowflake Meetup Berlin 2020
 
The Ecosystem is too damn big
The Ecosystem is too damn big The Ecosystem is too damn big
The Ecosystem is too damn big
 

Andere mochten auch

Big Data: NoSQL & the DBA
Big Data: NoSQL & the DBABig Data: NoSQL & the DBA
Big Data: NoSQL & the DBAAswani Kumar
 
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan OttTrivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan OttTrivadis
 
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...Trivadis
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...Data Con LA
 
Cloudera cluster setup and configuration
Cloudera cluster setup and configurationCloudera cluster setup and configuration
Cloudera cluster setup and configurationSudheer Kondla
 
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive AnalyticsBig Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive AnalyticsMark Rittman
 
Big Data/Hadoop Option Analysis
Big Data/Hadoop Option AnalysisBig Data/Hadoop Option Analysis
Big Data/Hadoop Option Analysiszafarali1981
 
Trivadis TechEvent 2016 DWH Modernization – in the Age of Big Data by Gregor ...
Trivadis TechEvent 2016 DWH Modernization – in the Age of Big Data by Gregor ...Trivadis TechEvent 2016 DWH Modernization – in the Age of Big Data by Gregor ...
Trivadis TechEvent 2016 DWH Modernization – in the Age of Big Data by Gregor ...Trivadis
 
Trivadis TechEvent 2016 Big Data Privacy and Security Fundamentals by Florian...
Trivadis TechEvent 2016 Big Data Privacy and Security Fundamentals by Florian...Trivadis TechEvent 2016 Big Data Privacy and Security Fundamentals by Florian...
Trivadis TechEvent 2016 Big Data Privacy and Security Fundamentals by Florian...Trivadis
 
Cluster management and automation with cloudera manager
Cluster management and automation with cloudera managerCluster management and automation with cloudera manager
Cluster management and automation with cloudera managerChris Westin
 
Transforming ISV's to Azure
Transforming ISV's to AzureTransforming ISV's to Azure
Transforming ISV's to AzureTrivadis
 
Big data processing using Cloudera Quickstart
Big data processing using Cloudera QuickstartBig data processing using Cloudera Quickstart
Big data processing using Cloudera QuickstartIMC Institute
 

Andere mochten auch (12)

Big Data: NoSQL & the DBA
Big Data: NoSQL & the DBABig Data: NoSQL & the DBA
Big Data: NoSQL & the DBA
 
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan OttTrivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
 
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
 
Cloudera cluster setup and configuration
Cloudera cluster setup and configurationCloudera cluster setup and configuration
Cloudera cluster setup and configuration
 
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive AnalyticsBig Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
 
Big Data/Hadoop Option Analysis
Big Data/Hadoop Option AnalysisBig Data/Hadoop Option Analysis
Big Data/Hadoop Option Analysis
 
Trivadis TechEvent 2016 DWH Modernization – in the Age of Big Data by Gregor ...
Trivadis TechEvent 2016 DWH Modernization – in the Age of Big Data by Gregor ...Trivadis TechEvent 2016 DWH Modernization – in the Age of Big Data by Gregor ...
Trivadis TechEvent 2016 DWH Modernization – in the Age of Big Data by Gregor ...
 
Trivadis TechEvent 2016 Big Data Privacy and Security Fundamentals by Florian...
Trivadis TechEvent 2016 Big Data Privacy and Security Fundamentals by Florian...Trivadis TechEvent 2016 Big Data Privacy and Security Fundamentals by Florian...
Trivadis TechEvent 2016 Big Data Privacy and Security Fundamentals by Florian...
 
Cluster management and automation with cloudera manager
Cluster management and automation with cloudera managerCluster management and automation with cloudera manager
Cluster management and automation with cloudera manager
 
Transforming ISV's to Azure
Transforming ISV's to AzureTransforming ISV's to Azure
Transforming ISV's to Azure
 
Big data processing using Cloudera Quickstart
Big data processing using Cloudera QuickstartBig data processing using Cloudera Quickstart
Big data processing using Cloudera Quickstart
 

Ähnlich wie Big Data/Cloudera from Excelerate Systems

Building Confidence in Big Data - IBM Smarter Business 2013
Building Confidence in Big Data - IBM Smarter Business 2013 Building Confidence in Big Data - IBM Smarter Business 2013
Building Confidence in Big Data - IBM Smarter Business 2013 IBM Sverige
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game ChangerCaserta
 
New Innovations in Information Management for Big Data - Smarter Business 2013
New Innovations in Information Management for Big Data - Smarter Business 2013New Innovations in Information Management for Big Data - Smarter Business 2013
New Innovations in Information Management for Big Data - Smarter Business 2013IBM Sverige
 
Fueling AI & Machine Learning: Legacy Data as a Competitive Advantage
Fueling AI & Machine Learning: Legacy Data as a Competitive AdvantageFueling AI & Machine Learning: Legacy Data as a Competitive Advantage
Fueling AI & Machine Learning: Legacy Data as a Competitive AdvantagePrecisely
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Denodo
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataSpringPeople
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...DATAVERSITY
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization Denodo
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationDATAVERSITY
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the OrganizationSeeling Cheung
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Precisely
 
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeeling Cheung
 
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
Hadoop is dead - long live Hadoop | BiDaTA 2013 GenoaHadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoalarsgeorge
 
Exploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis KapsalisExploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis KapsalisNetAppUK
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadhMithlesh Sadh
 
The Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the SameThe Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the SameCloudera, Inc.
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeDATAVERSITY
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Data Modeling and Scale Out - ScaleBase + 451-Group webinar 30.4.2015
Data Modeling and Scale Out - ScaleBase + 451-Group webinar 30.4.2015 Data Modeling and Scale Out - ScaleBase + 451-Group webinar 30.4.2015
Data Modeling and Scale Out - ScaleBase + 451-Group webinar 30.4.2015 Vladi Vexler
 

Ähnlich wie Big Data/Cloudera from Excelerate Systems (20)

Building Confidence in Big Data - IBM Smarter Business 2013
Building Confidence in Big Data - IBM Smarter Business 2013 Building Confidence in Big Data - IBM Smarter Business 2013
Building Confidence in Big Data - IBM Smarter Business 2013
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer
 
New Innovations in Information Management for Big Data - Smarter Business 2013
New Innovations in Information Management for Big Data - Smarter Business 2013New Innovations in Information Management for Big Data - Smarter Business 2013
New Innovations in Information Management for Big Data - Smarter Business 2013
 
Fueling AI & Machine Learning: Legacy Data as a Competitive Advantage
Fueling AI & Machine Learning: Legacy Data as a Competitive AdvantageFueling AI & Machine Learning: Legacy Data as a Competitive Advantage
Fueling AI & Machine Learning: Legacy Data as a Competitive Advantage
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?
 
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
 
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
Hadoop is dead - long live Hadoop | BiDaTA 2013 GenoaHadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
 
Exploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis KapsalisExploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis Kapsalis
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
The Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the SameThe Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the Same
 
Big data Question bank.pdf
Big data Question bank.pdfBig data Question bank.pdf
Big data Question bank.pdf
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-Purpose
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Data Modeling and Scale Out - ScaleBase + 451-Group webinar 30.4.2015
Data Modeling and Scale Out - ScaleBase + 451-Group webinar 30.4.2015 Data Modeling and Scale Out - ScaleBase + 451-Group webinar 30.4.2015
Data Modeling and Scale Out - ScaleBase + 451-Group webinar 30.4.2015
 

Kürzlich hochgeladen

Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 

Kürzlich hochgeladen (20)

Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 

Big Data/Cloudera from Excelerate Systems

  • 1. Mobile, Big Data, Cloud, Security, Virtualization http://www.exceleratesystems.com David Bennett - CEO
  • 2. 2 • Founded in 2008 • Excelerate Systems is a leading Company in the Americas focusing on Big Data, Cloud, IT Operations and Security. • With Offices in the US, Mexico, Chile and France as well as individual contributors in Brasil, Uruguay, Argentina, Canada, Spain, China and India we have a global delivery capability. • 125 customers in 25 countries
  • 5. 5 Big Data and Hadoop
  • 6. 6 Storage Only Grid (original raw data) Instrumentation Collection RDBMS (aggregated data) BI Reports + Interactive Apps Mostly Append ETL Compute Grid 1. Moving Data To Compute Doesn’t Scale 3. Can’t Explore Original High Fidelity Raw Data 2. Archiving = Premature Data Death The Problems with Current Data Systems
  • 7. 7 The Solution: A Combined Storage/Compute Layer Hadoop: Storage + Compute Grid Instrumentation Collection RDBMS (aggregated data) BI Reports + Interactive Apps 3. Data Exploration & Advanced Analytics 2. Keep Data Alive For Ever (Active Archive) 1. Scalable Throughput For ETL & Aggregation (ETL Acceleration) Mostly Append
  • 8. So What is Apache Hadoop ? • A scalable fault-tolerant distributed system for data storage and processing (open source under the Apache license). • Core Hadoop has two main systems: • Hadoop Distributed File System: self-healing high-bandwidth clustered storage. • MapReduce: distributed fault-tolerant resource management and scheduling coupled with a scalable data programming abstraction. • Key business values: • Flexibility – Store any data, Run any analysis. • Scalability – Start at 1TB/3-nodes grow to petabytes/1000s of nodes. • Economics – Cost per TB at a fraction of traditional options. 8
  • 9. The Hadoop Big Bang 9 • Fastest sort of a TB, 62secs over 1,460 nodes • Sorted a PB in 16.25hours over 3,658 nodes Hadoop World 2009, 500 attendees
  • 10. The Key Benefit: Agility/Flexibility 10 Schema-on-Read (Hadoop):Schema-on-Write (RDBMS): • Schema must be created before any data can be loaded. • An explicit load operation has to take place which transforms data to DB internal serialization format. • New columns must be added explicitly before new data for such columns can be loaded into the database. • OLAP is Fast • Standards/Governance • Data is simply copied to the file store, no transformation is needed. • A SerDe (Serializer/Deserlizer) is applied during read time to extract the required columns (late binding) • New data can start flowing anytime and will appear retroactively once the SerDe is updated to parse it. • Load is Fast • Flexibility/Agility Pros
  • 11. Scalability: Scalable Software Development 11 Grows without requiring developers to re-architect their algorithms/application. AUTO SCALE
  • 12. Economics: Return on Byte • Return on Byte (ROB) = value to be extracted from that byte divided by the cost of storing that byte • If ROB is < 1 then it will be buried into tape wasteland, thus we need more economical active storage. 12 Low ROB High ROB
  • 13.
  • 14.
  • 15.
  • 16. The Big Data Platform: CDH5 16
  • 17. CDH in the Enterprise Data Stack Logs Files Web Data Relational Databases IDEs BI / Analytics Enterprise Reporting Enterprise Data Warehouse Online Serving Systems Cloudera Manager SYSTEM OPERATORS ENGINEERS ANALYSTS BUSINESS USERS Web/Mobile Applications CUSTOMERS Sqoop Sqoop Sqoop FlumeFlumeFlume Modeling Tools DATA SCIENTISTS DATA ARCHITECTS Meta Data/ ETL Tools ODBC, JDBC, NFS, HTTP 17
  • 18. HBase versus HDFS HDFS: HBase: Use For: • Dimension tables which are updated frequently and require random low- latency lookups. Use For: • Fact tables that are mostly append only and require sequential full table scans. Optimized For: • Large Files • Sequential Access (Hi Throughput) • Append Only Optimized For: • Small Records • Random Access (Lo Latency) • Atomic Record Updates Not Suitable For: • Low Latency Interactive OLAP. 18
  • 19. • Retail: Price Optimization • Media: Content Targeting • Finance: Fraud Detection • Manufacturing: Diagnostics • Info Services: Satellite Imagery • Agriculture: Seed Optimization • Power: Smart Consumption Use Case Examples 19
  • 20. 1. FLEXIBILITY STORE ANY DATA RUN ANY ANALYSIS KEEP’S PACE WITH THE RATE OF CHANGE OF INCOMING DATA 2. SCALABILITY PROVEN GROWTH TO PBS/1,000s OF NODES NO NEED TO REWRITE QUERIES, AUTOMATICALLY SCALES KEEP’S PACE WITH THE RATE OF GROWTH OF INCOMING DATA 3. ECONOMICS COST PER TB AT A FRACTION OF OTHER OPTIONS KEEP ALL OF YOUR DATA ALIVE IN AN ACTIVE ARCHIVE POWERING THE DATA BEATS ALGORITHM MOVEMENT 20 Core Benefits of the Platform for Big Data
  • 21. How do I start? 21 I II III IV 4 Options Cloudera cluster up and running in the Cloud in 24 hours. Use and Excelerate Systems Data Scientist to set customer’s Data strategy.. Get an on-premise Cloudera Cluster up and running in 5 days with 5 nodes and upto 10TB of Data.. Training: Customers who invest in training are generally more successful than those who do not.
  • 22. Cloudera from Excelerate Systems 22 There is a worldwide shortage of Big Data skills, especially in Latin America. Excelerate Systems has invested heavily in building a global network of certified specialists in Cloudera who can design, implement, configure, develop and Support Big Data solutions. No other company in the region has these skills yet. Excelerate Systems is Cloudera’s Primary partner in the region.
  • 23.
  • 24. • 8 Certified Cloudera Developers • 6 Certified Cloudera Administrators • 2 Hbase developers • 2 Hadoop Developers • 2 Data Scientists Excelerate Systems Big Data Resources
  • 25. 25 Questions and next steps David Bennett, CEO David.bennett@exceleratesystems.net Victor Pichardo, President, Victor.pichardo@exceleratesystems.net Alex Campos, Systems Engineer, alex.campos@exceleratesystems.net Plus consulting Resources in various countries
  • 26. 26