Join Cloudian, Hortonworks and 451 Research for a panel-style Q&A discussion about the latest trends and technology innovations in Big Data and Analytics. Matt Aslett, Data Platforms and Analytics Research Director at 451 Research, John Kreisa, Vice President of Strategic Marketing at Hortonworks, and Paul Turner, Chief Marketing Officer at Cloudian, will answer your toughest questions about data storage, data analytics, log data, sensor data and the Internet of Things. Bring your questions or just come and listen!
What is Advanced Excel and what are some best practices for designing and cre...
Cloudian 451-hortonworks - webinar
1. Big Data Storage and Analytics Q&A
Matthew Aslett, research director
2. 2
Webinar Logistics
● Be on the look-out for polling questions
● You may ask questions at any time during the presentation by using the
Q&A box
● ON-Demand Viewers please tweet us questions @cloudianstorage
● At the end of the presentation please provide feedback and rate us
3. 451 Research is an information
technology research & advisory company
Founded in 2000
210+ employees, including over 100 analysts
1,000+ clients: Technology & Service providers, corporate
advisory, finance, professional services, and IT decision makers
12,500+ senior IT professionals in our research community
Over 52 million data points each quarter
4,500+ reports published each year covering 2,000+
innovative technology & service providers
Headquartered in New York City with offices in London,
Boston, San Francisco, and Washington D.C.
451 Research and its sister company Uptime Institute
comprise the two divisions of The 451 Group
Research & Data
Advisory Services
Events
3
Copyright (C) 2015 451 Research LLC
4. 4
Our Speakers
4
Paul Turner leads marketing, product planning and strategy at Cloudian. A storage
industry expert, he joined Cloudian from NetApp where he ran the Product Strategy Office,
guiding their investments into FlashRay,Iongrid and CacheIQ. Paul has more than 23
years of development and management leadership, including 15 years at Oracle.
Matt Aslet, Research Director for the data platforms and analytics research channel, has
overall responsibility for the coverage of operational and analytic databases, data
integration, data quality, and business intelligence. Matt's own primary area of focus is on
relational and non-relational databases - including NoSQL and NewSQL - data warehousing,
data caching, and Hadoop. Matthew is also an expert in open source software and regularly
contributes to 451 Research's open source-related research.
John Kreisa A veteran from the enterprise marketing industry, John has worked on products
at every level of the IT stack from the depths of storage through to the insight of business
intelligence and analytics. Currently John leads partner and strategic marketing initiatives at
open source leader Hortonworks who develops, distributes and supports Apache Hadoop.
5. • Apache Hadoop
• Object storage
• NoSQL
• Steam processing
• Predictive analytics
• Data wrangling
Big data: cause and effect
5
Copyright (C) 2015 451 Research LLC
CAUSE?
6. • Apache Hadoop
• Object storage
• NoSQL
• Steam processing
• Predictive analytics
• Data wrangling
Big data: cause and effect
• Volume
• Velocity
• Variety
EFFECT
6
Copyright (C) 2015 451 Research LLC
CAUSE?
7. • Apache Hadoop
• Object storage
• NoSQL
• Steam processing
• Predictive analytics
• Data wrangling
Big data: cause and effect
• Volume
• Velocity
• Variety
EFFECTEFFECTEDCAUSE
7
Copyright (C) 2015 451 Research LLC
8. • Apache Hadoop
• Object storage
• NoSQL
• Steam processing
• Predictive analytics
• Data wrangling
Big data: cause and effect
• Volume
• Velocity
• Variety
Economics:
• Commodity hardware
• Open source software
EFFECTEFFECTEDCAUSE
8
Copyright (C) 2015 451 Research LLC
9. Big data is driven by economics
9
“Big
data
is
what
happened
when
the
cost
of
keeping
informa5on
became
less
than
the
cost
of
throwing
it
away.”
–
George
Dyson
“Big
data:
New
business
insights
based
on
storing,
processing
and
analyzing
data
that
was
previously
ignored
due
to
the
cost
and
func5onal
limita5ons
of
tradi5onal
data
management
technologies.”
–
451
Research
Copyright (C) 2015 451 Research LLC
10. Big data is driven by economics
10
Copyright (C) 2015 451 Research LLC
What
happened
when
the
cost
of
keeping
informa5on
became
less
than
the
cost
of
throwing
it
away?
11. Big data is driven by economics
11
What
happened
when
the
cost
of
keeping
informa5on
became
less
than
the
cost
of
throwing
it
away?
• The
processing
and
analysis
of
very
large
data
sets
in
their
en5rety
• Increased
adop5on
of
massively
parallel
processing
approaches
• Storage
and
analysis
of
both
structured
and
mul5-‐structured
data
• Integra5on
of
external
(social)
and
corporate
data
for
more
complete
perspec5ve
• Schema-‐free
and
schema-‐on-‐read
approaches
to
data
storage/analysis
• Adop5on
of
exploratory
analy5c
approaches
to
iden5fy
new
paSerns
in
data
• Predic5ve
analy5cs
as
a
fundamental
component
of
BI
strategies
• Machine-‐learning
algorithms
automate
the
reflec5on
of
collec5ve
intelligence
• Increased
adop5on
of
in-‐memory
databases
for
rapid
data
inges5on
• Real-‐5me
analysis
of
data
prior
to
storage
within
the
data
warehouse/Hadoop
• Interac5ve,
na5ve,
SQL-‐based
analysis
of
data
in
Hadoop
and
HBase
• Large-‐scale
processing
of
sensor
and
other
machine-‐generated
data/events
Copyright (C) 2015 451 Research LLC
12. • Apache Hadoop
• Object storage
• NoSQL
• Steam processing
• Predictive analytics
• Data wrangling
Big data: cause and effect
• Volume
• Velocity
• Variety
Economics:
• Commodity hardware
• Open source software
EFFECTEFFECTEDCAUSE
12
IoT
Copyright (C) 2015 451 Research LLC
16. 16
Your Data at Webscale Economics
16
HyperStore:
SoZware
Defined
Storage
REPLICATION
(RF=1,2,3,4)
ERASURE
CODING
(N+1,2,3,4)
COMPRESSION
(Zlib,lz4)
Commodity
Servers
Scale
Out
Durable
Simple
to
Use
CPU
Disks
Network
Heterogeneous
Node
100TB
300TB
17. 17
Smart Data
17
Consumer Activity
(Events, GPS, WiFi)
Social MediaDevice Tracking and Logs
Cloudian HyperStore
INTERNET
OF
THINGS
BIG
DATA
Event
processing
plaMorm
ü Analyze more – allows for efficient bulk
data analysis in place
ü Faster time-to-decision
ü HyperStore scales out with your data –
adding nodes for I/O
Analytics
Result of Analysis
20. 20
Use Cases
20
Hadoop for Internet of Things
Clickstream data Sentiment data Server log data Sensor data
Analysis of what people click on –
Individual web pages and in what
order.
Clickstream analysis can reveal
how users research products and
also how they complete their
online purchases.
ü Internet Marketing
ü Online Commerce
Unstructured data on opinions,
emotions, and attitudes from
sources like social media posts,
blogs, online product reviews and
customer support interactions.
Organizations use sentiment
analysis to understand how the
public feels about something and
track how those opinions change
over time.
ü Retail
ü Media & Entertainment
Large enterprises build, manage
and protect their own proprietary,
distributed information networks.
Server logs are the computer-
generated records that report
data on the operations of those
networks.
When there is a problem, its one
of the first places the IT team
looks for a diagnosis.
ü IT Organizations
ü Customer Support
From refrigerators and coffee
makers to energy-measuring
smart meters, sensor data is
everywhere. It is created by the
machinery that runs assembly
lines and the cell towers that
route our phone calls.
It is net new data that is
increasing exponential in the
information age.
ü Manufacturing
ü Industrial