The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
Matthew Johnston - Big Data Futures Outlook BCM
1. Big Data overview & future
considerations for the BCM Professional
Matthew Johnston
Managing Director, South Asia
Dell | Software
1
10/31/2013
Software
4. The world of data is changing
4.3
connected
devices per adult
85%
10X
increase every
five years
27%
4
from new data
types
use social
media
10/31/2013
By 2015, organizations
that build a modern
information management
system will outperform
their peers financially by
20 percent.
Gartner, “Information Management in the 21st
Century“
Software
5. Data is created and consumed at a rapid pace…
$232
billion dollars will be spent
on Big Data through 2016
70%
of data is created by consumers. But
enterprises are responsible for storing and
managing 80% of it.
247 billion
emails are sent every day. 80% are
spam.
4.4 million $600 billion
IT jobs globally will be created to
support big data. Only 1/3 will be
filled.
dollars in waste annually for bad data or
poor quality data.
48 hours
37.5%
1.8 Zettabytes
200 million
of large organizations said
that analyzing big data is
their biggest challenge.
5
10/31/2013
of business data in use in 2011, up 30%
from 2010.
of video are uploaded to
YouTube every minute,
resulting in 8 yeas of content
daily.
photos are uploaded to Facebook
every day. That’s 6 billion pictures
every month.
Software
6. …generating new sets of questions
Why is our product more
popular with teenagers?
Advanced Analytics
How do I capture,
analyze, and manage all
of this data?
6
10/31/2013
6
How will my social media
campaign impact my
product launch?
Will monsoons impact my
sales in Indonesia and
parts availability from my
suppliers next quarter?
Social and Web Analytics
Live Data Feeds
How do I turn this data
into operational
intelligence?
How do I make the
connections?
Software
7. What is Big Data?
Information of increasing…
Volume
A large amount of
data, growing at rapid
rates
Variety
Velocity
Wide range of data
types and structure
Data that must be
processed at high
speed to facilitate
rapid decisions
Results in datasets too large or
complex for typical database or
data management tools to…
7
10/31/2013
7
capture
manage
analyze
Software
8. Big Data versus traditional database
Traditional Database
“Schema-on-Write”
•
1.
2.
8
10/31/2013
Reads are fast
Standards and governance
Pros
As data is being read into HDFS, the
required columns are extracted
during the process
•
New columns must be added
explicitly before new data for
those columns can be loaded
into the database
Data is simply copied to the file
store (HDFS), no transformation is
needed
•
An explicit load operation has to
take place in order to transform
data to the DB internal structure
•
•
Schema must be created before
any data can be loaded
•
Big Data (Hadoop)
“Schema-on-Read”
New data can start flowing at any
time since the schema is created as
part of the process
1.
2.
Loads are fast
Flexibility and agility
Software
9. Forces driving Big Data advancement
HPC
Enabling exascale computing on massive data sets
Cloud
Helping enterprises build open interoperable clouds
Open Source
Contributing code and fostering ecosystem
9
10/31/2013
9
Software
10. Think Big Data when…
Impossible / impractical to
perform data analysis with
existing technology stack
Relevant data exists across
multiple data sources and
various formats
Streams of data are being
generated, but capturing,
storing and processing
presents challenges
Cost to scale is prohibitively
high
Large volumes of useful
archived data resides on
tapes (unrecoverable after a
certain period of time)
Most of the data needs to be
analyzed rather than just a
small subset of the data
10 10/31/2013
Software
12. Big Data in action: Mobile subscriber QoS
• Measure, save
Replicate thin tocompare
and understand what factors influence the number of
people visiting a location at any time
• Use analysis to improve subscriber quality of service
12
10/31/2013
Software
13. Big Data in action: IP TV subscriber
recommendation engine
•
Collect subscriber clickstreams and viewing history
•
Add subscriber metadata from web-based movie database
•
Provide viewing recommendations to subscribers
Clickstreams
13
10/31/2013
13
EPG
VoD
Software
14. Big Data in action: Financial services
Overcome increasingly cumbersome data volumes scaling into petabytes that hinder
analysis
Gain operational efficiency by moving jobs to technology designed to process multiple
data types
Empower business users to ask different questions to improve decision making
14
10/31/2013
Software
16. What is BCM and why Big Data?
Business Continuity
Management (BCM)
Is a holistic management process which helps to
assess, plan and strengthen the resilience of your
value chain.
Business Continuity National Focal Point - http://www.bcm.org.sg/
Big Data
16
10/31/2013
Enables the assessment & analysis of large disparate
data sets.
Software
17. Use cases for Big Data (Hadoop)
Predictive Analytics
(bigger questions)
Customer 360
View
Content
Optimization
Recommendation
Engine
Network Analytics
Fraud Detection
EDW
Augmentation
ETL Offload
Batch Processing
Data Reservoir
Log Processing
Operational Data Processing
(data pain points)
17
10/31/2013
Software
18. Use cases relevant to BCM
Predictive Analytics
(bigger questions)
Customer 360
View
Content
Optimization
Recommendation
Engine
Network Analytics
Fraud Detection
EDW
Augmentation
ETL Offload
Batch Processing
Data Reservoir
Log Processing
Operational Data Processing
(data pain points)
18 10/31/2013
Software
20. Information Management
Manage data and
databases across
structured and nonstructured data
sources - cloud or
on-premise
Integrate data
disparate data
stores, cloud
and on-prem
Database management
Application & data
integration
Business intelligence and
Big data analytics
Discover trends and
make informed
decisions with
analysis based on all
data
20 10/31/2013
Software
21. Information Management
Database management
Application & data
integration
Business intelligence and
Big data analytics
Data
type
agnostic
Vendor
agnostic
Data
location
agnostic
One Vendor – Complete Tool Chain - All Data
21
10/31/2013
Software
22. Dell Software Solutions
Data center & cloud
management
Information
management
Client management
Performance management
Virtualization & cloud mgmt
Windows server mgmt
Database management
Application & data integration
Business intelligence/analytics
Security
22 10/31/2013
Identity & access management
Network security
Endpoint security
Email security
Mobile workforce
management
Mobile device mgmt
Desktop virtualization
Application/data access
Secure remote access
Data protection
Enterprise backup & recovery
Virtual protection
Application protection
Disaster recovery
Software
23. Summary
• BCM is about risk mitigation
• Big Data is about understanding data
• BCM + Big Data = Better understanding & analysis of the risk
23 10/31/2013
Software