http://www.spiral16.com Spiral16 product architect Aaron Weber presented this keynote at DST Systems' 2012 Transfer Agency Executives' Forum in Dallas, TX on Nov. 7, 2012. Big data is a big buzzword right now, but what does it really mean? Big data allows companies to identify trends, target customers more efficiently, run predictive analysis (see Nate Silver for recent proof), and make better use of what you already know.
No one industry has the market cornered on big data. Financial services, healthcare, retail, politics, and marketing can all benefit from aggregating and analyzing big data.
What is big data? Inevitable. And that's a good thing.
How to Get Started in Social Media for Art League City
Demystifying Big Data
1. Demystifying
Big Data
Aaron Weber, Product Architect
Spiral16
2. Since 2007, Spiral16′s group of developers, data
analysts and researchers have been quietly building
one of the most powerful data mining platforms on
the planet, backed up by one of the most
experienced social data analysis teams in the
industry.
Spiral16′s social media and web research
platform and data analysis services are designed to
help everyone from CEOs to CMOs, marketing and
PR agencies, and researchers.
Demystifying Big Data
11/8/2012 2
3. So what is Big Data?
Demystifying Big Data
11/8/2012 3
4. So what is Big Data?
“A collection of data sets so large and
complex that it becomes difficult to
process using on-hand database
management tools.”
- Wikipedia
Demystifying Big Data
11/8/2012 4
5. So what is Big Data?
Exabytes Created By Year
3000
2500
2000
1500
1000
500
0
2006 2007 2008 2009 2010 2011 2012
Demystifying Big Data
11/8/2012 5
6. So what is Big Data?
In 2008 we were generating
as much stored data from the
dawn of civilization to 2003
every two days.
And that rate is predicted to
double every two years.
Demystifying Big Data
11/8/2012 6
7. So what is Big Data?
Demystifying Big Data
11/8/2012 7
8. So what is Big Data?
Inevitable
Demystifying Big Data
11/8/2012 8
9. Big Data vs Little Data
Little Data =
Relational Databases
Demystifying Big Data
11/8/2012 9
10. Big Data vs Little Data
Big Data =
Non-Relational Databases
Demystifying Big Data
11/8/2012 10
11. Big Data vs Little Data
Demystifying Big Data
11/8/2012 11
12. Big Data vs Little Data
• Structured
Demystifying Big Data
11/8/2012 12
13. Big Data vs Little Data
• Structured
• Organized
Demystifying Big Data
11/8/2012 13
14. Big Data vs Little Data
• Structured
• Organized
• Hierarchical
Demystifying Big Data
11/8/2012 14
15. Big Data vs Little Data
• Structured
• Organized
• Hierarchical
• Rigid
Demystifying Big Data
11/8/2012 15
16. Big Data vs Little Data
Demystifying Big Data
11/8/2012 16
17. Big Data vs Little Data
• Unstructured
Demystifying Big Data
11/8/2012 17
18. Big Data vs Little Data
• Unstructured
• Disparate
Demystifying Big Data
11/8/2012 18
19. Big Data vs Little Data
• Unstructured
• Disparate
• Non-Hierarchical
Demystifying Big Data
11/8/2012 19
20. Big Data vs Little Data
• Unstructured
• Disparate
• Non-Hierarchical
• Reusable
Demystifying Big Data
11/8/2012 20
21. So why Big Data?
• Find trends in existing data
Demystifying Big Data
11/8/2012 21
22. So why Big Data?
• Find trends in existing data
• Better consumer targeting
Demystifying Big Data
11/8/2012 22
23. So why Big Data?
• Find trends in existing data
• Better consumer targeting
• Predictive analysis
Demystifying Big Data
11/8/2012 23
24. So why Big Data?
• Find trends in existing data
• Better consumer targeting
• Predictive analysis
• Making better use of what you
already know
Demystifying Big Data
11/8/2012 24
25. Who is using Big Data?
Demystifying Big Data
11/8/2012 25
26. Who is using Big Data?
o Financial Services
o Healthcare
o Retail
o Marketing
o Politics
Demystifying Big Data
11/8/2012 26
28. The Business of Big Data
Vendor (Founded) Founded Funding (in $US mil.) # of Institutional Rounds Investors
SAC Capital, The Founders Fund, Glynn
Capital, In-Q-Tel, Reed Elsevier
Palantir 2004 $301 7
Ventures, Ulu Ventures, Youniversity
Ventures and Jeremy Stoppelman
Mu Sigma 2004 $133 2 General Atlantic and Sequoia Capital
Silver Lake Sumeru, Accel-KKR, Invus
Opera Solutions 2004 $84 1 Financial Advisors, JGE Capital and
Tola Capital
Accel Partners, Greylock Partners and
Cloudera 2008 $81 4
Meritech Capital Partners
New Enterprise Associates, Sequoia
10gen 2008 $73.4 5 Capital, Flybridge Capital and Union
Square Ventures
Amazon, Menlo Ventures, Mohr
Davidow Ventures, Bay Partners,
ParAccel 2005 $73 5 Walden International, Tao Venture
Capital Partners and Silicon Valley
Bank
Andreesen Horowitz, General Catalyst,
O’Reilly AlphaTech Ventures, Windcrest
GoodData 2007 $53.5 3
Partners, Tenaya Capital and Next
World Capital
Ignition Partners, August Capital, JK&B
Splunk(1) 2003 $40 3
and Sevin Rosen Funds
Meritech Capital, Lightspeed Venture
DataStax 2010 $38.7 3 Partners, Sequoia Capital and Crosslink
Capital
1010data 2000 $35 1 Norwest Venture Partners
Demystifying Big Data
11/8/2012 28
30. So what is Big Data?
Inevitable
Demystifying Big Data
11/8/2012 30
31. So what is Big Data?
Inevitable
And that’s a good thing.
Demystifying Big Data
11/8/2012 31
32. Big Data’s Big Questions
o What about the data storage and
utilization we already have?
o How do we know if our data is Big Data?
o What are the primary costs of big data?
o What answers can I get from our data?
o Where do we begin?
Demystifying Big Data
11/8/2012 32
33. Better data. Better decisions.
7171 West 95th Street
Suite 310
Overland Park, KS 2208
913.944.4500
www.spiral16.com
Aaron Weber
aaron.weber@spiral16.com
Hinweis der Redaktion
So how large and complex are we talking about?
In fact we’ve moved past exabytes into zetabytes. To put this in perspective: That’s 2.7 TRILLION gigabytes of data.
So what is Big Data?
There is simply no way with our current levels of technology to process data in the manner we’re used to. Big Data is the technological path we have to take if we have any desire to make sense of the world we’re creating every day.
There is simply no way with our current levels of technology to process data in the manner we’re used to. Big Data is the technological path we have to take if we have any desire to make sense of the world we’re creating every day.
Little Data is a misnomer here. In the real world we’re still talking about massive sets of data. Warehouses full of it, in fact. What we’re really talking about is non-relational vs relational databases. An illustration:
Little Data is a misnomer here. In the real world we’re still talking about massive sets of data. Warehouses full of it, in fact. What we’re really talking about is non-relational vs relational databases. An illustration:
Little Data is a misnomer here. In the real world we’re still talking about massive sets of data. Warehouses full of it, in fact. What we’re really talking about
But more importantly, Big Data is something else:
But more importantly, Big Data is something else:
But more importantly, Big Data is something else:
But more importantly, Big Data is something else:
But more importantly, Big Data is something else:
But more importantly, Big Data is something else:
But more importantly, Big Data is something else:
But more importantly, Big Data is something else:
This last one is the important one: Non-structured data left in its original state is infinitely reusable. Instead of dozens or hundreds of silos of information, you can reduce data (and management) duplication for a unified pool of information that can be used for vastly different ends.
So how large and complex are we talking about?
So how large and complex are we talking about?
So how large and complex are we talking about?
So how large and complex are we talking about?
Gartner’s Hype Cycle – The Peak of Inflated Expectations