Introduction to Big Data (non-technical) and the importance of Data Science to create meaning.
First of all we define Big Data in the light of the 3 Vs: volume, velocity and variety; next we move on to redefine Big Data, and we touch the topic of a data lake. We envision that Big Data will become mainstream for small organisations as well, what we can do with Big Data, how to tackle Big Data projects, what challenges lie ahead, but what opportunities are there to reap. And of course how important data science is to find the meaning in all the data.
3. –Edd Dumbill
“Big data is data that exceeds the processing
capacity of conventional database systems.
The data is too big, moves too fast, or doesn’t fit
the strictures of your database architectures.”
3
http://radar.oreilly.com/2012/01/what-is-big-data.html
What is Big Data?
4. The 3 V’s of Big Data
4
• Volume
• Velocity
• Variety
• (Veracity)
9. New tools and technologies to store and
process all data on a cluster of commodity
hardware so that the system acts as one, is
resilient and scales linearly.
9
What is Big Data? — revisited
10. So what?
10
the data lake is a large data pool
in which the schema and data requirements are not defined
until the data is queried, processed, analysed
or delivered as information to the end-user
11. –???
“We don’t do Hadoop because we have Big
Data; we do Big Data because we have
Hadoop.”
11
So what?
12. –Matt Ehrlichman
“In the years ahead, the same power that big
data awards enterprise companies will be the
norm for small business.”
12
So what?
http://blogs.wsj.com/accelerators/2014/10/31/matt-ehrlichman-big-data-for-small-firms/
13. 13
What does Big Data enable?
• Combine data from within and without your
organisation
• Build new products and services
• Analyse all data (e.g. 5TB historic event data at rest in Oracle db)
14. Big Data is no panacea
14
• First decide what problem you want to solve; pick a
real business problem to add immediate value
• Start small, the technology is made for linear
scalability (a 3-node cluster is a cluster!)
• Then become lean: learn through experimentation
15. Big Data challenges
• Beware of hype, Big Data - washing and fad
• Tech infancy
• IT | Biz
• Data is hard
• Lack of skills!
shameless self plug: BigBoards!
15
16. Big Data opportunity
• Big Data is here to stay
• Vendor market is HUGE and will grow massively as
Big Data will blend in within the datacenter
• However, the Practitioner market can deliver
EXPONENTIALLY more value
16
17. 17
It is time to band together
and build these systems that deliver this kind of value
for fun
for profit
for good
for Belgium?
Call for Action