SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
The Reality of Big Data
#beltech2014
#1 – What problem are you
trying to solve?
Most of SME’s problems aren’t
Big Data, it’s just data.
Without a question you are
wasting your time.
#2 – Data will need cleaning
Roughly 80% of your data project
will be getting the data into shape
before processing.
Btiany Spears
#3 – Hadoop, on it’s own, will
NOT give you the answers.
#3 – Hadoop, on it’s own, will
NOT give you the answers.
(The Big Data version of “putting it in
the cloud”)
If anyone says, “will Hadoop just
give us the answers” or “put it in the
cloud”, do this….
Spit on one, or both, of their feet
and bite your thumb while shouting:
“The fig of Spain!”.
#4 – Do you actually need
Hadoop?
A well crafted algorithm may give
you more benefit.
It’s about knowing the right
questions.
And refining and refining and
refining…..
The first run won't work at all
The second only makes you wonder
The third will have you on your
knees.....
#5 – Data changes
…especially when you don’t own
it.
If you feel your data has value
then retain it.
If your data passes over the “creepy
line” then definitely retain it.
#6 – Skills are in short supply
Work with what you have.
Play with data, it’s the best way
to learn.
Collaborate with others to fill the
skills gaps.
Thank you
http://about.me/jasebell
@hadooping

Weitere ähnliche Inhalte

Ähnlich wie The Reality of Bigdata - #Beltech2014

Be a Top Notch PM Using Data Science by Farfetch Product Leader
Be a Top Notch PM Using Data Science by Farfetch Product LeaderBe a Top Notch PM Using Data Science by Farfetch Product Leader
Be a Top Notch PM Using Data Science by Farfetch Product LeaderProduct School
 
Want a Data-Driven Culture? Start Sorting Out the BI and Big Data Myths Now
Want a Data-Driven Culture? Start Sorting Out the BI and Big Data Myths NowWant a Data-Driven Culture? Start Sorting Out the BI and Big Data Myths Now
Want a Data-Driven Culture? Start Sorting Out the BI and Big Data Myths NowDana Gardner
 
Future Proofing Your IT Operating Model for Digital
Future Proofing Your IT Operating Model for DigitalFuture Proofing Your IT Operating Model for Digital
Future Proofing Your IT Operating Model for DigitalDavid Favelle
 
The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products Dataiku
 
Big data & hadoop Introduction
Big data & hadoop IntroductionBig data & hadoop Introduction
Big data & hadoop IntroductionMohit Shukla
 
Semantech Inc. - Mastering Enterprise Big Data - Intro
Semantech Inc. - Mastering Enterprise Big Data - IntroSemantech Inc. - Mastering Enterprise Big Data - Intro
Semantech Inc. - Mastering Enterprise Big Data - IntroStephen Lahanas
 
Is That Normal? Behaviour Modelling On The Cheap
Is That Normal? Behaviour Modelling On The CheapIs That Normal? Behaviour Modelling On The Cheap
Is That Normal? Behaviour Modelling On The CheapMark Nunnikhoven
 
DataDay 2023 Presentation - Notes
DataDay 2023 Presentation - NotesDataDay 2023 Presentation - Notes
DataDay 2023 Presentation - NotesMax De Marzi
 
Data Mining and Data Warehouse
Data Mining and Data WarehouseData Mining and Data Warehouse
Data Mining and Data WarehouseAnupam Sharma
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)mark madsen
 
Module 6 The Future of Big and Smart Data- Online
Module 6 The Future of Big and Smart Data- Online Module 6 The Future of Big and Smart Data- Online
Module 6 The Future of Big and Smart Data- Online caniceconsulting
 
Myths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data ScientistsMyths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data ScientistsIBM Analytics
 
Building a Data Driven Organization
Building a Data Driven OrganizationBuilding a Data Driven Organization
Building a Data Driven OrganizationIT Weekend
 
The 7 steps to escape spreadsheet hell
The 7 steps to escape spreadsheet hellThe 7 steps to escape spreadsheet hell
The 7 steps to escape spreadsheet hellVicky Bradford
 
Idiots guide to setting up a data science team
Idiots guide to setting up a data science teamIdiots guide to setting up a data science team
Idiots guide to setting up a data science teamAshish Bansal
 
The Myths of Big Data
The Myths of Big DataThe Myths of Big Data
The Myths of Big DataProphet
 
How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
How to Build a Successful Data Team - Florian Douetteau (@Dataiku) How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
How to Build a Successful Data Team - Florian Douetteau (@Dataiku) Dataiku
 
How to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
How to Build a Successful Data Team - Florian Douetteau @ PAPIs ConnectHow to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
How to Build a Successful Data Team - Florian Douetteau @ PAPIs ConnectPAPIs.io
 

Ähnlich wie The Reality of Bigdata - #Beltech2014 (20)

Be a Top Notch PM Using Data Science by Farfetch Product Leader
Be a Top Notch PM Using Data Science by Farfetch Product LeaderBe a Top Notch PM Using Data Science by Farfetch Product Leader
Be a Top Notch PM Using Data Science by Farfetch Product Leader
 
Want a Data-Driven Culture? Start Sorting Out the BI and Big Data Myths Now
Want a Data-Driven Culture? Start Sorting Out the BI and Big Data Myths NowWant a Data-Driven Culture? Start Sorting Out the BI and Big Data Myths Now
Want a Data-Driven Culture? Start Sorting Out the BI and Big Data Myths Now
 
Future Proofing Your IT Operating Model for Digital
Future Proofing Your IT Operating Model for DigitalFuture Proofing Your IT Operating Model for Digital
Future Proofing Your IT Operating Model for Digital
 
The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products
 
Analyzing social media with Python and other tools (1/4)
Analyzing social media with Python and other tools (1/4)Analyzing social media with Python and other tools (1/4)
Analyzing social media with Python and other tools (1/4)
 
Big data & hadoop Introduction
Big data & hadoop IntroductionBig data & hadoop Introduction
Big data & hadoop Introduction
 
Semantech Inc. - Mastering Enterprise Big Data - Intro
Semantech Inc. - Mastering Enterprise Big Data - IntroSemantech Inc. - Mastering Enterprise Big Data - Intro
Semantech Inc. - Mastering Enterprise Big Data - Intro
 
Is That Normal? Behaviour Modelling On The Cheap
Is That Normal? Behaviour Modelling On The CheapIs That Normal? Behaviour Modelling On The Cheap
Is That Normal? Behaviour Modelling On The Cheap
 
DataDay 2023 Presentation - Notes
DataDay 2023 Presentation - NotesDataDay 2023 Presentation - Notes
DataDay 2023 Presentation - Notes
 
Data Mining and Data Warehouse
Data Mining and Data WarehouseData Mining and Data Warehouse
Data Mining and Data Warehouse
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
Module 6 The Future of Big and Smart Data- Online
Module 6 The Future of Big and Smart Data- Online Module 6 The Future of Big and Smart Data- Online
Module 6 The Future of Big and Smart Data- Online
 
Myths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data ScientistsMyths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data Scientists
 
ANALYTICS OF DATA USING HADOOP-A REVIEW
ANALYTICS OF DATA USING HADOOP-A REVIEWANALYTICS OF DATA USING HADOOP-A REVIEW
ANALYTICS OF DATA USING HADOOP-A REVIEW
 
Building a Data Driven Organization
Building a Data Driven OrganizationBuilding a Data Driven Organization
Building a Data Driven Organization
 
The 7 steps to escape spreadsheet hell
The 7 steps to escape spreadsheet hellThe 7 steps to escape spreadsheet hell
The 7 steps to escape spreadsheet hell
 
Idiots guide to setting up a data science team
Idiots guide to setting up a data science teamIdiots guide to setting up a data science team
Idiots guide to setting up a data science team
 
The Myths of Big Data
The Myths of Big DataThe Myths of Big Data
The Myths of Big Data
 
How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
How to Build a Successful Data Team - Florian Douetteau (@Dataiku) How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
 
How to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
How to Build a Successful Data Team - Florian Douetteau @ PAPIs ConnectHow to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
How to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
 

Kürzlich hochgeladen

CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)Data & Analytics Magazin
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024Becky Burwell
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.JasonViviers2
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptaigil2
 

Kürzlich hochgeladen (17)

CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .ppt
 

The Reality of Bigdata - #Beltech2014