SlideShare a Scribd company logo
1 of 35
Download to read offline
The Evolution of Data Analytics
about:
how to grok data with machines
and keep up with changing times
The origins (40s, 50s, 60s)
Operation Research during World War II
First Predictive Weather Model on ENIAC
The origins (40s, 50s, 60s)
● Operational Research
● Collision loss vs Anti-Aircraft loss
● Optimization (Statistical) problems
● Scheduling and resource allocation
The origins (40s, 50s, 60s)
● ENIAC predicting weather
● Barometric equations
● 24 hours compute time (mostly manual work)
Analytics goes Mainstream
(70s, 80s)
● The Relational Database is born!
1972: E.F. Codd relational database model, normalization:
(free from insertion, deletion and update anomalies)
1978: Peter Chen, The entity-relationship model
● 1982: IBM DB2, Oracle v3, Sybase (SAP)
● 1986: First standardized SQL
● 1987: Commercial use of Decision Support Systems:
Texas Air Traffic Expert system
Analytics goes Mainstream
(70s, 80s)
http://www-03.ibm.com/ibm/history/ibm100/us/en/icons/system360/impacts/
Exploratory Data Analysis
In 1977, Tukey published Exploratory Data Analysis,
arguing that more emphasis needed to be placed on using
data to suggest hypotheses to test and that Exploratory
Data Analysis and Confirmatory Data Analysis “can—and
should—proceed side by side.”
Analytics goes Mainstream
(70s, 80s)
The Internet goes Global
(90s)
● 1995: Amazon
● 1995: eBay
● 1996: HotMail
● 1998: Google
● 1998: Paypal
Knowledge Data in Databases (1996)
Knowledge Data in Databases (1996)
What is all the excitement about? This article provides an overview of
this emerging field, clarifying how data mining and knowledge
discovery in databases are related both to each other and to related
fields, such as machine learning, statistics, and databases.
AI Magazine Volume 17 Number 3 (1996) (© AAAI)
http://www.aaai.org/ojs/index.php/aimagazine/article/view/1230/1131
The Internet goes Global
(90s)
● Analytics (OLAP):
Long queries, aggregations, data mining, reporting, models
● Operations (OLTP):
Fast transactions, ACID, consistent, available, fault-tolerant
Data warehouses and ETLs (90s)
● Building the Data Warehouse by
William Inmon (John Wiley - QED,
1992)
The World goes Social
(00s)
Web apps go in hyper - growth
● 2003: LinkedIn
● 2003: Skype
● 2004: Facebook
● 2006: Twitter
The advent of MPP OLAPs (Early 00s)
● Massive multi-rack systems
● 100’s of Computing Cores
● 100’s Terabytes of Storage
● Distributed computing
● Advanced Query Plans
● Columnar Data Models
● Re-programmable hardware
● Vertica (HP)
● Greenplum (Pivotal)
● Netezza (IBM)
● Exadata (Oracle)
● Exasol (Exasol)
The advent of MPP OLAPs (Early 00s)
Map-Reduce and Hadoop (Early 00s)
● Simpler programming paradigm
● Distributed, Replicated File System
Map-Reduce and Hadoop (Early 00s)
Hadoop or MPPs or both?
Hadoop and MPPs (00s)
● MPP
for speed and accuracy,
well structured data
● Hadoop
for size, flexibility, raw files
http://flowingdata.com/2009/06/04/rise-of-the-data-scientist/
http://medriscoll.com/post/4740157098/the-three-sexy-skills-of-data-geeks
The rise of the data scientist (late 00s)
Fast Data, APIs, Mobile and IoT (10s)
● WhatsApp: in a day
● 31 billion messages sent
● 700 million photo’s sent
Fast Data, APIs, Mobile and IoT (10s)
New Problems:
● Hadoop is too slow (File -> File)
● Productivity of Data Science goes down
● SQL is not enough
● Distributed Machine Learning algorithms?
Streaming and Real-Time Analytics (10s)
The RAM is the new Disk (10s)
Spark is a new framework for in-memory computing
Unify in a Distributed Computing paradigm:
SQL, Machine Learning, Map-Reduce, Graph Analytics
Spark
Generality
Combine SQL, streaming, and
complex analytics.
Runs Everywhere
Spark runs on Hadoop, Mesos,
standalone, or in the cloud.
Multiple Data Sources
It can access diverse data
sources including HDFS,
Cassandra, HBase, and S3.
https://spark.apache.org/
Popular Analytical Stacks (10s)
Hadoop Hive + MPP
Spark + Cassandra (no Hadoop!)
Spark + HDFS + Elastic(Search)
Future (10s, 20s)
Micro-Batch and Event Streaming Analytics
- Micro-Batch (Spark Streaming)
- Log Oriented (Kafka, Samza)
- NewSQL (VoldDB)
Takeaways
1) SQL is there to stay
2) Data Science must be easy to program
3) Memory is King
4) Spark is the new Hadoop
The evolution of data analytics

More Related Content

What's hot

Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval s
silambu111
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data mining
Slideshare
 

What's hot (20)

Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval s
 
Lecture6 introduction to data streams
Lecture6 introduction to data streamsLecture6 introduction to data streams
Lecture6 introduction to data streams
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big data
 
Data mining
Data mining Data mining
Data mining
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and prediction
 
WEB MINING.
WEB MINING.WEB MINING.
WEB MINING.
 
Big Data
Big DataBig Data
Big Data
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Data Science: Past, Present, and Future
Data Science: Past, Present, and FutureData Science: Past, Present, and Future
Data Science: Past, Present, and Future
 
Introduction data mining
Introduction data miningIntroduction data mining
Introduction data mining
 
Application of data mining
Application of data miningApplication of data mining
Application of data mining
 
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Data Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessingData Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessing
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
OLAP v/s OLTP
OLAP v/s OLTPOLAP v/s OLTP
OLAP v/s OLTP
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data mining
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Data Mining: What is Data Mining?
Data Mining: What is Data Mining?Data Mining: What is Data Mining?
Data Mining: What is Data Mining?
 

Similar to The evolution of data analytics

Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
BigDataEverywhere
 
Big Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and RoadmapBig Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and Roadmap
Srinath Perera
 
Building your big data solution
Building your big data solution Building your big data solution
Building your big data solution
WSO2
 

Similar to The evolution of data analytics (20)

Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
 
Apache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big DataApache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big Data
 
Hadoop Tutorial with @techmilind
Hadoop Tutorial with @techmilindHadoop Tutorial with @techmilind
Hadoop Tutorial with @techmilind
 
Big data distributed processing: Spark introduction
Big data distributed processing: Spark introductionBig data distributed processing: Spark introduction
Big data distributed processing: Spark introduction
 
High Performance Computing and Big Data
High Performance Computing and Big Data High Performance Computing and Big Data
High Performance Computing and Big Data
 
07 data structures_and_representations
07 data structures_and_representations07 data structures_and_representations
07 data structures_and_representations
 
Hadoop: A distributed framework for Big Data
Hadoop: A distributed framework for Big DataHadoop: A distributed framework for Big Data
Hadoop: A distributed framework for Big Data
 
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
 
Tech
TechTech
Tech
 
Big Data and OSS at IBM
Big Data and OSS at IBMBig Data and OSS at IBM
Big Data and OSS at IBM
 
How Apache Spark fits into the Big Data landscape
How Apache Spark fits into the Big Data landscapeHow Apache Spark fits into the Big Data landscape
How Apache Spark fits into the Big Data landscape
 
2021 04-20 apache arrow and its impact on the database industry.pptx
2021 04-20  apache arrow and its impact on the database industry.pptx2021 04-20  apache arrow and its impact on the database industry.pptx
2021 04-20 apache arrow and its impact on the database industry.pptx
 
Big Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and RoadmapBig Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and Roadmap
 
Die Bedeutung von Machine Learning für den e-Commerce am Beispiel von Amazon
Die Bedeutung von Machine Learning für den e-Commerce am Beispiel von AmazonDie Bedeutung von Machine Learning für den e-Commerce am Beispiel von Amazon
Die Bedeutung von Machine Learning für den e-Commerce am Beispiel von Amazon
 
Inroduction to Big Data
Inroduction to Big DataInroduction to Big Data
Inroduction to Big Data
 
Big Data Analysis Patterns with Hadoop, Mahout and Solr
Big Data Analysis Patterns with Hadoop, Mahout and SolrBig Data Analysis Patterns with Hadoop, Mahout and Solr
Big Data Analysis Patterns with Hadoop, Mahout and Solr
 
1 mapreduce-fest
1 mapreduce-fest1 mapreduce-fest
1 mapreduce-fest
 
Classification of Big Data Use Cases by different Facets
Classification of Big Data Use Cases by different FacetsClassification of Big Data Use Cases by different Facets
Classification of Big Data Use Cases by different Facets
 
Cs501 dm intro
Cs501 dm introCs501 dm intro
Cs501 dm intro
 
Building your big data solution
Building your big data solution Building your big data solution
Building your big data solution
 

More from Natalino Busa

Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...
Natalino Busa
 
Streaming Api Design with Akka, Scala and Spray
Streaming Api Design with Akka, Scala and SprayStreaming Api Design with Akka, Scala and Spray
Streaming Api Design with Akka, Scala and Spray
Natalino Busa
 
Yo. big data. understanding data science in the era of big data.
Yo. big data. understanding data science in the era of big data.Yo. big data. understanding data science in the era of big data.
Yo. big data. understanding data science in the era of big data.
Natalino Busa
 
Big Data and APIs - a recon tour on how to successfully do Big Data analytics
Big Data and APIs - a recon tour on how to successfully do Big Data analyticsBig Data and APIs - a recon tour on how to successfully do Big Data analytics
Big Data and APIs - a recon tour on how to successfully do Big Data analytics
Natalino Busa
 

More from Natalino Busa (19)

Data Production Pipelines: Legacy, practices, and innovation
Data Production Pipelines: Legacy, practices, and innovationData Production Pipelines: Legacy, practices, and innovation
Data Production Pipelines: Legacy, practices, and innovation
 
Data science apps powered by Jupyter Notebooks
Data science apps powered by Jupyter NotebooksData science apps powered by Jupyter Notebooks
Data science apps powered by Jupyter Notebooks
 
7 steps for highly effective deep neural networks
7 steps for highly effective deep neural networks7 steps for highly effective deep neural networks
7 steps for highly effective deep neural networks
 
Data science apps: beyond notebooks
Data science apps: beyond notebooksData science apps: beyond notebooks
Data science apps: beyond notebooks
 
[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditing[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditing
 
Strata London 16: sightseeing, venues, and friends
Strata  London 16: sightseeing, venues, and friendsStrata  London 16: sightseeing, venues, and friends
Strata London 16: sightseeing, venues, and friends
 
Data in Action
Data in ActionData in Action
Data in Action
 
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Real-Time Anomaly Detection  with Spark MLlib, Akka and  CassandraReal-Time Anomaly Detection  with Spark MLlib, Akka and  Cassandra
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
 
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...
 
Streaming Api Design with Akka, Scala and Spray
Streaming Api Design with Akka, Scala and SprayStreaming Api Design with Akka, Scala and Spray
Streaming Api Design with Akka, Scala and Spray
 
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
 
Big data solutions for advanced marketing analytics
Big data solutions for advanced marketing analyticsBig data solutions for advanced marketing analytics
Big data solutions for advanced marketing analytics
 
Awesome Banking API's
Awesome Banking API'sAwesome Banking API's
Awesome Banking API's
 
Yo. big data. understanding data science in the era of big data.
Yo. big data. understanding data science in the era of big data.Yo. big data. understanding data science in the era of big data.
Yo. big data. understanding data science in the era of big data.
 
Big and fast a quest for relevant and real-time analytics
Big and fast a quest for relevant and real-time analyticsBig and fast a quest for relevant and real-time analytics
Big and fast a quest for relevant and real-time analytics
 
Big Data and APIs - a recon tour on how to successfully do Big Data analytics
Big Data and APIs - a recon tour on how to successfully do Big Data analyticsBig Data and APIs - a recon tour on how to successfully do Big Data analytics
Big Data and APIs - a recon tour on how to successfully do Big Data analytics
 
Strata 2014: Data science and big data trending topics
Strata 2014: Data science and big data trending topicsStrata 2014: Data science and big data trending topics
Strata 2014: Data science and big data trending topics
 
Streaming computing: architectures, and tchnologies
Streaming computing: architectures, and tchnologiesStreaming computing: architectures, and tchnologies
Streaming computing: architectures, and tchnologies
 
Big data landscape
Big data landscapeBig data landscape
Big data landscape
 

Recently uploaded

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
karishmasinghjnh
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 

Recently uploaded (20)

DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 

The evolution of data analytics

  • 1. The Evolution of Data Analytics
  • 2.
  • 3. about: how to grok data with machines and keep up with changing times
  • 4. The origins (40s, 50s, 60s) Operation Research during World War II First Predictive Weather Model on ENIAC
  • 5.
  • 6. The origins (40s, 50s, 60s) ● Operational Research ● Collision loss vs Anti-Aircraft loss ● Optimization (Statistical) problems ● Scheduling and resource allocation
  • 7.
  • 8. The origins (40s, 50s, 60s) ● ENIAC predicting weather ● Barometric equations ● 24 hours compute time (mostly manual work)
  • 9. Analytics goes Mainstream (70s, 80s) ● The Relational Database is born! 1972: E.F. Codd relational database model, normalization: (free from insertion, deletion and update anomalies) 1978: Peter Chen, The entity-relationship model
  • 10. ● 1982: IBM DB2, Oracle v3, Sybase (SAP) ● 1986: First standardized SQL ● 1987: Commercial use of Decision Support Systems: Texas Air Traffic Expert system Analytics goes Mainstream (70s, 80s)
  • 12. Exploratory Data Analysis In 1977, Tukey published Exploratory Data Analysis, arguing that more emphasis needed to be placed on using data to suggest hypotheses to test and that Exploratory Data Analysis and Confirmatory Data Analysis “can—and should—proceed side by side.” Analytics goes Mainstream (70s, 80s)
  • 13. The Internet goes Global (90s) ● 1995: Amazon ● 1995: eBay ● 1996: HotMail ● 1998: Google ● 1998: Paypal
  • 14. Knowledge Data in Databases (1996)
  • 15. Knowledge Data in Databases (1996) What is all the excitement about? This article provides an overview of this emerging field, clarifying how data mining and knowledge discovery in databases are related both to each other and to related fields, such as machine learning, statistics, and databases. AI Magazine Volume 17 Number 3 (1996) (© AAAI) http://www.aaai.org/ojs/index.php/aimagazine/article/view/1230/1131
  • 16. The Internet goes Global (90s) ● Analytics (OLAP): Long queries, aggregations, data mining, reporting, models ● Operations (OLTP): Fast transactions, ACID, consistent, available, fault-tolerant
  • 17. Data warehouses and ETLs (90s) ● Building the Data Warehouse by William Inmon (John Wiley - QED, 1992)
  • 18. The World goes Social (00s) Web apps go in hyper - growth ● 2003: LinkedIn ● 2003: Skype ● 2004: Facebook ● 2006: Twitter
  • 19.
  • 20. The advent of MPP OLAPs (Early 00s) ● Massive multi-rack systems ● 100’s of Computing Cores ● 100’s Terabytes of Storage ● Distributed computing ● Advanced Query Plans ● Columnar Data Models ● Re-programmable hardware
  • 21. ● Vertica (HP) ● Greenplum (Pivotal) ● Netezza (IBM) ● Exadata (Oracle) ● Exasol (Exasol) The advent of MPP OLAPs (Early 00s)
  • 22. Map-Reduce and Hadoop (Early 00s) ● Simpler programming paradigm ● Distributed, Replicated File System
  • 23. Map-Reduce and Hadoop (Early 00s)
  • 24. Hadoop or MPPs or both?
  • 25. Hadoop and MPPs (00s) ● MPP for speed and accuracy, well structured data ● Hadoop for size, flexibility, raw files
  • 27. Fast Data, APIs, Mobile and IoT (10s) ● WhatsApp: in a day ● 31 billion messages sent ● 700 million photo’s sent
  • 28. Fast Data, APIs, Mobile and IoT (10s) New Problems: ● Hadoop is too slow (File -> File) ● Productivity of Data Science goes down ● SQL is not enough ● Distributed Machine Learning algorithms?
  • 29. Streaming and Real-Time Analytics (10s)
  • 30. The RAM is the new Disk (10s) Spark is a new framework for in-memory computing Unify in a Distributed Computing paradigm: SQL, Machine Learning, Map-Reduce, Graph Analytics
  • 31. Spark Generality Combine SQL, streaming, and complex analytics. Runs Everywhere Spark runs on Hadoop, Mesos, standalone, or in the cloud. Multiple Data Sources It can access diverse data sources including HDFS, Cassandra, HBase, and S3. https://spark.apache.org/
  • 32. Popular Analytical Stacks (10s) Hadoop Hive + MPP Spark + Cassandra (no Hadoop!) Spark + HDFS + Elastic(Search)
  • 33. Future (10s, 20s) Micro-Batch and Event Streaming Analytics - Micro-Batch (Spark Streaming) - Log Oriented (Kafka, Samza) - NewSQL (VoldDB)
  • 34. Takeaways 1) SQL is there to stay 2) Data Science must be easy to program 3) Memory is King 4) Spark is the new Hadoop