SlideShare ist ein Scribd-Unternehmen logo
1 von 11
Downloaden Sie, um offline zu lesen
DATA INTELLIGENCE FOR ALL

Adatao Live Demo
at the First Spark Summit
Dec 2, 2013, San Francisco
(Video at the end of this deck)
Christopher Nguyen, PhD
Co-Founder & CEO
Big-Data Compute Engines, Google Apps
Engineering Director, Google Founders’ Award,
HKUST Prof, 2 successful enterprise exits,
Stanford PhD

Deep engineering &
business experience from
Google, Yahoo et al.
PhD’s in DM & ML from
UIUC, Georgia Tech,
Stanford, Berkeley, ...

Hadoop distributed/streaming analytics,Yahoo
Hadoop Eng, UIUC PhD

Machine learning & machine vision, US Army
Research Lab, Johns Hopkins PhD
Business Users
Data Scientists
Data Engineers
ONE Integrated Platform for Business & Data Science & Engineering

BIG
INSIGHTS

001 001
0 1 1 00 1 1 0
1 1 1 01 1 1 0
1 0 0 01 0 0 0
0 0 0 10 0 0 1
0 0 0 10 0 0 1
0 1 1 00 1 1 0
1 1 1 01 1 1 0

Visually Beautiful	

Interactive Data

Exploration	

Narrative Web App

BIG
COMPUTE

Powerful In-Memory Data Mining	

Machine Learning Big Analytics Platform	


(Hadoop HDFS, Cassandra, SQL DMBS, Streaming Data)

BIG
DATA
Architecture Design
One Integrated Platform
for Business & Data Science & Engineering
Business Users

Data Scientists

Data Engineers

001 001
0 1 1 00 1 1 0
1 1 1 01 1 1 0
1 0 0 01 0 0 0
0 0 0 10 0 0 1
0 0 0 10 0 0 1
0 1 1 00 1 1 0
1 1 1 01 1 1 0

Business Users

VS

Data Scientists

Data Engineers

stack	

for	

business	

users

stack	

for	

data
science

stack	

for	

data	

eng

OTHERS
001 001
0 1 1 00 1 1 0
1 1 1 01 1 1 0
1 0 0 01 0 0 0
0 0 0 10 0 0 1
0 0 0 10 0 0 1
0 1 1 00 1 1 0
1 1 1 01 1 1 0

for Data Scientists & Engineers
Big Data Mining & Machine Learning

Powerful In-Memory Data Mining & Machine
Learning—Model Terabytes in Seconds	

Interactive, Cluster-Scale Data Munging &
Modeling with Native R, R-Studio, Python, SQL,
and Java Front-ends	

Real-Time Scoring Directly From Trained Models	

Share reproducible, live data analysis documents	

Hadoop, Cassandra, RDBMS, Streaming Data
for Business Users
Predictive Decision Making

A Beautiful New Way to Create & Share
Visual Narratives of Your Analysis	

!

Perform Ad Hoc Queries in Plain English	

!

Publish Streaming, Interactive Dashboards	

!

Collaborate With Others In Real Time	

!

Query Terabytes in Seconds.
Demo Deployment
Diagram

CLIENT

MASTER

WORKER

WORKER

WORKER

WORKER
Demo Config
Cluster: 8-node x 8-core x 30GB RAM x 1TB Disk
Data Sets: 12GB-100GB, 100M-1B rows
Airline Arrival Data, 1988-2008 from DoT
Algorithms
- LM & supporting statistics (AIC, log-likelihood, R2, cross-validation)

- Binning

- Classification metrics: confusion matrix, ROC, AUC, F1

- Logistic Regression with Ref Level for Categorical Vars

- k-Means

- Random Forest

- Naive Bayes

- Linear SVM
Algorithm Roadmap
- Hierarchical Clustering

- Text Mining (token, POS, LDA, …)

- SVD

- Markov Chain Models

- Ensemble Models

-…
Thank you!
See demo video at
!

http:/
/youtu.be/5UAdk7oHoPE?t=7m

Weitere ähnliche Inhalte

Was ist angesagt?

Big Data on Public Cloud
Big Data on Public CloudBig Data on Public Cloud
Big Data on Public CloudIMC Institute
 
Big data management
Big data managementBig data management
Big data managementzeba khanam
 
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
Guest Lecture: Introduction to Big Data at Indian Institute of TechnologyGuest Lecture: Introduction to Big Data at Indian Institute of Technology
Guest Lecture: Introduction to Big Data at Indian Institute of TechnologyNishant Gandhi
 
IBM Big Data References
IBM Big Data ReferencesIBM Big Data References
IBM Big Data ReferencesRob Thomas
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Sciencesarith divakar
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataKristof Jozsa
 
Apache Spark and future of advanced analytics
Apache Spark and future of advanced analyticsApache Spark and future of advanced analytics
Apache Spark and future of advanced analyticsMuralidhar Somisetty
 

Was ist angesagt? (8)

Big Data on Public Cloud
Big Data on Public CloudBig Data on Public Cloud
Big Data on Public Cloud
 
Big data management
Big data managementBig data management
Big data management
 
Big data 101
Big data 101Big data 101
Big data 101
 
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
Guest Lecture: Introduction to Big Data at Indian Institute of TechnologyGuest Lecture: Introduction to Big Data at Indian Institute of Technology
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
 
IBM Big Data References
IBM Big Data ReferencesIBM Big Data References
IBM Big Data References
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Science
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Apache Spark and future of advanced analytics
Apache Spark and future of advanced analyticsApache Spark and future of advanced analytics
Apache Spark and future of advanced analytics
 

Ähnlich wie Adatao Live Demo at the First Spark Summit

Adatao: Interactive, Visual, Predictive Analytics for Big Data @ Silicon Vall...
Adatao: Interactive, Visual, Predictive Analytics for Big Data @ Silicon Vall...Adatao: Interactive, Visual, Predictive Analytics for Big Data @ Silicon Vall...
Adatao: Interactive, Visual, Predictive Analytics for Big Data @ Silicon Vall...Arimo, Inc.
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data ScienceDataWorks Summit
 
SoftElegance Services: Data Science, Data Engineering, Big Data Architecture
SoftElegance Services: Data Science, Data Engineering, Big Data Architecture SoftElegance Services: Data Science, Data Engineering, Big Data Architecture
SoftElegance Services: Data Science, Data Engineering, Big Data Architecture Daryna Dubitska
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...Alex Liu
 
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of HadoopBig Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of HadoopHazelcast
 
Introduction to PySpark
Introduction to PySparkIntroduction to PySpark
Introduction to PySparkRussell Jurney
 
Coding software and tools used for data science management - Phdassistance
Coding software and tools used for data science management - PhdassistanceCoding software and tools used for data science management - Phdassistance
Coding software and tools used for data science management - PhdassistancephdAssistance1
 
Coding‌ ‌Software‌ ‌and‌ ‌Tools‌ ‌used‌ ‌for‌ ‌Data‌ ‌Science‌ ‌Management‌ ‌...
Coding‌ ‌Software‌ ‌and‌ ‌Tools‌ ‌used‌ ‌for‌ ‌Data‌ ‌Science‌ ‌Management‌ ‌...Coding‌ ‌Software‌ ‌and‌ ‌Tools‌ ‌used‌ ‌for‌ ‌Data‌ ‌Science‌ ‌Management‌ ‌...
Coding‌ ‌Software‌ ‌and‌ ‌Tools‌ ‌used‌ ‌for‌ ‌Data‌ ‌Science‌ ‌Management‌ ‌...phdAssistance1
 
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Tomasz Bednarz
 
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning Ian Gomez
 
Introduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and PredictionIntroduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and PredictionJongwook Woo
 
PPT5: Neuron Introduction
PPT5: Neuron IntroductionPPT5: Neuron Introduction
PPT5: Neuron Introductionakira-ai
 
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham ALSecrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham ALMark Tabladillo
 
ASGARD Splunk Conf 2016
ASGARD Splunk Conf 2016ASGARD Splunk Conf 2016
ASGARD Splunk Conf 2016Keith Kraus
 
TAKE A LOOK AT THE TOP 7 SKILLS THAT A DATA ENGINEER CERTAINLY HAS TO HAVE
TAKE A LOOK AT THE TOP 7 SKILLS THAT A DATA ENGINEER CERTAINLY HAS TO HAVETAKE A LOOK AT THE TOP 7 SKILLS THAT A DATA ENGINEER CERTAINLY HAS TO HAVE
TAKE A LOOK AT THE TOP 7 SKILLS THAT A DATA ENGINEER CERTAINLY HAS TO HAVEEmilySmith271958
 
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...Denodo
 
Technology and AI sharing - From 2016 to Y2017 and Beyond
Technology and AI sharing - From 2016 to Y2017 and BeyondTechnology and AI sharing - From 2016 to Y2017 and Beyond
Technology and AI sharing - From 2016 to Y2017 and BeyondJames Huang
 
Big Data at DYNO
Big Data at DYNOBig Data at DYNO
Big Data at DYNOTu Pham
 
Big Data in Action – Real-World Solution Showcase
 Big Data in Action – Real-World Solution Showcase Big Data in Action – Real-World Solution Showcase
Big Data in Action – Real-World Solution ShowcaseInside Analysis
 
resumeGreggBuchanan160601
resumeGreggBuchanan160601resumeGreggBuchanan160601
resumeGreggBuchanan160601Gregg Buchanan
 

Ähnlich wie Adatao Live Demo at the First Spark Summit (20)

Adatao: Interactive, Visual, Predictive Analytics for Big Data @ Silicon Vall...
Adatao: Interactive, Visual, Predictive Analytics for Big Data @ Silicon Vall...Adatao: Interactive, Visual, Predictive Analytics for Big Data @ Silicon Vall...
Adatao: Interactive, Visual, Predictive Analytics for Big Data @ Silicon Vall...
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Science
 
SoftElegance Services: Data Science, Data Engineering, Big Data Architecture
SoftElegance Services: Data Science, Data Engineering, Big Data Architecture SoftElegance Services: Data Science, Data Engineering, Big Data Architecture
SoftElegance Services: Data Science, Data Engineering, Big Data Architecture
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
 
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of HadoopBig Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
 
Introduction to PySpark
Introduction to PySparkIntroduction to PySpark
Introduction to PySpark
 
Coding software and tools used for data science management - Phdassistance
Coding software and tools used for data science management - PhdassistanceCoding software and tools used for data science management - Phdassistance
Coding software and tools used for data science management - Phdassistance
 
Coding‌ ‌Software‌ ‌and‌ ‌Tools‌ ‌used‌ ‌for‌ ‌Data‌ ‌Science‌ ‌Management‌ ‌...
Coding‌ ‌Software‌ ‌and‌ ‌Tools‌ ‌used‌ ‌for‌ ‌Data‌ ‌Science‌ ‌Management‌ ‌...Coding‌ ‌Software‌ ‌and‌ ‌Tools‌ ‌used‌ ‌for‌ ‌Data‌ ‌Science‌ ‌Management‌ ‌...
Coding‌ ‌Software‌ ‌and‌ ‌Tools‌ ‌used‌ ‌for‌ ‌Data‌ ‌Science‌ ‌Management‌ ‌...
 
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
 
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
 
Introduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and PredictionIntroduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and Prediction
 
PPT5: Neuron Introduction
PPT5: Neuron IntroductionPPT5: Neuron Introduction
PPT5: Neuron Introduction
 
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham ALSecrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
 
ASGARD Splunk Conf 2016
ASGARD Splunk Conf 2016ASGARD Splunk Conf 2016
ASGARD Splunk Conf 2016
 
TAKE A LOOK AT THE TOP 7 SKILLS THAT A DATA ENGINEER CERTAINLY HAS TO HAVE
TAKE A LOOK AT THE TOP 7 SKILLS THAT A DATA ENGINEER CERTAINLY HAS TO HAVETAKE A LOOK AT THE TOP 7 SKILLS THAT A DATA ENGINEER CERTAINLY HAS TO HAVE
TAKE A LOOK AT THE TOP 7 SKILLS THAT A DATA ENGINEER CERTAINLY HAS TO HAVE
 
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
 
Technology and AI sharing - From 2016 to Y2017 and Beyond
Technology and AI sharing - From 2016 to Y2017 and BeyondTechnology and AI sharing - From 2016 to Y2017 and Beyond
Technology and AI sharing - From 2016 to Y2017 and Beyond
 
Big Data at DYNO
Big Data at DYNOBig Data at DYNO
Big Data at DYNO
 
Big Data in Action – Real-World Solution Showcase
 Big Data in Action – Real-World Solution Showcase Big Data in Action – Real-World Solution Showcase
Big Data in Action – Real-World Solution Showcase
 
resumeGreggBuchanan160601
resumeGreggBuchanan160601resumeGreggBuchanan160601
resumeGreggBuchanan160601
 

Kürzlich hochgeladen

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Evolution of Money: Digital Transformation and CBDCs in Central Banking
The Evolution of Money: Digital Transformation and CBDCs in Central BankingThe Evolution of Money: Digital Transformation and CBDCs in Central Banking
The Evolution of Money: Digital Transformation and CBDCs in Central BankingSelcen Ozturkcan
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 

Kürzlich hochgeladen (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Evolution of Money: Digital Transformation and CBDCs in Central Banking
The Evolution of Money: Digital Transformation and CBDCs in Central BankingThe Evolution of Money: Digital Transformation and CBDCs in Central Banking
The Evolution of Money: Digital Transformation and CBDCs in Central Banking
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 

Adatao Live Demo at the First Spark Summit

  • 1. DATA INTELLIGENCE FOR ALL Adatao Live Demo at the First Spark Summit Dec 2, 2013, San Francisco (Video at the end of this deck) Christopher Nguyen, PhD Co-Founder & CEO
  • 2. Big-Data Compute Engines, Google Apps Engineering Director, Google Founders’ Award, HKUST Prof, 2 successful enterprise exits, Stanford PhD Deep engineering & business experience from Google, Yahoo et al. PhD’s in DM & ML from UIUC, Georgia Tech, Stanford, Berkeley, ... Hadoop distributed/streaming analytics,Yahoo Hadoop Eng, UIUC PhD Machine learning & machine vision, US Army Research Lab, Johns Hopkins PhD
  • 3. Business Users Data Scientists Data Engineers ONE Integrated Platform for Business & Data Science & Engineering BIG INSIGHTS 001 001 0 1 1 00 1 1 0 1 1 1 01 1 1 0 1 0 0 01 0 0 0 0 0 0 10 0 0 1 0 0 0 10 0 0 1 0 1 1 00 1 1 0 1 1 1 01 1 1 0 Visually Beautiful Interactive Data
 Exploration Narrative Web App BIG COMPUTE Powerful In-Memory Data Mining Machine Learning Big Analytics Platform (Hadoop HDFS, Cassandra, SQL DMBS, Streaming Data) BIG DATA
  • 4. Architecture Design One Integrated Platform for Business & Data Science & Engineering Business Users Data Scientists Data Engineers 001 001 0 1 1 00 1 1 0 1 1 1 01 1 1 0 1 0 0 01 0 0 0 0 0 0 10 0 0 1 0 0 0 10 0 0 1 0 1 1 00 1 1 0 1 1 1 01 1 1 0 Business Users VS Data Scientists Data Engineers stack for business users stack for data science stack for data eng OTHERS
  • 5. 001 001 0 1 1 00 1 1 0 1 1 1 01 1 1 0 1 0 0 01 0 0 0 0 0 0 10 0 0 1 0 0 0 10 0 0 1 0 1 1 00 1 1 0 1 1 1 01 1 1 0 for Data Scientists & Engineers Big Data Mining & Machine Learning Powerful In-Memory Data Mining & Machine Learning—Model Terabytes in Seconds Interactive, Cluster-Scale Data Munging & Modeling with Native R, R-Studio, Python, SQL, and Java Front-ends Real-Time Scoring Directly From Trained Models Share reproducible, live data analysis documents Hadoop, Cassandra, RDBMS, Streaming Data
  • 6. for Business Users Predictive Decision Making A Beautiful New Way to Create & Share Visual Narratives of Your Analysis ! Perform Ad Hoc Queries in Plain English ! Publish Streaming, Interactive Dashboards ! Collaborate With Others In Real Time ! Query Terabytes in Seconds.
  • 8. Demo Config Cluster: 8-node x 8-core x 30GB RAM x 1TB Disk Data Sets: 12GB-100GB, 100M-1B rows Airline Arrival Data, 1988-2008 from DoT
  • 9. Algorithms - LM & supporting statistics (AIC, log-likelihood, R2, cross-validation)
 - Binning
 - Classification metrics: confusion matrix, ROC, AUC, F1
 - Logistic Regression with Ref Level for Categorical Vars
 - k-Means
 - Random Forest
 - Naive Bayes
 - Linear SVM
  • 10. Algorithm Roadmap - Hierarchical Clustering
 - Text Mining (token, POS, LDA, …)
 - SVD
 - Markov Chain Models
 - Ensemble Models
 -…
  • 11. Thank you! See demo video at ! http:/ /youtu.be/5UAdk7oHoPE?t=7m