SlideShare ist ein Scribd-Unternehmen logo
1 von 11
Downloaden Sie, um offline zu lesen
DATA INTELLIGENCE FOR ALL

Adatao Live Demo
at the First Spark Summit
Dec 2, 2013, San Francisco
(Video at the end of this deck)
Christopher Nguyen, PhD
Co-Founder & CEO
Big-Data Compute Engines, Google Apps
Engineering Director, Google Founders’ Award,
HKUST Prof, 2 successful enterprise exits,
Stanford PhD

Deep engineering &
business experience from
Google, Yahoo et al.
PhD’s in DM & ML from
UIUC, Georgia Tech,
Stanford, Berkeley, ...

Hadoop distributed/streaming analytics,Yahoo
Hadoop Eng, UIUC PhD

Machine learning & machine vision, US Army
Research Lab, Johns Hopkins PhD
Business Users
Data Scientists
Data Engineers
ONE Integrated Platform for Business & Data Science & Engineering

BIG
INSIGHTS

001 001
0 1 1 00 1 1 0
1 1 1 01 1 1 0
1 0 0 01 0 0 0
0 0 0 10 0 0 1
0 0 0 10 0 0 1
0 1 1 00 1 1 0
1 1 1 01 1 1 0

Visually Beautiful	

Interactive Data

Exploration	

Narrative Web App

BIG
COMPUTE

Powerful In-Memory Data Mining	

Machine Learning Big Analytics Platform	


(Hadoop HDFS, Cassandra, SQL DMBS, Streaming Data)

BIG
DATA
Architecture Design
One Integrated Platform
for Business & Data Science & Engineering
Business Users

Data Scientists

Data Engineers

001 001
0 1 1 00 1 1 0
1 1 1 01 1 1 0
1 0 0 01 0 0 0
0 0 0 10 0 0 1
0 0 0 10 0 0 1
0 1 1 00 1 1 0
1 1 1 01 1 1 0

Business Users

VS

Data Scientists

Data Engineers

stack	

for	

business	

users

stack	

for	

data
science

stack	

for	

data	

eng

OTHERS
001 001
0 1 1 00 1 1 0
1 1 1 01 1 1 0
1 0 0 01 0 0 0
0 0 0 10 0 0 1
0 0 0 10 0 0 1
0 1 1 00 1 1 0
1 1 1 01 1 1 0

for Data Scientists & Engineers
Big Data Mining & Machine Learning

Powerful In-Memory Data Mining & Machine
Learning—Model Terabytes in Seconds	

Interactive, Cluster-Scale Data Munging &
Modeling with Native R, R-Studio, Python, SQL,
and Java Front-ends	

Real-Time Scoring Directly From Trained Models	

Share reproducible, live data analysis documents	

Hadoop, Cassandra, RDBMS, Streaming Data
for Business Users
Predictive Decision Making

A Beautiful New Way to Create & Share
Visual Narratives of Your Analysis	

!

Perform Ad Hoc Queries in Plain English	

!

Publish Streaming, Interactive Dashboards	

!

Collaborate With Others In Real Time	

!

Query Terabytes in Seconds.
Demo Deployment
Diagram

CLIENT

MASTER

WORKER

WORKER

WORKER

WORKER
Demo Config
Cluster: 8-node x 8-core x 30GB RAM x 1TB Disk
Data Sets: 12GB-100GB, 100M-1B rows
Airline Arrival Data, 1988-2008 from DoT
Algorithms
- LM & supporting statistics (AIC, log-likelihood, R2, cross-validation)

- Binning

- Classification metrics: confusion matrix, ROC, AUC, F1

- Logistic Regression with Ref Level for Categorical Vars

- k-Means

- Random Forest

- Naive Bayes

- Linear SVM
Algorithm Roadmap
- Hierarchical Clustering

- Text Mining (token, POS, LDA, …)

- SVD

- Markov Chain Models

- Ensemble Models

-…
Thank you!
See demo video at
!

http:/
/youtu.be/5UAdk7oHoPE?t=7m

Weitere ähnliche Inhalte

Was ist angesagt?

Big Data on Public Cloud
Big Data on Public CloudBig Data on Public Cloud
Big Data on Public CloudIMC Institute
 
Big data management
Big data managementBig data management
Big data managementzeba khanam
 
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
Guest Lecture: Introduction to Big Data at Indian Institute of TechnologyGuest Lecture: Introduction to Big Data at Indian Institute of Technology
Guest Lecture: Introduction to Big Data at Indian Institute of TechnologyNishant Gandhi
 
IBM Big Data References
IBM Big Data ReferencesIBM Big Data References
IBM Big Data ReferencesRob Thomas
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Sciencesarith divakar
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataKristof Jozsa
 
Apache Spark and future of advanced analytics
Apache Spark and future of advanced analyticsApache Spark and future of advanced analytics
Apache Spark and future of advanced analyticsMuralidhar Somisetty
 

Was ist angesagt? (8)

Big Data on Public Cloud
Big Data on Public CloudBig Data on Public Cloud
Big Data on Public Cloud
 
Big data management
Big data managementBig data management
Big data management
 
Big data 101
Big data 101Big data 101
Big data 101
 
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
Guest Lecture: Introduction to Big Data at Indian Institute of TechnologyGuest Lecture: Introduction to Big Data at Indian Institute of Technology
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
 
IBM Big Data References
IBM Big Data ReferencesIBM Big Data References
IBM Big Data References
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Science
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Apache Spark and future of advanced analytics
Apache Spark and future of advanced analyticsApache Spark and future of advanced analytics
Apache Spark and future of advanced analytics
 

Ähnlich wie Adatao Live Demo at the First Spark Summit

Adatao: Interactive, Visual, Predictive Analytics for Big Data @ Silicon Vall...
Adatao: Interactive, Visual, Predictive Analytics for Big Data @ Silicon Vall...Adatao: Interactive, Visual, Predictive Analytics for Big Data @ Silicon Vall...
Adatao: Interactive, Visual, Predictive Analytics for Big Data @ Silicon Vall...Arimo, Inc.
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data ScienceDataWorks Summit
 
SoftElegance Services: Data Science, Data Engineering, Big Data Architecture
SoftElegance Services: Data Science, Data Engineering, Big Data Architecture SoftElegance Services: Data Science, Data Engineering, Big Data Architecture
SoftElegance Services: Data Science, Data Engineering, Big Data Architecture Daryna Dubitska
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...Alex Liu
 
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of HadoopBig Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of HadoopHazelcast
 
Introduction to PySpark
Introduction to PySparkIntroduction to PySpark
Introduction to PySparkRussell Jurney
 
Coding software and tools used for data science management - Phdassistance
Coding software and tools used for data science management - PhdassistanceCoding software and tools used for data science management - Phdassistance
Coding software and tools used for data science management - PhdassistancephdAssistance1
 
Coding‌ ‌Software‌ ‌and‌ ‌Tools‌ ‌used‌ ‌for‌ ‌Data‌ ‌Science‌ ‌Management‌ ‌...
Coding‌ ‌Software‌ ‌and‌ ‌Tools‌ ‌used‌ ‌for‌ ‌Data‌ ‌Science‌ ‌Management‌ ‌...Coding‌ ‌Software‌ ‌and‌ ‌Tools‌ ‌used‌ ‌for‌ ‌Data‌ ‌Science‌ ‌Management‌ ‌...
Coding‌ ‌Software‌ ‌and‌ ‌Tools‌ ‌used‌ ‌for‌ ‌Data‌ ‌Science‌ ‌Management‌ ‌...phdAssistance1
 
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Tomasz Bednarz
 
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning Ian Gomez
 
Introduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and PredictionIntroduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and PredictionJongwook Woo
 
PPT5: Neuron Introduction
PPT5: Neuron IntroductionPPT5: Neuron Introduction
PPT5: Neuron Introductionakira-ai
 
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham ALSecrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham ALMark Tabladillo
 
ASGARD Splunk Conf 2016
ASGARD Splunk Conf 2016ASGARD Splunk Conf 2016
ASGARD Splunk Conf 2016Keith Kraus
 
TAKE A LOOK AT THE TOP 7 SKILLS THAT A DATA ENGINEER CERTAINLY HAS TO HAVE
TAKE A LOOK AT THE TOP 7 SKILLS THAT A DATA ENGINEER CERTAINLY HAS TO HAVETAKE A LOOK AT THE TOP 7 SKILLS THAT A DATA ENGINEER CERTAINLY HAS TO HAVE
TAKE A LOOK AT THE TOP 7 SKILLS THAT A DATA ENGINEER CERTAINLY HAS TO HAVEEmilySmith271958
 
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...Denodo
 
Technology and AI sharing - From 2016 to Y2017 and Beyond
Technology and AI sharing - From 2016 to Y2017 and BeyondTechnology and AI sharing - From 2016 to Y2017 and Beyond
Technology and AI sharing - From 2016 to Y2017 and BeyondJames Huang
 
Big Data at DYNO
Big Data at DYNOBig Data at DYNO
Big Data at DYNOTu Pham
 
Big Data in Action – Real-World Solution Showcase
 Big Data in Action – Real-World Solution Showcase Big Data in Action – Real-World Solution Showcase
Big Data in Action – Real-World Solution ShowcaseInside Analysis
 
resumeGreggBuchanan160601
resumeGreggBuchanan160601resumeGreggBuchanan160601
resumeGreggBuchanan160601Gregg Buchanan
 

Ähnlich wie Adatao Live Demo at the First Spark Summit (20)

Adatao: Interactive, Visual, Predictive Analytics for Big Data @ Silicon Vall...
Adatao: Interactive, Visual, Predictive Analytics for Big Data @ Silicon Vall...Adatao: Interactive, Visual, Predictive Analytics for Big Data @ Silicon Vall...
Adatao: Interactive, Visual, Predictive Analytics for Big Data @ Silicon Vall...
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Science
 
SoftElegance Services: Data Science, Data Engineering, Big Data Architecture
SoftElegance Services: Data Science, Data Engineering, Big Data Architecture SoftElegance Services: Data Science, Data Engineering, Big Data Architecture
SoftElegance Services: Data Science, Data Engineering, Big Data Architecture
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
 
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of HadoopBig Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
 
Introduction to PySpark
Introduction to PySparkIntroduction to PySpark
Introduction to PySpark
 
Coding software and tools used for data science management - Phdassistance
Coding software and tools used for data science management - PhdassistanceCoding software and tools used for data science management - Phdassistance
Coding software and tools used for data science management - Phdassistance
 
Coding‌ ‌Software‌ ‌and‌ ‌Tools‌ ‌used‌ ‌for‌ ‌Data‌ ‌Science‌ ‌Management‌ ‌...
Coding‌ ‌Software‌ ‌and‌ ‌Tools‌ ‌used‌ ‌for‌ ‌Data‌ ‌Science‌ ‌Management‌ ‌...Coding‌ ‌Software‌ ‌and‌ ‌Tools‌ ‌used‌ ‌for‌ ‌Data‌ ‌Science‌ ‌Management‌ ‌...
Coding‌ ‌Software‌ ‌and‌ ‌Tools‌ ‌used‌ ‌for‌ ‌Data‌ ‌Science‌ ‌Management‌ ‌...
 
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
 
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
 
Introduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and PredictionIntroduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and Prediction
 
PPT5: Neuron Introduction
PPT5: Neuron IntroductionPPT5: Neuron Introduction
PPT5: Neuron Introduction
 
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham ALSecrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
 
ASGARD Splunk Conf 2016
ASGARD Splunk Conf 2016ASGARD Splunk Conf 2016
ASGARD Splunk Conf 2016
 
TAKE A LOOK AT THE TOP 7 SKILLS THAT A DATA ENGINEER CERTAINLY HAS TO HAVE
TAKE A LOOK AT THE TOP 7 SKILLS THAT A DATA ENGINEER CERTAINLY HAS TO HAVETAKE A LOOK AT THE TOP 7 SKILLS THAT A DATA ENGINEER CERTAINLY HAS TO HAVE
TAKE A LOOK AT THE TOP 7 SKILLS THAT A DATA ENGINEER CERTAINLY HAS TO HAVE
 
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
 
Technology and AI sharing - From 2016 to Y2017 and Beyond
Technology and AI sharing - From 2016 to Y2017 and BeyondTechnology and AI sharing - From 2016 to Y2017 and Beyond
Technology and AI sharing - From 2016 to Y2017 and Beyond
 
Big Data at DYNO
Big Data at DYNOBig Data at DYNO
Big Data at DYNO
 
Big Data in Action – Real-World Solution Showcase
 Big Data in Action – Real-World Solution Showcase Big Data in Action – Real-World Solution Showcase
Big Data in Action – Real-World Solution Showcase
 
resumeGreggBuchanan160601
resumeGreggBuchanan160601resumeGreggBuchanan160601
resumeGreggBuchanan160601
 

Kürzlich hochgeladen

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 

Kürzlich hochgeladen (20)

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 

Adatao Live Demo at the First Spark Summit

  • 1. DATA INTELLIGENCE FOR ALL Adatao Live Demo at the First Spark Summit Dec 2, 2013, San Francisco (Video at the end of this deck) Christopher Nguyen, PhD Co-Founder & CEO
  • 2. Big-Data Compute Engines, Google Apps Engineering Director, Google Founders’ Award, HKUST Prof, 2 successful enterprise exits, Stanford PhD Deep engineering & business experience from Google, Yahoo et al. PhD’s in DM & ML from UIUC, Georgia Tech, Stanford, Berkeley, ... Hadoop distributed/streaming analytics,Yahoo Hadoop Eng, UIUC PhD Machine learning & machine vision, US Army Research Lab, Johns Hopkins PhD
  • 3. Business Users Data Scientists Data Engineers ONE Integrated Platform for Business & Data Science & Engineering BIG INSIGHTS 001 001 0 1 1 00 1 1 0 1 1 1 01 1 1 0 1 0 0 01 0 0 0 0 0 0 10 0 0 1 0 0 0 10 0 0 1 0 1 1 00 1 1 0 1 1 1 01 1 1 0 Visually Beautiful Interactive Data
 Exploration Narrative Web App BIG COMPUTE Powerful In-Memory Data Mining Machine Learning Big Analytics Platform (Hadoop HDFS, Cassandra, SQL DMBS, Streaming Data) BIG DATA
  • 4. Architecture Design One Integrated Platform for Business & Data Science & Engineering Business Users Data Scientists Data Engineers 001 001 0 1 1 00 1 1 0 1 1 1 01 1 1 0 1 0 0 01 0 0 0 0 0 0 10 0 0 1 0 0 0 10 0 0 1 0 1 1 00 1 1 0 1 1 1 01 1 1 0 Business Users VS Data Scientists Data Engineers stack for business users stack for data science stack for data eng OTHERS
  • 5. 001 001 0 1 1 00 1 1 0 1 1 1 01 1 1 0 1 0 0 01 0 0 0 0 0 0 10 0 0 1 0 0 0 10 0 0 1 0 1 1 00 1 1 0 1 1 1 01 1 1 0 for Data Scientists & Engineers Big Data Mining & Machine Learning Powerful In-Memory Data Mining & Machine Learning—Model Terabytes in Seconds Interactive, Cluster-Scale Data Munging & Modeling with Native R, R-Studio, Python, SQL, and Java Front-ends Real-Time Scoring Directly From Trained Models Share reproducible, live data analysis documents Hadoop, Cassandra, RDBMS, Streaming Data
  • 6. for Business Users Predictive Decision Making A Beautiful New Way to Create & Share Visual Narratives of Your Analysis ! Perform Ad Hoc Queries in Plain English ! Publish Streaming, Interactive Dashboards ! Collaborate With Others In Real Time ! Query Terabytes in Seconds.
  • 8. Demo Config Cluster: 8-node x 8-core x 30GB RAM x 1TB Disk Data Sets: 12GB-100GB, 100M-1B rows Airline Arrival Data, 1988-2008 from DoT
  • 9. Algorithms - LM & supporting statistics (AIC, log-likelihood, R2, cross-validation)
 - Binning
 - Classification metrics: confusion matrix, ROC, AUC, F1
 - Logistic Regression with Ref Level for Categorical Vars
 - k-Means
 - Random Forest
 - Naive Bayes
 - Linear SVM
  • 10. Algorithm Roadmap - Hierarchical Clustering
 - Text Mining (token, POS, LDA, …)
 - SVD
 - Markov Chain Models
 - Ensemble Models
 -…
  • 11. Thank you! See demo video at ! http:/ /youtu.be/5UAdk7oHoPE?t=7m