SlideShare ist ein Scribd-Unternehmen logo
1 von 22
FRONTIERS OF
OPEN DATA
SCIENCE RESEARCH
Ani Aghababyan
O P E N
D A T A
S C I E N C E
C O N F E R E N C E_
BOSTON 2015
@opendatasci
Ani Aghababyan, Ph.D.
Data Scientist
McGraw-Hill Education
Analytics
Frontiers of Open Data Science
Research
Data and Analytics
Saturday, May 30, 2015
Big
Data
Spark
Analytics
DataScience
Learning Science
Visualization
Learning Analytics
Reporting Elastic Map Reduce
Scala
NoSQL
MongoDB
Hadoop
Privacy
Anonymization
Open
Caliper
BI
Smart Data
Internet of Things
data
lifecycle
prescriptive
descriptive
data analytics
nudge
Cassandra
EXCITING POSSIBILITIES
What if my FitBit could if I will fail my test: ready for the test?
Whether I truly have test anxiety?
Should I delay taking this take home exam?
SOBERING QUESTIONS
Whose data is it?
Can I even access my data—all my data?
Who else can access my data?
Can the data be used against me?
Is the data even accurate?
How good is the science?
Research Studies
Research Studies
The 2-sigma problem
Group 2 – 1 sigma above Group 1
Group 3 – 2 sigmas above Group 1
The average tutored student outperformed 98% of traditional
students
BENJAMIN BLOOM
2𝞂
QUESTIONS + CONCLUSIONS
How do we achieve a 1- or 2-sigma improvement in outcomes?
How do we encourage self-regulation in the learner?
How do we provide targeted, real-time feedback (nudges)?
How do we create a personalized path for the learner?
HINT
Learning Analytics
Adaptive Learning
Learning Analytics
What is the best
that could happen?
What might happen?
Stages of Analytics
Analytics Maturity
CompetitiveAdvantage
Raw
Data
Cleaned
Data
Standard
Reports
Adhoc
Reports &
OLAP
Generic
Predictive
Analytics
Predictive
Modeling
PREDICTION
What happened?
What correlates to what happened??
PRESCRIPTIONDESCRIPTION
Accepted standards for
learning
Aligned curricula
and assessments
Measurement and reports
Course correction
Descriptive
Predictions
Prescriptive
WHAT IS LEARNING ANALYTICS
The measurement, collection, analysis and reporting of data
about learners and their contexts, for purposes of
understanding and optimizing learning and the environments in
which it occurs.
How could we achieve that?
HINT
Open Architecture
Open Architecture
Data Source 1
LearningEvents+Context
Learning Analytics
Store
OutputAPI
Caliper Data Capture
Specification
Product 1
Open Analytics Architecture
Data Source 2
Data Source 3
Data Source 4
InputAPIs
Product 2
Product 3
Data Source 1
LearningEvents+Context
Learning Analytics
Store
OutputAPI
Caliper Data Capture
Specification
Product 1
Open Analytics Architecture
Data Source 2
Data Source 3
Data Source 4
InputAPIs
Product 2
Product 3
Data Source 1
LearningEvents+Context
Learning Analytics
Store
OutputAPI
Caliper Data Capture
Specification
Product 1
Open Analytics Architecture
Data Source 2
Data Source 3
Data Source 4
InputAPIs
Product 2
Product 3
Data Source 1
LearningEvents+Context
Learning Analytics
Store
OutputAPI
Caliper Data Capture
Specification
Product 1
Open Analytics Architecture
Data Source 2
Data Source 3
Data Source 4
InputAPIs
Product 2
Product 3
Data Source 1
LearningEvents+Context
Learning Analytics
Store
OutputAPI
Caliper Data Capture
Specification
Product 1
Open Analytics Architecture
Data Source 2
Data Source 3
Data Source 4
InputAPIs
Product 2
Product 3
MCGRAW-HILL EDUCATION
THANK YOU.

Weitere ähnliche Inhalte

Was ist angesagt?

What data scientists really do, according to 50 data scientists
What data scientists really do, according to 50 data scientistsWhat data scientists really do, according to 50 data scientists
What data scientists really do, according to 50 data scientistsHugo Bowne-Anderson
 
Uncertainty Quantification in Complex Physical Systems. (An Inroduction)
Uncertainty Quantification in Complex Physical Systems. (An Inroduction)Uncertainty Quantification in Complex Physical Systems. (An Inroduction)
Uncertainty Quantification in Complex Physical Systems. (An Inroduction)Ogechi Onuoha
 
Beyond Proofs of Concept for Biomedical AI
Beyond Proofs of Concept for Biomedical AIBeyond Proofs of Concept for Biomedical AI
Beyond Proofs of Concept for Biomedical AIPaul Agapow
 
Logistic Regression In Data Science
Logistic Regression In Data ScienceLogistic Regression In Data Science
Logistic Regression In Data ScienceEdureka!
 
Correctness in Data Science - Data Science Pop-up Seattle
Correctness in Data Science - Data Science Pop-up SeattleCorrectness in Data Science - Data Science Pop-up Seattle
Correctness in Data Science - Data Science Pop-up SeattleDomino Data Lab
 
Big Data Analytics: Ashwin Malshe Talk
Big Data Analytics: Ashwin Malshe TalkBig Data Analytics: Ashwin Malshe Talk
Big Data Analytics: Ashwin Malshe TalkAshwin Malshe
 
From Good to Great – Tips for Becoming a Great Data Analyst
From Good to Great – Tips for Becoming a Great Data AnalystFrom Good to Great – Tips for Becoming a Great Data Analyst
From Good to Great – Tips for Becoming a Great Data AnalystAndy Kriebel
 
Analysis of "A Predictive Analytics Primer" by Tom Davenport
 Analysis of "A Predictive Analytics Primer" by Tom Davenport Analysis of "A Predictive Analytics Primer" by Tom Davenport
Analysis of "A Predictive Analytics Primer" by Tom DavenportEt Hish
 
How to become a Data Scientist?
How to become a Data Scientist? How to become a Data Scientist?
How to become a Data Scientist? HackerEarth
 
Asists in context nyacce 2013
Asists in context nyacce 2013Asists in context nyacce 2013
Asists in context nyacce 2013Venu Thelakkat
 
Going from Raw Data to Impactful Predictions
Going from Raw Data to Impactful Predictions Going from Raw Data to Impactful Predictions
Going from Raw Data to Impactful Predictions aybuke turker
 
Crises of confidence and publishing reforms: What every social psychologist n...
Crises of confidence and publishing reforms: What every social psychologist n...Crises of confidence and publishing reforms: What every social psychologist n...
Crises of confidence and publishing reforms: What every social psychologist n...Matti Heino
 
Data Science at Scale @ barricade.io
Data Science at Scale @ barricade.ioData Science at Scale @ barricade.io
Data Science at Scale @ barricade.ioDavid Coallier
 
Data Science, what even...
Data Science, what even...Data Science, what even...
Data Science, what even...David Coallier
 

Was ist angesagt? (20)

What data scientists really do, according to 50 data scientists
What data scientists really do, according to 50 data scientistsWhat data scientists really do, according to 50 data scientists
What data scientists really do, according to 50 data scientists
 
Uncertainty Quantification in Complex Physical Systems. (An Inroduction)
Uncertainty Quantification in Complex Physical Systems. (An Inroduction)Uncertainty Quantification in Complex Physical Systems. (An Inroduction)
Uncertainty Quantification in Complex Physical Systems. (An Inroduction)
 
Lo "AI-infused interfaces for reading AI preprints"
Lo "AI-infused interfaces for reading AI preprints"Lo "AI-infused interfaces for reading AI preprints"
Lo "AI-infused interfaces for reading AI preprints"
 
Conrad - Separating the Wheat from the Chaff
Conrad - Separating the Wheat from the ChaffConrad - Separating the Wheat from the Chaff
Conrad - Separating the Wheat from the Chaff
 
Beyond Proofs of Concept for Biomedical AI
Beyond Proofs of Concept for Biomedical AIBeyond Proofs of Concept for Biomedical AI
Beyond Proofs of Concept for Biomedical AI
 
Logistic Regression In Data Science
Logistic Regression In Data ScienceLogistic Regression In Data Science
Logistic Regression In Data Science
 
Correctness in Data Science - Data Science Pop-up Seattle
Correctness in Data Science - Data Science Pop-up SeattleCorrectness in Data Science - Data Science Pop-up Seattle
Correctness in Data Science - Data Science Pop-up Seattle
 
Data scienceppt
Data sciencepptData scienceppt
Data scienceppt
 
Buzzword scheme
Buzzword schemeBuzzword scheme
Buzzword scheme
 
Big Data Analytics: Ashwin Malshe Talk
Big Data Analytics: Ashwin Malshe TalkBig Data Analytics: Ashwin Malshe Talk
Big Data Analytics: Ashwin Malshe Talk
 
From Good to Great – Tips for Becoming a Great Data Analyst
From Good to Great – Tips for Becoming a Great Data AnalystFrom Good to Great – Tips for Becoming a Great Data Analyst
From Good to Great – Tips for Becoming a Great Data Analyst
 
Analysis of "A Predictive Analytics Primer" by Tom Davenport
 Analysis of "A Predictive Analytics Primer" by Tom Davenport Analysis of "A Predictive Analytics Primer" by Tom Davenport
Analysis of "A Predictive Analytics Primer" by Tom Davenport
 
How to become a Data Scientist?
How to become a Data Scientist? How to become a Data Scientist?
How to become a Data Scientist?
 
Asists in context nyacce 2013
Asists in context nyacce 2013Asists in context nyacce 2013
Asists in context nyacce 2013
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Going from Raw Data to Impactful Predictions
Going from Raw Data to Impactful Predictions Going from Raw Data to Impactful Predictions
Going from Raw Data to Impactful Predictions
 
Introduction to Data Science by Datalent Team @Data Science Clinic #9
Introduction to Data Science by Datalent Team @Data Science Clinic #9Introduction to Data Science by Datalent Team @Data Science Clinic #9
Introduction to Data Science by Datalent Team @Data Science Clinic #9
 
Crises of confidence and publishing reforms: What every social psychologist n...
Crises of confidence and publishing reforms: What every social psychologist n...Crises of confidence and publishing reforms: What every social psychologist n...
Crises of confidence and publishing reforms: What every social psychologist n...
 
Data Science at Scale @ barricade.io
Data Science at Scale @ barricade.ioData Science at Scale @ barricade.io
Data Science at Scale @ barricade.io
 
Data Science, what even...
Data Science, what even...Data Science, what even...
Data Science, what even...
 

Andere mochten auch

The Big Data of Everyday Things
The Big Data of Everyday ThingsThe Big Data of Everyday Things
The Big Data of Everyday Thingsodsc
 
Can We Automate Predictive Analytics
Can We Automate Predictive AnalyticsCan We Automate Predictive Analytics
Can We Automate Predictive Analyticsodsc
 
Think Breadth, Not Depth
Think Breadth, Not DepthThink Breadth, Not Depth
Think Breadth, Not Depthodsc
 
Jumping to Conclusions
Jumping to ConclusionsJumping to Conclusions
Jumping to Conclusionsodsc
 
Mark higginscrowd sourced_data_science_competitions
Mark higginscrowd sourced_data_science_competitionsMark higginscrowd sourced_data_science_competitions
Mark higginscrowd sourced_data_science_competitionsodsc
 
Vowpal Wabbit
Vowpal WabbitVowpal Wabbit
Vowpal Wabbitodsc
 
Beyond Names
Beyond NamesBeyond Names
Beyond Namesodsc
 
Bridging the Gap Between Data and Insight using Open-Source Tools
Bridging the Gap Between Data and Insight using Open-Source ToolsBridging the Gap Between Data and Insight using Open-Source Tools
Bridging the Gap Between Data and Insight using Open-Source Toolsodsc
 
API Driven Development
API Driven Development API Driven Development
API Driven Development odsc
 
Data Science 101
Data Science 101Data Science 101
Data Science 101odsc
 
Machine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge DiscoveryMachine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge Discoveryodsc
 
xlwings – Make Excel Fly with Python
xlwings – Make Excel Fly with Pythonxlwings – Make Excel Fly with Python
xlwings – Make Excel Fly with Pythonodsc
 
Scalable Data Science and Deep Learning with H2O
Scalable Data Science and Deep Learning with H2OScalable Data Science and Deep Learning with H2O
Scalable Data Science and Deep Learning with H2Oodsc
 
Spark, Python and Parquet
Spark, Python and Parquet Spark, Python and Parquet
Spark, Python and Parquet odsc
 
Searching for Meaning in the Deep Web
Searching for Meaning in the Deep WebSearching for Meaning in the Deep Web
Searching for Meaning in the Deep Webodsc
 

Andere mochten auch (15)

The Big Data of Everyday Things
The Big Data of Everyday ThingsThe Big Data of Everyday Things
The Big Data of Everyday Things
 
Can We Automate Predictive Analytics
Can We Automate Predictive AnalyticsCan We Automate Predictive Analytics
Can We Automate Predictive Analytics
 
Think Breadth, Not Depth
Think Breadth, Not DepthThink Breadth, Not Depth
Think Breadth, Not Depth
 
Jumping to Conclusions
Jumping to ConclusionsJumping to Conclusions
Jumping to Conclusions
 
Mark higginscrowd sourced_data_science_competitions
Mark higginscrowd sourced_data_science_competitionsMark higginscrowd sourced_data_science_competitions
Mark higginscrowd sourced_data_science_competitions
 
Vowpal Wabbit
Vowpal WabbitVowpal Wabbit
Vowpal Wabbit
 
Beyond Names
Beyond NamesBeyond Names
Beyond Names
 
Bridging the Gap Between Data and Insight using Open-Source Tools
Bridging the Gap Between Data and Insight using Open-Source ToolsBridging the Gap Between Data and Insight using Open-Source Tools
Bridging the Gap Between Data and Insight using Open-Source Tools
 
API Driven Development
API Driven Development API Driven Development
API Driven Development
 
Data Science 101
Data Science 101Data Science 101
Data Science 101
 
Machine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge DiscoveryMachine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge Discovery
 
xlwings – Make Excel Fly with Python
xlwings – Make Excel Fly with Pythonxlwings – Make Excel Fly with Python
xlwings – Make Excel Fly with Python
 
Scalable Data Science and Deep Learning with H2O
Scalable Data Science and Deep Learning with H2OScalable Data Science and Deep Learning with H2O
Scalable Data Science and Deep Learning with H2O
 
Spark, Python and Parquet
Spark, Python and Parquet Spark, Python and Parquet
Spark, Python and Parquet
 
Searching for Meaning in the Deep Web
Searching for Meaning in the Deep WebSearching for Meaning in the Deep Web
Searching for Meaning in the Deep Web
 

Ähnlich wie Frontiers of Open Data Science Research

Jisc learning analytics MASHEIN Jan 2017
Jisc learning analytics MASHEIN Jan 2017Jisc learning analytics MASHEIN Jan 2017
Jisc learning analytics MASHEIN Jan 2017Paul Bailey
 
7 Dimensions of Agile Analytics by Ken Collier
7 Dimensions of Agile Analytics by Ken Collier 7 Dimensions of Agile Analytics by Ken Collier
7 Dimensions of Agile Analytics by Ken Collier Thoughtworks
 
Data+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdfData+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdfneelakandan2001kpm
 
Jisc learning analytics update-nov2016
Jisc learning analytics update-nov2016Jisc learning analytics update-nov2016
Jisc learning analytics update-nov2016Paul Bailey
 
Jisc learninganalytics nov2016
Jisc learninganalytics nov2016Jisc learninganalytics nov2016
Jisc learninganalytics nov2016Paul Bailey
 
Jisc learninganalytics dec2016
Jisc learninganalytics dec2016Jisc learninganalytics dec2016
Jisc learninganalytics dec2016Paul Bailey
 
Who is a data scientist
Who is a data scientist  Who is a data scientist
Who is a data scientist prateek kumar
 
JU Analytics Day Presentation by Naveen Agarwal, Creative Analytics Solutions...
JU Analytics Day Presentation by Naveen Agarwal, Creative Analytics Solutions...JU Analytics Day Presentation by Naveen Agarwal, Creative Analytics Solutions...
JU Analytics Day Presentation by Naveen Agarwal, Creative Analytics Solutions...Naveen Agarwal
 
Göteborg university(condensed)
Göteborg university(condensed)Göteborg university(condensed)
Göteborg university(condensed)Zenodia Charpy
 
Learning analytics are more than measurement
Learning analytics are more than measurementLearning analytics are more than measurement
Learning analytics are more than measurementDragan Gasevic
 
Jisc learning analytics mar2017
Jisc learning analytics mar2017Jisc learning analytics mar2017
Jisc learning analytics mar2017Paul Bailey
 
LACE Spring Briefing - Learning analytics are more than measurement
LACE Spring Briefing - Learning analytics are more than measurementLACE Spring Briefing - Learning analytics are more than measurement
LACE Spring Briefing - Learning analytics are more than measurementLACE Project
 
Open Learning Analytics panel presentation - LAK 15
Open Learning Analytics panel presentation - LAK 15 Open Learning Analytics panel presentation - LAK 15
Open Learning Analytics panel presentation - LAK 15 Sandeep M. Jayaprakash
 
Tips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data ScientistTips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data ScientistLisa Cohen
 
Leveraging Social Media with Computer Vision
Leveraging Social Media with Computer VisionLeveraging Social Media with Computer Vision
Leveraging Social Media with Computer VisionTJ Torres
 
Open academic early alert & risk assessment ap presentation
Open academic early alert & risk assessment ap presentationOpen academic early alert & risk assessment ap presentation
Open academic early alert & risk assessment ap presentationSandeep M. Jayaprakash
 
Data science with Google Analytics @MeasureCamp
Data science with Google Analytics @MeasureCampData science with Google Analytics @MeasureCamp
Data science with Google Analytics @MeasureCampAlex Papageorgiou
 

Ähnlich wie Frontiers of Open Data Science Research (20)

Jisc learning analytics MASHEIN Jan 2017
Jisc learning analytics MASHEIN Jan 2017Jisc learning analytics MASHEIN Jan 2017
Jisc learning analytics MASHEIN Jan 2017
 
7 Dimensions of Agile Analytics by Ken Collier
7 Dimensions of Agile Analytics by Ken Collier 7 Dimensions of Agile Analytics by Ken Collier
7 Dimensions of Agile Analytics by Ken Collier
 
Data+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdfData+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdf
 
Jisc learning analytics update-nov2016
Jisc learning analytics update-nov2016Jisc learning analytics update-nov2016
Jisc learning analytics update-nov2016
 
Jisc learninganalytics nov2016
Jisc learninganalytics nov2016Jisc learninganalytics nov2016
Jisc learninganalytics nov2016
 
Jisc learninganalytics dec2016
Jisc learninganalytics dec2016Jisc learninganalytics dec2016
Jisc learninganalytics dec2016
 
Who is a data scientist
Who is a data scientist  Who is a data scientist
Who is a data scientist
 
JU Analytics Day Presentation by Naveen Agarwal, Creative Analytics Solutions...
JU Analytics Day Presentation by Naveen Agarwal, Creative Analytics Solutions...JU Analytics Day Presentation by Naveen Agarwal, Creative Analytics Solutions...
JU Analytics Day Presentation by Naveen Agarwal, Creative Analytics Solutions...
 
Göteborg university(condensed)
Göteborg university(condensed)Göteborg university(condensed)
Göteborg university(condensed)
 
Learning analytics are more than measurement
Learning analytics are more than measurementLearning analytics are more than measurement
Learning analytics are more than measurement
 
Jisc learning analytics mar2017
Jisc learning analytics mar2017Jisc learning analytics mar2017
Jisc learning analytics mar2017
 
LACE Spring Briefing - Learning analytics are more than measurement
LACE Spring Briefing - Learning analytics are more than measurementLACE Spring Briefing - Learning analytics are more than measurement
LACE Spring Briefing - Learning analytics are more than measurement
 
Bayesian reasoning
Bayesian reasoningBayesian reasoning
Bayesian reasoning
 
Open Learning Analytics panel presentation - LAK 15
Open Learning Analytics panel presentation - LAK 15 Open Learning Analytics panel presentation - LAK 15
Open Learning Analytics panel presentation - LAK 15
 
Tips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data ScientistTips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data Scientist
 
Lak20 drill down recommendation
Lak20 drill down recommendationLak20 drill down recommendation
Lak20 drill down recommendation
 
Leveraging Social Media with Computer Vision
Leveraging Social Media with Computer VisionLeveraging Social Media with Computer Vision
Leveraging Social Media with Computer Vision
 
Open academic early alert & risk assessment ap presentation
Open academic early alert & risk assessment ap presentationOpen academic early alert & risk assessment ap presentation
Open academic early alert & risk assessment ap presentation
 
Data science with Google Analytics @MeasureCamp
Data science with Google Analytics @MeasureCampData science with Google Analytics @MeasureCamp
Data science with Google Analytics @MeasureCamp
 
Intro big data.pdf
Intro big data.pdfIntro big data.pdf
Intro big data.pdf
 

Mehr von odsc

Understanding the Chief Data Officer
Understanding the Chief Data Officer Understanding the Chief Data Officer
Understanding the Chief Data Officer odsc
 
Mobile technology Usage by Humanitarian Programs: A Metadata Analysis
Mobile technology Usage by Humanitarian Programs: A Metadata AnalysisMobile technology Usage by Humanitarian Programs: A Metadata Analysis
Mobile technology Usage by Humanitarian Programs: A Metadata Analysisodsc
 
Productionizing Deep Learning From the Ground Up
Productionizing Deep Learning From the Ground UpProductionizing Deep Learning From the Ground Up
Productionizing Deep Learning From the Ground Upodsc
 
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and HiveBig Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hiveodsc
 
Data Science at Dow Jones: Monetizing Data, News and Information
Data Science at Dow Jones: Monetizing Data, News and InformationData Science at Dow Jones: Monetizing Data, News and Information
Data Science at Dow Jones: Monetizing Data, News and Informationodsc
 
Building a Predictive Analytics Solution with Azure ML
Building a Predictive Analytics Solution with Azure MLBuilding a Predictive Analytics Solution with Azure ML
Building a Predictive Analytics Solution with Azure MLodsc
 
How Woman are Conquering the S&P 500
How Woman are Conquering the S&P 500How Woman are Conquering the S&P 500
How Woman are Conquering the S&P 500odsc
 
Domain Expertise and Unstructured Data
Domain Expertise and Unstructured DataDomain Expertise and Unstructured Data
Domain Expertise and Unstructured Dataodsc
 
Kaggle The Home of Data Science
Kaggle The Home of Data ScienceKaggle The Home of Data Science
Kaggle The Home of Data Scienceodsc
 
Open Source Tools & Data Science Competitions
Open Source Tools & Data Science Competitions Open Source Tools & Data Science Competitions
Open Source Tools & Data Science Competitions odsc
 
Machine Learning with scikit-learn
Machine Learning with scikit-learnMachine Learning with scikit-learn
Machine Learning with scikit-learnodsc
 
Top 10 Signs of the Textpocalypse
Top 10 Signs of the TextpocalypseTop 10 Signs of the Textpocalypse
Top 10 Signs of the Textpocalypseodsc
 
The Art of Data Science
The Art of Data Science The Art of Data Science
The Art of Data Science odsc
 
Feature Engineering
Feature Engineering Feature Engineering
Feature Engineering odsc
 
Agile Data
Agile DataAgile Data
Agile Dataodsc
 
Using your powers for good: Data science in the social sector
Using your powers for good: Data science in the social sectorUsing your powers for good: Data science in the social sector
Using your powers for good: Data science in the social sectorodsc
 
Machine Learning for Suits
Machine Learning for SuitsMachine Learning for Suits
Machine Learning for Suitsodsc
 
Recurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text AnalysisRecurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text Analysisodsc
 
Predictive Modeling Workshop
Predictive Modeling WorkshopPredictive Modeling Workshop
Predictive Modeling Workshopodsc
 
Enabling Graph Analytics at Scale: The Opportunity for GPU-Acceleration of D...
Enabling Graph Analytics at Scale:  The Opportunity for GPU-Acceleration of D...Enabling Graph Analytics at Scale:  The Opportunity for GPU-Acceleration of D...
Enabling Graph Analytics at Scale: The Opportunity for GPU-Acceleration of D...odsc
 

Mehr von odsc (20)

Understanding the Chief Data Officer
Understanding the Chief Data Officer Understanding the Chief Data Officer
Understanding the Chief Data Officer
 
Mobile technology Usage by Humanitarian Programs: A Metadata Analysis
Mobile technology Usage by Humanitarian Programs: A Metadata AnalysisMobile technology Usage by Humanitarian Programs: A Metadata Analysis
Mobile technology Usage by Humanitarian Programs: A Metadata Analysis
 
Productionizing Deep Learning From the Ground Up
Productionizing Deep Learning From the Ground UpProductionizing Deep Learning From the Ground Up
Productionizing Deep Learning From the Ground Up
 
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and HiveBig Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
 
Data Science at Dow Jones: Monetizing Data, News and Information
Data Science at Dow Jones: Monetizing Data, News and InformationData Science at Dow Jones: Monetizing Data, News and Information
Data Science at Dow Jones: Monetizing Data, News and Information
 
Building a Predictive Analytics Solution with Azure ML
Building a Predictive Analytics Solution with Azure MLBuilding a Predictive Analytics Solution with Azure ML
Building a Predictive Analytics Solution with Azure ML
 
How Woman are Conquering the S&P 500
How Woman are Conquering the S&P 500How Woman are Conquering the S&P 500
How Woman are Conquering the S&P 500
 
Domain Expertise and Unstructured Data
Domain Expertise and Unstructured DataDomain Expertise and Unstructured Data
Domain Expertise and Unstructured Data
 
Kaggle The Home of Data Science
Kaggle The Home of Data ScienceKaggle The Home of Data Science
Kaggle The Home of Data Science
 
Open Source Tools & Data Science Competitions
Open Source Tools & Data Science Competitions Open Source Tools & Data Science Competitions
Open Source Tools & Data Science Competitions
 
Machine Learning with scikit-learn
Machine Learning with scikit-learnMachine Learning with scikit-learn
Machine Learning with scikit-learn
 
Top 10 Signs of the Textpocalypse
Top 10 Signs of the TextpocalypseTop 10 Signs of the Textpocalypse
Top 10 Signs of the Textpocalypse
 
The Art of Data Science
The Art of Data Science The Art of Data Science
The Art of Data Science
 
Feature Engineering
Feature Engineering Feature Engineering
Feature Engineering
 
Agile Data
Agile DataAgile Data
Agile Data
 
Using your powers for good: Data science in the social sector
Using your powers for good: Data science in the social sectorUsing your powers for good: Data science in the social sector
Using your powers for good: Data science in the social sector
 
Machine Learning for Suits
Machine Learning for SuitsMachine Learning for Suits
Machine Learning for Suits
 
Recurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text AnalysisRecurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text Analysis
 
Predictive Modeling Workshop
Predictive Modeling WorkshopPredictive Modeling Workshop
Predictive Modeling Workshop
 
Enabling Graph Analytics at Scale: The Opportunity for GPU-Acceleration of D...
Enabling Graph Analytics at Scale:  The Opportunity for GPU-Acceleration of D...Enabling Graph Analytics at Scale:  The Opportunity for GPU-Acceleration of D...
Enabling Graph Analytics at Scale: The Opportunity for GPU-Acceleration of D...
 

Kürzlich hochgeladen

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 

Frontiers of Open Data Science Research

  • 1. FRONTIERS OF OPEN DATA SCIENCE RESEARCH Ani Aghababyan O P E N D A T A S C I E N C E C O N F E R E N C E_ BOSTON 2015 @opendatasci
  • 2. Ani Aghababyan, Ph.D. Data Scientist McGraw-Hill Education Analytics Frontiers of Open Data Science Research Data and Analytics Saturday, May 30, 2015
  • 3. Big Data Spark Analytics DataScience Learning Science Visualization Learning Analytics Reporting Elastic Map Reduce Scala NoSQL MongoDB Hadoop Privacy Anonymization Open Caliper BI Smart Data Internet of Things data lifecycle prescriptive descriptive data analytics nudge Cassandra
  • 4.
  • 5.
  • 6.
  • 7.
  • 8. EXCITING POSSIBILITIES What if my FitBit could if I will fail my test: ready for the test? Whether I truly have test anxiety? Should I delay taking this take home exam? SOBERING QUESTIONS Whose data is it? Can I even access my data—all my data? Who else can access my data? Can the data be used against me? Is the data even accurate? How good is the science?
  • 10. Research Studies The 2-sigma problem Group 2 – 1 sigma above Group 1 Group 3 – 2 sigmas above Group 1 The average tutored student outperformed 98% of traditional students BENJAMIN BLOOM 2𝞂
  • 11. QUESTIONS + CONCLUSIONS How do we achieve a 1- or 2-sigma improvement in outcomes? How do we encourage self-regulation in the learner? How do we provide targeted, real-time feedback (nudges)? How do we create a personalized path for the learner? HINT Learning Analytics Adaptive Learning
  • 13. What is the best that could happen? What might happen? Stages of Analytics Analytics Maturity CompetitiveAdvantage Raw Data Cleaned Data Standard Reports Adhoc Reports & OLAP Generic Predictive Analytics Predictive Modeling PREDICTION What happened? What correlates to what happened?? PRESCRIPTIONDESCRIPTION
  • 14. Accepted standards for learning Aligned curricula and assessments Measurement and reports Course correction Descriptive Predictions Prescriptive
  • 15. WHAT IS LEARNING ANALYTICS The measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs. How could we achieve that? HINT Open Architecture
  • 17. Data Source 1 LearningEvents+Context Learning Analytics Store OutputAPI Caliper Data Capture Specification Product 1 Open Analytics Architecture Data Source 2 Data Source 3 Data Source 4 InputAPIs Product 2 Product 3
  • 18. Data Source 1 LearningEvents+Context Learning Analytics Store OutputAPI Caliper Data Capture Specification Product 1 Open Analytics Architecture Data Source 2 Data Source 3 Data Source 4 InputAPIs Product 2 Product 3
  • 19. Data Source 1 LearningEvents+Context Learning Analytics Store OutputAPI Caliper Data Capture Specification Product 1 Open Analytics Architecture Data Source 2 Data Source 3 Data Source 4 InputAPIs Product 2 Product 3
  • 20. Data Source 1 LearningEvents+Context Learning Analytics Store OutputAPI Caliper Data Capture Specification Product 1 Open Analytics Architecture Data Source 2 Data Source 3 Data Source 4 InputAPIs Product 2 Product 3
  • 21. Data Source 1 LearningEvents+Context Learning Analytics Store OutputAPI Caliper Data Capture Specification Product 1 Open Analytics Architecture Data Source 2 Data Source 3 Data Source 4 InputAPIs Product 2 Product 3

Hinweis der Redaktion

  1. Frontiers of Open Data Science Research. Whenever I see a presentation titles such as the one I am giving today, the words that come to my mind are something like this:
  2. Big Data, Data Analytics, Data Science, Learning Science, Visualization, Reporting, Hadoop, Elastic Map Reduce, Spark, Scala, NoSQL, etc. Everyone seems to be explaining big data or data science in different words. So my goal for today is to provide clarity to these words in the context of education and learning. But first, why do we care? What is so important and noteworthy about data and data science anyways—and in particular, as it applies to learning and education since I represent a learning sciences company and I am a learning scientist myself.
  3. Nowadays our lives seem to be filled with gadgets and tools that spit out data and most of them do some pretty cool analytics and reporting for the users. Here are some example of these everyday gadgets. Some seem trivial but in reality the questions we could ask and answer through these data could be very sophisticated and fascinating. Things that we couldn’t do easily before. An example would be this fitbit
  4. Fitbit provides a phone app through which you can see charts and graphs of various information. It could be very simple such as your steps for the day, the milage you crossed, the evelevvation infroamtionetc.
  5. Some models can even provide the user with their heart rate information
  6. What brings it into data analytics is that you can create usage analytics based on these trivial data: for example, you could compare your heart rate based on the circumstances and see if there is a pattern that emerges. For example you can compare your heart rate for days when you are battling a cold to when you are very healthy and strong. See if there is a difference between your resting heart rates. If there is (which was the case in this situation), you can try to analyze whether fitbit could have predicted your illness prior to the day when you were unable to leave your bed. This is a simple case but there are many more we could apply for in learning context. For example, you could identify whether there is a difference in your academic performance based on your physical condition.
  7. So exciting possibilities are that I could predict things like whether I am ready for my test or not, whether I have test anxiety. However, this excitement comes with price of such sobering realizations like is my data safe? Who else can see my data? Will I be judged based on this data?
  8. So lets move closer to eduction. Let’s consider a research case that ground my talk
  9. 2-sigma Problem Back in 1984 Benjamin Bloom looked at student performance for students learning in three different contexts. In the first group, students were taught in a traditional class-room setting. In the second group, students were taught using mastery-learning techniques and formative feedback loop. In the third group students were in one on one tutoring sessions. Bloom discovered that students' performance from the second group was 1 sigma (standard deviation) higher than the students' performance in the first group. And students' performance in the third group was 2 sigmas higher than the students' performance in the first group. So another way said, the average tutored student outperformed 98% of the traditionally taught students!
  10. This and other similar studies raise some very important questions for us: How do we achieve a 1- or 2-sigma improvement in student outcomes? How do we encourage self-regulation in the learner? How do we provide targeted, real-time feedback (nudges)? How do create a personalized path for the learner? The hint is hint: learning analytics and adaptive learning.
  11. So what is analytics? How does it differ from our reports? And how can we apply it to learning? Data analytics and learning analytics, broadly put, is a system of analysis applied to data and to learning events. Yet, a definition of that breadth is not imminently practical. So lets look at the stages of analytics.
  12. Descriptive. In this stage of analytics we are concerned with a presentation of the past. What has happened? What patterns of past behavior can be observed? This type of information, presented well, can be very powerful. Predictive. In the predictive stage, we begin to change our time horizon towards the future. What trends do we see? What events correlates to what happened. And even, what might happen? In this stage of analytics, we create predictive models, grounded in past data, of what might happen in the future. Prescriptive. In the last stage of analytics—the holy grail of analytics—we move from predictions to prescriptions. Given where I am, and given where I want to go, what should I do? What is the optimal path for me to take? This is where the adaptive learning comes in.
  13. The analytics applied in learning context allow us to make sure that we align assignments to curricula but also allow students to follow their inidividual paths avoiding disengagement or ceiling effect.
  14. Finally the last concept I will introduce is the open architecture.
  15. Here at McGraw Hill we have created an Open Analytics Architecture. What does this mean?
  16. In an open system, data and learning events can be sourced from many data sources. Why is this important? Because I can guarantee you that no one system or product has a complete picture of a student’s learning. The content tools you use “know” about a certain set of learning interactions
  17. We use a standards body that provides a set pf requiremenst regarding how learning events data should be formatted and structured. This way we can guarantee a communication between different systems.
  18. Here we transform and store the data collected from learning environments.
  19. Finally, the last piece of our open architecture are the products and platforms that consume data from the analytics platform. These could be user-facing visualization products or any other system.