Frontiers of Open Data Science Research

•Als PPTX, PDF herunterladen•

1 gefällt mir•399 views

The document discusses open data science research topics presented at a conference, including opportunities and challenges with learning analytics and adaptive learning using open data. It describes how learning analytics can help achieve large improvements in student outcomes through targeted feedback and personalized learning paths. An open analytics architecture is proposed to integrate different data sources and applications using common data standards.

Technologie

FRONTIERS OF
OPEN DATA
SCIENCE RESEARCH
Ani Aghababyan
O P E N
D A T A
S C I E N C E
C O N F E R E N C E_
BOSTON 2015
@opendatasci

Ani Aghababyan, Ph.D.
Data Scientist
McGraw-Hill Education
Analytics
Frontiers of Open Data Science
Research
Data and Analytics
Saturday, May 30, 2015

Big
Data
Spark
Analytics
DataScience
Learning Science
Visualization
Learning Analytics
Reporting Elastic Map Reduce
Scala
NoSQL
MongoDB
Hadoop
Privacy
Anonymization
Open
Caliper
BI
Smart Data
Internet of Things
data
lifecycle
prescriptive
descriptive
data analytics
nudge
Cassandra

EXCITING POSSIBILITIES
What if my FitBit could if I will fail my test: ready for the test?
Whether I truly have test anxiety?
Should I delay taking this take home exam?
SOBERING QUESTIONS
Whose data is it?
Can I even access my data—all my data?
Who else can access my data?
Can the data be used against me?
Is the data even accurate?
How good is the science?

Research Studies
The 2-sigma problem
Group 2 – 1 sigma above Group 1
Group 3 – 2 sigmas above Group 1
The average tutored student outperformed 98% of traditional
students
BENJAMIN BLOOM
2𝞂

QUESTIONS + CONCLUSIONS
How do we achieve a 1- or 2-sigma improvement in outcomes?
How do we encourage self-regulation in the learner?
How do we provide targeted, real-time feedback (nudges)?
How do we create a personalized path for the learner?
HINT
Learning Analytics
Adaptive Learning

What is the best
that could happen?
What might happen?
Stages of Analytics
Analytics Maturity
CompetitiveAdvantage
Raw
Data
Cleaned
Data
Standard
Reports
Adhoc
Reports &
OLAP
Generic
Predictive
Analytics
Predictive
Modeling
PREDICTION
What happened?
What correlates to what happened??
PRESCRIPTIONDESCRIPTION

Accepted standards for
learning
Aligned curricula
and assessments
Measurement and reports
Course correction
Descriptive
Predictions
Prescriptive

WHAT IS LEARNING ANALYTICS
The measurement, collection, analysis and reporting of data
about learners and their contexts, for purposes of
understanding and optimizing learning and the environments in
which it occurs.
How could we achieve that?
HINT
Open Architecture

Data Source 1
LearningEvents+Context
Learning Analytics
Store
OutputAPI
Caliper Data Capture
Specification
Product 1
Open Analytics Architecture
Data Source 2
Data Source 3
Data Source 4
InputAPIs
Product 2
Product 3

Weitere ähnliche Inhalte

Was ist angesagt?

What data scientists really do, according to 50 data scientistsHugo Bowne-Anderson

Uncertainty Quantification in Complex Physical Systems. (An Inroduction)Ogechi Onuoha

Lo "AI-infused interfaces for reading AI preprints"National Information Standards Organization (NISO)

Conrad - Separating the Wheat from the ChaffNational Information Standards Organization (NISO)

Beyond Proofs of Concept for Biomedical AIPaul Agapow

Logistic Regression In Data ScienceEdureka!

Correctness in Data Science - Data Science Pop-up SeattleDomino Data Lab

Data sciencepptJayabalan Sekar

Buzzword schemeSergey Shelpuk

Big Data Analytics: Ashwin Malshe TalkAshwin Malshe

From Good to Great – Tips for Becoming a Great Data AnalystAndy Kriebel

Analysis of "A Predictive Analytics Primer" by Tom DavenportEt Hish

How to become a Data Scientist? HackerEarth

Asists in context nyacce 2013Venu Thelakkat

Big data analyticsIsłém Jendoubi

Going from Raw Data to Impactful Predictions aybuke turker

Introduction to Data Science by Datalent Team @Data Science Clinic #9Dr.Sotarat Thammaboosadee CIMP-Data Governance

Crises of confidence and publishing reforms: What every social psychologist n...Matti Heino

Data Science at Scale @ barricade.ioDavid Coallier

Data Science, what even...David Coallier

Was ist angesagt? (20)

What data scientists really do, according to 50 data scientists

Uncertainty Quantification in Complex Physical Systems. (An Inroduction)

Lo "AI-infused interfaces for reading AI preprints"

Conrad - Separating the Wheat from the Chaff

Beyond Proofs of Concept for Biomedical AI

Logistic Regression In Data Science

Correctness in Data Science - Data Science Pop-up Seattle

Data scienceppt

Buzzword scheme

Big Data Analytics: Ashwin Malshe Talk

From Good to Great – Tips for Becoming a Great Data Analyst

Analysis of "A Predictive Analytics Primer" by Tom Davenport

How to become a Data Scientist?

Asists in context nyacce 2013

Big data analytics

Going from Raw Data to Impactful Predictions

Introduction to Data Science by Datalent Team @Data Science Clinic #9

Crises of confidence and publishing reforms: What every social psychologist n...

Data Science at Scale @ barricade.io

Data Science, what even...

Andere mochten auch

The Big Data of Everyday Thingsodsc

Can We Automate Predictive Analyticsodsc

Think Breadth, Not Depthodsc

Jumping to Conclusionsodsc

Mark higginscrowd sourced_data_science_competitionsodsc

Vowpal Wabbitodsc

Beyond Namesodsc

Bridging the Gap Between Data and Insight using Open-Source Toolsodsc

API Driven Development odsc

Data Science 101odsc

Machine-In-The-Loop for Knowledge Discoveryodsc

xlwings – Make Excel Fly with Pythonodsc

Scalable Data Science and Deep Learning with H2Oodsc

Spark, Python and Parquet odsc

Searching for Meaning in the Deep Webodsc

Andere mochten auch (15)

The Big Data of Everyday Things

Can We Automate Predictive Analytics

Think Breadth, Not Depth

Jumping to Conclusions

Mark higginscrowd sourced_data_science_competitions

Vowpal Wabbit

Beyond Names

Bridging the Gap Between Data and Insight using Open-Source Tools

API Driven Development

Data Science 101

Machine-In-The-Loop for Knowledge Discovery

xlwings – Make Excel Fly with Python

Scalable Data Science and Deep Learning with H2O

Spark, Python and Parquet

Searching for Meaning in the Deep Web

Ähnlich wie Frontiers of Open Data Science Research

Jisc learning analytics MASHEIN Jan 2017Paul Bailey

7 Dimensions of Agile Analytics by Ken Collier Thoughtworks

Data+Science+in+Python+-+Data+Prep+&+EDA.pdfneelakandan2001kpm

Jisc learning analytics update-nov2016Paul Bailey

Jisc learninganalytics nov2016Paul Bailey

Jisc learninganalytics dec2016Paul Bailey

Who is a data scientist prateek kumar

JU Analytics Day Presentation by Naveen Agarwal, Creative Analytics Solutions...Naveen Agarwal

Göteborg university(condensed)Zenodia Charpy

Learning analytics are more than measurementDragan Gasevic

Jisc learning analytics mar2017Paul Bailey

LACE Spring Briefing - Learning analytics are more than measurementLACE Project

Bayesian reasoningMarta Fajlhauer

Open Learning Analytics panel presentation - LAK 15 Sandeep M. Jayaprakash

Tips and Tricks to be an Effective Data ScientistLisa Cohen

Lak20 drill down recommendationShiva Leemans Shabaninejad

Leveraging Social Media with Computer VisionTJ Torres

Open academic early alert & risk assessment ap presentationSandeep M. Jayaprakash

Data science with Google Analytics @MeasureCampAlex Papageorgiou

Intro big data.pdfEssamElfakharany1

Ähnlich wie Frontiers of Open Data Science Research (20)

Jisc learning analytics MASHEIN Jan 2017

7 Dimensions of Agile Analytics by Ken Collier

Data+Science+in+Python+-+Data+Prep+&+EDA.pdf

Jisc learning analytics update-nov2016

Jisc learninganalytics nov2016

Jisc learninganalytics dec2016

Who is a data scientist

JU Analytics Day Presentation by Naveen Agarwal, Creative Analytics Solutions...

Göteborg university(condensed)

Learning analytics are more than measurement

Jisc learning analytics mar2017

LACE Spring Briefing - Learning analytics are more than measurement

Bayesian reasoning

Open Learning Analytics panel presentation - LAK 15

Tips and Tricks to be an Effective Data Scientist

Lak20 drill down recommendation

Leveraging Social Media with Computer Vision

Open academic early alert & risk assessment ap presentation

Data science with Google Analytics @MeasureCamp

Intro big data.pdf

Mehr von odsc

Understanding the Chief Data Officer odsc

Mobile technology Usage by Humanitarian Programs: A Metadata Analysisodsc

Productionizing Deep Learning From the Ground Upodsc

Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hiveodsc

Data Science at Dow Jones: Monetizing Data, News and Informationodsc

Building a Predictive Analytics Solution with Azure MLodsc

How Woman are Conquering the S&P 500odsc

Domain Expertise and Unstructured Dataodsc

Kaggle The Home of Data Scienceodsc

Open Source Tools & Data Science Competitions odsc

Machine Learning with scikit-learnodsc

Top 10 Signs of the Textpocalypseodsc

The Art of Data Science odsc

Feature Engineering odsc

Agile Dataodsc

Using your powers for good: Data science in the social sectorodsc

Machine Learning for Suitsodsc

Recurrent Neural Networks for Text Analysisodsc

Predictive Modeling Workshopodsc

Enabling Graph Analytics at Scale: The Opportunity for GPU-Acceleration of D...odsc

Mehr von odsc (20)

Understanding the Chief Data Officer

Mobile technology Usage by Humanitarian Programs: A Metadata Analysis

Productionizing Deep Learning From the Ground Up

Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive

Data Science at Dow Jones: Monetizing Data, News and Information

Building a Predictive Analytics Solution with Azure ML

How Woman are Conquering the S&P 500

Domain Expertise and Unstructured Data

Kaggle The Home of Data Science

Open Source Tools & Data Science Competitions

Machine Learning with scikit-learn

Top 10 Signs of the Textpocalypse

The Art of Data Science

Feature Engineering

Agile Data

Using your powers for good: Data science in the social sector

Machine Learning for Suits

Recurrent Neural Networks for Text Analysis

Predictive Modeling Workshop

Enabling Graph Analytics at Scale: The Opportunity for GPU-Acceleration of D...

Kürzlich hochgeladen

Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity

DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays

SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521

Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely

unit 4 immunoblotting technique complete.pptxBkGupta21

How to write a Business Continuity PlanDatabarracks

Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan

The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech

DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3

A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3

Gen AI in Business - Global Trends Report 2024.pdfAddepto

The State of Passkeys with FIDO Alliance.pptxLoriGlavin3

Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University

"ML in Production",Oleksandr BaganFwdays

Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3

Kürzlich hochgeladen (20)

Dev Dives: Streamline document processing with UiPath Studio Web

DSPy a system for AI to Write Prompts and Do Fine Tuning

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack

SALESFORCE EDUCATION CLOUD | FEXLE SERVICES

Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf

unit 4 immunoblotting technique complete.pptx

How to write a Business Continuity Plan

Generative AI for Technical Writer or Information Developers

The Ultimate Guide to Choosing WordPress Pros and Cons

DevoxxFR 2024 Reproducible Builds with Apache Maven

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

Digital Identity is Under Attack: FIDO Paris Seminar.pptx

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx

Gen AI in Business - Global Trends Report 2024.pdf

The State of Passkeys with FIDO Alliance.pptx

Nell’iperspazio con Rocket: il Framework Web di Rust!

"ML in Production",Oleksandr Bagan

Scanning the Internet for External Cloud Exposures via SSL Certs

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx

Frontiers of Open Data Science Research

1. FRONTIERS OF OPEN DATA SCIENCE RESEARCH Ani Aghababyan O P E N D A T A S C I E N C E C O N F E R E N C E_ BOSTON 2015 @opendatasci

2. Ani Aghababyan, Ph.D. Data Scientist McGraw-Hill Education Analytics Frontiers of Open Data Science Research Data and Analytics Saturday, May 30, 2015

3. Big Data Spark Analytics DataScience Learning Science Visualization Learning Analytics Reporting Elastic Map Reduce Scala NoSQL MongoDB Hadoop Privacy Anonymization Open Caliper BI Smart Data Internet of Things data lifecycle prescriptive descriptive data analytics nudge Cassandra

8. EXCITING POSSIBILITIES What if my FitBit could if I will fail my test: ready for the test? Whether I truly have test anxiety? Should I delay taking this take home exam? SOBERING QUESTIONS Whose data is it? Can I even access my data—all my data? Who else can access my data? Can the data be used against me? Is the data even accurate? How good is the science?

9. Research Studies

10. Research Studies The 2-sigma problem Group 2 – 1 sigma above Group 1 Group 3 – 2 sigmas above Group 1 The average tutored student outperformed 98% of traditional students BENJAMIN BLOOM 2𝞂

11. QUESTIONS + CONCLUSIONS How do we achieve a 1- or 2-sigma improvement in outcomes? How do we encourage self-regulation in the learner? How do we provide targeted, real-time feedback (nudges)? How do we create a personalized path for the learner? HINT Learning Analytics Adaptive Learning

12. Learning Analytics

13. What is the best that could happen? What might happen? Stages of Analytics Analytics Maturity CompetitiveAdvantage Raw Data Cleaned Data Standard Reports Adhoc Reports & OLAP Generic Predictive Analytics Predictive Modeling PREDICTION What happened? What correlates to what happened?? PRESCRIPTIONDESCRIPTION

14. Accepted standards for learning Aligned curricula and assessments Measurement and reports Course correction Descriptive Predictions Prescriptive

15. WHAT IS LEARNING ANALYTICS The measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs. How could we achieve that? HINT Open Architecture

16. Open Architecture

17. Data Source 1 LearningEvents+Context Learning Analytics Store OutputAPI Caliper Data Capture Specification Product 1 Open Analytics Architecture Data Source 2 Data Source 3 Data Source 4 InputAPIs Product 2 Product 3

18. Data Source 1 LearningEvents+Context Learning Analytics Store OutputAPI Caliper Data Capture Specification Product 1 Open Analytics Architecture Data Source 2 Data Source 3 Data Source 4 InputAPIs Product 2 Product 3

19. Data Source 1 LearningEvents+Context Learning Analytics Store OutputAPI Caliper Data Capture Specification Product 1 Open Analytics Architecture Data Source 2 Data Source 3 Data Source 4 InputAPIs Product 2 Product 3

20. Data Source 1 LearningEvents+Context Learning Analytics Store OutputAPI Caliper Data Capture Specification Product 1 Open Analytics Architecture Data Source 2 Data Source 3 Data Source 4 InputAPIs Product 2 Product 3

21. Data Source 1 LearningEvents+Context Learning Analytics Store OutputAPI Caliper Data Capture Specification Product 1 Open Analytics Architecture Data Source 2 Data Source 3 Data Source 4 InputAPIs Product 2 Product 3

22. MCGRAW-HILL EDUCATION THANK YOU.

Hinweis der Redaktion

Frontiers of Open Data Science Research. Whenever I see a presentation titles such as the one I am giving today, the words that come to my mind are something like this:
Big Data, Data Analytics, Data Science, Learning Science, Visualization, Reporting, Hadoop, Elastic Map Reduce, Spark, Scala, NoSQL, etc. Everyone seems to be explaining big data or data science in different words. So my goal for today is to provide clarity to these words in the context of education and learning. But first, why do we care? What is so important and noteworthy about data and data science anyways—and in particular, as it applies to learning and education since I represent a learning sciences company and I am a learning scientist myself.
Nowadays our lives seem to be filled with gadgets and tools that spit out data and most of them do some pretty cool analytics and reporting for the users. Here are some example of these everyday gadgets. Some seem trivial but in reality the questions we could ask and answer through these data could be very sophisticated and fascinating. Things that we couldn’t do easily before. An example would be this fitbit
Fitbit provides a phone app through which you can see charts and graphs of various information. It could be very simple such as your steps for the day, the milage you crossed, the evelevvation infroamtionetc.
Some models can even provide the user with their heart rate information
What brings it into data analytics is that you can create usage analytics based on these trivial data: for example, you could compare your heart rate based on the circumstances and see if there is a pattern that emerges. For example you can compare your heart rate for days when you are battling a cold to when you are very healthy and strong. See if there is a difference between your resting heart rates. If there is (which was the case in this situation), you can try to analyze whether fitbit could have predicted your illness prior to the day when you were unable to leave your bed. This is a simple case but there are many more we could apply for in learning context. For example, you could identify whether there is a difference in your academic performance based on your physical condition.
So exciting possibilities are that I could predict things like whether I am ready for my test or not, whether I have test anxiety. However, this excitement comes with price of such sobering realizations like is my data safe? Who else can see my data? Will I be judged based on this data?
So lets move closer to eduction. Let’s consider a research case that ground my talk
2-sigma Problem Back in 1984 Benjamin Bloom looked at student performance for students learning in three different contexts. In the first group, students were taught in a traditional class-room setting. In the second group, students were taught using mastery-learning techniques and formative feedback loop. In the third group students were in one on one tutoring sessions. Bloom discovered that students' performance from the second group was 1 sigma (standard deviation) higher than the students' performance in the first group. And students' performance in the third group was 2 sigmas higher than the students' performance in the first group. So another way said, the average tutored student outperformed 98% of the traditionally taught students!
This and other similar studies raise some very important questions for us: How do we achieve a 1- or 2-sigma improvement in student outcomes? How do we encourage self-regulation in the learner? How do we provide targeted, real-time feedback (nudges)? How do create a personalized path for the learner? The hint is hint: learning analytics and adaptive learning.
So what is analytics? How does it differ from our reports? And how can we apply it to learning? Data analytics and learning analytics, broadly put, is a system of analysis applied to data and to learning events. Yet, a definition of that breadth is not imminently practical. So lets look at the stages of analytics.
Descriptive. In this stage of analytics we are concerned with a presentation of the past. What has happened? What patterns of past behavior can be observed? This type of information, presented well, can be very powerful. Predictive. In the predictive stage, we begin to change our time horizon towards the future. What trends do we see? What events correlates to what happened. And even, what might happen? In this stage of analytics, we create predictive models, grounded in past data, of what might happen in the future. Prescriptive. In the last stage of analytics—the holy grail of analytics—we move from predictions to prescriptions. Given where I am, and given where I want to go, what should I do? What is the optimal path for me to take? This is where the adaptive learning comes in.
The analytics applied in learning context allow us to make sure that we align assignments to curricula but also allow students to follow their inidividual paths avoiding disengagement or ceiling effect.
Finally the last concept I will introduce is the open architecture.
Here at McGraw Hill we have created an Open Analytics Architecture. What does this mean?
In an open system, data and learning events can be sourced from many data sources. Why is this important? Because I can guarantee you that no one system or product has a complete picture of a student’s learning. The content tools you use “know” about a certain set of learning interactions
We use a standards body that provides a set pf requiremenst regarding how learning events data should be formatted and structured. This way we can guarantee a communication between different systems.
Here we transform and store the data collected from learning environments.
Finally, the last piece of our open architecture are the products and platforms that consume data from the analytics platform. These could be user-facing visualization products or any other system.

Frontiers of Open Data Science Research

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (15)

Ähnlich wie Frontiers of Open Data Science Research

Ähnlich wie Frontiers of Open Data Science Research (20)

Mehr von odsc

Mehr von odsc (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Frontiers of Open Data Science Research

Hinweis der Redaktion