SlideShare ist ein Scribd-Unternehmen logo
1 von 68
Downloaden Sie, um offline zu lesen
Voyagers and Voyeurs
Supporting Social Data Analysis

Jeffrey Heer
Computer Science Department
Stanford University

CIDR 2009 – Monterey, CA
5 January 2009
A Tale of Two Visualizations
vizster
Observations
Groups spent more time in front of the
visualization than individuals.

Friends encouraged each other to unearth
relationships, probe community boundaries, and
challenge reported information.

Social play resulted in informal analysis, often
driven by story-telling of group histories.
NameVoyager
The Baby Name Voyager
Social Data Analysis
Visual sensemaking can be social as
well as cognitive.
Analysis of data coupled with social
interpretation and deliberation.

How can user interfaces catalyze and
support collaborative visual analysis?
sense.us
A Web Application for Collaborative
Visualization of Demographic Data
Voyagers and Voyeurs
Complementary faces of analysis
Voyager – focus on visualized data
Active engagement with the data
Serendipitous comment discovery

Voyeur – focus on comment listings
Investigate others’ explorations
Find people and topics of interest
Catalyze new explorations
Out of the Lab,
 Into the Wild
Wikimapia.org
DecisionSite posters




Spotfire Decision Site Posters
Tableau Server
Many-Eyes
Social Data Analysis In Action
1. Discussion and Debate
2. Text is Data, Too
3. Data Integrity and Cleaning
4. Integrating Data in Context
5. Pointing and Naming

For each, some thoughts on future directions.
I asked my colleagues: if you could give database
researchers a wish list, what would it be?
Discussion and Debate
Tableau X-Box / Quest Diag?

              “Valley of Death”
Content Analysis of Comments
                                           Service
                           Sense.us                       Many-Eyes
  Observation
     Question
   Hypothesis
 Data Integrity
        Linking
    Socializing
System Design
        Testing
           Tips
         To-Do
   Affirmation
                  0   20      40      60   80 0      20      40      60   80
                           Percentage                     Percentage



 Feature prevalence from content analysis (min Cohen’s = .74)
 High co-occurrence of Observations, Questions, and Hypotheses
WANTED: Structured Conversation

Reduce the cost of synthesizing contributions




Wikipedia: Shared Revisions   NASA ClickWorkers: Statistics
WANTED: Structured Conversation

Reduce the cost of synthesizing contributions

Can we represent data, visualizations, and social
activity in a unified data model?
Text is Data, Too
Visualization Popularity
                                                  Service
                              Many-Eyes                            Swivel
       Tag Cloud
   Bubble Graph
      Word Tree
        Bar Chart
            Maps
Network Diagram
        Treemap
    Matrix Chart
      Line Graph
      Scatterplot
  Stacked Graph
        Pie Chart
      Histogram
                    0.0 0.1   0.2    0.3    0.4   0.5 0.0 0.1   0.2    0.3    0.4   0.5
                               Percentage                        Percentage


Over 1/3 of Many-Eyes visualizations use free text
Alberto Gonzales
WANTED: Better Tools for Text

Statistical Analysis of text (with ties to source!)
Entity Extraction
Aggregation and Comparison of texts
  Get a “global” view of documents

We can do better than Tag Clouds (!?)
Use text analysis tools to enable analysis of
structured conversation by the community.
Data Integrity and Cleaning
No cooks in 1910? … There may have
been cooks then. But maybe not.
The great postmaster
scourge of 1910?
      Or just a bug
      in the data?
Content Analysis of Comments
                                           Service
                           Sense.us                       Many-Eyes
  Observation
     Question
   Hypothesis
 Data Integrity
        Linking
    Socializing
System Design
        Testing
           Tips
         To-Do
   Affirmation
                  0   20      40      60   80 0      20      40      60   80
                           Percentage                     Percentage


 16% of sense.us comments and 10% of Many-Eyes comments
 reference data quality or integrity.
WANTED: Data Cleaning Tools

Reshape data, reformat rows & columns
Handle missing data: label, repair, interpolate
Entity resolution and de-duplication
Group related values into aggregates
Assist table lookups & data transforms

Provide tools in situ to leverage collective
Transparency requires provenance
Integrating Data in Context
College Drug Use
College Drug Use
Harry Potter is Freaking Popular
WANTED: In-Situ Data Integration

Search for and suggest related data or views
User input for types, schema matching, or data
Apply in context of the current task
 But record mappings for future use
Record provenance: chain of data sources

Examples: Google Web Tables, Pay-As-You-Go,
  Stanford Vispedia, Utah VisTrails
Pointing and Naming
“Look at that spike.”
“Look at the spike for Turkey.”
“Look at the spike in the middle.”
Free-form   Data-aware
Visual Queries
Model selections as declarative queries over
interface elements or underlying data




  (-118.371 ≤ lon AND lon ≤ -118.164) AND (33.915 ≤ lat AND lat ≤ 34.089)
Visual Queries
Model selections as declarative queries over
interface elements or underlying data

Applicable to dynamic, time-varying data
Retarget selection across visual encodings
Support social navigation and data mining
WANTED: Data-Aware Annotation

Meta-queries linking annotations to views
Visually specifying notification triggers
Annotating data aggregates (use lineage?)
Unified model (again!) to facilitate reference
How to make it work at scale?

How else to use machine-readable annotations?
Can annotations be used to steer data mining?
Conclusion
Social Data Analysis
Collective analysis of data supported
by social interaction.
1. Discussion and Debate
2. Text is Data, Too
3. Data Integrity and Cleaning
4. Integrating Data in Context
5. Pointing and Naming
Summary
As visualization becomes common on the web,
opportunities for collaborative analysis abound.
Weave visualizations into the web: data access,
visualization creation, view sharing and pointing.
Support discovery, discussion, and integration
of contributions to leverage the collective.
Improve both processes and technologies for
communication and dissemination.
Parting Thoughts
Visualizations may have a catalytic effect
on social interaction around data.

Encourage participation by minimizing or
offsetting interaction costs.

Provide incentives by fostering the
personal relevance of the data.
Acknowledgements

@ Berkeley: Maneesh Agrawala, Wes Willett,
  danah boyd, Marti Hearst, Joe Hellerstein
@ IBM: Martin Wattenberg, Fernanda Viégas
@ PARC: Stu Card
@ Tableau: Jock Mackinlay, Chris Stolte,
  Christian Chabot
Voyagers and Voyeurs
Supporting Social Data Analysis

Jeffrey Heer Stanford University
jheer@stanford.edu
http://jheer.org
With a collaborative spirit, with a collaborative platform
where people can upload data, explore data, compare
solutions, discuss the results, build consensus, we can
engage passionate people, local communities, media and
this will raise - incredibly - the amount of people who can
understand what is going on.

And this would have fantastic outcomes: the engagement of
people, especially new generations; it would increase
knowledge, unlock statistics, improve transparency and
accountability of public policies, change culture, increase
numeracy, and in the end, improve democracy and welfare.

       Enrico Giovannini, Chief Statistician, OECD. June 2007.

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (6)

Cidr
CidrCidr
Cidr
 
C I D R
C I D RC I D R
C I D R
 
Cidr.ppt
Cidr.pptCidr.ppt
Cidr.ppt
 
Unicast multicast & broadcast
Unicast multicast & broadcastUnicast multicast & broadcast
Unicast multicast & broadcast
 
Ch05
Ch05Ch05
Ch05
 
Classless addressing
Classless addressingClassless addressing
Classless addressing
 

Ähnlich wie CIDR 2009: Jeff Heer Keynote

Coping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing ApproachCoping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing ApproachAndre Freitas
 
Research on collaborative information sharing systems
Research on collaborative information sharing systemsResearch on collaborative information sharing systems
Research on collaborative information sharing systemsDavide Eynard
 
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent SystemsFrom Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent SystemsMathieu d'Aquin
 
Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...
Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...
Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...Stephanie Steinhardt
 
Digital cultural heritage spring 2015 day 2
Digital cultural heritage spring 2015 day 2Digital cultural heritage spring 2015 day 2
Digital cultural heritage spring 2015 day 2Stefano A Gazziano
 
Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...ACMBangalore
 
Re-Empower the Public with Data Visualization and Game Design
Re-Empower the Public with Data Visualization and Game DesignRe-Empower the Public with Data Visualization and Game Design
Re-Empower the Public with Data Visualization and Game DesignSam Pottinger
 
Querying Heterogeneous Datasets on the Linked Data Web
Querying Heterogeneous Datasets on the Linked Data WebQuerying Heterogeneous Datasets on the Linked Data Web
Querying Heterogeneous Datasets on the Linked Data WebEdward Curry
 
Spark Social Media
Spark Social Media Spark Social Media
Spark Social Media suresh sood
 
Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...
Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...
Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...DataWorks Summit/Hadoop Summit
 
Open Data is Not Enough: Making Data Sharing Work
Open Data is Not Enough: Making Data Sharing WorkOpen Data is Not Enough: Making Data Sharing Work
Open Data is Not Enough: Making Data Sharing WorkResearch Data Alliance
 
Sweeny group think-ias2015
Sweeny group think-ias2015Sweeny group think-ias2015
Sweeny group think-ias2015Marianne Sweeny
 
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020P2Pvalue
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?Elena Simperl
 
Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Fernando de Assis Rodrigues
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so farElena Simperl
 

Ähnlich wie CIDR 2009: Jeff Heer Keynote (20)

Coping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing ApproachCoping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing Approach
 
Show me the data! Actionable insight from open courses
Show me the data! Actionable insight from open coursesShow me the data! Actionable insight from open courses
Show me the data! Actionable insight from open courses
 
Research on collaborative information sharing systems
Research on collaborative information sharing systemsResearch on collaborative information sharing systems
Research on collaborative information sharing systems
 
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent SystemsFrom Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
 
Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...
Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...
Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...
 
Digital cultural heritage spring 2015 day 2
Digital cultural heritage spring 2015 day 2Digital cultural heritage spring 2015 day 2
Digital cultural heritage spring 2015 day 2
 
Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...
 
data, big data, open data
data, big data, open datadata, big data, open data
data, big data, open data
 
Re-Empower the Public with Data Visualization and Game Design
Re-Empower the Public with Data Visualization and Game DesignRe-Empower the Public with Data Visualization and Game Design
Re-Empower the Public with Data Visualization and Game Design
 
Why Data Science is a Science
Why Data Science is a ScienceWhy Data Science is a Science
Why Data Science is a Science
 
Big Data Trends
Big Data TrendsBig Data Trends
Big Data Trends
 
Querying Heterogeneous Datasets on the Linked Data Web
Querying Heterogeneous Datasets on the Linked Data WebQuerying Heterogeneous Datasets on the Linked Data Web
Querying Heterogeneous Datasets on the Linked Data Web
 
Spark Social Media
Spark Social Media Spark Social Media
Spark Social Media
 
Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...
Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...
Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...
 
Open Data is Not Enough: Making Data Sharing Work
Open Data is Not Enough: Making Data Sharing WorkOpen Data is Not Enough: Making Data Sharing Work
Open Data is Not Enough: Making Data Sharing Work
 
Sweeny group think-ias2015
Sweeny group think-ias2015Sweeny group think-ias2015
Sweeny group think-ias2015
 
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?
 
Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so far
 

Mehr von infoblog

CIDR 2009: James Hamilton Keynote
CIDR 2009: James Hamilton KeynoteCIDR 2009: James Hamilton Keynote
CIDR 2009: James Hamilton Keynoteinfoblog
 
Claremont Report on Database Research: Research Directions (Le Gruenwald)
Claremont Report on Database Research: Research Directions (Le Gruenwald)Claremont Report on Database Research: Research Directions (Le Gruenwald)
Claremont Report on Database Research: Research Directions (Le Gruenwald)infoblog
 
Claremont Report on Database Research: Research Directions (Eric A. Brewer)
Claremont Report on Database Research: Research Directions (Eric A. Brewer)Claremont Report on Database Research: Research Directions (Eric A. Brewer)
Claremont Report on Database Research: Research Directions (Eric A. Brewer)infoblog
 
Claremont Report on Database Research: Research Directions (Rakesh Agrawal)
Claremont Report on Database Research: Research Directions (Rakesh Agrawal)Claremont Report on Database Research: Research Directions (Rakesh Agrawal)
Claremont Report on Database Research: Research Directions (Rakesh Agrawal)infoblog
 
Claremont Report on Database Research: Research Directions (Gerhard Weikum)
Claremont Report on Database Research: Research Directions (Gerhard Weikum)Claremont Report on Database Research: Research Directions (Gerhard Weikum)
Claremont Report on Database Research: Research Directions (Gerhard Weikum)infoblog
 
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)Claremont Report on Database Research: Research Directions (Beng Chin Ooi)
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)infoblog
 
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)infoblog
 
Claremont Report on Database Research: Research Directions (Donald Kossmann)
Claremont Report on Database Research: Research Directions (Donald Kossmann)Claremont Report on Database Research: Research Directions (Donald Kossmann)
Claremont Report on Database Research: Research Directions (Donald Kossmann)infoblog
 
Claremont Report on Database Research: Research Directions (Johannes Gehrke)
Claremont Report on Database Research: Research Directions (Johannes Gehrke)Claremont Report on Database Research: Research Directions (Johannes Gehrke)
Claremont Report on Database Research: Research Directions (Johannes Gehrke)infoblog
 
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)Claremont Report on Database Research: Research Directions (Alon Y. Halevy)
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)infoblog
 
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)infoblog
 
Database Research Principles Revealed (Small Size)
Database Research Principles Revealed (Small Size)Database Research Principles Revealed (Small Size)
Database Research Principles Revealed (Small Size)infoblog
 
Database Research Principles Revealed
Database Research Principles RevealedDatabase Research Principles Revealed
Database Research Principles Revealedinfoblog
 

Mehr von infoblog (14)

CIDR 2009: James Hamilton Keynote
CIDR 2009: James Hamilton KeynoteCIDR 2009: James Hamilton Keynote
CIDR 2009: James Hamilton Keynote
 
Claremont Report on Database Research: Research Directions (Le Gruenwald)
Claremont Report on Database Research: Research Directions (Le Gruenwald)Claremont Report on Database Research: Research Directions (Le Gruenwald)
Claremont Report on Database Research: Research Directions (Le Gruenwald)
 
Claremont Report on Database Research: Research Directions (Eric A. Brewer)
Claremont Report on Database Research: Research Directions (Eric A. Brewer)Claremont Report on Database Research: Research Directions (Eric A. Brewer)
Claremont Report on Database Research: Research Directions (Eric A. Brewer)
 
Claremont Report on Database Research: Research Directions (Rakesh Agrawal)
Claremont Report on Database Research: Research Directions (Rakesh Agrawal)Claremont Report on Database Research: Research Directions (Rakesh Agrawal)
Claremont Report on Database Research: Research Directions (Rakesh Agrawal)
 
Claremont Report on Database Research: Research Directions (Gerhard Weikum)
Claremont Report on Database Research: Research Directions (Gerhard Weikum)Claremont Report on Database Research: Research Directions (Gerhard Weikum)
Claremont Report on Database Research: Research Directions (Gerhard Weikum)
 
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)Claremont Report on Database Research: Research Directions (Beng Chin Ooi)
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)
 
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)
 
Claremont Report on Database Research: Research Directions (Donald Kossmann)
Claremont Report on Database Research: Research Directions (Donald Kossmann)Claremont Report on Database Research: Research Directions (Donald Kossmann)
Claremont Report on Database Research: Research Directions (Donald Kossmann)
 
Claremont Report on Database Research: Research Directions (Johannes Gehrke)
Claremont Report on Database Research: Research Directions (Johannes Gehrke)Claremont Report on Database Research: Research Directions (Johannes Gehrke)
Claremont Report on Database Research: Research Directions (Johannes Gehrke)
 
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)Claremont Report on Database Research: Research Directions (Alon Y. Halevy)
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)
 
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)
 
Spot Sigs
Spot SigsSpot Sigs
Spot Sigs
 
Database Research Principles Revealed (Small Size)
Database Research Principles Revealed (Small Size)Database Research Principles Revealed (Small Size)
Database Research Principles Revealed (Small Size)
 
Database Research Principles Revealed
Database Research Principles RevealedDatabase Research Principles Revealed
Database Research Principles Revealed
 

Kürzlich hochgeladen

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 

Kürzlich hochgeladen (20)

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 

CIDR 2009: Jeff Heer Keynote

  • 1. Voyagers and Voyeurs Supporting Social Data Analysis Jeffrey Heer Computer Science Department Stanford University CIDR 2009 – Monterey, CA 5 January 2009
  • 2. A Tale of Two Visualizations
  • 4. Observations Groups spent more time in front of the visualization than individuals. Friends encouraged each other to unearth relationships, probe community boundaries, and challenge reported information. Social play resulted in informal analysis, often driven by story-telling of group histories.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10. Social Data Analysis Visual sensemaking can be social as well as cognitive. Analysis of data coupled with social interpretation and deliberation. How can user interfaces catalyze and support collaborative visual analysis?
  • 11. sense.us A Web Application for Collaborative Visualization of Demographic Data
  • 12.
  • 13. Voyagers and Voyeurs Complementary faces of analysis Voyager – focus on visualized data Active engagement with the data Serendipitous comment discovery Voyeur – focus on comment listings Investigate others’ explorations Find people and topics of interest Catalyze new explorations
  • 14. Out of the Lab, Into the Wild
  • 15.
  • 16.
  • 20.
  • 22. Social Data Analysis In Action 1. Discussion and Debate 2. Text is Data, Too 3. Data Integrity and Cleaning 4. Integrating Data in Context 5. Pointing and Naming For each, some thoughts on future directions. I asked my colleagues: if you could give database researchers a wish list, what would it be?
  • 24.
  • 25.
  • 26.
  • 27. Tableau X-Box / Quest Diag? “Valley of Death”
  • 28.
  • 29.
  • 30.
  • 31. Content Analysis of Comments Service Sense.us Many-Eyes Observation Question Hypothesis Data Integrity Linking Socializing System Design Testing Tips To-Do Affirmation 0 20 40 60 80 0 20 40 60 80 Percentage Percentage Feature prevalence from content analysis (min Cohen’s = .74) High co-occurrence of Observations, Questions, and Hypotheses
  • 32. WANTED: Structured Conversation Reduce the cost of synthesizing contributions Wikipedia: Shared Revisions NASA ClickWorkers: Statistics
  • 33. WANTED: Structured Conversation Reduce the cost of synthesizing contributions Can we represent data, visualizations, and social activity in a unified data model?
  • 35. Visualization Popularity Service Many-Eyes Swivel Tag Cloud Bubble Graph Word Tree Bar Chart Maps Network Diagram Treemap Matrix Chart Line Graph Scatterplot Stacked Graph Pie Chart Histogram 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 Percentage Percentage Over 1/3 of Many-Eyes visualizations use free text
  • 36.
  • 38. WANTED: Better Tools for Text Statistical Analysis of text (with ties to source!) Entity Extraction Aggregation and Comparison of texts Get a “global” view of documents We can do better than Tag Clouds (!?) Use text analysis tools to enable analysis of structured conversation by the community.
  • 39. Data Integrity and Cleaning
  • 40. No cooks in 1910? … There may have been cooks then. But maybe not.
  • 41. The great postmaster scourge of 1910? Or just a bug in the data?
  • 42.
  • 43.
  • 44. Content Analysis of Comments Service Sense.us Many-Eyes Observation Question Hypothesis Data Integrity Linking Socializing System Design Testing Tips To-Do Affirmation 0 20 40 60 80 0 20 40 60 80 Percentage Percentage 16% of sense.us comments and 10% of Many-Eyes comments reference data quality or integrity.
  • 45. WANTED: Data Cleaning Tools Reshape data, reformat rows & columns Handle missing data: label, repair, interpolate Entity resolution and de-duplication Group related values into aggregates Assist table lookups & data transforms Provide tools in situ to leverage collective Transparency requires provenance
  • 47.
  • 48.
  • 51. Harry Potter is Freaking Popular
  • 52.
  • 53. WANTED: In-Situ Data Integration Search for and suggest related data or views User input for types, schema matching, or data Apply in context of the current task But record mappings for future use Record provenance: chain of data sources Examples: Google Web Tables, Pay-As-You-Go, Stanford Vispedia, Utah VisTrails
  • 55. “Look at that spike.”
  • 56. “Look at the spike for Turkey.”
  • 57. “Look at the spike in the middle.”
  • 58. Free-form Data-aware
  • 59. Visual Queries Model selections as declarative queries over interface elements or underlying data (-118.371 ≤ lon AND lon ≤ -118.164) AND (33.915 ≤ lat AND lat ≤ 34.089)
  • 60. Visual Queries Model selections as declarative queries over interface elements or underlying data Applicable to dynamic, time-varying data Retarget selection across visual encodings Support social navigation and data mining
  • 61. WANTED: Data-Aware Annotation Meta-queries linking annotations to views Visually specifying notification triggers Annotating data aggregates (use lineage?) Unified model (again!) to facilitate reference How to make it work at scale? How else to use machine-readable annotations? Can annotations be used to steer data mining?
  • 63. Social Data Analysis Collective analysis of data supported by social interaction. 1. Discussion and Debate 2. Text is Data, Too 3. Data Integrity and Cleaning 4. Integrating Data in Context 5. Pointing and Naming
  • 64. Summary As visualization becomes common on the web, opportunities for collaborative analysis abound. Weave visualizations into the web: data access, visualization creation, view sharing and pointing. Support discovery, discussion, and integration of contributions to leverage the collective. Improve both processes and technologies for communication and dissemination.
  • 65. Parting Thoughts Visualizations may have a catalytic effect on social interaction around data. Encourage participation by minimizing or offsetting interaction costs. Provide incentives by fostering the personal relevance of the data.
  • 66. Acknowledgements @ Berkeley: Maneesh Agrawala, Wes Willett, danah boyd, Marti Hearst, Joe Hellerstein @ IBM: Martin Wattenberg, Fernanda Viégas @ PARC: Stu Card @ Tableau: Jock Mackinlay, Chris Stolte, Christian Chabot
  • 67. Voyagers and Voyeurs Supporting Social Data Analysis Jeffrey Heer Stanford University jheer@stanford.edu http://jheer.org
  • 68. With a collaborative spirit, with a collaborative platform where people can upload data, explore data, compare solutions, discuss the results, build consensus, we can engage passionate people, local communities, media and this will raise - incredibly - the amount of people who can understand what is going on. And this would have fantastic outcomes: the engagement of people, especially new generations; it would increase knowledge, unlock statistics, improve transparency and accountability of public policies, change culture, increase numeracy, and in the end, improve democracy and welfare. Enrico Giovannini, Chief Statistician, OECD. June 2007.