SlideShare a Scribd company logo
Visual Analytics for the Digital Humanities:
Combining Analytics and Visualization for Gaining
Insights into Linguistic Data

Daniel A. Keim
Data Analysis and Information
Visualization Group
University of Konstanz, Germany

Herrenhausen Conference, Hannover, Germany
December 5, 2013

1
Visual Analytics

"Computers are incredibly fast,
accurate, and stupid; humans are
incredibly slow, inaccurate, and
brilliant; together they are powerful
beyond imagination."
attributed to Albert Einstein

Visual Analytics
Tight Integration of Visual and Automatic Data Analysis Methods
for Information Exploration and Scalable Decision Support

Visual Data Exploration
Visualization
Data

Knowledge
Models
Automated Data Analysis
Feedback loop

2
Visual Analytics

Roadmap from the
VisMaster EU Project

www.visual-analytics.eu

Video

3
Why Visualization for the Digital Humanities?
•! Automated techniques not sufficient
–! Data ambiguous and incomplete
–! Complex relationship
–! Semantic gap
–! Limited Accuracy

•! Human Interaction is central for
–! Exploration of Data
–! Generation of Hypotheses
–! Interpretation of Results
–! Steering of the Analysis

Outline
•! Visual Analytics
–! Motivation and Definition
–! Visualization for the e-Humanities

•! Visual Analytics Examples
–! Literature Analysis
–! Language Analysis
–! Political Analysis

•! Perspectives

4
Autorship Attribution

Books of Mark Twain

Books of Jack London

Autorship Attribution

Average
or
Development
over the text?

5
Literature Fingerprinting

Book of
Jack
London
Book of
Mark
Twain

One Book

One block of 10000 words

6
7
Age Suitability Analysis

Features

Characters (Part of Harry Potter)

–! Character Detection
–! Topic Detection
–! Emotion Detection
–! Story Complexity
–! Book Features
–! Readability

Characters (Part of Stephen King’s “It”)

Character are, for example,
(1) Named Entities (2) often agents of verbs
(3) usually not after prepositions indicating a location

Age Suitability Analysis

8
Outline
•! Visual Analytics
–! Motivation and Definition
–! Visualization for the e-Humanities

•! Visual Analytics Examples
–! Literature Analysis
–! Language Analysis
–! Political Analysis

•! Perspectives

Cross-Language Analysis

9
Cross-Language Analysis

Languages from Papua New Guinea with leaves showing features
ordered to maximize (left) and minimize (right) the pairwise leaf similarity

Cross-Language Analysis

10
Vowel Harmony: Cross-linguistic Comparison
of Complex Language Features

“two-level” Vowel Harmony

i and u avoid each other

“one-level” Vowel Harmony

syllable reduplication

Vowel succession patterns in 42 languages (automatically
sorted by significance) [2]

Vowel Harmony: Cross-linguistic Comparison
of Complex Language Features

Comparing Swedish and Norwegian: Vowel
transitions according to their position within words
based on at least 50 Bible types.

Vowel transitions according to their position within
words. Only those transitions plotted based on at least
200 Bible types (interactive filter).

11
Tracking Semantic Change
Frequency development of different word senses automatically induced from word contexts with topic modeling.
Data: NYT Annotated Corpus, 1.8 million articles from daily newspaper editions 1987-2007

Reprinted from [3], © 2011 Association for Computational Linguistics

Analyzing Prosodic Features: Intonation

12
Analyzing Prosodic Features: Intonation

Outline
•! Visual Analytics
–! Motivation and Definition
–! Visualization for the e-Humanities

•! Visual Analytics Examples
–! Literature Analysis
–! Language Analysis
–! Political Analysis

•! Perspectives

13
One day of the
Stuttgart 21
mediations

BMBF Project
VisArgue

Presidential
Debate
Analysis

BMBF Project
VisArgue

14
Presidential
Debate
Analysis

Topic Shifts
BMBF Project
VisArgue

Presidential
Debate
Analysis

Crosstalk
BMBF Project
VisArgue

15
Comparison of
US-Presidential
Debates

Obama vs. McCain
2008

Obama vs. Romney
2012

BMBF Project
VisArgue

Stuttgart 21
Discourse
Analysis

BMBF Project
VisArgue

16
Analysis of Policy Networks

Parallel Tag Clouds to Show Differences
across US Court Circuits

Reprinted from Collins et al. [9], © 2009 IEEE

17
Voronoi Treemaps [10] in NYT

http://www.nytimes.com/interactive/2008/05/03/business/20080403_SPENDING_GRAPHIC.html?_r=0

Outline
•! Visual Analytics
–! Motivation and Definition
–! Visualization for the e-Humanities

•! Visual Analytics Examples
–! Literature Analysis
–! Language Analysis
–! Political Analysis

•! Perspectives

18
Visualization in the Digital Humanities
•! Visualization is central to allow humans and computers
to cooperate effectively
–! allow the computer to process large data
–! allow the human to understand and interact with large data

•! Interactive Visualization is central for
–! Exploration of Data
–! Interpretation of Results
–! Generation of Hypotheses
–! Steering of the Analysis

19
Thank you for your attention.
Questions?
“Anyone who claims to know all the answers
doesn't really know very much.”
Apostle Paul in 1. Cor. 8,2

infovis.uni-konstanz.de

20

More Related Content

Similar to Digital humanitiesherrenhaeusserforum2013keim

Autoethnography: proposing a new method for Information Systems research
Autoethnography: proposing a new method for Information Systems researchAutoethnography: proposing a new method for Information Systems research
Autoethnography: proposing a new method for Information Systems research
Niamh O Riordan
 
1 Introduction.ppt
1 Introduction.ppt1 Introduction.ppt
1 Introduction.ppt
tanishamahajan11
 
chinchor_nvac_may06
chinchor_nvac_may06chinchor_nvac_may06
chinchor_nvac_may06
webuploader
 
Cognitive Content Strategy
Cognitive Content StrategyCognitive Content Strategy
Cognitive Content Strategy
Tyler Tate
 
Global Analytics: Text, Speech, Sentiment, and Sense
Global Analytics: Text, Speech, Sentiment, and SenseGlobal Analytics: Text, Speech, Sentiment, and Sense
Global Analytics: Text, Speech, Sentiment, and Sense
Seth Grimes
 
Numerical Cognition, linguistic relativity and the ontology of numbers
Numerical Cognition, linguistic relativity and the ontology of numbersNumerical Cognition, linguistic relativity and the ontology of numbers
Numerical Cognition, linguistic relativity and the ontology of numbers
Hady Ba
 
It services & research methods
It services & research methodsIt services & research methods
It services & research methods
AkanshShandilya
 

Similar to Digital humanitiesherrenhaeusserforum2013keim (20)

LIWC2015: Using Word-Based Psychometrics for Research
LIWC2015:  Using Word-Based Psychometrics for Research LIWC2015:  Using Word-Based Psychometrics for Research
LIWC2015: Using Word-Based Psychometrics for Research
 
Autoethnography: proposing a new method for Information Systems research
Autoethnography: proposing a new method for Information Systems researchAutoethnography: proposing a new method for Information Systems research
Autoethnography: proposing a new method for Information Systems research
 
Semantic technology: The tourists’ voice comes alive.
Semantic technology: The tourists’ voice comes alive.Semantic technology: The tourists’ voice comes alive.
Semantic technology: The tourists’ voice comes alive.
 
1 Introduction.ppt
1 Introduction.ppt1 Introduction.ppt
1 Introduction.ppt
 
virtual scholarship: cutting-edge technologies for virtual research environments
virtual scholarship: cutting-edge technologies for virtual research environmentsvirtual scholarship: cutting-edge technologies for virtual research environments
virtual scholarship: cutting-edge technologies for virtual research environments
 
Introduction to automated text analyses in the Political Sciences
Introduction to automated text analyses in the Political SciencesIntroduction to automated text analyses in the Political Sciences
Introduction to automated text analyses in the Political Sciences
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
A tool for discourse visualization and analysis
A tool for discourse visualization and analysisA tool for discourse visualization and analysis
A tool for discourse visualization and analysis
 
chinchor_nvac_may06
chinchor_nvac_may06chinchor_nvac_may06
chinchor_nvac_may06
 
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
 
Looking beyond the script
Looking beyond the scriptLooking beyond the script
Looking beyond the script
 
Objective Fiction, i-semantics keynote
Objective Fiction, i-semantics keynoteObjective Fiction, i-semantics keynote
Objective Fiction, i-semantics keynote
 
Content analysis
Content analysisContent analysis
Content analysis
 
Roeder rocky 2011_46
Roeder rocky 2011_46Roeder rocky 2011_46
Roeder rocky 2011_46
 
Cognitive Content Strategy
Cognitive Content StrategyCognitive Content Strategy
Cognitive Content Strategy
 
Global Analytics: Text, Speech, Sentiment, and Sense
Global Analytics: Text, Speech, Sentiment, and SenseGlobal Analytics: Text, Speech, Sentiment, and Sense
Global Analytics: Text, Speech, Sentiment, and Sense
 
Findwise and IBM Watson
Findwise and IBM WatsonFindwise and IBM Watson
Findwise and IBM Watson
 
Numerical Cognition, linguistic relativity and the ontology of numbers
Numerical Cognition, linguistic relativity and the ontology of numbersNumerical Cognition, linguistic relativity and the ontology of numbers
Numerical Cognition, linguistic relativity and the ontology of numbers
 
It services & research methods
It services & research methodsIt services & research methods
It services & research methods
 
CONTENT ANALYSIS
CONTENT ANALYSISCONTENT ANALYSIS
CONTENT ANALYSIS
 

Recently uploaded

Industrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training ReportIndustrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training Report
Avinash Rai
 

Recently uploaded (20)

Operations Management - Book1.p - Dr. Abdulfatah A. Salem
Operations Management - Book1.p  - Dr. Abdulfatah A. SalemOperations Management - Book1.p  - Dr. Abdulfatah A. Salem
Operations Management - Book1.p - Dr. Abdulfatah A. Salem
 
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptxslides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
 
The Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve ThomasonThe Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve Thomason
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
Jose-Rizal-and-Philippine-Nationalism-National-Symbol-2.pptx
Jose-Rizal-and-Philippine-Nationalism-National-Symbol-2.pptxJose-Rizal-and-Philippine-Nationalism-National-Symbol-2.pptx
Jose-Rizal-and-Philippine-Nationalism-National-Symbol-2.pptx
 
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdfDanh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Morse OER Some Benefits and Challenges.pptx
Morse OER Some Benefits and Challenges.pptxMorse OER Some Benefits and Challenges.pptx
Morse OER Some Benefits and Challenges.pptx
 
Gyanartha SciBizTech Quiz slideshare.pptx
Gyanartha SciBizTech Quiz slideshare.pptxGyanartha SciBizTech Quiz slideshare.pptx
Gyanartha SciBizTech Quiz slideshare.pptx
 
Basic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & EngineeringBasic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
 
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdfINU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
 
Salient features of Environment protection Act 1986.pptx
Salient features of Environment protection Act 1986.pptxSalient features of Environment protection Act 1986.pptx
Salient features of Environment protection Act 1986.pptx
 
Keeping Your Information Safe with Centralized Security Services
Keeping Your Information Safe with Centralized Security ServicesKeeping Your Information Safe with Centralized Security Services
Keeping Your Information Safe with Centralized Security Services
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Industrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training ReportIndustrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training Report
 
The Benefits and Challenges of Open Educational Resources
The Benefits and Challenges of Open Educational ResourcesThe Benefits and Challenges of Open Educational Resources
The Benefits and Challenges of Open Educational Resources
 

Digital humanitiesherrenhaeusserforum2013keim

  • 1. Visual Analytics for the Digital Humanities: Combining Analytics and Visualization for Gaining Insights into Linguistic Data Daniel A. Keim Data Analysis and Information Visualization Group University of Konstanz, Germany Herrenhausen Conference, Hannover, Germany December 5, 2013 1
  • 2. Visual Analytics "Computers are incredibly fast, accurate, and stupid; humans are incredibly slow, inaccurate, and brilliant; together they are powerful beyond imagination." attributed to Albert Einstein Visual Analytics Tight Integration of Visual and Automatic Data Analysis Methods for Information Exploration and Scalable Decision Support Visual Data Exploration Visualization Data Knowledge Models Automated Data Analysis Feedback loop 2
  • 3. Visual Analytics Roadmap from the VisMaster EU Project www.visual-analytics.eu Video 3
  • 4. Why Visualization for the Digital Humanities? •! Automated techniques not sufficient –! Data ambiguous and incomplete –! Complex relationship –! Semantic gap –! Limited Accuracy •! Human Interaction is central for –! Exploration of Data –! Generation of Hypotheses –! Interpretation of Results –! Steering of the Analysis Outline •! Visual Analytics –! Motivation and Definition –! Visualization for the e-Humanities •! Visual Analytics Examples –! Literature Analysis –! Language Analysis –! Political Analysis •! Perspectives 4
  • 5. Autorship Attribution Books of Mark Twain Books of Jack London Autorship Attribution Average or Development over the text? 5
  • 6. Literature Fingerprinting Book of Jack London Book of Mark Twain One Book One block of 10000 words 6
  • 7. 7
  • 8. Age Suitability Analysis Features Characters (Part of Harry Potter) –! Character Detection –! Topic Detection –! Emotion Detection –! Story Complexity –! Book Features –! Readability Characters (Part of Stephen King’s “It”) Character are, for example, (1) Named Entities (2) often agents of verbs (3) usually not after prepositions indicating a location Age Suitability Analysis 8
  • 9. Outline •! Visual Analytics –! Motivation and Definition –! Visualization for the e-Humanities •! Visual Analytics Examples –! Literature Analysis –! Language Analysis –! Political Analysis •! Perspectives Cross-Language Analysis 9
  • 10. Cross-Language Analysis Languages from Papua New Guinea with leaves showing features ordered to maximize (left) and minimize (right) the pairwise leaf similarity Cross-Language Analysis 10
  • 11. Vowel Harmony: Cross-linguistic Comparison of Complex Language Features “two-level” Vowel Harmony i and u avoid each other “one-level” Vowel Harmony syllable reduplication Vowel succession patterns in 42 languages (automatically sorted by significance) [2] Vowel Harmony: Cross-linguistic Comparison of Complex Language Features Comparing Swedish and Norwegian: Vowel transitions according to their position within words based on at least 50 Bible types. Vowel transitions according to their position within words. Only those transitions plotted based on at least 200 Bible types (interactive filter). 11
  • 12. Tracking Semantic Change Frequency development of different word senses automatically induced from word contexts with topic modeling. Data: NYT Annotated Corpus, 1.8 million articles from daily newspaper editions 1987-2007 Reprinted from [3], © 2011 Association for Computational Linguistics Analyzing Prosodic Features: Intonation 12
  • 13. Analyzing Prosodic Features: Intonation Outline •! Visual Analytics –! Motivation and Definition –! Visualization for the e-Humanities •! Visual Analytics Examples –! Literature Analysis –! Language Analysis –! Political Analysis •! Perspectives 13
  • 14. One day of the Stuttgart 21 mediations BMBF Project VisArgue Presidential Debate Analysis BMBF Project VisArgue 14
  • 16. Comparison of US-Presidential Debates Obama vs. McCain 2008 Obama vs. Romney 2012 BMBF Project VisArgue Stuttgart 21 Discourse Analysis BMBF Project VisArgue 16
  • 17. Analysis of Policy Networks Parallel Tag Clouds to Show Differences across US Court Circuits Reprinted from Collins et al. [9], © 2009 IEEE 17
  • 18. Voronoi Treemaps [10] in NYT http://www.nytimes.com/interactive/2008/05/03/business/20080403_SPENDING_GRAPHIC.html?_r=0 Outline •! Visual Analytics –! Motivation and Definition –! Visualization for the e-Humanities •! Visual Analytics Examples –! Literature Analysis –! Language Analysis –! Political Analysis •! Perspectives 18
  • 19. Visualization in the Digital Humanities •! Visualization is central to allow humans and computers to cooperate effectively –! allow the computer to process large data –! allow the human to understand and interact with large data •! Interactive Visualization is central for –! Exploration of Data –! Interpretation of Results –! Generation of Hypotheses –! Steering of the Analysis 19
  • 20. Thank you for your attention. Questions? “Anyone who claims to know all the answers doesn't really know very much.” Apostle Paul in 1. Cor. 8,2 infovis.uni-konstanz.de 20