Training workshop for the CHASE Arts and Humanities in the Digital Age programme. (
This session will give you an overview of a variety of techniques and tools available for data visualisation and analysis in the humanities. You will learn about common types of visualisations and the role of exploratory and explanatory visualisations, explore examples of scholarly visualisations, try some visualisation tools, and know where to find further information about analysing and building data visualisations.
2. While we're getting started...
• Check that you can get online with the browsers Firefox or Chrome
• The Exercises page contains all the links you need during the day
• Check you can view it now: http://bit.ly/2kYtGx4
• Check you can log in to Viewshare with your new account
http://viewshare.org/
• Timetable
• 11am Tea and coffee
• 1 - 1:45pm Lunch
• 3 - 3:15pm Tea and coffee
• 4:30pm Finish; free working time until 5pm
3. Overview
• What is information visualisation and why use
it?
• The building blocks of visualisations
• Exploring and critiquing interactive
visualisations
• Getting from the data you have to the
visualisation you want
5. Visualisation is the graphical display of
quantitative or qualitative information to create
insights by highlighting patterns, trends,
variations and anomalies.
10. Why visualise information?
For 'sense-making (also called data analysis) and
communication' (Stephen Few)
'…showing quantitative and qualitative information
so that a viewer can see patterns, trends, or
anomalies, constancy or variation' (Michael
Friendly)
'…interactive, visual representations of abstract
data to amplify cognition' (Card et al)
'Distant reading' (Moretti) - focus on the shape
rather than detail of a collection
11. Introductions
• In a sentence or two, what's your interest in
data visualisation?
– What kinds of data do you work with?
– What's the goal of any visualisations you're
interested in creating?
– Do you have any potential users in mind?
17. Charles Minard's figurative map, 1869
'Figurative Map of the successive losses in men of the French Army in the Russian campaign
1812-1813'. Drawn up by M. Minard, Inspector General of Bridges and Roads in retirement.
Paris, November 20, 1869.
18. Web 2.0 and the mashup, 2006
http://www.bombsight.org
22. Exercise: compare n-gram tools
http://bit.ly/2kYtGx4
• Think of two words or phrases you'd like to
compare over time (e.g. Burma, Burmah).
• Open two browser windows
• In one, go to http://books.google.com/ngrams
• In the other, go to http://benschmidt.org/OL/
• Enter your words or phrases in each and compare
the results
• Discuss with your neighbour: what differences
did you find, and why?
26. Networks
Every point on this diagram represents a male film producer. The pink dots represent men who worked exclusively with other men in the period
surveyed, and the green dots represent those who worked with women.
https://theconversation.com/women-arent-the-problem-in-the-film-industry-men-are-68740 Deb Verhoeven and Stuart Palmer
27. Visualising images and video
http://www.flickr.com/photos/culturevis/5883371358/
'Mondrian vs. Rothko', Lev Manovich, 2010. Image preparation: Xiaoda Wang
30. Data types
• Quantitative
• Qualitative
• Geographic
• Temporal
• Media
• Entities (people, places, events, concepts,
things)
31. How do you get data to visualise?
• Make it
– Type it into a spreadsheet or database
• Automate it
– Extract it from text, images, audio or video
• Find it
– Lots of freely available data to practice with
39. Scholarly data visualisations
• Visualisations as 'distant reading' where
distance is 'a specific form of knowledge:
fewer elements, hence a sharper sense of
their overall interconnection' (Moretti, 2005)
• Inspiring curiosity and research questions
• But - which questions do they privilege and
what do they leave out?
40. Exercise: critiquing scholarly visualisations
Go to http://bit.ly/2kYtGx4 and follow the steps
for Exercise 3
Pair up and discuss together before reporting
back.
59. Considerations for humanities data
Commercial tools often assume complete, born-
digital datasets – no missing fields or changes in
data entry over time
• Historical records often contain uncertainty
and fuzziness (e.g. date ranges, multiple
values, uncertain or unavailable information)
• Includes metadata, data, digital surrogates
60. Messiness in historical data
• 'Begun in Kiryu, Japan, finished in France'
• 'Bali? Java? Mexico?'
• Variations on USA:
– U.S.
– U.S.A
– U.S.A.
– USA
– United States of America
– USA ?
– United States (case)
• Inconsistency in uncertainty
– U.S.A. or England
– U.S.A./England ?
– England & U.S.A.
63. Preparing data for visualisations
Historical data often needs manual cleaning to:
remove rows where vital information is missing
tidy inconsistencies in term lists or spelling
convert words to numbers (e.g. dates)
remove hard returns and non-ASCII characters (or
change data format)
split multiple values in one field into other
columns (e.g. author name, date in single field)
expand coded values (e.g. countries, language)
73. Key format decisions
• Static or interactive?
• Print or digital?
• Narrative or 'factual'?
• Shape (distant view) or detail (close view)?
74. Purpose, data, audience, structure
• Intersections of format and purpose
• Data types: quantitative, qualitative,
geographic, time series, media, entities
(people, places, events, concepts, things)
• Static, interactive; print, digital; product,
process
• Exploratory, explanatory: find new insights, or
tell a story? Pragmatic, emotive?
75. Dealing with complex data
• Find a visualisation type that can harbour the
data in a meaningful way or reduce the data in
a meaningful way.
– e.g. go from individual values to distribution of
values
– e.g. introduce interaction: overview, zoom and
filter, details on demand (Ben Shneiderman)
76. Exercise: 10 minute Viewshare tutorial
Instructions http://bit.ly/2kYtGx4
Discuss: what did you learn about preparing
data and using visualisation software?
83. Data Preparation
• Generally needs to be in tables, one row per
item, one column per value
• Aggregate or individual values - might need to
calculate totals in advance
• Data should be made as consistent as possible
with tools like Excel, OpenRefine
85. Sample advice
From viewshare, on spreadsheets:
• Remove any data that is not in a solid rectangular area.
This includes white space, page titles, scattered cells,
and additional worksheets.
• Check that your formatting is consistent throughout
each column (e.g. column is all in date format, currency
format, etc. as appropriate).
• Make sure that data of the same type but in different
columns is formatted consistently (e.g. dates in
different columns are in the same date format).
86. If all else fails...
• Sketch out your visualisation on paper to test
it
• Iteration is key, and...
• Stubbornness is a virtue!
87. Exercise: try views and widgets in
Viewshare
Instructions http://bit.ly/2kYtGx4
Views
• Lists, maps, pie charts, bar charts, scatter plots, tables,
timelines or galleries
Widgets
• Search boxes, lists, tag clouds, sliders, ranges, logos or
text
How might you apply these with your own data?
94. Publishing visualisations
• How can you contextualise, explain any
limitations of your visualisations? e.g.
– provenance and qualities of original dataset;
– what you needed to do to it to get it into software
(how transformed, how cleaned);
– what's left out of the visualisation, and why?
95. Best practice for design
• How effectively does the visualisation support
cognitive tasks?
• The most important and frequent visual
queries/pattern finding should be supported
with the most visually distinct objects
• Question: which examples did this well?
96. Do you really need a visualisation?
• Use tables when:
– doc will be used to look up individual values
– to compare individual values
– precise values are required
– the quantitative info to be communicated involves
more than one unit of measure
• Use graphs when:
– the message is contained in the shape of the values
– the document will be used to reveal relationships
among values
98. Tools that don't require programming
• Excel
• Google Fusion Tables, Google Drive
• Viewshare
• Tableau Public
NB: be careful about sensitive data on cloud
platforms