Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Conversations with 
Data Tony Hirst 
Computing and Communications, 
The Open University
(Recognising 
and addressing 
a skills gap)
“The Technical Tools of Statistics” read at the 125th Anniversary Meeting of the American Statistical Association, 
Boston...
“A Boy's Work is Never Done”, KellyB. (flickr: foreverphoto/2467694199/)
“Exploratory data analysis 
is an attitude, 
a flexibility, 
and reliance on display, 
not a bundle of techniques 
and sho...
“I … cannot disagree strongly enough with statements 
about the dangers of putting powerful tools in the 
hands of novices...
Data 
accessibility 
Data 
sensemaking
Clean 
Shape 
Augment 
Look
Dirty Data
openrefine.org
Shapes…
I see trees…
See also: IPython notebook demo 
http://nbviewer.ipython.org/gist/psychemedia/9c54721e853403b43d21/pivotTable_demo.ipynb
“There is no more reason to expect 
one graph to ‘tell all’ than to expect 
one number to do the same.” 
-- John Tukey
If quantities are conserved, 
can you think of them in terms of flow?
“[T]he picture examining eye 
is the best finder we have 
of the wholly unanticipated.” 
Tukey, John W. "We need both expl...
How can we 
look at data?
How do we 
ask questions 
of data?
underspend filetype:xls site:gov.uk 
Search limits
Structured queries 
underspend filetype:xls site:gov.uk 
select webPages where 
text like “%underspend%” 
and filetype=“xl...
Count things 
Sort things
http://www.coolinfographics.com/blog/2014/8/29/false-visualizations-sizing-circles-in-infographics.html
How do we 
interpret the 
answers?
Look for 
outliers 
Top 3… 
…bottom 3
Outliers may be rare occurrences 
over time too… 
Streaks and runs…
Look for 
similarities & 
differences
Look for 
trends
Look for 
patterns & 
structure
“Hand-drawing of graphs, except 
perhaps for reproduction in books 
and in some journals, is now 
economically wasteful, s...
Recording your 
conversations
Rstudio.org
IPython Notebook
“I know of no person or group that is 
taking nearly adequate advantage of 
the graphical potentialities of the 
computer....
Hopefully, that 
contained some 
ouseful.info 
-- @psychemedia
Conversations with data
Conversations with data
Conversations with data
Conversations with data
Conversations with data
Conversations with data
Conversations with data
Conversations with data
Conversations with data
Conversations with data
Conversations with data
Conversations with data
Conversations with data
Conversations with data
Conversations with data
Conversations with data
Conversations with data
Conversations with data
Conversations with data
Conversations with data
Conversations with data
Nächste SlideShare
Wird geladen in …5
×

Conversations with data

4.056 Aufrufe

Veröffentlicht am

#dalmooc 27/10/12 slides

Veröffentlicht in: Bildung
  • Als Erste(r) kommentieren

Conversations with data

  1. 1. Conversations with Data Tony Hirst Computing and Communications, The Open University
  2. 2. (Recognising and addressing a skills gap)
  3. 3. “The Technical Tools of Statistics” read at the 125th Anniversary Meeting of the American Statistical Association, Boston, November 1964, published in April 1965 American Statistician. http://cm.bell-labs.com/cm/ms/departments/sia/tukey/memo/techtools.html /via Adam Cooper, “Exploratory Data Analysis” http://blogs.cetis.ac.uk/adam/2012/05/18/exploratory-data-analysis/ John Tukey “journeyman carpenter of data-analytical tools”
  4. 4. “A Boy's Work is Never Done”, KellyB. (flickr: foreverphoto/2467694199/)
  5. 5. “Exploratory data analysis is an attitude, a flexibility, and reliance on display, not a bundle of techniques and should be so taught.” John Tukey Tukey, John W. "We need both exploratory and confirmatory." The American Statistician 34.1 (1980): 23-25. http://www.ece.rice.edu/~fk1/classes/ELEC697/TukeyEDA.pdf
  6. 6. “I … cannot disagree strongly enough with statements about the dangers of putting powerful tools in the hands of novices. Computer algebra, statistics, and graphics systems provide plenty of rope for novices to hang themselves and may even help to inhibit the learning of essential skills needed by researchers. The obvious problems caused by this situation do not justify blunting our tools, however. They require better education in the imaginative and disciplined use of these tools. And they call for more attention to the way powerful and sophisticated tools are presented to novice users.” Leland Wilkinson, The Grammar of Graphics, Springer-Verlag, 1999, ISBN 0-387-98774-6, p15-16.
  7. 7. Data accessibility Data sensemaking
  8. 8. Clean Shape Augment Look
  9. 9. Dirty Data
  10. 10. openrefine.org
  11. 11. Shapes…
  12. 12. I see trees…
  13. 13. See also: IPython notebook demo http://nbviewer.ipython.org/gist/psychemedia/9c54721e853403b43d21/pivotTable_demo.ipynb
  14. 14. “There is no more reason to expect one graph to ‘tell all’ than to expect one number to do the same.” -- John Tukey
  15. 15. If quantities are conserved, can you think of them in terms of flow?
  16. 16. “[T]he picture examining eye is the best finder we have of the wholly unanticipated.” Tukey, John W. "We need both exploratory and confirmatory." The American Statistician 34.1 (1980): 23-25. http://www.ece.rice.edu/~fk1/classes/ELEC697/TukeyEDA.pdf John Tukey
  17. 17. How can we look at data?
  18. 18. How do we ask questions of data?
  19. 19. underspend filetype:xls site:gov.uk Search limits
  20. 20. Structured queries underspend filetype:xls site:gov.uk select webPages where text like “%underspend%” and filetype=“xls” and domain=“gov.uk” SQL
  21. 21. Count things Sort things
  22. 22. http://www.coolinfographics.com/blog/2014/8/29/false-visualizations-sizing-circles-in-infographics.html
  23. 23. How do we interpret the answers?
  24. 24. Look for outliers Top 3… …bottom 3
  25. 25. Outliers may be rare occurrences over time too… Streaks and runs…
  26. 26. Look for similarities & differences
  27. 27. Look for trends
  28. 28. Look for patterns & structure
  29. 29. “Hand-drawing of graphs, except perhaps for reproduction in books and in some journals, is now economically wasteful, slow, and on the way out.” – John Tukey
  30. 30. Recording your conversations
  31. 31. Rstudio.org
  32. 32. IPython Notebook
  33. 33. “I know of no person or group that is taking nearly adequate advantage of the graphical potentialities of the computer.” – John Tukey
  34. 34. Hopefully, that contained some ouseful.info -- @psychemedia

×