How and why study bigvisual cultural dataDr. Lev ManovichProfessor, CUNY Graduate Centermanovich.lev@gmail.comsoftwarestud...
softwarestudies.com   2
Software Studies Initiative - 2007NEH Office for Digital Humanities - 2008NEH Humanities High Performance Computing - 2008N...
How can we take advantage of unprecedentedamounts of cultural data available on the weband digitized cultural heritage to ...
NEH/NSF Digging into Data competition (2009):“How does the notion of scale affecthumanities and social science research?No...
Why studybig cultural data ?               softwarestudies.com   6
1 study societies through the social mediatraces (social computing)2 more inclusive understanding of culturalhistory and p...
4 generate multiple maps of the same culturaldata sets (multiple “landscapes”)5 the best way to follow global professional...
softwarestudies.com   9
Example - graph from Ted Underwood, “The Differentiation of Literaryand nonliterary diction, 1700-1900.” Data: 3,724 18th ...
modern (19th-20th centuries) social andcultural theory: describe what is similar(classes, structures, types) / statistics(...
“We are no longer interested in the conformityof an individual to an ideal type; we are nowinterested in the relation of a...
Visualization: Thinkingwithout “large” categories              softwarestudies.com   13
Manual De Landa:“The ontological status of assemblages, largeand small, is always that of unique, singularindividuals.”“Un...
Bruno Latour:“The ‘whole is now nothing more than aprovisional visualization which can bemodified and reversed at will, by ...
How to study big culturalvisual data in practice?How to explore massive visual collections(exploratory media analysis)?Whi...
Our methodology:media visualizationdisplay completecollection sorted usingmetadata and/or extractedfeatures              s...
infovis: data into picturesmediavis: pictures into pictures                  softwarestudies.com   18
left: scatter plotright: media visualization (image plot) of the same data                              softwarestudies.co...
our media visualization software on 287 megapixel display (image: 1 million manga pages)
our media visualization software on newerdisplay wall with thin bezelsdata: 4535 Time magazine covers)                    ...
mediavis - related research:M. Worring, G.P. Nguyen. Interactive access to largeimage collections using similarity-based v...
mediavis vs. normalcomputer science approach:borrow techniques from media art, digital art,information visualization / for...
Basic media visualizationtechniques:1 montage: sort images using metadata2 slice: sample images and arrange usingmetadata3...
1montage: sort imagesusing metadata4535 Time covers, 1923-2009                              softwarestudies.com   26
1 montage close up:   Time magazine covers, 1920s                                  softwarestudies.com   27
1 montage close up:   Time magazine covers, 1990s-2000s                                  softwarestudies.com     28
2slice: sample images and arrange using metadata4535 Time covers, 1923-2009. Each line is a vertical slice through the cen...
Time coves slice close-up                            softwarestudies.com   30
3 image plot: organize images using features and(optionally) metadataImage plots of 4535 Time covers, 1923-2009. X-axis = ...
Time covers image plot close-up                            softwarestudies.com   32
Comparing a number of image sets with image plotsSelected paintings by six impressionist artists. X-axis = mean saturation...
softwarestudies.com   34
visualizing videocollections:use media visualization with a set ofkeyframesautomatic selection of key frames(for example, ...
Kingdom Hearts video game62.5 hr. of game play, 29 sessions over 20 days.ys.montage: 1 frame per 3 sec (22500 frames in to...
softwarestudies.com   37
softwarestudies.com   38
11th Year (Dziga Vertov, 1928): first frame of every shot                               softwarestudies.com
11th Year (Dziga Vertov, 1928): comparing firstand last frame in every shot (close-ups fromthe larger visualization)       ...
Why use numbers?Using numbers to describecultural artifacts allows toreplacing discretecategories (words) withcontinuos de...
1 from timelines to graphs2 better represent analog attributesof cultural artifacts3 map cultural landscapes (fuzzy /overl...
1 from timelines to curves                Mark Rothko, 393 paintings (1927-1970).X - year. Y - brightness mean. Hao Wang a...
2 better represent analog attributes of cultural artifactsNext slide:close-up of a visualization showing average amount of...
softwarestudies.com
3 the maps of cultural landscapes reveal fuzzy andoverlapping clusters - rather than discrete categorieswith hard boundari...
4 visualize the space of variations600 variations of Google Logo, 1988-2009                                           soft...
softwarestudies.com   48
Studying large massivedata sets challenges ourexisting theoreticalconcepts and assumptionsexample: what is “style”?       ...
image plot of one million manga pagesx - standard deviationy - entropy                                        softwarestud...
softwarestudies.com   51
distribution ofmillion manga pagesx - standard deviationy - entropy                         softwarestudies.com   52
single short manga series< 1000 pages                            softwarestudies.com   53
776 Vincent van Gogh paintings. X - year/month. Y - brightness mean.                                              software...
Current / recent projectsat softwarestudies.com:6000+ paintings of French Impressionists7000 year old stone arrowheads(wit...
samples from 4.7 million newspaper pagescollection from Library of Congress (UCSDundergraduate students)virtual world / ga...
Big project supported by Mellon FoundationGrant, 2012-2015- tools and workflows for working with imageand video collections...
Postscript:digital humanities (workingwith digitized collections ofhistorical artifacts)vs. computational humanities(using...
“The capacity to collect and analyze massive amountsof data has transformed such fields as biology andphysics. But the emer...
Massive amounts of cultural content and onlineconversations, opinions, and cultural activities(general and specialized soc...
manovich.lev@gmail.comsoftwarestudies.com                  softwarestudies.com   61
Our free open source software tools foranalyzing and visualizing large image andvideo collections, publications andproject...
How and why study big cultural data v2
Nächste SlideShare
Wird geladen in ...5
×

How and why study big cultural data v2

21,531

Published on

Visualizing large image and video collections: techniques, examples, theory.

More info: softwarestudies.com

Published in: Bildung

Transcript of "How and why study big cultural data v2"

  1. 1. How and why study bigvisual cultural dataDr. Lev ManovichProfessor, CUNY Graduate Centermanovich.lev@gmail.comsoftwarestudies.comFall 2012 version softwarestudies.com 1
  2. 2. softwarestudies.com 2
  3. 3. Software Studies Initiative - 2007NEH Office for Digital Humanities - 2008NEH Humanities High Performance Computing - 2008NEH/NSF Digging Into Data competition - 2009Computational Social Science - 2009Culturnomics and Google n-gram viewer - 2010New York Times: “The next big idea in language,history and the arts? Data.”- 2010 softwarestudies.com 3
  4. 4. How can we take advantage of unprecedentedamounts of cultural data available on the weband digitized cultural heritage to begin analyzingcultural processes in new ways?How does computational analysis of themassive cultural datasets and real-time flowscan help us to develop theories and methods inhumanities adequate for the scale and speed ofthe 21st century global networked digitalculture ? softwarestudies.com 4
  5. 5. NEH/NSF Digging into Data competition (2009):“How does the notion of scale affecthumanities and social science research?Now that scholars have access to hugerepositories of digitized data—far more thanthey could read in a lifetime—what does thatmean for research?” softwarestudies.com 5
  6. 6. Why studybig cultural data ? softwarestudies.com 6
  7. 7. 1 study societies through the social mediatraces (social computing)2 more inclusive understanding of culturalhistory and present (using much largersamples)3 detect large scale cultural patterns softwarestudies.com 7
  8. 8. 4 generate multiple maps of the same culturaldata sets (multiple “landscapes”)5 the best way to follow global professionallyproduced digital culture; understand newdeveloped cultural fields (“X” design)6 map cultural variability and diversity softwarestudies.com 8
  9. 9. softwarestudies.com 9
  10. 10. Example - graph from Ted Underwood, “The Differentiation of Literaryand nonliterary diction, 1700-1900.” Data: 3,724 18th century volumes,using 10,000 most frequent words (excluding proper nouns). softwarestudies.com 10
  11. 11. modern (19th-20th centuries) social andcultural theory: describe what is similar(classes, structures, types) / statistics(reduction)computational humanities and social scienceshould focus on describing what is different /variability / diversity“from data to knowledge” is wrong. In thestudy of culture, we need to go from our(incomplete, biased) knowledge to actualcultural data softwarestudies.com 11
  12. 12. “We are no longer interested in the conformityof an individual to an ideal type; we are nowinterested in the relation of an individual to theother individuals with which it interacts...Relations will be more important thancategories; functions, which are variable, willbe more important than purposes; transitionswill be more important than boundaries;sequences will be more important thanhierarchies.”Louis Menand on Darvin, 2001. softwarestudies.com 12
  13. 13. Visualization: Thinkingwithout “large” categories softwarestudies.com 13
  14. 14. Manual De Landa:“The ontological status of assemblages, largeand small, is always that of unique, singularindividuals.”“Unlike taxonomic essentialism in whichgenus, species and individuals are separateontological categories, the ontology ofassemblages is flat since it contains nothingbut differently scaled individual singularities.”source: A New Philosophy of Society. softwarestudies.com 14
  15. 15. Bruno Latour:“The ‘whole is now nothing more than aprovisional visualization which can bemodified and reversed at will, by moving backto the individual components, and thenlooking for yet other tools to regroup the sameelements into alternative assemblages.”source: “Tarde’s idea of quantification.” InThe Social After Gabriel Tarde: Debates andAssessments. softwarestudies.com 15
  16. 16. How to study big culturalvisual data in practice?How to explore massive visual collections(exploratory media analysis)?Which data analysis and visualizationtechniques are appropriate for non-technicalusers? How to democratize data analysis? softwarestudies.com 16
  17. 17. Our methodology:media visualizationdisplay completecollection sorted usingmetadata and/or extractedfeatures softwarestudies.com 17
  18. 18. infovis: data into picturesmediavis: pictures into pictures softwarestudies.com 18
  19. 19. left: scatter plotright: media visualization (image plot) of the same data softwarestudies.com 19
  20. 20. our media visualization software on 287 megapixel display (image: 1 million manga pages)
  21. 21. our media visualization software on newerdisplay wall with thin bezelsdata: 4535 Time magazine covers) softwarestudies.com 21
  22. 22. mediavis - related research:M. Worring, G.P. Nguyen. Interactive access to largeimage collections using similarity-based visualization.Journal of Visual Languages and Computing 19 (2008)(submitted 2005).Gerald Schaefer. Interactive Browsing of ImageRepositories. ICVG 2012.Jing et al., Google Inc. Google Image Swirl: A Large-ScaleContent-Based Image Visualization System. WWW 2012. softwarestudies.com 22
  23. 23. mediavis vs. normalcomputer science approach:borrow techniques from media art, digital art,information visualization / for non-technical usersexplore the possibilities of simplest techniques byusing them with media collections from every areaof humanitiesuse mediavis to challenge existing concepts andassumptions of humanities softwarestudies.com 23
  24. 24. Basic media visualizationtechniques:1 montage: sort images using metadata2 slice: sample images and arrange usingmetadata3 image plot: automatically measure imageproperties (features) and organize in 2D usingthese measurements and metadata softwarestudies.com 25
  25. 25. 1montage: sort imagesusing metadata4535 Time covers, 1923-2009 softwarestudies.com 26
  26. 26. 1 montage close up: Time magazine covers, 1920s softwarestudies.com 27
  27. 27. 1 montage close up: Time magazine covers, 1990s-2000s softwarestudies.com 28
  28. 28. 2slice: sample images and arrange using metadata4535 Time covers, 1923-2009. Each line is a vertical slice through the center of an image. softwarestudies.com 29
  29. 29. Time coves slice close-up softwarestudies.com 30
  30. 30. 3 image plot: organize images using features and(optionally) metadataImage plots of 4535 Time covers, 1923-2009. X-axis = date; Y-axis = saturation mean. softwarestudies.com 31
  31. 31. Time covers image plot close-up softwarestudies.com 32
  32. 32. Comparing a number of image sets with image plotsSelected paintings by six impressionist artists. X-axis = mean saturation. Y-axis =median hue. Megan O’Rourke, 2012. softwarestudies.com 33
  33. 33. softwarestudies.com 34
  34. 34. visualizing videocollections:use media visualization with a set ofkeyframesautomatic selection of key frames(for example, using free shot detectionsoftware) softwarestudies.com 35
  35. 35. Kingdom Hearts video game62.5 hr. of game play, 29 sessions over 20 days.ys.montage: 1 frame per 3 sec (22500 frames in total) softwarestudies.com
  36. 36. softwarestudies.com 37
  37. 37. softwarestudies.com 38
  38. 38. 11th Year (Dziga Vertov, 1928): first frame of every shot softwarestudies.com
  39. 39. 11th Year (Dziga Vertov, 1928): comparing firstand last frame in every shot (close-ups fromthe larger visualization) softwarestudies.com 40
  40. 40. Why use numbers?Using numbers to describecultural artifacts allows toreplacing discretecategories (words) withcontinuos descriptions(curves) softwarestudies.com 41
  41. 41. 1 from timelines to graphs2 better represent analog attributesof cultural artifacts3 map cultural landscapes (fuzzy /overlapping / hard clusters?)4 visualize cultural variability5 discover new gropings softwarestudies.com 42
  42. 42. 1 from timelines to curves Mark Rothko, 393 paintings (1927-1970).X - year. Y - brightness mean. Hao Wang and Mayra Vasquez. softwarestudies.com
  43. 43. 2 better represent analog attributes of cultural artifactsNext slide:close-up of a visualization showing average amount ofvisual change (bar graph) in every shot in Vertov’s11th year. Images above the bar: first frame of everyshot.To measure visual change per shot:1) calculate brightness mean of the difference imagebetween each two frames in the shot2) add all means3) divide by number of frames in the shot softwarestudies.com
  44. 44. softwarestudies.com
  45. 45. 3 the maps of cultural landscapes reveal fuzzy andoverlapping clusters - rather than discrete categorieswith hard boundaries softwarestudies.com 46
  46. 46. 4 visualize the space of variations600 variations of Google Logo, 1988-2009 softwarestudies.com
  47. 47. softwarestudies.com 48
  48. 48. Studying large massivedata sets challenges ourexisting theoreticalconcepts and assumptionsexample: what is “style”? softwarestudies.com 49
  49. 49. image plot of one million manga pagesx - standard deviationy - entropy softwarestudies.com
  50. 50. softwarestudies.com 51
  51. 51. distribution ofmillion manga pagesx - standard deviationy - entropy softwarestudies.com 52
  52. 52. single short manga series< 1000 pages softwarestudies.com 53
  53. 53. 776 Vincent van Gogh paintings. X - year/month. Y - brightness mean. softwarestudies.com 54
  54. 54. Current / recent projectsat softwarestudies.com:6000+ paintings of French Impressionists7000 year old stone arrowheads(with UCSD anthropologist) softwarestudies.com 55
  55. 55. samples from 4.7 million newspaper pagescollection from Library of Congress (UCSDundergraduate students)virtual world / game analytics (funded by NSFEager, with UCSD Experimental Games Lab)comparing Art Now & Graphic design Flickrgroups (340,000 images)(with CS collaborator from Laurence BerkeleyNational Laboratory) softwarestudies.com 56
  56. 56. Big project supported by Mellon FoundationGrant, 2012-2015- tools and workflows for working with imageand video collections using SEASR / MEANDREdigital humanities workflow platform- applications:1) 1+ million images + millions of metadatarecords from deviantArt (the largest socialnetwork for user-created art - 20 M users, 240 Martworks).2) 1+ million manga pages.3) thousands of hours TV poltical news andonline video softwarestudies.com 57
  57. 57. Postscript:digital humanities (workingwith digitized collections ofhistorical artifacts)vs. computational humanities(using social web data) softwarestudies.com 58
  58. 58. “The capacity to collect and analyze massive amountsof data has transformed such fields as biology andphysics. But the emergence of a data-drivencomputational social science has been much slower.Leading journals in economics, sociology, and politicalscience show little evidence of this field. Butcomputational social science is occurring in Internetcompanies such as Google and Yahoo, and ingovernment agencies such as the U.S. NationalSecurity Agency.”“Computational Social Science.” Science, vol. 323, no.6, February 2009. softwarestudies.com 59
  59. 59. Massive amounts of cultural content and onlineconversations, opinions, and cultural activities(general and specialized social media networks;personal and professional web sites ).This data offers us unprecedented opportunities tounderstand cultural processes and their dynamicsand develop new concepts and models which can bealso used to better understand the past.Currently only analyzed by Google, Facebook,YouTube, Bluefin labs, Echonest, and othercompanies, and computer scientists working in“social computing”- not yet by humanists. softwarestudies.com 60
  60. 60. manovich.lev@gmail.comsoftwarestudies.com softwarestudies.com 61
  61. 61. Our free open source software tools foranalyzing and visualizing large image andvideo collections, publications andprojects:softwarestudies.comThe tools run on Mac, PC, Unix.All media visualizations in this presentationwere created by members of Software softwarestudies.com 62
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×