SlideShare ist ein Scribd-Unternehmen logo
1 von 40
Visualization and the New
      Epistemology

      Prof. Alvarado
     MDST 3703/7703
     23 October 2012
Business

• Midterms graded
• Office hours
  – Tomorrow 11:00AM—3:00PM
  – Friday 1:00PM—3:00PM
• Project homework
  – Mark-up text for paragraphs and quotes
  – Quotes are SPAN elements with CLASS attributes
    of either ‘quote’ or ‘extract’
  – Make sure file and directories are named properly
Review

• Web 2.0
  – Post-Google era of the web
  – Massive participation in social media
  – Social production of knowledge
  – New models of how knowledge is
    produced, maintained, organized
• Tags
  – One example of this shift
  – A new kind of knowledge “product”
A Delicio.us “folksonomy” visualized
A Flickr “folksonomy” visualized
http://anthonyflo.tumblr.com/post/7590868323/photographer-and-self-described-geek-of-maps   Eric Fischer
                                                                                            creates maps that
                                                                                            merge geographic
                                                                                            locations with
                                                                                            geotagged photos
                                                                                            from Flickr and
                                                                                            tweets from
                                                                                            Twitter. Red dots
                                                                                            pinpoint the
                                                                                            locations of Flickr
                                                                                            pictures, blue dots
                                                                                            show tweets, white
                                                                                            dots mark places
                                                                                            that have been
                                                                                            posted to both. This
                                                                                            map of
                                                                                            Washington, D.C., s
                                                                                            hows messages
                                                                                            concentrating
                                                                                            around the national
                                                                                            landmarks and
                                                                                            power corridors of
                                                                                            the city‟s federal
                                                                                            zone.
An algorithm generates a virtual Rome in 3D from
150,000 Flickr Users' Photos
http://www.popsci.com/gear-amp-gadgets/article/2009-09/building-virtual-cities-automatically-150000-flickr-photos
Flickr Photos Yield Tourist Trails. An algorithm
uses images from millions of tourists to suggest
ways for visitors to spend their time.
      http://www.technologyreview.com/computing/25549/page1/
Trends Map for #OWS




                http://trendsmap.co
                m
These visualizations are created out of
              “Big Data”
What is Big Data? What are some
           examples?
What is distinctive about the form of this
kind of knowledge generated by Big Data?
Organic

    Rhizomic

Socially generated

  Transductive
What about the content of this kind of
            knowledge?

  What does it tell us about what?
?
A new epistemology?

    An new science?

(media determinism again)
Franic Bacon in 1620 described a
new kind of knowledge based on
observation and induction
(empiricism). This view can be
partly traced to the successes of
exploration and instruments in
learning about the world.
Anderson argues that a similar shift is
           happening now

With the era of the “cloud” and massive
                  data

           the Petabyte Age

   comes a new kind of knowledge
The database is not just a symbolic
                form

It is the pervasive and standard form in
    which our knowledge is organized
Anderson

• The end of theory
  – Positivism (see definition)
  – It’s algorithms all the way down
• No need for models and causality
  – Correlation is enough
• More is different
  – The “Petabyte Age”
  – The sheer amount of data makes it valuable
  – Quality does not matter
Some Definitions

• Petabyte (PB) = 250
  1,125,899,906,842,624 bytes
  1,024 terabytes
• Positivism (my definition)
  – A theory of knowledge that views physical laws and
    models as more or less stable patterns
  – Regards statistics and pattern recognition as more
    authentic forms of knowledge than laws
  – Radically empiricism (nothing “behind” the observed)
The Page Rank
                                    algorithm
                                    visualized




Google does not care about what is on a page, it
just cares about this
Same approach to advertising
“AdWords analyzes every Google search to
determine which advertisers get each of up
to 11 „sponsored links‟ on every results page.
It‟s the world‟s biggest, fastest auction, a
never-ending, automated, self-service
version of Tokyo‟s boisterous Tsukiji fish
market, and it takes place, Varian
says, „every time you search.‟ ”
Steven Levy, “Secret of Googlenomics: Data-Fueled Recipe Brews
Profitability,” WIRED 17.06.
http://www.wired.com/culture/culturereviews/magazine/17-06/nep_googlenomics
It’s all about the algorithm

There is no real theory behind the
              formula

     It just happens to work
Sometimes this approach is called
     “the physics of clicks”
Manovich’s experiments
                explore this concept

   (examples from Mapping Time Exhibit)


http://www.flickr.com/photos/culturevis/sets/72157624959121129/detail/
Time Magazine
  1923—2009
Science and Popular Science Magazines




           1872-1922
Anna Karenina


This visualization of Anna
Karenina is inspired by a common
reading practice of underlining
important lines and passages in a
text using magic markers. To
create this visualization we
designed a program that reads the
text from a file and renders it in a
series of columns running from top
to bottom and from left to right as
a single image it also checks
whether text lines contain
particular words (this version
checks for the word Anna) and
highlights the found matches.
Manga Style Space
ENTROPY




                              VARIATION
MONDRIAN                               ROTHKO


           13 years of data for each
           X: Brightness
           Y: Saturation
ImagePlot of Vertov’s film, The Elevent Hour

BRIGHTNESS
NUM SHAPES
Visualization
Algorithm
   Big Data
Visualizing Text
Co-presence of characters in Les Miserables
Same data, different display algorithm

Weitere ähnliche Inhalte

Ähnlich wie Visualizing Knowledge from Big Data

9 Visualization In E Social Science
9 Visualization In E Social Science9 Visualization In E Social Science
9 Visualization In E Social Scienceguestde3e66f
 
9 Visualization In E Social Science
9 Visualization In E Social Science9 Visualization In E Social Science
9 Visualization In E Social ScienceWebometrics Class
 
Knowledge Mapping for Open Sensemaking Communities
Knowledge Mapping for Open Sensemaking CommunitiesKnowledge Mapping for Open Sensemaking Communities
Knowledge Mapping for Open Sensemaking CommunitiesSimon Buckingham Shum
 
Mining the Social Web - Lecture 2 - T61.6020
Mining the Social Web - Lecture 2 - T61.6020Mining the Social Web - Lecture 2 - T61.6020
Mining the Social Web - Lecture 2 - T61.6020Michael Mathioudakis
 
Commuication Methods Visualization Types and Tools
Commuication Methods Visualization Types and ToolsCommuication Methods Visualization Types and Tools
Commuication Methods Visualization Types and ToolsPlaceVision Inc
 
The Regional Image: Interpreting the Visual Products of Regional Planning
The Regional Image: Interpreting the Visual Products of Regional PlanningThe Regional Image: Interpreting the Visual Products of Regional Planning
The Regional Image: Interpreting the Visual Products of Regional PlanningAlissa Barber Torres, PhD, AICP, PLS
 
Ubiquitous Information Architecture
Ubiquitous Information ArchitectureUbiquitous Information Architecture
Ubiquitous Information ArchitecturePeter Morville
 
2013 Talk on Informatics tools for public transport re cities and health
2013 Talk on Informatics tools for public transport re cities and health2013 Talk on Informatics tools for public transport re cities and health
2013 Talk on Informatics tools for public transport re cities and healthPatrick Sunter
 
Creating the Future Community with Visual Communication in the Urban Planning...
Creating the Future Community with Visual Communication in the Urban Planning...Creating the Future Community with Visual Communication in the Urban Planning...
Creating the Future Community with Visual Communication in the Urban Planning...Alissa Barber Torres, PhD, AICP, PLS
 
Image of the city
Image of the cityImage of the city
Image of the cityCoEP
 
MSLab - A paradigmatic model to shape the metropolitan growth
MSLab - A paradigmatic model to shape the metropolitan growthMSLab - A paradigmatic model to shape the metropolitan growth
MSLab - A paradigmatic model to shape the metropolitan growthMSLab Polimi
 
How and why study big cultural data v2
How and why study big cultural data v2How and why study big cultural data v2
How and why study big cultural data v2Lev Manovich
 
6 hybrid space_walking
6 hybrid space_walking6 hybrid space_walking
6 hybrid space_walkingnihledb
 
MDST 3705 2012-03-05 Databases to Visualization
MDST 3705 2012-03-05 Databases to VisualizationMDST 3705 2012-03-05 Databases to Visualization
MDST 3705 2012-03-05 Databases to VisualizationRafael Alvarado
 

Ähnlich wie Visualizing Knowledge from Big Data (19)

Location prediction
Location predictionLocation prediction
Location prediction
 
9 Visualization In E Social Science
9 Visualization In E Social Science9 Visualization In E Social Science
9 Visualization In E Social Science
 
9 Visualization In E Social Science
9 Visualization In E Social Science9 Visualization In E Social Science
9 Visualization In E Social Science
 
Knowledge Mapping for Open Sensemaking Communities
Knowledge Mapping for Open Sensemaking CommunitiesKnowledge Mapping for Open Sensemaking Communities
Knowledge Mapping for Open Sensemaking Communities
 
paper24_SRomalewski
paper24_SRomalewskipaper24_SRomalewski
paper24_SRomalewski
 
paper24_SRomalewski
paper24_SRomalewskipaper24_SRomalewski
paper24_SRomalewski
 
Roughmaps
RoughmapsRoughmaps
Roughmaps
 
Mining the Social Web - Lecture 2 - T61.6020
Mining the Social Web - Lecture 2 - T61.6020Mining the Social Web - Lecture 2 - T61.6020
Mining the Social Web - Lecture 2 - T61.6020
 
Commuication Methods Visualization Types and Tools
Commuication Methods Visualization Types and ToolsCommuication Methods Visualization Types and Tools
Commuication Methods Visualization Types and Tools
 
The Regional Image: Interpreting the Visual Products of Regional Planning
The Regional Image: Interpreting the Visual Products of Regional PlanningThe Regional Image: Interpreting the Visual Products of Regional Planning
The Regional Image: Interpreting the Visual Products of Regional Planning
 
Ubiquitous Information Architecture
Ubiquitous Information ArchitectureUbiquitous Information Architecture
Ubiquitous Information Architecture
 
2013 Talk on Informatics tools for public transport re cities and health
2013 Talk on Informatics tools for public transport re cities and health2013 Talk on Informatics tools for public transport re cities and health
2013 Talk on Informatics tools for public transport re cities and health
 
Creating the Future Community with Visual Communication in the Urban Planning...
Creating the Future Community with Visual Communication in the Urban Planning...Creating the Future Community with Visual Communication in the Urban Planning...
Creating the Future Community with Visual Communication in the Urban Planning...
 
Rkthesis
RkthesisRkthesis
Rkthesis
 
Image of the city
Image of the cityImage of the city
Image of the city
 
MSLab - A paradigmatic model to shape the metropolitan growth
MSLab - A paradigmatic model to shape the metropolitan growthMSLab - A paradigmatic model to shape the metropolitan growth
MSLab - A paradigmatic model to shape the metropolitan growth
 
How and why study big cultural data v2
How and why study big cultural data v2How and why study big cultural data v2
How and why study big cultural data v2
 
6 hybrid space_walking
6 hybrid space_walking6 hybrid space_walking
6 hybrid space_walking
 
MDST 3705 2012-03-05 Databases to Visualization
MDST 3705 2012-03-05 Databases to VisualizationMDST 3705 2012-03-05 Databases to Visualization
MDST 3705 2012-03-05 Databases to Visualization
 

Mehr von Rafael Alvarado

Mdst3703 2013-10-08-thematic-research-collections
Mdst3703 2013-10-08-thematic-research-collectionsMdst3703 2013-10-08-thematic-research-collections
Mdst3703 2013-10-08-thematic-research-collectionsRafael Alvarado
 
Mdst3703 2013-10-01-hypertext-and-history
Mdst3703 2013-10-01-hypertext-and-historyMdst3703 2013-10-01-hypertext-and-history
Mdst3703 2013-10-01-hypertext-and-historyRafael Alvarado
 
Mdst3703 2013-09-24-hypertext
Mdst3703 2013-09-24-hypertextMdst3703 2013-09-24-hypertext
Mdst3703 2013-09-24-hypertextRafael Alvarado
 
Mdst3703 2013-09-12-semantic-html
Mdst3703 2013-09-12-semantic-htmlMdst3703 2013-09-12-semantic-html
Mdst3703 2013-09-12-semantic-htmlRafael Alvarado
 
Mdst3703 2013-09-17-text-models
Mdst3703 2013-09-17-text-modelsMdst3703 2013-09-17-text-models
Mdst3703 2013-09-17-text-modelsRafael Alvarado
 
Mdst3703 2013-09-10-textual-signals
Mdst3703 2013-09-10-textual-signalsMdst3703 2013-09-10-textual-signals
Mdst3703 2013-09-10-textual-signalsRafael Alvarado
 
Mdst3703 2013-09-05-studio2
Mdst3703 2013-09-05-studio2Mdst3703 2013-09-05-studio2
Mdst3703 2013-09-05-studio2Rafael Alvarado
 
Mdst3703 2013-09-03-plato2
Mdst3703 2013-09-03-plato2Mdst3703 2013-09-03-plato2
Mdst3703 2013-09-03-plato2Rafael Alvarado
 
Mdst3703 2013-08-29-hello-world
Mdst3703 2013-08-29-hello-worldMdst3703 2013-08-29-hello-world
Mdst3703 2013-08-29-hello-worldRafael Alvarado
 
UVA MDST 3703 2013 08-27 Introduction
UVA MDST 3703 2013 08-27 IntroductionUVA MDST 3703 2013 08-27 Introduction
UVA MDST 3703 2013 08-27 IntroductionRafael Alvarado
 
Mdst3705 2013-02-26-db-as-genre
Mdst3705 2013-02-26-db-as-genreMdst3705 2013-02-26-db-as-genre
Mdst3705 2013-02-26-db-as-genreRafael Alvarado
 
Mdst3705 2013-02-19-text-into-data
Mdst3705 2013-02-19-text-into-dataMdst3705 2013-02-19-text-into-data
Mdst3705 2013-02-19-text-into-dataRafael Alvarado
 
Mdst3705 2013-02-12-finding-data
Mdst3705 2013-02-12-finding-dataMdst3705 2013-02-12-finding-data
Mdst3705 2013-02-12-finding-dataRafael Alvarado
 
Mdst3705 2013-02-05-databases
Mdst3705 2013-02-05-databasesMdst3705 2013-02-05-databases
Mdst3705 2013-02-05-databasesRafael Alvarado
 
Mdst3705 2013-01-29-praxis
Mdst3705 2013-01-29-praxisMdst3705 2013-01-29-praxis
Mdst3705 2013-01-29-praxisRafael Alvarado
 
Mdst3705 2013-01-31-php3
Mdst3705 2013-01-31-php3Mdst3705 2013-01-31-php3
Mdst3705 2013-01-31-php3Rafael Alvarado
 
Mdst3705 2012-01-22-code-as-language
Mdst3705 2012-01-22-code-as-languageMdst3705 2012-01-22-code-as-language
Mdst3705 2012-01-22-code-as-languageRafael Alvarado
 
Mdst3705 2013-01-24-php2
Mdst3705 2013-01-24-php2Mdst3705 2013-01-24-php2
Mdst3705 2013-01-24-php2Rafael Alvarado
 
Mdst3705 2012-01-15-introduction
Mdst3705 2012-01-15-introductionMdst3705 2012-01-15-introduction
Mdst3705 2012-01-15-introductionRafael Alvarado
 

Mehr von Rafael Alvarado (20)

Mdst3703 2013-10-08-thematic-research-collections
Mdst3703 2013-10-08-thematic-research-collectionsMdst3703 2013-10-08-thematic-research-collections
Mdst3703 2013-10-08-thematic-research-collections
 
Mdst3703 2013-10-01-hypertext-and-history
Mdst3703 2013-10-01-hypertext-and-historyMdst3703 2013-10-01-hypertext-and-history
Mdst3703 2013-10-01-hypertext-and-history
 
Mdst3703 2013-09-24-hypertext
Mdst3703 2013-09-24-hypertextMdst3703 2013-09-24-hypertext
Mdst3703 2013-09-24-hypertext
 
Presentation1
Presentation1Presentation1
Presentation1
 
Mdst3703 2013-09-12-semantic-html
Mdst3703 2013-09-12-semantic-htmlMdst3703 2013-09-12-semantic-html
Mdst3703 2013-09-12-semantic-html
 
Mdst3703 2013-09-17-text-models
Mdst3703 2013-09-17-text-modelsMdst3703 2013-09-17-text-models
Mdst3703 2013-09-17-text-models
 
Mdst3703 2013-09-10-textual-signals
Mdst3703 2013-09-10-textual-signalsMdst3703 2013-09-10-textual-signals
Mdst3703 2013-09-10-textual-signals
 
Mdst3703 2013-09-05-studio2
Mdst3703 2013-09-05-studio2Mdst3703 2013-09-05-studio2
Mdst3703 2013-09-05-studio2
 
Mdst3703 2013-09-03-plato2
Mdst3703 2013-09-03-plato2Mdst3703 2013-09-03-plato2
Mdst3703 2013-09-03-plato2
 
Mdst3703 2013-08-29-hello-world
Mdst3703 2013-08-29-hello-worldMdst3703 2013-08-29-hello-world
Mdst3703 2013-08-29-hello-world
 
UVA MDST 3703 2013 08-27 Introduction
UVA MDST 3703 2013 08-27 IntroductionUVA MDST 3703 2013 08-27 Introduction
UVA MDST 3703 2013 08-27 Introduction
 
Mdst3705 2013-02-26-db-as-genre
Mdst3705 2013-02-26-db-as-genreMdst3705 2013-02-26-db-as-genre
Mdst3705 2013-02-26-db-as-genre
 
Mdst3705 2013-02-19-text-into-data
Mdst3705 2013-02-19-text-into-dataMdst3705 2013-02-19-text-into-data
Mdst3705 2013-02-19-text-into-data
 
Mdst3705 2013-02-12-finding-data
Mdst3705 2013-02-12-finding-dataMdst3705 2013-02-12-finding-data
Mdst3705 2013-02-12-finding-data
 
Mdst3705 2013-02-05-databases
Mdst3705 2013-02-05-databasesMdst3705 2013-02-05-databases
Mdst3705 2013-02-05-databases
 
Mdst3705 2013-01-29-praxis
Mdst3705 2013-01-29-praxisMdst3705 2013-01-29-praxis
Mdst3705 2013-01-29-praxis
 
Mdst3705 2013-01-31-php3
Mdst3705 2013-01-31-php3Mdst3705 2013-01-31-php3
Mdst3705 2013-01-31-php3
 
Mdst3705 2012-01-22-code-as-language
Mdst3705 2012-01-22-code-as-languageMdst3705 2012-01-22-code-as-language
Mdst3705 2012-01-22-code-as-language
 
Mdst3705 2013-01-24-php2
Mdst3705 2013-01-24-php2Mdst3705 2013-01-24-php2
Mdst3705 2013-01-24-php2
 
Mdst3705 2012-01-15-introduction
Mdst3705 2012-01-15-introductionMdst3705 2012-01-15-introduction
Mdst3705 2012-01-15-introduction
 

Visualizing Knowledge from Big Data

  • 1. Visualization and the New Epistemology Prof. Alvarado MDST 3703/7703 23 October 2012
  • 2. Business • Midterms graded • Office hours – Tomorrow 11:00AM—3:00PM – Friday 1:00PM—3:00PM • Project homework – Mark-up text for paragraphs and quotes – Quotes are SPAN elements with CLASS attributes of either ‘quote’ or ‘extract’ – Make sure file and directories are named properly
  • 3. Review • Web 2.0 – Post-Google era of the web – Massive participation in social media – Social production of knowledge – New models of how knowledge is produced, maintained, organized • Tags – One example of this shift – A new kind of knowledge “product”
  • 6. http://anthonyflo.tumblr.com/post/7590868323/photographer-and-self-described-geek-of-maps Eric Fischer creates maps that merge geographic locations with geotagged photos from Flickr and tweets from Twitter. Red dots pinpoint the locations of Flickr pictures, blue dots show tweets, white dots mark places that have been posted to both. This map of Washington, D.C., s hows messages concentrating around the national landmarks and power corridors of the city‟s federal zone.
  • 7. An algorithm generates a virtual Rome in 3D from 150,000 Flickr Users' Photos http://www.popsci.com/gear-amp-gadgets/article/2009-09/building-virtual-cities-automatically-150000-flickr-photos
  • 8. Flickr Photos Yield Tourist Trails. An algorithm uses images from millions of tourists to suggest ways for visitors to spend their time. http://www.technologyreview.com/computing/25549/page1/
  • 9. Trends Map for #OWS http://trendsmap.co m
  • 10. These visualizations are created out of “Big Data”
  • 11. What is Big Data? What are some examples?
  • 12.
  • 13. What is distinctive about the form of this kind of knowledge generated by Big Data?
  • 14. Organic Rhizomic Socially generated Transductive
  • 15. What about the content of this kind of knowledge? What does it tell us about what?
  • 16. ?
  • 17. A new epistemology? An new science? (media determinism again)
  • 18. Franic Bacon in 1620 described a new kind of knowledge based on observation and induction (empiricism). This view can be partly traced to the successes of exploration and instruments in learning about the world.
  • 19.
  • 20. Anderson argues that a similar shift is happening now With the era of the “cloud” and massive data the Petabyte Age comes a new kind of knowledge
  • 21. The database is not just a symbolic form It is the pervasive and standard form in which our knowledge is organized
  • 22. Anderson • The end of theory – Positivism (see definition) – It’s algorithms all the way down • No need for models and causality – Correlation is enough • More is different – The “Petabyte Age” – The sheer amount of data makes it valuable – Quality does not matter
  • 23. Some Definitions • Petabyte (PB) = 250 1,125,899,906,842,624 bytes 1,024 terabytes • Positivism (my definition) – A theory of knowledge that views physical laws and models as more or less stable patterns – Regards statistics and pattern recognition as more authentic forms of knowledge than laws – Radically empiricism (nothing “behind” the observed)
  • 24. The Page Rank algorithm visualized Google does not care about what is on a page, it just cares about this
  • 25. Same approach to advertising
  • 26. “AdWords analyzes every Google search to determine which advertisers get each of up to 11 „sponsored links‟ on every results page. It‟s the world‟s biggest, fastest auction, a never-ending, automated, self-service version of Tokyo‟s boisterous Tsukiji fish market, and it takes place, Varian says, „every time you search.‟ ” Steven Levy, “Secret of Googlenomics: Data-Fueled Recipe Brews Profitability,” WIRED 17.06. http://www.wired.com/culture/culturereviews/magazine/17-06/nep_googlenomics
  • 27. It’s all about the algorithm There is no real theory behind the formula It just happens to work
  • 28. Sometimes this approach is called “the physics of clicks”
  • 29. Manovich’s experiments explore this concept (examples from Mapping Time Exhibit) http://www.flickr.com/photos/culturevis/sets/72157624959121129/detail/
  • 30. Time Magazine 1923—2009
  • 31. Science and Popular Science Magazines 1872-1922
  • 32. Anna Karenina This visualization of Anna Karenina is inspired by a common reading practice of underlining important lines and passages in a text using magic markers. To create this visualization we designed a program that reads the text from a file and renders it in a series of columns running from top to bottom and from left to right as a single image it also checks whether text lines contain particular words (this version checks for the word Anna) and highlights the found matches.
  • 33.
  • 35. MONDRIAN ROTHKO 13 years of data for each X: Brightness Y: Saturation
  • 36. ImagePlot of Vertov’s film, The Elevent Hour BRIGHTNESS NUM SHAPES
  • 39. Co-presence of characters in Les Miserables
  • 40. Same data, different display algorithm

Hinweis der Redaktion

  1. ----- Meeting Notes (10/23/12 12:16) -----Review posts and comments in studio -- lots of interesting thoughts
  2. Source: http://www.metablake.com/advml/map1.png
  3. Source: http://www.flickr.com/photos/27318782@N03/2549628406/sizes/m/in/photostream/
  4. ----- Meeting Notes (10/23/12 12:16) -----Dynamic
  5. http://fora.tv/2011/05/03/Googles_Susan_Wojcicki_The_Advertising_Algorithm#chapter_04
  6. Original: http://www.flickr.com/photos/culturevis/4038907270/sizes/o/in/set-72157624959121129/ Mapping TimeJeremy Douglass and Lev Manovich, 2009.----------------------------Data:The covers of every issue of Time magazine published from 1923 to summer 2009. Total number of covers: 4535.The large percentage of the covers included red borders. We cropped these borders and scaled all images to the same size to allow a user see more clearly the temporal patterns across all covers. ----------------------------Timescale:1923-2009.----------------------------Mapping:Time covers appear in order of publication (i.e., from 1923 to 2009), arranged in a grid layout (left to right and top to bottom).----------------------------Mapping 4535 Time covers into a grid organized by publicatoon date reveals a number of historical patterns. Here are some of them:Medium: In the 1920s and 1930s Time covers use mostly photography. After 1941, the magazine switches to paintings. In the later decades the photography gradually comes to dominate again. In the 1990s we see emergence of the contemporary software-based visual language which combines manipulated photography, graphic and typographic elements.Color vs. black and white: The shift from early black and white to full color covers happens gradually, with both types coexisting for many years.Hue: Distinct “color periods” appear in bands: green, yellow/brown, red/blue, yellow/brown again, yellow, and a lighter yellow/blue in the 2000s.Brightness: The changes in brightness (the mean of all pixels’ grayscale values for each cover) follow a similar cyclical pattern.Contrast and Saturation: Both gradually increase throughout the 20th century. However, since the end of the 1990s, this trend is reversed: recent covers have less contrast and less saturation.Content: Initially most covers are portraits of individuals set against neutral backgrounds. Over time, portrait backgrounds change to feature compositions representing concepts. Later, these two different strategies come to co-exist: portraits return to neutral backgrounds, while concepts are now represented by compositions which may include both objects and people – but not particular individuals. The visualization also reveals an important “metapattern”: almost all changes are gradual. Each of the new communication strategies emerges slowly over a number of months, years or even decades.
  7. Original:http://www.flickr.com/photos/culturevis/4564315478/sizes/o/in/photostream/
  8. Source: http://www.flickr.com/photos/culturevis/5107682969/sizes/l/in/set-72157624959121129/This visualization of Anna Karenina is inspired by a common reading practice of underlining important lines and passages in a text using magic markers. To create this visualization we designed a program that reads the text from a file and renders it in a series of columns running from top to bottom and from left to right as a single image it also checks whether text lines contain particular words (this version checks for the word Anna) and highlights the found matches.
  9. Source: http://www.flickr.com/photos/culturevis/5109394222/in/set-72157624959121129Manga Style SpaceLev Manovich and Jeremy Douglass, 2010.------Data: 883 Manga series from the scanlationsiteOneManga.com. Total number of pages: 1,074,790.In the Fall 2009, we havedownloaded 883 Manga seriescontaining 1,074,790 uniquepages from thissite. We havethenusedourcustom software system installed on a supercomputeratNationalDepartment of EnergyResearch Center (NERSC) to analyzevisualfeatures of thesepages.-------------------------------Timescale:The longestrunning Manga serieshasbeenpublishedcontinuouslysince 1976. The most popular series on OneManga.comareNaruto (1999-; 8835 pages) and One Piece (1997-; 10562 pages). Along with suchlong Manga series, our data set alsocontainsshorterseriesthatappeared in 2000s and only run for 1-3 years. -------------------------------Mapping:X axis: standard deviation of pixels’ grayscalevalues in a page. Y axis: entropymeasuredoverallpixels’ grayscalevalues in a page. -------------------------------The visualizationshows 1,074,790 uniquepages from 883 distinct manga series from Japan, Korea and China. The seriesincludebothvery popular long-runningtitlessuch as Naruto and One Piece and alsomanyshort-livedtitles. The visualizationmaps the pages the pagesaccording to some of theirvisualcharacteristicsthatweremeasuredautomatically on supercomputersat the U.S. NationalDepartment of EnergyResearch Center usingcustom software developed by Software StudiesInitiative. (X-axis: standard deviation. Y-axis: entropy.)The pages in the bottom part of the visualizationare the most graphic and have the leastamount of detail. The pages in the upperrighthavelots of detail and texture. The pages with the highestcontrastare on the right, whilepages with the leastcontrastare on the left. In betweenthesefourextremes, we findeverypossiblestylisticvariation. Thissuggeststhatourbasicconcept of “style” maybe not appropriatethen we considerlargecultural data sets. The conceptassumesthat we canpartition a set of culturalartifactsworksinto a small number of discretecategories. In the case of our one millionpages set, we findpracticallyinfinitegraphicalvariations. If we try to dividethisspaceintodiscretestylisticcategories, anysuchattemptwill be arbitrary. Visualizationalsoshowswhichgraphicalchoicesaremorecommonlyused by manga artists (the central part of the “cloud” of pages) and whichappear much morerarely (bottom and leftparts).---------------Note: some of the pages - such as allcovers - are in color. However in order to be able to fitall image into a single large image (the originalis 44,000x44,000 pixels - scaled to 10,000x10,000 for posting to Flickr), we renderedeverything in greyscale.Becausepagesarerendered on top of eachother, youdon'tactuallysee 1 million of distinctpages - the visualizationshows a distribution of allpages with typicalexamplesappearing on the top.
  10. Mondrian vs. RothkoLev Manovich, 2010.images preparation: Xiaoda Wang ----------------------------Data:128 paintings by Piet Mondrian (1905 - 1917).151 paintings by Mark Rothko (1944 - 1957). ----------------------------Mapping:X-axis: brightness meanY-axis: saturation meanThe two image plots are placed side by side so they share the Y-axis----------------------------This visualization demonstrates how image plots can be used to compare multiple data sets. In this case, the goal is to compare similar number of paintings by Piet Mondrian and Mark Rothko (produced over comparable time periods of 13 years) along particular visual dimensions. We have selected particular periods in the career of each artist which are structurally similar. In the beginning of a period each artist was imitating his predecessors and contemporaries. By the end period each developed his mature style for which he became famous. In between, each gradually moved moved from figurative representation to pure abstraction.The left image plot shows 128 paintings by Mondrian; the right shows 151 paintings by Rothko. The paintings are organized according to their brightness mean (X-axis) and saturation mean (Y-axis). These measurements were obtained with digital image processing software.Projecting sets of paintings of these two artists into the same coordinate space reveals their comparative "footprints" - the parts of the space of visual possibilities they explored. We can see the relative distributions of their works - the more dense and the more sparse areas, the presence or absence of clusters, the outliers, etc.The visualizations also show how Mark Rotho - the abstract artist of the generation which followed Mondrian’s - was exploring the parts of brightness/hue space which Mondrian did not reach (highly saturated and bright paintings in the upper right corner, and desaturated dark paintings in the left part). Another interesting pattern revealed by the visualization is that all paintings of one artists are sufficiently different from each other – no two occupy the same point in brightness / saturation space. This makes sense given the ideology of modern art on unique original works – if we are to map works from earlier centuries, when it was common for artists to make copies of successful works which were considered to be equally valuable, we may expect to see a different pattern. However what could not be predicted is that the distances between any two paintings which are next to each are similar to each other – i.e., while each image occupies its own unique position, its not very far from its neighbours.To see how each artist moved through brightness/saturation space during the 13 year periods we are comparing, we can visualize the paintings as color circles. The colors indicate the position of each paintings within the time period, running from blue to red. To make the patterns even easier to see, we also vary the size of circles from - from smallest to largest.www.flickr.com/photos/culturevis/4728910768/in/set-721576...This visualizationreveals another interesting pattern. Rothko starts his explorations in late 1930-1940s in the same same part of brightness/saturation space where Mondrian arrives by 1917 - high brightness/low saturation area (the right bottom corner of the plot). But as he develops, he is able to move beyond the areas already “marked” by his European predecessors such as Mondrian.
  11. Original: http://www.flickr.com/photos/culturevis/4048646419/sizes/o/in/set-72157622608431194/ film: The Eleventh Yeardata: every shot of the film is represented by 1 frameThe frame are arranged by brightness kurtosis (X) and number of shapes (Y). (Note that frames overlap so not all of them are visible).