We will discuss four misunderstandings often connected to use of digital traces:
1) the use of a notion of digital traces that is both too narrow and too ambitious;
2) the alternation of oblivion and paranoia on the conditions of digital traces' production;
3) the tendency to confuse digital and automatic;
4) the hope that the digital traces are easily clamped by conventional methods.
We will try to show than when these misunderstandings are avoided, digital methods can renew the vision of social sciences and help them to overcome the classic divide between qualitative and quantitative methods.
2. 4+1 misunderstandings
0. [ Digital mediation traces society ]
1. Digital traces are not sociological data
2. Quantity is less interesting that variety
3. Digital does not mean automatic
4. More quantification demands more qualification
5. The media as carbon
paper
Chris Harrison, 2004
Internet connections
6. The rise of
digital methods
Virtual reality
Late ‘80-early ‘90 (Barlow, Turkle, Negroponte, Rheingold)
Virtual society?
1997-2002 (Steve Woolgar et al.)
7. Digital
traceability
Once you can get information as bores, bytes, modem,
sockets, cables and so on, you have actually a more material
way of looking at what happens in Society.
Virtual Society thus, is not a thing of the future, it’s the
materialisation, the traceability of society. It renders visible
because of the obsessive necessity of materialising
information into cables, into data.
Latour, B. 1998
“Thought Experiments in Social Science: from
the Social Contract to Virtual Society”
8. From digital
traceability …
Bruno Latour (1998), argued that the Web is mainly of importance
to social science insofar as it makes possible new types of
descriptions of social life. According to Latour, the social integration
of the Web constitutes an event for social science because the
social link becomes traceable in this medium. Thus, social relations
are established in a tangible form as a material network
connection. We take Latour’s claim of the tangibility of the social as
a point of departure in our search (p. 342).
Rogers, R., and Marres, N. 2002
“Frenchs candals on the Web, and on the streets:
A small experiment in stretching the limits of reported reality.”
Asian Journal of Social Science 66: 339-353.
9. The rise of
digital methods
Virtual reality
Late ‘80-early ‘90 (Barlow, Turkle, Negroponte, Rheingold)
Virtual society?
1997-2002 (Steve Woolgar et al.)
Digital methods
2009 (Richard Rogers)
https://soundcloud.com/mit-cmsw/richard-
rogers-digital-methods
11. Media acceleration
[Media] amplify or accelerate existing processes.
For the "message" of any medium or technology is the
change of scale or pace or pattern that it introduces into
human affairs.
The railway did not introduce movement or transportation or
wheel or road into human society, but it accelerated and
enlarged the scale of previous human functions, creating
totally new kinds of cities and new kinds of work and leisure.
Mcluhan, M. 1964
Understanding Media
13. Tracing collective life is not cheaper
(the price is paid elsewhere)
Cable industry investments
(cumulative unadjusted data
source: www.ncta.com)
Cable industry investments
(de-inflated rate
source: www.techdirt.com)
17. Are we mapping the
media or the content? http://contropedia.net
E. Borra, E. Weltevrede, P. Ciuccarelli, A. Kaltenbrunner, D. Laniado,
G. Magni, M. Mauri, R. Rogers, T. Venturini.
Societal Controversies in Wikipedia Articles
CHI'15: 33rd Annual ACM Conference on Human Factors in Computing Systems
Proceedings, 2015.
18. Redistribution of
research methods
• Methods as usual (ex. Andrew Abbott, )
The techniques used by digital platforms have been long used in social
sciences.
• Big methods (ex. Newman et al, 2007)
Digital traceability increases the quantity of social data thereby demanding use
of mathematical techniques of analysis.
• Virtual methods (ex. Christine Hine, 2000, 2005)
Digital media transform the quality of social practices and demand therefore
increased efforts of observations and interpretation.
• Platform repurposing (ex. Richard Rogers, 2009)
Digital platforms have their own methods that need to be understood and re-
purposed for social research.
• Re-mediation of sociological methods (ex. Nortje Marres, 2011)
The techniques used by digital platforms have been long used in social
sciences, but are radically transformed the new context of their use.
Marres, N. (2011).
Re-distributing Methods:
Interventions in Digital Social Research.
More redistribution
Less redistribution
19. On digital traceability
Venturini, Tommaso, and Bruno Latour. 2010.
“The Social Fabric: Digital Traces and Quali-Quantitative Methods.”
in Proceedings of Future En Seine 2009. Paris, pp. 87–101
Venturini, Tommaso. 2012.
“Building on Faults: How to Represent Controversies with Digital Methods.”
in Public Understanding of Science 21(7):796–812.
Venturini, Tommaso, and Daniele Guido. 2012.
“Once Upon a Text : An ANT Tale in Text Analysis.”
in Sociologica 3.
33. Finding a needle in a needlestak
Drawing credit – Frits Ahlefeldt
34. This is a world where massive amounts of data and applied mathematics replace every
other tool that might be brought to bear. Out with every theory of human behavior, from
linguistics to sociology. Forget taxonomy, ontology, and psychology. Who knows why
people do what they do? The point is they do it, and we can track and measure it with
unprecedented fidelity. With enough data, the numbers speak for themselves…
Petabytes allow us to say: ‘‘Correlation is enough.’’ We can stop looking for models. We
can analyze the data without hypotheses about what it might show. We can throw the
numbers into the biggest computing clusters the world has ever seen and let statistical
algorithms find patterns.
Chris Anderson
http://www.wired.com/science/discoveries/
magazine/16-07/pb_theory
The end of theory?
44. Venturini, Tommaso et al. 2014.
“Three Maps and Three Misunderstandings:
A Digital Mapping of Climate Diplomacy.”
in Big Data & Society 1(2).
Venturini, T. et al. 2014
Climaps by EMAPS in 2 Pages
(A Summary For Policymakers and Busy People in General).
in SSRNDecember 2, 2014.
If you want to know more
45. M. 4 More quantification demands
more qualification
53. On datascape navigation
Latour, Bruno, Pablo Jensen, Tommaso Venturini,
Sébastian Grauwin and Dominique Boullier, 2012.
“‘The Whole Is Always Smaller than Its Parts’:
A Digital Test of Gabriel Tardes’ Monads.”
The British Journal of Sociology 63(4), pp. 590–615
Venturini, Tommaso, Pablo Jensen, and Bruno Latour (forthcoming),
“Fill in the Gap. A New Alliance for Social and Natural Sciences.”
Journal of Artificial Societies and Social Simulations.
56. Venturini, T. (2010).
Diving in magma: how to explore controversies with actor-network theory.
in Public Understanding of Science, 19(3), 258–273.
Jacomy, M., Venturini, T., Heymann, S. & Bastian, M. (2014)
ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network Visualization
Designed for the Gephi Software.
PlosONE, 9:6
Venturini, T., Jacomy, M and De Carvalho Pereira, D. (working paper)
Visual Network Analysis: The example of the rio+20 online debate
Beyond micro/macro
Such honorable enterprise has been somewhat defeated by the speed at which digital technologies have infiltrated modernity. Electronic media became so pervasive that they can no longer be conceived as a separate social space. As such, they offer much more than just another field of application for old theories and methods: they offer a chance to renovate social sciences.
Digital media have a very interesting feature: all the interactions that they mediate become traceable and often actually are. Beside the obvious consequences on individual privacy, this characteristic of electronic media may have a huge impact on social science (Lazer et al., 2009). The more the digital infiltrates modern society, the more collective life become traceable (Mitchell, 2009). Every day, new digital archives are made available for the researchers: public database are swallowed by computer memory; economic transaction migrates online; social networks root in the web. Digital traceability spreads like an immense carbon paper, offering social sciences more data that they ever dreamt.
Even more important the IssueCrawler marked a key turn in the relation between social sciences and digital media. Up until few years ago, social scientists conceived electronic media as nothing more than new terrains for old methods. Notions such as “cyber-culture” (Negroponte, 1996), “virtual communities” (Rheingold, 2000), “online identities’” (Turkle, 1995) were introduced to channel the novelty of new media within the tradition of social sciences.
The first trace of the encounter between ANT and digital methods is to be found in 1998, in the occasion of the first conference of the Virtual Society? program [Woolgar 2002]. In his speech, Bruno Latour [1998] presented for the first time the idea that digital traces could provide the materialization of interactions that ANT was looking for:
SEE THE SLIDE
In the audience of the conference were two young sociologists, Richard Rogers and Noortje Marres, who, in the following years, developed a series of tools and methods to put digital traces at the service of social sciences (see Rogers, 2005, 2009 and digitalmethods.net). The most famous of these tools, the IssueCrawler was explicitly developed to materialize ANT ideas [Rogers and Marres 2002]:
SEE THE SLIDE
Even more important the IssueCrawler marked a key turn in the relation between social sciences and digital media. Up until few years ago, social scientists conceived electronic media as nothing more than new terrains for old methods. Notions such as “cyber-culture” (Negroponte, 1996), “virtual communities” (Rheingold, 2000), “online identities’” (Turkle, 1995) were introduced to channel the novelty of new media within the tradition of social sciences.
Trying to reproduce these findings, however, we noticed something strange: it was not the name of the anti-depressors that matched with unemployment, but the expression ‘side effects’. At first we thought that people might have been taking more medicines in general when they lose their job, but than we found out that other words had the same curve and, in particular, the word ‘template’, which also start being more searched at the end of 2008.
The first trace of the encounter between ANT and digital methods is to be found in 1998, in the occasion of the first conference of the Virtual Society? program [Woolgar 2002]. In his speech, Bruno Latour [1998] presented for the first time the idea that digital traces could provide the materialization of interactions that ANT was looking for:
SEE THE SLIDE
This work is sometime called ‘data mining’ and this metaphor should be taken very seriously. Everyone who ever visited a gold mine knows well that what is striking about this type of landscape is the feeling of absence that dominate them. Where a mountain is supposed to be, there is instead a huge hole. Describing mining as the act of collecting gold and other precious materials is mistaking the aim for the practice. 0.1% of mining is about collecting precious substances, 99,9% of it is about removing tons and tons of rocks, sand and earth. Gold is the product of such absence, what is left when everything else is gone.
An example will make my argument cleared. Some years ago, I was striving with some colleagues to make sense of Google Insight for Search data and use them for social research. Reading the literature, we stumbled on an amazing discussion paper by Askitas and Zimmermann (2011), in which the two economists claimed to have found a striking correlation between the unemployment rate and the search for anti-depressors’ side effects. The result was compelling: when the unemployment rate begins to rise because of the economical crisis of 2008, so does the query for anti-depressors’ side effects.
Trying to reproduce these findings, however, we noticed something strange: it was not the name of the anti-depressors that matched with unemployment, but the expression ‘side effects’. At first we thought that people might have been taking more medicines in general when they lose their job, but than we found out that other words had the same curve and, in particular, the word ‘template’, which also start being more searched at the end of 2008.
We were striving to make sense of this, when it occurred to us that in late 2008 Google enabled by default its ‘suggest’ feature. This feature is meant to auto-complete common search expressions: when you ask it Google about a dish, it will asky you if you want to know about its recipe, when you ask about motivational letter, it will ask you if you are looking for a template and when you ask about drug, it will ask you if you want to know about its side-effects.
17
When the algorithm is launched, the nodes are moved by the opposite forces until they reach a situation of equilibrium.
A good example of the importance of selection may come from the comparison of two maps if the Web. The first is the so-called Internet map (http://internet-map.net). This impressive map is, to our knowledge, the largest publicly available map of the Web. Aiming at exhaustively this map is both vain (because the Web is so big and changes so quickly that no map will ever capture more than a tiny fraction of it) and useless because little knowledge can be extracted from it. All that we can see is that the Web is polarized by language (the color of the nodes) and that some nodes are (far) more connected than the other (size of the nodes). None of this is a surprise.
A good example of the importance of selection may come from the comparison of two maps if the Web. The first is the so-called Internet map (http://internet-map.net). This impressive map is, to our knowledge, the largest publicly available map of the Web. Aiming at exhaustively this map is both vain (because the Web is so big and changes so quickly that no map will ever capture more than a tiny fraction of it) and useless because little knowledge can be extracted from it. All that we can see is that the Web is polarized by language (the color of the nodes) and that some nodes are (far) more connected than the other (size of the nodes). None of this is a surprise.
This work is sometime called ‘data mining’ and this metaphor should be taken very seriously. Everyone who ever visited a gold mine knows well that what is striking about this type of landscape is the feeling of absence that dominate them. Where a mountain is supposed to be, there is instead a huge hole. Describing mining as the act of collecting gold and other precious materials is mistaking the aim for the practice. 0.1% of mining is about collecting precious substances, 99,9% of it is about removing tons and tons of rocks, sand and earth. Gold is the product of such absence, what is left when everything else is gone.
A good example of the importance of selection may come from the comparison of two maps if the Web. The first is the so-called Internet map (http://internet-map.net). This impressive map is, to our knowledge, the largest publicly available map of the Web. Aiming at exhaustively this map is both vain (because the Web is so big and changes so quickly that no map will ever capture more than a tiny fraction of it) and useless because little knowledge can be extracted from it. All that we can see is that the Web is polarized by language (the color of the nodes) and that some nodes are (far) more connected than the other (size of the nodes). None of this is a surprise.
The same is true for information mining: it is not about collecting as much data as possible (that should be called ‘compulsive hoarding’); it is about getting rid of most of it. This is important, because the current ‘data deluge’ ideology, obsessed as it is with the question of collecting, storing, exploiting data, forgets that the careful selection of data is most important part of every scientific protocol.
A good map of the Web is always limited in its ambition: it tries to represent a limited portion of the Web and the better this portion is delimited, the better is the map. A convincing example of this strategy is map of the French political blogosphere, realized by Linkfluence for Le Monde (politicosphere.blog.lemonde.fr).
Because the selection of the websites has been done carefully it is possible to use this map as a research tool and discover for example, that the extreme left and the extreme right have two very different position in French online politics: the first is little, spread out and central; the second is massive, clusterized and eccentric.
Because the selection of the websites has been done carefully it is possible to use this map as a research tool and discover for example, that the extreme left and the extreme right have two very different position in French online politics: the first is little, spread out and central; the second is massive, clusterized and eccentric.
Because the selection of the websites has been done carefully it is possible to use this map as a research tool and discover for example, that the extreme left and the extreme right have two very different position in French online politics: the first is little, spread out and central; the second is massive, clusterized and eccentric.
Because the selection of the websites has been done carefully it is possible to use this map as a research tool and discover for example, that the extreme left and the extreme right have two very different position in French online politics: the first is little, spread out and central; the second is massive, clusterized and eccentric.
Trying to reproduce these findings, however, we noticed something strange: it was not the name of the anti-depressors that matched with unemployment, but the expression ‘side effects’. At first we thought that people might have been taking more medicines in general when they lose their job, but than we found out that other words had the same curve and, in particular, the word ‘template’, which also start being more searched at the end of 2008.
A few years ago, Chris Anderson published a controversial article on the journal Wired, in which he argued for The End of Theory:
“At the petabyte scale, information is not a matter of simple three- and four-dimensional taxonomy and order but of dimensionally agnostic statistics. It calls for an entirely different approach, one that requires us to lose the tether of data as something that can be visualized in its totality. It forces us to view data mathematically first and establish a context for it later. For instance, Google conquered the advertising world with nothing more than applied mathematics. It didn't pretend to know anything about the culture and conventions of advertising — it just assumed that better data, with better analytical tools, would win the day. And Google was right. Google's founding philosophy is that we don't know why this page is better than that one: If the statistics of incoming links say it is, that's good enough. No semantic or causal analysis is required. That's why Google can translate languages without actually "knowing" them (given equal corpus data, Google can translate Klingon into Farsi as easily as it can translate French into German). And why it can match ads to content without any knowledge or assumptions about the ads or the content”.
This argument is misleading for the reason I gave in the previous paragraph: learning something from digital traces requires separating information from noise. But things are even more complicated, because there is not way to what is information and what is noise without knowing how the traces have been constructed.
Because the selection of the websites has been done carefully it is possible to use this map as a research tool and discover for example, that the extreme left and the extreme right have two very different position in French online politics: the first is little, spread out and central; the second is massive, clusterized and eccentric.
The main aim of this course is to teach you how to avoid jumping from the frying pan of positivism to the fire of relativism.
Or, as the say in Thailand, escape a tiger, meet a crocodile.
The main aim of this course is to teach you how to avoid jumping from the frying pan of positivism to the fire of relativism.
Or, as the say in Thailand, escape a tiger, meet a crocodile.
The main aim of this course is to teach you how to avoid jumping from the frying pan of positivism to the fire of relativism.
Or, as the say in Thailand, escape a tiger, meet a crocodile.
52
When the algorithm is launched, the nodes are moved by the opposite forces until they reach a situation of equilibrium.
The main aim of this course is to teach you how to avoid jumping from the frying pan of positivism to the fire of relativism.
Or, as the say in Thailand, escape a tiger, meet a crocodile.