SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Downloaden Sie, um offline zu lesen
P E O P L E , W O R D S , D ATA : H A R V E S T I N G
C O L L E C T I V E I N T E L L I G E N C E
A L B E R T O C O T T I C A
Public domain
Most of you are familiar with this. It’s called La scuola di Atene: 500 years ago, at the peak of the Renaissance and his own powers, Raphael paints it as a tribute to the
wisdom and courage of humans seeking truth. The whole Top 40 chart of Greek philosophers is depicted: Pythagoras is right in the front left, studying a large tome.
Epicurus also to the left, facing us, crowned with laurel. Socrates. Heraclitus. And in the center of the action: Plato and Aristotle. The two mighty philosophers are deep in
discussion as they stride towards us. And this is only appropriate, because discussion, “dialogue” as Plato liked to call it is what powers their knowledge. Look around
the fresco: debate is everywhere. Twenty-five centuries after Plato, and five after Raphael, this is still how science works.
Photo: Vitaliy Smolygin
Dialogue – conversation – is not just about accumulating information. It’s a process that augments information, by setting it in a richer context. A well-run debate can feel
like walking into a room with a fragment of map to find others that have its other pieces and ending up with a whole map. The whole is more, much more of the sum of its
parts. And it never ends: questions beget answers and new questions, and it starts all over again, like a living thing. In fact, life itself works in a similar way, with genetic
information being continuously traded amongst organism to give rise to always new, wonderful life forms.
Photo: Geraint Rowland
This is why Edgeryders encourages truth-seeking, result-oriented conversation as a knowledge engine. Nobody is smarter than everybody, and the smarts is in the
interaction – the conversation. But this raises a problem: conversations don’t scale well. A hundred people cannot have a conversation, in the sense that they all keep a
reasonably similar outlook on what is being discussed and what conclusions are being reached. They have to splinter into smaller groups. A great community convener,
like John, or Noemi, can get a great many interesting people involved in discussing a problem. A lot of insights are generated and validated; but they are validated locally,
by subsets of a few people. How to generalise? How can we tell which insights are solid, and which are the product of a branch of the conversation that simply misfired? 

At Edgeryders, we do it in three moves.
O N L I N E E T H N O G R A P H Y
Photo: sam.romilly
The first move is online ethnography. Many of you will be familiar with it; for the rest of us, ethnography is a qualitative research technique that results in the description of
a group that encodes the point of view of that group. Ethnographers prefer to work with no hypothesis to prove, simply letting their respondents take the study to
whatever they think is important. So, the first thing we do when trying to make sense of an online conversation, is deploy one or more ethnographers.
Photo: Medhin PaolosPhoto: Medhin Paolos
Ethnographers aggregate knowledge by interviewing people, and annotating the transcripts of their interviews. Annotations consist of a quotation from
what the respondent has said, associated to one or more keywords. As she goes through the testimonies, the ethnographer builds and maintains an
ontology of concepts and facts relevant to the issue being studied, as seen by respondents as a group. This is process is known as “coding”. Nowadays
this is done with computers, and the results entered in a kind of database. So, the researcher can call back all the quotations that have been coded with a
keyword, say “greentech”, and quickly get a grasp of what different people are saying about greentech. Do they agree? Is their controversy? What is
important? What other concepts is greentech connected to? Etcetera.
In Edgeryders, we have an advantage: conversation happens on our platform, so ethnographers can start to code it right away – no need for interviewing
and transcribing. We can be much faster and cheaper this way.
The Edgeryders platform features its own open source module to do ethnographic coding of the content therein. We call it OpenEthnographer, because
we have a vision of ethnography as a collaborative discipline.
S O C I A L N E T W O R K A N A LY S I S
The second move is social network analysis. The Edgeryders conversation is encoded in a database – it has to be, for the platform to work at all. So, we can represent
the discussion as a social network, where nodes represent people and arcs represent interactions – comments, essentially. At that point the pattern of connectivity in our
conversation is described by a graph, and graphs are well understood mathematical objects. What’s more, there is a rich literature in social science that associates
measurable graph metrics to social phenomena like brokerage and authoritativeness. In general, the graph can help us tentatively assign reliability scores to individuals.
In this one, for example, there is a densely connected core of about 40 people in the center, who can be assumed to have at least been exposed to what other people
have to say. Their quotations carry more weight, because they are validating and correcting each other. Then there is a periphery of people that have only one or two
connections, and even a few completely isolated nodes to the southwest. What these people have to say might be very wise and relevant, but it lacks in-group validation. 

We developed software to do real-time social network analysis of the conversation on the Edgeryders platform. We call it Edgesense.
Thinking about collective intelligence in terms of networks has been fruitful for us. It generates intuition, and lets us do quick, simple checks on the general state of the
community. For example, this visualisation comes from a project done two years ago. Like in the previous slide, dots represent people, and arcs represent comments. In
orange, you see the new project; in blue, the Edgeryders conversation outside of the project. As you can see, the new conversation is well connected to the pre-existing
one, through individual that participate in both; but is still maintains structural cohesion – you see that because the orange edges are grouped to the west of the centre.
This is what we would hope to see: there is a healthy balance of specialists, that presumably bring their specific knowledge and interest; and generalists, who carry
memory of previous debates and can connect the newcomers to the old-timers relevant to the new project.
S E M A N T I C S O C I A L N E T W O R K S
posts/comments induce network interaction
Ayman Ben
code1
code2
Comment
(free text)
timestamp
The third move is to pull in together ethnographic data and social network data to build something we call semantic social networks. The “atom” of a SSN is an
interaction like this, where Ayman addresses a comment to Ben. In your social networking platform, the comment is made of some text; it has a source, Ayman, and a
target, Ben; and it has a timestamp. After the ethnographer has worked her magic, it also has one or more ethnographic codes, so we know what this is about.
Ayman Ben
Chris
Eric
Gazbia
Hegazy
Dan
Jay
Fayez
#diabetes
#arduino
#dementia
#3d printing
#accessibility
#design
#design
#childcare
#childcare
#3d printing
#design
When you generalise this approach to a whole conversation, you get a fairly complex graph. You can do many things with it. Consider, for example, this toy network of a
conversation about care.
Ayman
Chris
Eric
Dan
Jay
Fran
#diabetes
#arduino
#dementia
#3d printing
#childcare
#3d printing
Dan
Bob
Gea
Hera
#accessibility
#design
#design
#childcare
#design
Jay
Fran
#dementia
#3d printing
#childcare
#3d printing
Ben
Gazbia
Hegazy
#accessibility
#design
#design
#childcare
#design
Jay
Fran
A simple one is to filter it by ethnographic code. In our toy example, I have pulled out the network of design. I now know that Jay, Hegazy, Gazbia, Ben and Fran are
interested in design. I notice that they are not really talking to each other; Jay, Hegazy and Ben form a group, Gazbia and Fran another. I know that the discussion on
design in the context of care risks being incoherent, or balkanised. If the study is ongoing, I can even let Gazbia and Fran know that there is a group around Hegazy
which shares their interest for design, maybe they will get in touch, share notes, and change both the content of what is being said and the structure of the graph.
#diabetes
#dementia
#accessibility
#childcare
#3d printing
#design
#arduino
Hegazy,
Jay
Dan,
Jay
Dan,
Fayez
Ayman,
Ben
Gazbia,
Fayez
I can also reverse the idea, and build a network of ethnographic codes. Now the arcs represent co-occurrences: two codes are connected when they appear together in
one comment. In this toy network, it turns out, I have two disconnected clusters of concepts.
We are still inventing the math for SSNA, but already at an intuitive level we see that the method is very powerful. I can use it, for example, to identify the emergent
groups of specialists gathering around certain ethno codes. For example, here I have a SSNA from one of our previous studies. I decide I want to look into the subset of
this conversation which is about education. I find and select the relevant codes, and am referred to the people that have been using those codes. As you can see, people
in this group are part of a connected graph, but only some are directly talking to each other about education issues. They have education as a common interest, but they
are not all comparing notes around it yet. I can also decide to ask my data what ELSE the people talking about education are also talking about. And here it is: this is like
doing association patterns, but with a group. This works also if the people in question do not know they are part of the group. 

We use a metrics called entanglement to measure each code’s contribution to the cohesiveness of a group of codes. This is very new, it first appeared in a PhD thesis
less than 3 years ago. As I said, the method is new. At each project we improve and iterate.
We are still inventing the math for SSNA, but already at an intuitive level we see that the method is very powerful. I can use it, for example, to identify the emergent
groups of specialists gathering around certain ethno codes. For example, here I have a SSNA from one of our previous studies. I decide I want to look into the subset of
this conversation which is about education. I find and select the relevant codes, and am referred to the people that have been using those codes. As you can see, people
in this group are part of a connected graph, but only some are directly talking to each other about education issues. They have education as a common interest, but they
are not all comparing notes around it yet. I can also decide to ask my data what ELSE the people talking about education are also talking about. And here it is: this is like
doing association patterns, but with a group. This works also if the people in question do not know they are part of the group. 

We use a metrics called entanglement to measure each code’s contribution to the cohesiveness of a group of codes. This is very new, it first appeared in a PhD thesis
less than 3 years ago. As I said, the method is new. At each project we improve and iterate.
I N T E L L I G E N C E , N O T S E N T I M E N T
Photo: alex_inside
I would like to dedicate the last two minutes of this talk to lift our gaze from the data themselves and thing about what exactly we are trying to do with them. First, we are
not trying to do sentiment analysis. We are not, because we model people differently. Sentiment analysis was developed in the context of commercial communication:
you need to get good at pressing the behavioural buttons that trigger purchase decisions. It models people as desire machines – use a certain shade of blue on your
website and increase infinitesimally the probability that the person will buy an insurance policy. We ask people to be smart, committed, hard-working: citizen experts, not
“consumers”, “beneficiaries”, “target groups” etc. And we model them as such. This generates high quality, nuanced contributions. We need fairly sophisticated
analytical instruments to do justice to that quality.

An obvious application of this method to what the World Bank does is as a risk assessment support tool. Situated knowledge, validated by debate, can monitor the
situation of projects on the ground in real time, or close enough. When operating in unstable contexts, that can be especially valuable.
O P E N D ATA
Photo: Chaos Computer Club
Second, we could not do this on Facebook. This kind of data analysis requires technical and legal means to access and process the data – in other words, open data.
The Edgeryders platform has both. We consider ourselves part of the open data community, and some of us, myself included, are open data activists in our spare time.
And third. I’m a data guy, and I like data. But at the end of the day, conversation graphs are just shorthand. The real thing behind the data is collective intelligence, people
in conversation; and that is incredibly rich and more interesting than any shorthand, like all emergent phenomena. In our community, we are committed to put data
analysis at the service of this ever-changing, ongoing dialogue. We are mapmakers, not landscapers, and we do not confuse the map with the territory. If we work
together, yes, we will make graphs with you. And then we will encourage you to use them to find the most interesting people, and engage with them and what they have
to say. Our data describe people, and people are really there. You can engage them, look them up, challenge them, hire them, become their friends. Hans Rosling got it
just right: we don’t look at the data. We look thought the data, to better see the people behind them.

Weitere ähnliche Inhalte

Was ist angesagt?

2000 - CSCW - Conversation Trees and Threaded Chats
2000  - CSCW - Conversation Trees and Threaded Chats2000  - CSCW - Conversation Trees and Threaded Chats
2000 - CSCW - Conversation Trees and Threaded Chats
Marc Smith
 
2009-Social computing-First steps to netviz nirvana
2009-Social computing-First steps to netviz nirvana2009-Social computing-First steps to netviz nirvana
2009-Social computing-First steps to netviz nirvana
Marc Smith
 

Was ist angesagt? (20)

Memes meaning and definitions
Memes meaning and definitionsMemes meaning and definitions
Memes meaning and definitions
 
15 Network Visualization and Communities
15 Network Visualization and Communities15 Network Visualization and Communities
15 Network Visualization and Communities
 
Frontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter DesignFrontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter Design
 
Frontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text AnalysisFrontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text Analysis
 
2000 - CSCW - Conversation Trees and Threaded Chats
2000  - CSCW - Conversation Trees and Threaded Chats2000  - CSCW - Conversation Trees and Threaded Chats
2000 - CSCW - Conversation Trees and Threaded Chats
 
2009-Social computing-First steps to netviz nirvana
2009-Social computing-First steps to netviz nirvana2009-Social computing-First steps to netviz nirvana
2009-Social computing-First steps to netviz nirvana
 
Web & Social Media Analystics - Workshop Semantica
Web & Social Media Analystics - Workshop SemanticaWeb & Social Media Analystics - Workshop Semantica
Web & Social Media Analystics - Workshop Semantica
 
Small Worlds Social Graphs Social Media
Small Worlds Social Graphs Social MediaSmall Worlds Social Graphs Social Media
Small Worlds Social Graphs Social Media
 
Chatbot in a Campus Enviroment: Design of LiSA, a Virtual Assistant To Help S...
Chatbot in a Campus Enviroment: Design of LiSA, a Virtual Assistant To Help S...Chatbot in a Campus Enviroment: Design of LiSA, a Virtual Assistant To Help S...
Chatbot in a Campus Enviroment: Design of LiSA, a Virtual Assistant To Help S...
 
Data Ethics for Mathematicians
Data Ethics for MathematiciansData Ethics for Mathematicians
Data Ethics for Mathematicians
 
How to map networks
How to map networks How to map networks
How to map networks
 
How to map networks
How to map networks How to map networks
How to map networks
 
CS6010 Social Network Analysis Unit I
CS6010 Social Network Analysis Unit ICS6010 Social Network Analysis Unit I
CS6010 Social Network Analysis Unit I
 
04 Data Visualization (2017)
04 Data Visualization (2017)04 Data Visualization (2017)
04 Data Visualization (2017)
 
2009 - Connected Action - Marc Smith - Social Media Network Analysis
2009 - Connected Action - Marc Smith - Social Media Network Analysis2009 - Connected Action - Marc Smith - Social Media Network Analysis
2009 - Connected Action - Marc Smith - Social Media Network Analysis
 
Web 3.0
Web 3.0Web 3.0
Web 3.0
 
Mathematics and Social Networks
Mathematics and Social NetworksMathematics and Social Networks
Mathematics and Social Networks
 
Information Networks and Semantics
Information Networks and SemanticsInformation Networks and Semantics
Information Networks and Semantics
 
How to conduct a social network analysis: A tool for empowering teams and wor...
How to conduct a social network analysis: A tool for empowering teams and wor...How to conduct a social network analysis: A tool for empowering teams and wor...
How to conduct a social network analysis: A tool for empowering teams and wor...
 
CSE509 Lecture 6
CSE509 Lecture 6CSE509 Lecture 6
CSE509 Lecture 6
 

Ähnlich wie Harvesting collective intelligence.

Summary Of Defending Against The Indefensible Essay
Summary Of Defending Against The Indefensible EssaySummary Of Defending Against The Indefensible Essay
Summary Of Defending Against The Indefensible Essay
Brenda Zerr
 
Thesis_Digest
Thesis_DigestThesis_Digest
Thesis_Digest
Wei Xu
 
Social networks and Social capital
Social networks and Social capitalSocial networks and Social capital
Social networks and Social capital
p11230428
 

Ähnlich wie Harvesting collective intelligence. (20)

Summary Of Defending Against The Indefensible Essay
Summary Of Defending Against The Indefensible EssaySummary Of Defending Against The Indefensible Essay
Summary Of Defending Against The Indefensible Essay
 
Digital humanities
Digital humanitiesDigital humanities
Digital humanities
 
Notational systems and cognitive evolution
Notational systems and cognitive evolutionNotational systems and cognitive evolution
Notational systems and cognitive evolution
 
Questions For Essays.pdf
Questions For Essays.pdfQuestions For Essays.pdf
Questions For Essays.pdf
 
Human-machine Inter-agencies
Human-machine Inter-agenciesHuman-machine Inter-agencies
Human-machine Inter-agencies
 
Steim patterns&pleasures nina_wenhart_presentation
Steim patterns&pleasures nina_wenhart_presentationSteim patterns&pleasures nina_wenhart_presentation
Steim patterns&pleasures nina_wenhart_presentation
 
Usabilidad y diseno
Usabilidad y disenoUsabilidad y diseno
Usabilidad y diseno
 
Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...
 
Transliteracy 3 t's conference
Transliteracy 3 t's conference Transliteracy 3 t's conference
Transliteracy 3 t's conference
 
Thesis_Digest
Thesis_DigestThesis_Digest
Thesis_Digest
 
Social networks and Social capital
Social networks and Social capitalSocial networks and Social capital
Social networks and Social capital
 
ICOTS 2018
ICOTS 2018ICOTS 2018
ICOTS 2018
 
Thinking in networks: what it means for policy makers – PDF 2014
Thinking in networks: what it means for policy makers – PDF 2014Thinking in networks: what it means for policy makers – PDF 2014
Thinking in networks: what it means for policy makers – PDF 2014
 
Cloud Computing Essays.pdf
Cloud Computing Essays.pdfCloud Computing Essays.pdf
Cloud Computing Essays.pdf
 
Semantic web and information graph
Semantic web and information graphSemantic web and information graph
Semantic web and information graph
 
Making data more human
Making data more humanMaking data more human
Making data more human
 
020610
020610020610
020610
 
Platforms and Analytical Gestures
Platforms and Analytical GesturesPlatforms and Analytical Gestures
Platforms and Analytical Gestures
 
Sea Grant: New Tools for Outreach and Engagement
Sea Grant: New Tools for Outreach and EngagementSea Grant: New Tools for Outreach and Engagement
Sea Grant: New Tools for Outreach and Engagement
 
Communication between open source developers
Communication between open source developersCommunication between open source developers
Communication between open source developers
 

Kürzlich hochgeladen

development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 

Kürzlich hochgeladen (20)

Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 

Harvesting collective intelligence.

  • 1. P E O P L E , W O R D S , D ATA : H A R V E S T I N G C O L L E C T I V E I N T E L L I G E N C E A L B E R T O C O T T I C A
  • 2. Public domain Most of you are familiar with this. It’s called La scuola di Atene: 500 years ago, at the peak of the Renaissance and his own powers, Raphael paints it as a tribute to the wisdom and courage of humans seeking truth. The whole Top 40 chart of Greek philosophers is depicted: Pythagoras is right in the front left, studying a large tome. Epicurus also to the left, facing us, crowned with laurel. Socrates. Heraclitus. And in the center of the action: Plato and Aristotle. The two mighty philosophers are deep in discussion as they stride towards us. And this is only appropriate, because discussion, “dialogue” as Plato liked to call it is what powers their knowledge. Look around the fresco: debate is everywhere. Twenty-five centuries after Plato, and five after Raphael, this is still how science works.
  • 3. Photo: Vitaliy Smolygin Dialogue – conversation – is not just about accumulating information. It’s a process that augments information, by setting it in a richer context. A well-run debate can feel like walking into a room with a fragment of map to find others that have its other pieces and ending up with a whole map. The whole is more, much more of the sum of its parts. And it never ends: questions beget answers and new questions, and it starts all over again, like a living thing. In fact, life itself works in a similar way, with genetic information being continuously traded amongst organism to give rise to always new, wonderful life forms.
  • 4. Photo: Geraint Rowland This is why Edgeryders encourages truth-seeking, result-oriented conversation as a knowledge engine. Nobody is smarter than everybody, and the smarts is in the interaction – the conversation. But this raises a problem: conversations don’t scale well. A hundred people cannot have a conversation, in the sense that they all keep a reasonably similar outlook on what is being discussed and what conclusions are being reached. They have to splinter into smaller groups. A great community convener, like John, or Noemi, can get a great many interesting people involved in discussing a problem. A lot of insights are generated and validated; but they are validated locally, by subsets of a few people. How to generalise? How can we tell which insights are solid, and which are the product of a branch of the conversation that simply misfired? At Edgeryders, we do it in three moves.
  • 5. O N L I N E E T H N O G R A P H Y Photo: sam.romilly The first move is online ethnography. Many of you will be familiar with it; for the rest of us, ethnography is a qualitative research technique that results in the description of a group that encodes the point of view of that group. Ethnographers prefer to work with no hypothesis to prove, simply letting their respondents take the study to whatever they think is important. So, the first thing we do when trying to make sense of an online conversation, is deploy one or more ethnographers.
  • 6. Photo: Medhin PaolosPhoto: Medhin Paolos Ethnographers aggregate knowledge by interviewing people, and annotating the transcripts of their interviews. Annotations consist of a quotation from what the respondent has said, associated to one or more keywords. As she goes through the testimonies, the ethnographer builds and maintains an ontology of concepts and facts relevant to the issue being studied, as seen by respondents as a group. This is process is known as “coding”. Nowadays this is done with computers, and the results entered in a kind of database. So, the researcher can call back all the quotations that have been coded with a keyword, say “greentech”, and quickly get a grasp of what different people are saying about greentech. Do they agree? Is their controversy? What is important? What other concepts is greentech connected to? Etcetera. In Edgeryders, we have an advantage: conversation happens on our platform, so ethnographers can start to code it right away – no need for interviewing and transcribing. We can be much faster and cheaper this way. The Edgeryders platform features its own open source module to do ethnographic coding of the content therein. We call it OpenEthnographer, because we have a vision of ethnography as a collaborative discipline.
  • 7. S O C I A L N E T W O R K A N A LY S I S The second move is social network analysis. The Edgeryders conversation is encoded in a database – it has to be, for the platform to work at all. So, we can represent the discussion as a social network, where nodes represent people and arcs represent interactions – comments, essentially. At that point the pattern of connectivity in our conversation is described by a graph, and graphs are well understood mathematical objects. What’s more, there is a rich literature in social science that associates measurable graph metrics to social phenomena like brokerage and authoritativeness. In general, the graph can help us tentatively assign reliability scores to individuals. In this one, for example, there is a densely connected core of about 40 people in the center, who can be assumed to have at least been exposed to what other people have to say. Their quotations carry more weight, because they are validating and correcting each other. Then there is a periphery of people that have only one or two connections, and even a few completely isolated nodes to the southwest. What these people have to say might be very wise and relevant, but it lacks in-group validation. We developed software to do real-time social network analysis of the conversation on the Edgeryders platform. We call it Edgesense.
  • 8. Thinking about collective intelligence in terms of networks has been fruitful for us. It generates intuition, and lets us do quick, simple checks on the general state of the community. For example, this visualisation comes from a project done two years ago. Like in the previous slide, dots represent people, and arcs represent comments. In orange, you see the new project; in blue, the Edgeryders conversation outside of the project. As you can see, the new conversation is well connected to the pre-existing one, through individual that participate in both; but is still maintains structural cohesion – you see that because the orange edges are grouped to the west of the centre. This is what we would hope to see: there is a healthy balance of specialists, that presumably bring their specific knowledge and interest; and generalists, who carry memory of previous debates and can connect the newcomers to the old-timers relevant to the new project.
  • 9. S E M A N T I C S O C I A L N E T W O R K S posts/comments induce network interaction Ayman Ben code1 code2 Comment (free text) timestamp The third move is to pull in together ethnographic data and social network data to build something we call semantic social networks. The “atom” of a SSN is an interaction like this, where Ayman addresses a comment to Ben. In your social networking platform, the comment is made of some text; it has a source, Ayman, and a target, Ben; and it has a timestamp. After the ethnographer has worked her magic, it also has one or more ethnographic codes, so we know what this is about.
  • 10. Ayman Ben Chris Eric Gazbia Hegazy Dan Jay Fayez #diabetes #arduino #dementia #3d printing #accessibility #design #design #childcare #childcare #3d printing #design When you generalise this approach to a whole conversation, you get a fairly complex graph. You can do many things with it. Consider, for example, this toy network of a conversation about care.
  • 11. Ayman Chris Eric Dan Jay Fran #diabetes #arduino #dementia #3d printing #childcare #3d printing Dan Bob Gea Hera #accessibility #design #design #childcare #design Jay Fran #dementia #3d printing #childcare #3d printing Ben Gazbia Hegazy #accessibility #design #design #childcare #design Jay Fran A simple one is to filter it by ethnographic code. In our toy example, I have pulled out the network of design. I now know that Jay, Hegazy, Gazbia, Ben and Fran are interested in design. I notice that they are not really talking to each other; Jay, Hegazy and Ben form a group, Gazbia and Fran another. I know that the discussion on design in the context of care risks being incoherent, or balkanised. If the study is ongoing, I can even let Gazbia and Fran know that there is a group around Hegazy which shares their interest for design, maybe they will get in touch, share notes, and change both the content of what is being said and the structure of the graph.
  • 12. #diabetes #dementia #accessibility #childcare #3d printing #design #arduino Hegazy, Jay Dan, Jay Dan, Fayez Ayman, Ben Gazbia, Fayez I can also reverse the idea, and build a network of ethnographic codes. Now the arcs represent co-occurrences: two codes are connected when they appear together in one comment. In this toy network, it turns out, I have two disconnected clusters of concepts.
  • 13. We are still inventing the math for SSNA, but already at an intuitive level we see that the method is very powerful. I can use it, for example, to identify the emergent groups of specialists gathering around certain ethno codes. For example, here I have a SSNA from one of our previous studies. I decide I want to look into the subset of this conversation which is about education. I find and select the relevant codes, and am referred to the people that have been using those codes. As you can see, people in this group are part of a connected graph, but only some are directly talking to each other about education issues. They have education as a common interest, but they are not all comparing notes around it yet. I can also decide to ask my data what ELSE the people talking about education are also talking about. And here it is: this is like doing association patterns, but with a group. This works also if the people in question do not know they are part of the group. We use a metrics called entanglement to measure each code’s contribution to the cohesiveness of a group of codes. This is very new, it first appeared in a PhD thesis less than 3 years ago. As I said, the method is new. At each project we improve and iterate.
  • 14. We are still inventing the math for SSNA, but already at an intuitive level we see that the method is very powerful. I can use it, for example, to identify the emergent groups of specialists gathering around certain ethno codes. For example, here I have a SSNA from one of our previous studies. I decide I want to look into the subset of this conversation which is about education. I find and select the relevant codes, and am referred to the people that have been using those codes. As you can see, people in this group are part of a connected graph, but only some are directly talking to each other about education issues. They have education as a common interest, but they are not all comparing notes around it yet. I can also decide to ask my data what ELSE the people talking about education are also talking about. And here it is: this is like doing association patterns, but with a group. This works also if the people in question do not know they are part of the group. We use a metrics called entanglement to measure each code’s contribution to the cohesiveness of a group of codes. This is very new, it first appeared in a PhD thesis less than 3 years ago. As I said, the method is new. At each project we improve and iterate.
  • 15. I N T E L L I G E N C E , N O T S E N T I M E N T Photo: alex_inside I would like to dedicate the last two minutes of this talk to lift our gaze from the data themselves and thing about what exactly we are trying to do with them. First, we are not trying to do sentiment analysis. We are not, because we model people differently. Sentiment analysis was developed in the context of commercial communication: you need to get good at pressing the behavioural buttons that trigger purchase decisions. It models people as desire machines – use a certain shade of blue on your website and increase infinitesimally the probability that the person will buy an insurance policy. We ask people to be smart, committed, hard-working: citizen experts, not “consumers”, “beneficiaries”, “target groups” etc. And we model them as such. This generates high quality, nuanced contributions. We need fairly sophisticated analytical instruments to do justice to that quality. An obvious application of this method to what the World Bank does is as a risk assessment support tool. Situated knowledge, validated by debate, can monitor the situation of projects on the ground in real time, or close enough. When operating in unstable contexts, that can be especially valuable.
  • 16. O P E N D ATA Photo: Chaos Computer Club Second, we could not do this on Facebook. This kind of data analysis requires technical and legal means to access and process the data – in other words, open data. The Edgeryders platform has both. We consider ourselves part of the open data community, and some of us, myself included, are open data activists in our spare time.
  • 17. And third. I’m a data guy, and I like data. But at the end of the day, conversation graphs are just shorthand. The real thing behind the data is collective intelligence, people in conversation; and that is incredibly rich and more interesting than any shorthand, like all emergent phenomena. In our community, we are committed to put data analysis at the service of this ever-changing, ongoing dialogue. We are mapmakers, not landscapers, and we do not confuse the map with the territory. If we work together, yes, we will make graphs with you. And then we will encourage you to use them to find the most interesting people, and engage with them and what they have to say. Our data describe people, and people are really there. You can engage them, look them up, challenge them, hire them, become their friends. Hans Rosling got it just right: we don’t look at the data. We look thought the data, to better see the people behind them.

Hinweis der Redaktion

  1. Most of you are familiar with this. It’s called La scuola di Atene: 500 years ago, at the peak of the Renaissance and his own powers, Raphael paints it as a tribute to the wisdom and courage of humans seeking truth. The whole Top 40 chart of Greek philosophers is depicted: Pythagoras is right in the front left, studying a large tome. Epicurus also to the left, facing us, crowned with laurel. Socrates. Heraclitus. And in the center of the action: Plato and Aristotle. The two mighty philosophers are deep in discussion as they stride towards us. And this is only appropriate, because discussion, “dialogue” as Plato liked to call it is what powers their knowledge. Look around the fresco: debate is everywhere. Twenty-five centuries after Plato, and five after Raphael, this is still how science works.
  2. Dialogue – conversation – is not just about accumulating information. It’s a process that augments information, by setting it in a richer context. A well-run debate can feel like walking into a room with a fragment of map to find others that have its other pieces and ending up with a whole map. The whole is more, much more of the sum of its parts. And it never ends: questions beget answers and new questions, and it starts all over again, like a living thing. In fact, life itself works in a similar way, with genetic information being continuously traded amongst organism to give rise to always new, wonderful life forms.
  3. This is why Edgeryders encourages truth-seeking, result-oriented conversation as a knowledge engine. Nobody is smarter than everybody, and the smarts is in the interaction – the conversation. But this raises a problem: conversations don’t scale well. A hundred people cannot have a conversation, in the sense that they all keep a reasonably similar outlook on what is being discussed and what conclusions are being reached. They have to splinter into smaller groups. A great community convener, like John, or Noemi, can get a great many interesting people involved in discussing a problem. A lot of insights are generated and validated; but they are validated locally, by subsets of a few people. How to generalise? How can we tell which insights are solid, and which are the product of a branch of the conversation that simply misfired? At Edgeryders, we do it in three moves.
  4. The first move is online ethnography. Many of you will be familiar with it; for the rest of us, ethnography is a qualitative research technique that results in the description of a group that encodes the point of view of that group. Ethnographers prefer to work with no hypothesis to prove, simply letting their respondents take the study to whatever they think is important. So, the first thing we do when trying to make sense of an online conversation, is deploy one or more ethnographers.
  5. Ethnographers aggregate knowledge by interviewing people, and annotating the transcripts of their interviews. Annotations consist of a quotation from what the respondent has said, associated to one or more keywords. As she goes through the testimonies, the ethnographer builds and maintains an ontology of concepts and facts relevant to the issue being studied, as seen by respondents as a group. This is process is known as “coding”. Nowadays this is done with computers, and the results entered in a kind of database. So, the researcher can call back all the quotations that have been coded with a keyword, say “greentech”, and quickly get a grasp of what different people are saying about greentech. Do they agree? Is their controversy? What is important? What other concepts is greentech connected to? Etcetera. In Edgeryders, we have an advantage: conversation happens on our platform, so ethnographers can start to code it right away – no need for interviewing and transcribing. We can be much faster and cheaper this way. The Edgeryders platform features its own open source module to do ethnographic coding of the content therein. We call it OpenEthnographer, because we have a vision of ethnography as a collaborative discipline.
  6. The second move is social network analysis. The Edgeryders conversation is encoded in a database – it has to be, for the platform to work at all. So, we can represent the discussion as a social network, where nodes represent people and arcs represent interactions – comments, essentially. At that point the pattern of connectivity in our conversation is described by a graph, and graphs are well understood mathematical objects. What’s more, there is a rich literature in social science that associates measurable graph metrics to social phenomena like brokerage and authoritativeness. In general, the graph can help us tentatively assign reliability scores to individuals. In this one, for example, there is a densely connected core of about 40 people in the center, who can be assumed to have at least been exposed to what other people have to say. Their quotations carry more weight, because they are validating and correcting each other. Then there is a periphery of people that have only one or two connections, and even a few completely isolated nodes to the southwest. What these people have to say might be very wise and relevant, but it lacks in-group validation. We developed software to do real-time social network analysis of the conversation on the Edgeryders platform. We call it Edgesense.
  7. Thinking about collective intelligence in terms of networks has been fruitful for us. It generates intuition, and lets us do quick, simple checks on the general state of the community. For example, this visualisation comes from a project done two years ago. Like in the previous slide, dots represent people, and arcs represent comments. In orange, you see the new project; in blue, the Edgeryders conversation outside of the project. As you can see, the new conversation is well connected to the pre-existing one, through individual that participate in both; but is still maintains structural cohesion – you see that because the orange edges are grouped to the west of the centre. This is what we would hope to see: there is a healthy balance of specialists, that presumably bring their specific knowledge and interest; and generalists, who carry memory of previous debates and can connect the newcomers to the old-timers relevant to the new project.
  8. The third move is to pull in together ethnographic data and social network data to build something we call semantic social networks. The “atom” of a SSN is an interaction like this, where Ayman addresses a comment to Ben. In your social networking platform, the comment is made of some text; it has a source, Ayman, and a target, Ben; and it has a timestamp. After the ethnographer has worked her magic, it also has one or more ethnographic codes, so we know what this is about.
  9. When you generalise this approach to a whole conversation, you get a fairly complex graph. You can do many things with it. Consider, for example, this toy network of a conversation about care.
  10. A simple one is to filter it by ethnographic code. In our toy example, I have pulled out the network of design. I now know that Jay, Hegazy, Gazbia, Ben and Fran are interested in design. I notice that they are not really talking to each other; Jay, Hegazy and Ben form a group, Gazbia and Fran another. I know that the discussion on design in the context of care risks being incoherent, or balkanised. If the study is ongoing, I can even let Gazbia and Fran know that there is a group around Hegazy which shares their interest for design, maybe they will get in touch, share notes, and change both the content of what is being said and the structure of the graph.
  11. I can also reverse the idea, and build a network of ethnographic codes. Now the arcs represent co-occurrences: two codes are connected when they appear together in one comment. In this toy network, it turns out, I have two disconnected clusters of concepts.
  12. We are still inventing the math for SSNA, but already at an intuitive level we see that the method is very powerful. I can use it, for example, to identify the emergent groups of specialists gathering around certain ethno codes. For example, here I have a SSNA from one of our previous studies. I decide I want to look into the subset of this conversation which is about education. I find and select the relevant codes, and am referred to the people that have been using those codes. As you can see, people in this group are part of a connected graph, but only some are directly talking to each other about education issues. They have education as a common interest, but they are not all comparing notes around it yet. I can also decide to ask my data what ELSE the people talking about education are also talking about. And here it is: this is like doing association patterns, but with a group. This works also if the people in question do not know they are part of the group. We use a metrics called entanglement to measure each code’s contribution to the cohesiveness of a group of codes. This is very new, it first appeared in a PhD thesis less than 3 years ago. As I said, the method is new. At each project we improve and iterate.
  13. I would like to dedicate the last two minutes of this talk to lift our gaze from the data themselves and thing about what exactly we are trying to do with them. First, we are not trying to do sentiment analysis. We are not, because we model people differently. Sentiment analysis was developed in the context of commercial communication: you need to get good at pressing the behavioural buttons that trigger purchase decisions. It models people as desire machines – use a certain shade of blue on your website and increase infinitesimally the probability that the person will buy an insurance policy. We ask people to be smart, committed, hard-working: citizen experts, not “consumers”, “beneficiaries”, “target groups” etc. And we model them as such. This generates high quality, nuanced contributions. We need fairly sophisticated analytical instruments to do justice to that quality. An obvious application of this method to what the World Bank does is as a risk assessment support tool. Situated knowledge, validated by debate, can monitor the situation of projects on the ground in real time, or close enough. When operating in unstable contexts, that can be especially valuable.
  14. Second, we could not do this on Facebook. This kind of data analysis requires technical and legal means to access and process the data – in other words, open data. The Edgeryders platform has both. We consider ourselves part of the open data community, and some of us, myself included, are open data activists in our spare time.
  15. And third. I’m a data guy, and I like data. But at the end of the day, conversation graphs are just shorthand. The real thing behind the data is collective intelligence, people in conversation; and that is incredibly rich and more interesting than any shorthand, like all emergent phenomena. In our community, we are committed to put data analysis at the service of this ever-changing, ongoing dialogue. We are mapmakers, not landscapers, and we do not confuse the map with the territory. If we work together, yes, we will make graphs with you. And then we will encourage you to use them to find the most interesting people, and engage with them and what they have to say. Our data describe people, and people are really there. You can engage them, look them up, challenge them, hire them, become their friends. Hans Rosling got it just right: we don’t look at the data. We look thought the data, to better see the people behind them.