SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Even Before the Cradle
mapping online debates on
c-section and family planning
Tommaso Venturini
tommaso.venturini@sciencespo.fr
An example of a medium-size
project in digital methods
1 expert partner:
2 research partners:
Resources (@médialab):
- Donato Ricci (designer - 3 months full time)
- Audrey Baneyx (developer - 3 weeks )
- Support from the médialab team
Rationale for the project
The WHO issues recommendations on health related practices to medical
institutions and to the public opinion and need ways to monitor the spread and
efficacy of such recommendations
No, its not just pushing
a button
Actual research protocol
1. Briefing with the issue experts
2. Draft of possible mapping approaches (by students)
3. Choice of datasets and methods with the issue experts
4. Data extraction and cleaning
5. Data treatment
6. Exploration and definition of specific research questions
7. Sketch of data visualizations
8. Meeting between data experts and designers
9. Data refinement
10. Development of the visualizations
11. Interpretation of the visualizations with the issue experts
12. Development of the public atlas
What we have done so far
1. Briefing with the issue experts
2. Draft of possible mapping approaches (by students)
3. Choice of datasets and methods with the issue experts
4. Data extraction and cleaning
5. Data treatment
6. Exploration and definition of specific research questions
7. Sketch of data visualizations
8. Meeting between data experts and designers
9. Data refinement
10. Development of the visualizations
11. Interpretation of the visualizations with the issue experts
12. Development of the public atlas
1. Briefing with
the issue experts
• July 2012
• Meeting with 2 experts from the WHO
• Mario Merialdi(Coordinator of Coordinator Reproductive Health and
Research Family, Women's and Children's Health)
• Ana PilarBetranLazaga(Assistant coordinator…)
• 1 afternoon (at the médialab) presentation on digital methods
and communication design
2. Maps drafts http://www.densitydesign.org/courses/integrated-
course-final-synthesis-studio-2/
The Pill (poster) http://www.flickr.com/photos/densitydesign
/8187992346/lightbox/
The Pill (report) http://issuu.com/densitydesign/docs/02_-
_the_pill/1?e=0
The Pill (video)
http://vimeo.com/62162885
3. Choice of
datasets and methods
• 1 day meeting in Geneva
• Presentation of the students’ draft-maps
• Identification of methodological problems
• Choice of the two case studies
• After 20 days, list of precise research questions
(by the experts)
• After 20 days, definition of the operationalization (with the
experts)
Two case-studies
Caesarean section (also C-section, Cesarean section) “a
surgical procedure in which one or more incisions are
made through a mother's abdomen (laparotomy) and
uterus (hysterotomy) to deliver one or more babies”
(Wikipedia 05/07/13)
Family planning
“the planning of when to have children and the use of birth
control and other techniques to implement such plans”
(Wikipedia 05/07/13)
Two case-studies
(Wikipedia 05/07/13) (Wikipedia 05/07/13)
3. Choice of datasets and methods
C-section
1. Websites hyperlinks
- cartography of the topology of the hyperlink connections
- analysis of the penetration of the WHO messages
2. Websites texts
- analysis of the expressions used by the different type of websites
3. Online images
- analysis of the representations used by the different type of websites
4. Discussions in AuFeminin
- analysis of the agenda of the online discussion
Family Planning
1. Websites texts
- analysis of the expressions used by the different type of websites
2. Wikipedia
- Analysis of the edit history of the pages on family planning
3. Choice of datasets and methods
C-section
1. Websites hyperlinks
- cartography of the topology of the hyperlink connections
- analysis of the penetration of the WHO messages
2. Websites texts
- analysis of the expressions used by the different type of websites
3. Online images
- analysis of the representations used by the different type of websites
4. Discussions in AuFeminin
- analysis of the agenda of the online discussion
Family Planning
1. Websites texts
- analysis of the expressions used by the different type of websites
2. Wikipedia
- Analysis of the edit history of the pages on family planning
4. Data extraction
and cleaningC-section
1. Websites hyperlinks
- Query de-personalized Google with various queries
- Harvest the first 100 results
- Manually clean the results (366 seed-URLs)
2. Websites texts
-Harvest the source-code of the 366 seed-URLs and extract the textual content
3. Online images
- Query de-personalized Google +co.ok, +fr, +it, +com with translated queries
- Harvest the first 100 results for each query and engine (about 4.000 img)
- Harvest the source-code of the URLs containing the images
- Automatically extract the textual content
4. Discussions in AuFeminin
- Select some the forums
- Search all the discussions containing the translated queries
- Harvest the discussions of last year with at least 2 replies
- Automatically extract the textual content
Family Planning
1. Websites texts
- Query the de-personalized Google.com with various queries
- Harvest the first 100 results
- Manually clean the results (553 seed-URLs)
- Harvest the source-code of the 553 URLs and extract the textual content
2. Wikipedia
- Select the most relevant Wikipedia pages related to family planning
- Extract the complete edit history of the page via the Wikipedia API
4. Data extraction
and cleaningC-section
1. Websites hyperlinks
- Query de-personalized Google with various queries
- Harvest the first 100 results
- Manually clean the results (366 seed-URLs)
2. Websites texts
- Harvest the source-code of the366 seed-URLs and extract the textual content
3. Online images
- Query de-personalized Google +co.ok, +fr, +it, +com with translated queries
- Harvest the first 100 results for each query and engine (about 4.000 img)
- Harvest the source-code of the URLs containing the images
- Automatically extract the textual content
4. Discussions in AuFeminin
- Select some the forums
- Search all the discussions containing the translated queries
- Harvest the discussions of last year with at least 2 replies
- Automatically extract the textual content
Family Planning
1. Websites texts
- Query the de-personalized Google.com with various queries
- Harvest the first 100 results
- Manually clean the results (553 seed-URLs)
- Harvest the source-code of the 553 URLs and extract the textual content
2. Wikipedia
- Select the most relevant Wikipedia pages related to family planning
- Extract the complete edit history of the page via the Wikipedia API
4. Data extraction and cleaning
C-section
1. Websites hyperlinks
- Query the de-personalized Google.com with various queries
(C-section, Cs delivery, Surgical delivery, Abdominal delivery,
Cesarean delivery, Caesarean delivery, Operative delivery, Caesarean, Cesarean)
- Harvest the first 100 results of each query
- Manually clean the results (366 seed-URLs)
2. Websites texts
- Harvest the source-code of the366 seed-URLs
- Extract textual content (through Url2Text)
5. Data treatment
C-section
1. Websites hyperlinks
- Crawl the 366 seed-URLs through Hyphe
(https://github.com/medialab/Hypertext-Corpus-Initiative/)
- Manually and automatically clean the neighbors URLs (614 URLs)
- Extract the hyperlink networks through Hyphe
2. Websites texts
- Categorize the the 366 seed-URLs
Theme: Pros, cons and worries, involvement of the father, Ethical issues
Type: Health care providers, Government, Medical and scientific groups, Ngos / no-profit, Feminists
groups / female associations / moms, Rights groups, Natural & holistic delivery groups, Media, Blogs&
Discussions, Institutions, Hospitals & Clinics, Products
- Extract the noun-phrases through Pattern (www.clips.ua.ac.be/pattern)
- Cluster and clean the noun-phrases (through Google Refine)
Hyphe https://github.com/medialab/Hypertext-
Corpus-Initiative/
Hyphe https://github.com/medialab/Hypertext-
Corpus-Initiative/
Hyphe https://github.com/medialab/Hypertext-
Corpus-Initiative/
Hyphe https://github.com/medialab/Hypertext-
Corpus-Initiative/
5. Data treatment
C-section
1. Websites hyperlinks
- Crawl the 366 seed-URLs through Hyphe
(https://github.com/medialab/Hypertext-Corpus-Initiative/)
- Manually and automatically clean the neighbors URLs (614 URLs)
- Extract the hyperlink networks through Hyphe
2. Websites texts
- Categorize the the 366 seed-URLs
Theme: Pros, cons and worries, involvement of the father, Ethical issues
Type: Health care providers, Government, Medical and scientific groups, Ngos / no-profit, Feminists
groups / female associations / moms, Rights groups, Natural & holistic delivery groups, Media, Blogs&
Discussions, Institutions, Hospitals & Clinics, Products
- Extract the noun-phrases through Pattern (www.clips.ua.ac.be/pattern)
- Cluster and clean the noun-phrases (through Google Refine)
Pattern
www.clips.ua.ac.be/pattern
Google Refine
https://github.com/OpenRefine/pattern
6. Exploration
C-section
1. Websites hyperlinks
- Visual network analysis in Gephi(gephi.org)
- Egocenter heatmaps in Heatgraph
(tools.medialab.sciences-po.fr/heatgraph/)
2. Websites texts
- Language intake analysis in Sven (sven.densitydesign.org)
Gephi
Gephi.org
Gephi
Gephi.org
Gephi
Gephi.org
Heatgraph
tools.medialab.sciences-po.fr/heatgraph
Heatgraph
tools.medialab.sciences-po.fr/heatgraph
6. Exploration
C-section
1. Websites hyperlinks
- Visual network analysis in Gephi(gephi.org)
- Egocenter heatmaps in Heatgraph
(tools.medialab.sciences-po.fr/heatgraph/)
2. Websites texts
- Language intake analysis in Sven (sven.densitydesign.org)
SVEN
http://sven.densitydesign.org
A) Pros, cons and worries
B) Involvement of the father
C) Health care providers
D) Government
E) Ethical issues
F) Medical and scientific groups
G) Ngos / no-profit
H) Feminists groups / female associations / moms
I) Rights groups
J) Natural & holistic delivery groups
K) Media
L) Blogs& Discussions
M) Institutions
N) Hospitals & Clinics
O) Products
Other datasets and methods
C-section
1. Websites hyperlinks
- cartography of the topology of the hyperlink connections
- analysis of the penetration of the WHO messages
2. Websites texts
- analysis of the expressions used by the different type of websites
3. Online images
- analysis of the representations used by the different type of websites
4. Discussions in AuFeminin
- analysis of the agenda of the online discussion
Family Planning
1. Websites texts
- analysis of the expressions used by the different type of websites
2. Wikipedia
- Analysis of the edit history of the pages on family planning
What remains to do
1. Briefing with the issue experts
2. Draft of possible mapping approaches (by students)
3. Choice of datasets and methods with the issue experts
4. Data extraction and cleaning
5. Data treatment
6. Exploration and definition of specific research questions
7. Sketch of data visualizations
8. Meeting between data experts and designers
9. Data refinement
10. Development of the visualizations
11. Interpretation of the visualizations with the issue experts
12. Development of the public atlas
What we have learned
1. Digital methods are not easier or quicker
2. More data always entails more noise
3. Results quality depends heavily on data cleaning
4. No a priori distinction exists between noise and information
5. An iterative approach is necessary
6. Exchanges with experts and expertise building are necessary
7. Digital methods are a form of field work
tommasoventurini.it
Venturini, T. (2012). Great expectations: méthodes quali-quantitative et
analyse des réseaux sociaux.
In J.-P. Fourmentraux (Ed.), L’Ère Post-Media. Humanités digitales et
Cultures numériques (Hermann., Vol. 104, pp. 39–51). Paris.
Venturini, T., & Latour, B. (2010). Le tissu social : trace numérique et
méthodes quali-quantitatives. Proceedings of Future En Seine 2009. Paris:
Editions Futur en Seine.
Latour, B., Jensen, P., Venturini, T., Grauwin, S., &Boullier, D. (2012). “The
WholeisAlwaysSmallerThanIts Parts” A Digital Test of Gabriel
Tarde’sMonads. British Journal of Sociology, 63(4), 591–615.

Weitere ähnliche Inhalte

Ähnlich wie From Before the Cradle: mapping online debates on c-section and family planning

Temporal and semantic analysis of richly typed social networks from user-gene...
Temporal and semantic analysis of richly typed social networks from user-gene...Temporal and semantic analysis of richly typed social networks from user-gene...
Temporal and semantic analysis of richly typed social networks from user-gene...
Zide Meng
 
Introduction to Big Data and Data Science
Introduction to Big Data and Data ScienceIntroduction to Big Data and Data Science
Introduction to Big Data and Data Science
Feyzi R. Bagirov
 

Ähnlich wie From Before the Cradle: mapping online debates on c-section and family planning (20)

Clinical Anatomy 9566
Clinical Anatomy 9566Clinical Anatomy 9566
Clinical Anatomy 9566
 
Iochem.carles bo
Iochem.carles boIochem.carles bo
Iochem.carles bo
 
Recsys 2016
Recsys 2016Recsys 2016
Recsys 2016
 
DMI Workshop: When Search Becomes Research
DMI Workshop: When Search Becomes ResearchDMI Workshop: When Search Becomes Research
DMI Workshop: When Search Becomes Research
 
HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9
 
Bioschemas overview
Bioschemas overviewBioschemas overview
Bioschemas overview
 
Temporal and semantic analysis of richly typed social networks from user-gene...
Temporal and semantic analysis of richly typed social networks from user-gene...Temporal and semantic analysis of richly typed social networks from user-gene...
Temporal and semantic analysis of richly typed social networks from user-gene...
 
Deploying Viva Topics
Deploying Viva TopicsDeploying Viva Topics
Deploying Viva Topics
 
Data extraction tools
Data extraction toolsData extraction tools
Data extraction tools
 
Analyzing social media with Python and other tools (1/4)
Analyzing social media with Python and other tools (1/4)Analyzing social media with Python and other tools (1/4)
Analyzing social media with Python and other tools (1/4)
 
Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack
 
DatoConference2015
DatoConference2015DatoConference2015
DatoConference2015
 
Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)
 
Web Information Extraction for the Database Research Domain
Web Information Extraction for the Database Research DomainWeb Information Extraction for the Database Research Domain
Web Information Extraction for the Database Research Domain
 
Building Search Engine in the Social Media Era
Building Search Engine in the Social Media EraBuilding Search Engine in the Social Media Era
Building Search Engine in the Social Media Era
 
Introduction to Big Data and Data Science
Introduction to Big Data and Data ScienceIntroduction to Big Data and Data Science
Introduction to Big Data and Data Science
 
Library Management System
Library Management SystemLibrary Management System
Library Management System
 
Bioschemas Workshop
Bioschemas WorkshopBioschemas Workshop
Bioschemas Workshop
 
Introduction to Big Data and its Potential for Dementia Research
Introduction to Big Data and its Potential for Dementia ResearchIntroduction to Big Data and its Potential for Dementia Research
Introduction to Big Data and its Potential for Dementia Research
 
Feedly & Cassandra at Fashiolista
Feedly & Cassandra at FashiolistaFeedly & Cassandra at Fashiolista
Feedly & Cassandra at Fashiolista
 

Mehr von INRIA - ENS Lyon

On Continuity in Social Sciences
On Continuity in Social SciencesOn Continuity in Social Sciences
On Continuity in Social Sciences
INRIA - ENS Lyon
 
A Trip to Flatland: mapping or modeling in the social sciences
A Trip to Flatland: mapping or modeling in the social sciencesA Trip to Flatland: mapping or modeling in the social sciences
A Trip to Flatland: mapping or modeling in the social sciences
INRIA - ENS Lyon
 
How to follow actors through their traces. Exploiting digital traceability
How to follow actors through their traces. Exploiting digital traceabilityHow to follow actors through their traces. Exploiting digital traceability
How to follow actors through their traces. Exploiting digital traceability
INRIA - ENS Lyon
 
Who are the actors of controversies? appreciating the heterogeneity of collec...
Who are the actors of controversies? appreciating the heterogeneity of collec...Who are the actors of controversies? appreciating the heterogeneity of collec...
Who are the actors of controversies? appreciating the heterogeneity of collec...
INRIA - ENS Lyon
 
1. Why controversies? Learning to be constructivist
1. Why controversies? Learning to be constructivist1. Why controversies? Learning to be constructivist
1. Why controversies? Learning to be constructivist
INRIA - ENS Lyon
 

Mehr von INRIA - ENS Lyon (20)

Actor-Network Theory as a Theory of Action
Actor-Network Theory as a Theory of ActionActor-Network Theory as a Theory of Action
Actor-Network Theory as a Theory of Action
 
Actor‐Network Theory VS Network Analysis VS Digital Networks Are We Talking A...
Actor‐Network Theory VS Network Analysis VS Digital Networks Are We Talking A...Actor‐Network Theory VS Network Analysis VS Digital Networks Are We Talking A...
Actor‐Network Theory VS Network Analysis VS Digital Networks Are We Talking A...
 
Dr. Jekyll and Mr. Hyde IPCC and the Double Logic of International Expertise
Dr. Jekyll and Mr. Hyde IPCC and the Double Logic of International ExpertiseDr. Jekyll and Mr. Hyde IPCC and the Double Logic of International Expertise
Dr. Jekyll and Mr. Hyde IPCC and the Double Logic of International Expertise
 
Dancing Together: the Fluidification of the Modern Mind
Dancing Together: the Fluidification of the Modern MindDancing Together: the Fluidification of the Modern Mind
Dancing Together: the Fluidification of the Modern Mind
 
Digital methods - 1 : Introduction
Digital methods - 1 : IntroductionDigital methods - 1 : Introduction
Digital methods - 1 : Introduction
 
Contropedia, and the question of analytically separating the medium and the m...
Contropedia, and the question of analytically separating the medium and the m...Contropedia, and the question of analytically separating the medium and the m...
Contropedia, and the question of analytically separating the medium and the m...
 
A Tale of Two Cities
A Tale of Two CitiesA Tale of Two Cities
A Tale of Two Cities
 
Escaping greatdivide coimbra
Escaping greatdivide coimbraEscaping greatdivide coimbra
Escaping greatdivide coimbra
 
What isa border_kings
What isa border_kingsWhat isa border_kings
What isa border_kings
 
Climaps by EMAPS et Europeana2015
Climaps by EMAPS et Europeana2015Climaps by EMAPS et Europeana2015
Climaps by EMAPS et Europeana2015
 
Medusa haidresser
Medusa haidresserMedusa haidresser
Medusa haidresser
 
Keynote speech at the Digitale Praxen conference at Frankfurt University
Keynote speech at the Digitale Praxen conference at Frankfurt UniversityKeynote speech at the Digitale Praxen conference at Frankfurt University
Keynote speech at the Digitale Praxen conference at Frankfurt University
 
On Continuity in Social Sciences
On Continuity in Social SciencesOn Continuity in Social Sciences
On Continuity in Social Sciences
 
A Trip to Flatland: mapping or modeling in the social sciences
A Trip to Flatland: mapping or modeling in the social sciencesA Trip to Flatland: mapping or modeling in the social sciences
A Trip to Flatland: mapping or modeling in the social sciences
 
How to follow actors through their traces. Exploiting digital traceability
How to follow actors through their traces. Exploiting digital traceabilityHow to follow actors through their traces. Exploiting digital traceability
How to follow actors through their traces. Exploiting digital traceability
 
What’s in a controversy. Deploying the folds of collective action
What’s in a controversy. Deploying the folds of collective actionWhat’s in a controversy. Deploying the folds of collective action
What’s in a controversy. Deploying the folds of collective action
 
Who are the actors of controversies? appreciating the heterogeneity of collec...
Who are the actors of controversies? appreciating the heterogeneity of collec...Who are the actors of controversies? appreciating the heterogeneity of collec...
Who are the actors of controversies? appreciating the heterogeneity of collec...
 
1. Why controversies? Learning to be constructivist
1. Why controversies? Learning to be constructivist1. Why controversies? Learning to be constructivist
1. Why controversies? Learning to be constructivist
 
Mapping connectionswithheatmaps
Mapping connectionswithheatmapsMapping connectionswithheatmaps
Mapping connectionswithheatmaps
 
Welcome to Flatland
Welcome to FlatlandWelcome to Flatland
Welcome to Flatland
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

From Before the Cradle: mapping online debates on c-section and family planning

  • 1. Even Before the Cradle mapping online debates on c-section and family planning Tommaso Venturini tommaso.venturini@sciencespo.fr
  • 2. An example of a medium-size project in digital methods 1 expert partner: 2 research partners: Resources (@médialab): - Donato Ricci (designer - 3 months full time) - Audrey Baneyx (developer - 3 weeks ) - Support from the médialab team Rationale for the project The WHO issues recommendations on health related practices to medical institutions and to the public opinion and need ways to monitor the spread and efficacy of such recommendations
  • 3. No, its not just pushing a button
  • 4. Actual research protocol 1. Briefing with the issue experts 2. Draft of possible mapping approaches (by students) 3. Choice of datasets and methods with the issue experts 4. Data extraction and cleaning 5. Data treatment 6. Exploration and definition of specific research questions 7. Sketch of data visualizations 8. Meeting between data experts and designers 9. Data refinement 10. Development of the visualizations 11. Interpretation of the visualizations with the issue experts 12. Development of the public atlas
  • 5. What we have done so far 1. Briefing with the issue experts 2. Draft of possible mapping approaches (by students) 3. Choice of datasets and methods with the issue experts 4. Data extraction and cleaning 5. Data treatment 6. Exploration and definition of specific research questions 7. Sketch of data visualizations 8. Meeting between data experts and designers 9. Data refinement 10. Development of the visualizations 11. Interpretation of the visualizations with the issue experts 12. Development of the public atlas
  • 6. 1. Briefing with the issue experts • July 2012 • Meeting with 2 experts from the WHO • Mario Merialdi(Coordinator of Coordinator Reproductive Health and Research Family, Women's and Children's Health) • Ana PilarBetranLazaga(Assistant coordinator…) • 1 afternoon (at the médialab) presentation on digital methods and communication design
  • 7. 2. Maps drafts http://www.densitydesign.org/courses/integrated- course-final-synthesis-studio-2/
  • 8. The Pill (poster) http://www.flickr.com/photos/densitydesign /8187992346/lightbox/
  • 9. The Pill (report) http://issuu.com/densitydesign/docs/02_- _the_pill/1?e=0
  • 11. 3. Choice of datasets and methods • 1 day meeting in Geneva • Presentation of the students’ draft-maps • Identification of methodological problems • Choice of the two case studies • After 20 days, list of precise research questions (by the experts) • After 20 days, definition of the operationalization (with the experts)
  • 12. Two case-studies Caesarean section (also C-section, Cesarean section) “a surgical procedure in which one or more incisions are made through a mother's abdomen (laparotomy) and uterus (hysterotomy) to deliver one or more babies” (Wikipedia 05/07/13) Family planning “the planning of when to have children and the use of birth control and other techniques to implement such plans” (Wikipedia 05/07/13)
  • 14. 3. Choice of datasets and methods C-section 1. Websites hyperlinks - cartography of the topology of the hyperlink connections - analysis of the penetration of the WHO messages 2. Websites texts - analysis of the expressions used by the different type of websites 3. Online images - analysis of the representations used by the different type of websites 4. Discussions in AuFeminin - analysis of the agenda of the online discussion Family Planning 1. Websites texts - analysis of the expressions used by the different type of websites 2. Wikipedia - Analysis of the edit history of the pages on family planning
  • 15. 3. Choice of datasets and methods C-section 1. Websites hyperlinks - cartography of the topology of the hyperlink connections - analysis of the penetration of the WHO messages 2. Websites texts - analysis of the expressions used by the different type of websites 3. Online images - analysis of the representations used by the different type of websites 4. Discussions in AuFeminin - analysis of the agenda of the online discussion Family Planning 1. Websites texts - analysis of the expressions used by the different type of websites 2. Wikipedia - Analysis of the edit history of the pages on family planning
  • 16. 4. Data extraction and cleaningC-section 1. Websites hyperlinks - Query de-personalized Google with various queries - Harvest the first 100 results - Manually clean the results (366 seed-URLs) 2. Websites texts -Harvest the source-code of the 366 seed-URLs and extract the textual content 3. Online images - Query de-personalized Google +co.ok, +fr, +it, +com with translated queries - Harvest the first 100 results for each query and engine (about 4.000 img) - Harvest the source-code of the URLs containing the images - Automatically extract the textual content 4. Discussions in AuFeminin - Select some the forums - Search all the discussions containing the translated queries - Harvest the discussions of last year with at least 2 replies - Automatically extract the textual content Family Planning 1. Websites texts - Query the de-personalized Google.com with various queries - Harvest the first 100 results - Manually clean the results (553 seed-URLs) - Harvest the source-code of the 553 URLs and extract the textual content 2. Wikipedia - Select the most relevant Wikipedia pages related to family planning - Extract the complete edit history of the page via the Wikipedia API
  • 17. 4. Data extraction and cleaningC-section 1. Websites hyperlinks - Query de-personalized Google with various queries - Harvest the first 100 results - Manually clean the results (366 seed-URLs) 2. Websites texts - Harvest the source-code of the366 seed-URLs and extract the textual content 3. Online images - Query de-personalized Google +co.ok, +fr, +it, +com with translated queries - Harvest the first 100 results for each query and engine (about 4.000 img) - Harvest the source-code of the URLs containing the images - Automatically extract the textual content 4. Discussions in AuFeminin - Select some the forums - Search all the discussions containing the translated queries - Harvest the discussions of last year with at least 2 replies - Automatically extract the textual content Family Planning 1. Websites texts - Query the de-personalized Google.com with various queries - Harvest the first 100 results - Manually clean the results (553 seed-URLs) - Harvest the source-code of the 553 URLs and extract the textual content 2. Wikipedia - Select the most relevant Wikipedia pages related to family planning - Extract the complete edit history of the page via the Wikipedia API
  • 18. 4. Data extraction and cleaning C-section 1. Websites hyperlinks - Query the de-personalized Google.com with various queries (C-section, Cs delivery, Surgical delivery, Abdominal delivery, Cesarean delivery, Caesarean delivery, Operative delivery, Caesarean, Cesarean) - Harvest the first 100 results of each query - Manually clean the results (366 seed-URLs) 2. Websites texts - Harvest the source-code of the366 seed-URLs - Extract textual content (through Url2Text)
  • 19. 5. Data treatment C-section 1. Websites hyperlinks - Crawl the 366 seed-URLs through Hyphe (https://github.com/medialab/Hypertext-Corpus-Initiative/) - Manually and automatically clean the neighbors URLs (614 URLs) - Extract the hyperlink networks through Hyphe 2. Websites texts - Categorize the the 366 seed-URLs Theme: Pros, cons and worries, involvement of the father, Ethical issues Type: Health care providers, Government, Medical and scientific groups, Ngos / no-profit, Feminists groups / female associations / moms, Rights groups, Natural & holistic delivery groups, Media, Blogs& Discussions, Institutions, Hospitals & Clinics, Products - Extract the noun-phrases through Pattern (www.clips.ua.ac.be/pattern) - Cluster and clean the noun-phrases (through Google Refine)
  • 24. 5. Data treatment C-section 1. Websites hyperlinks - Crawl the 366 seed-URLs through Hyphe (https://github.com/medialab/Hypertext-Corpus-Initiative/) - Manually and automatically clean the neighbors URLs (614 URLs) - Extract the hyperlink networks through Hyphe 2. Websites texts - Categorize the the 366 seed-URLs Theme: Pros, cons and worries, involvement of the father, Ethical issues Type: Health care providers, Government, Medical and scientific groups, Ngos / no-profit, Feminists groups / female associations / moms, Rights groups, Natural & holistic delivery groups, Media, Blogs& Discussions, Institutions, Hospitals & Clinics, Products - Extract the noun-phrases through Pattern (www.clips.ua.ac.be/pattern) - Cluster and clean the noun-phrases (through Google Refine)
  • 27. 6. Exploration C-section 1. Websites hyperlinks - Visual network analysis in Gephi(gephi.org) - Egocenter heatmaps in Heatgraph (tools.medialab.sciences-po.fr/heatgraph/) 2. Websites texts - Language intake analysis in Sven (sven.densitydesign.org)
  • 33. 6. Exploration C-section 1. Websites hyperlinks - Visual network analysis in Gephi(gephi.org) - Egocenter heatmaps in Heatgraph (tools.medialab.sciences-po.fr/heatgraph/) 2. Websites texts - Language intake analysis in Sven (sven.densitydesign.org)
  • 34. SVEN http://sven.densitydesign.org A) Pros, cons and worries B) Involvement of the father C) Health care providers D) Government E) Ethical issues F) Medical and scientific groups G) Ngos / no-profit H) Feminists groups / female associations / moms I) Rights groups J) Natural & holistic delivery groups K) Media L) Blogs& Discussions M) Institutions N) Hospitals & Clinics O) Products
  • 35. Other datasets and methods C-section 1. Websites hyperlinks - cartography of the topology of the hyperlink connections - analysis of the penetration of the WHO messages 2. Websites texts - analysis of the expressions used by the different type of websites 3. Online images - analysis of the representations used by the different type of websites 4. Discussions in AuFeminin - analysis of the agenda of the online discussion Family Planning 1. Websites texts - analysis of the expressions used by the different type of websites 2. Wikipedia - Analysis of the edit history of the pages on family planning
  • 36. What remains to do 1. Briefing with the issue experts 2. Draft of possible mapping approaches (by students) 3. Choice of datasets and methods with the issue experts 4. Data extraction and cleaning 5. Data treatment 6. Exploration and definition of specific research questions 7. Sketch of data visualizations 8. Meeting between data experts and designers 9. Data refinement 10. Development of the visualizations 11. Interpretation of the visualizations with the issue experts 12. Development of the public atlas
  • 37. What we have learned 1. Digital methods are not easier or quicker 2. More data always entails more noise 3. Results quality depends heavily on data cleaning 4. No a priori distinction exists between noise and information 5. An iterative approach is necessary 6. Exchanges with experts and expertise building are necessary 7. Digital methods are a form of field work
  • 38. tommasoventurini.it Venturini, T. (2012). Great expectations: méthodes quali-quantitative et analyse des réseaux sociaux. In J.-P. Fourmentraux (Ed.), L’Ère Post-Media. Humanités digitales et Cultures numériques (Hermann., Vol. 104, pp. 39–51). Paris. Venturini, T., & Latour, B. (2010). Le tissu social : trace numérique et méthodes quali-quantitatives. Proceedings of Future En Seine 2009. Paris: Editions Futur en Seine. Latour, B., Jensen, P., Venturini, T., Grauwin, S., &Boullier, D. (2012). “The WholeisAlwaysSmallerThanIts Parts” A Digital Test of Gabriel Tarde’sMonads. British Journal of Sociology, 63(4), 591–615.