SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Matthew S. Weber
Rutgers University
@docmattweber
Presented At
130th Annual Meeting of
Big Data, Big Theory &
The Thread of Recent History
Credit: Flickr @ilovecology
Pekin Daily Times, Pekin, IL, October 8, 2013
Credit: Pekin Daily Times
4
What’s in the data?
7
Source | Destination | Date | Frequency | Content Type | Bytes | Content
Link Data:
http://gawker.com/5953665/mitt-romneys-
staff-played-the-media-covering-them-in-a-
friendly-game-of-flag-football
Mitt Romney's Staff Played the Media Covering
Them in a Friendly Game of Flag
http://gawker.com
2012-10-22
8
13
News Media on the Web
(Weber, Ognyanova, Kosterich & Nguyen, 2015)
NJ Local News: 2007 - 2012
17
0
1
2
3
4
5
6
7
0
100
200
300
400
500
600
700
800
900
1000
2007 2008 2009 2010 2011 2012
Avg.MBperWebpage
Avg.NumberofWebpages
NJ.com Domain Analysis
Number of Pages Avg MB
18
Dataset Research Potential Dates Captures Unique URLs
Hurricane Katrina Online networks and organizational
resilience (Chewning, Lai and Doerfel,
2012; Perry, Taylor and Doerfel, 2003) in
the wake of disasters; information
dissemination
2003 – 2012 1,694,236 663,740
Superstorm
Sandy
2003 – 2012 41,703,112 20,013,455
US Senate Study the growth of political activity in
online environments (Adamic & Glance,
2005; Bruns, 2007; Chang & Park, 2012);
polarization & media discourse
109th – 112th
Congresses
26,965,770 8,674,397
US House 51,840,777 12,410,014
Occupy Wall
Street
Previous research on NGOs in the online
environment (Bach & Stark, 2004;
Shumate, 2003, 2012; Shumate, Fulk, &
Monge, 2005); use of hyperlink data to
study the formation and role of alliances
between SMOs
2010 – 2012 247,928,272 11,3259,655
US Media
Previous studies of news media
organizations (Greer & Mensing, 2006;
Weber, 2012; Weber & Monge, In
Press); focus on evolutionary patterns
2008 – 2012 1,315,132,555 539,184,823
To what degree are large-scale datasets reliable?
20
21
22
0 5 10 15 20 25 30
050000010000001500000200000025000003000000
Potential vs. Actual URLs
CountofPages
23t
CountofURLs
Potential
Actual
Difference
24
0e+002e+064e+066e+06
Changes in Crawl Completeness
CountofPages
t
CountofURLs
OWS
House
Senate
Katrina
existing
potential
b =
set a unit of time for analysis, c
choosing n periods across a total time T
In the ideal case, it would be possible to create a factor that corrects
for data degrade:
bt
How does this help?
Each of the illustrated cases fits against an
exponential function ~ b
• Senate: 0.13
• House: 0.13
• Katrina: 0.02
• OWS: 0.10
25
ebt
26
Challenges are not unique to these
data
Courtesy of Marc Smith, NodeXL
27
Research support from:
NSF Award #1244727; Additional support from the NetSCI Lab @ Rutgers

Weitere ähnliche Inhalte

Was ist angesagt?

One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...
One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...
One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...Elena Simperl
 
Technology and Citizen Engagement in Chesterfield VA
Technology and Citizen Engagement in Chesterfield VATechnology and Citizen Engagement in Chesterfield VA
Technology and Citizen Engagement in Chesterfield VABarry Condrey
 
RESOLVING MULTI-PARTY PRIVACY CONFLICTS IN SOCIAL MEDIA
RESOLVING MULTI-PARTY PRIVACY CONFLICTS IN SOCIAL MEDIARESOLVING MULTI-PARTY PRIVACY CONFLICTS IN SOCIAL MEDIA
RESOLVING MULTI-PARTY PRIVACY CONFLICTS IN SOCIAL MEDIANexgen Technology
 
CeB - f - s01
CeB - f - s01CeB - f - s01
CeB - f - s01gauvins
 
The Sociology of Nothingness: Challenges of Big Data
The Sociology of Nothingness: Challenges of Big DataThe Sociology of Nothingness: Challenges of Big Data
The Sociology of Nothingness: Challenges of Big DataEugen Glavan
 
Collaboration and fairness-aware big data management in distributed clouds
Collaboration  and fairness-aware big data management in distributed cloudsCollaboration  and fairness-aware big data management in distributed clouds
Collaboration and fairness-aware big data management in distributed cloudsNexgen Technology
 
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020P2Pvalue
 
Taking it Public: Visualizing Geospatial Data on the Web Using Shiny
Taking it Public: Visualizing Geospatial Data on the Web Using ShinyTaking it Public: Visualizing Geospatial Data on the Web Using Shiny
Taking it Public: Visualizing Geospatial Data on the Web Using Shinynacis_slides
 
Living Labs Roundtable / NYC Climate Week 2020/ Part 2 of 2
Living Labs Roundtable / NYC Climate Week 2020/ Part 2 of 2Living Labs Roundtable / NYC Climate Week 2020/ Part 2 of 2
Living Labs Roundtable / NYC Climate Week 2020/ Part 2 of 2Carter Craft
 
NSGIC Mid-Year Meeting
NSGIC Mid-Year MeetingNSGIC Mid-Year Meeting
NSGIC Mid-Year MeetingKSI Koniag
 
Protecting big data mining association rules using fuzzy system
Protecting big data mining association rules using fuzzy systemProtecting big data mining association rules using fuzzy system
Protecting big data mining association rules using fuzzy systemTELKOMNIKA JOURNAL
 
Building better knowledge graphs through social computing
Building better knowledge graphs through social computingBuilding better knowledge graphs through social computing
Building better knowledge graphs through social computingElena Simperl
 
Accelerating biomedical discovery with an Internet of FAIR data and services ...
Accelerating biomedical discovery with an Internet of FAIR data and services ...Accelerating biomedical discovery with an Internet of FAIR data and services ...
Accelerating biomedical discovery with an Internet of FAIR data and services ...Platform Linked Data Netherlands (PLDN)
 
IN2N: Cross-institutional Authority Collaboration
IN2N: Cross-institutional Authority CollaborationIN2N: Cross-institutional Authority Collaboration
IN2N: Cross-institutional Authority CollaborationAlexander Haffner
 

Was ist angesagt? (16)

One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...
One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...
One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...
 
Technology and Citizen Engagement in Chesterfield VA
Technology and Citizen Engagement in Chesterfield VATechnology and Citizen Engagement in Chesterfield VA
Technology and Citizen Engagement in Chesterfield VA
 
RESOLVING MULTI-PARTY PRIVACY CONFLICTS IN SOCIAL MEDIA
RESOLVING MULTI-PARTY PRIVACY CONFLICTS IN SOCIAL MEDIARESOLVING MULTI-PARTY PRIVACY CONFLICTS IN SOCIAL MEDIA
RESOLVING MULTI-PARTY PRIVACY CONFLICTS IN SOCIAL MEDIA
 
CeB - f - s01
CeB - f - s01CeB - f - s01
CeB - f - s01
 
The Sociology of Nothingness: Challenges of Big Data
The Sociology of Nothingness: Challenges of Big DataThe Sociology of Nothingness: Challenges of Big Data
The Sociology of Nothingness: Challenges of Big Data
 
Collaboration and fairness-aware big data management in distributed clouds
Collaboration  and fairness-aware big data management in distributed cloudsCollaboration  and fairness-aware big data management in distributed clouds
Collaboration and fairness-aware big data management in distributed clouds
 
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
 
Taking it Public: Visualizing Geospatial Data on the Web Using Shiny
Taking it Public: Visualizing Geospatial Data on the Web Using ShinyTaking it Public: Visualizing Geospatial Data on the Web Using Shiny
Taking it Public: Visualizing Geospatial Data on the Web Using Shiny
 
Living Labs Roundtable / NYC Climate Week 2020/ Part 2 of 2
Living Labs Roundtable / NYC Climate Week 2020/ Part 2 of 2Living Labs Roundtable / NYC Climate Week 2020/ Part 2 of 2
Living Labs Roundtable / NYC Climate Week 2020/ Part 2 of 2
 
NSGIC Mid-Year Meeting
NSGIC Mid-Year MeetingNSGIC Mid-Year Meeting
NSGIC Mid-Year Meeting
 
Protecting big data mining association rules using fuzzy system
Protecting big data mining association rules using fuzzy systemProtecting big data mining association rules using fuzzy system
Protecting big data mining association rules using fuzzy system
 
Data Journalism
Data JournalismData Journalism
Data Journalism
 
Building better knowledge graphs through social computing
Building better knowledge graphs through social computingBuilding better knowledge graphs through social computing
Building better knowledge graphs through social computing
 
CyberDefPos_Scott
CyberDefPos_ScottCyberDefPos_Scott
CyberDefPos_Scott
 
Accelerating biomedical discovery with an Internet of FAIR data and services ...
Accelerating biomedical discovery with an Internet of FAIR data and services ...Accelerating biomedical discovery with an Internet of FAIR data and services ...
Accelerating biomedical discovery with an Internet of FAIR data and services ...
 
IN2N: Cross-institutional Authority Collaboration
IN2N: Cross-institutional Authority CollaborationIN2N: Cross-institutional Authority Collaboration
IN2N: Cross-institutional Authority Collaboration
 

Ähnlich wie Analyzing Big Data from News Media and Social Movements

Big Data? Big Issues: Degradation in Longitudinal Data and Implications for ...
Big Data? Big Issues:  Degradation in Longitudinal Data and Implications for ...Big Data? Big Issues:  Degradation in Longitudinal Data and Implications for ...
Big Data? Big Issues: Degradation in Longitudinal Data and Implications for ...mwe400
 
Internet Archives as a Tool for Research: Decay in Large Scale Archival Records
Internet Archives as a Tool for Research: Decay in Large Scale Archival RecordsInternet Archives as a Tool for Research: Decay in Large Scale Archival Records
Internet Archives as a Tool for Research: Decay in Large Scale Archival Recordsmwe400
 
Wire Workshop: Overview slides for ArchiveHub Project
Wire Workshop: Overview slides for ArchiveHub ProjectWire Workshop: Overview slides for ArchiveHub Project
Wire Workshop: Overview slides for ArchiveHub Projectmwe400
 
Privacy in the Age of Big Data: Exploring the Role of Modern Identity Managem...
Privacy in the Age of Big Data: Exploring the Role of Modern Identity Managem...Privacy in the Age of Big Data: Exploring the Role of Modern Identity Managem...
Privacy in the Age of Big Data: Exploring the Role of Modern Identity Managem...Arab Federation for Digital Economy
 
Designing Cybersecurity Policies with Field Experiments
Designing Cybersecurity Policies with Field ExperimentsDesigning Cybersecurity Policies with Field Experiments
Designing Cybersecurity Policies with Field ExperimentsGene Moo Lee
 
Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science  Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science suresh sood
 
The What, Why and How of Big Data
The What, Why and How of Big DataThe What, Why and How of Big Data
The What, Why and How of Big DataLuca Naso
 
Isolating values from big data with the help of four v’s
Isolating values from big data with the help of four v’sIsolating values from big data with the help of four v’s
Isolating values from big data with the help of four v’seSAT Journals
 
Governing Big Data : Principles and practices
Governing Big Data : Principles and practicesGoverning Big Data : Principles and practices
Governing Big Data : Principles and practicesPiyush Malik
 
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCESBROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCESMicah Altman
 
wireless sensor network
wireless sensor networkwireless sensor network
wireless sensor networkparry prabhu
 
Top 10 Read articles in Web & semantic technology
 Top  10 Read articles in Web & semantic technology Top  10 Read articles in Web & semantic technology
Top 10 Read articles in Web & semantic technologydannyijwest
 
Big data - a review (2013 4)
Big data - a review (2013 4)Big data - a review (2013 4)
Big data - a review (2013 4)Sonu Gupta
 
hariri2019.pdf
hariri2019.pdfhariri2019.pdf
hariri2019.pdfAkuhuruf
 
Using Graphs to Enable National-Scale Analytics
Using Graphs to Enable National-Scale AnalyticsUsing Graphs to Enable National-Scale Analytics
Using Graphs to Enable National-Scale AnalyticsNeo4j
 

Ähnlich wie Analyzing Big Data from News Media and Social Movements (20)

Big Data? Big Issues: Degradation in Longitudinal Data and Implications for ...
Big Data? Big Issues:  Degradation in Longitudinal Data and Implications for ...Big Data? Big Issues:  Degradation in Longitudinal Data and Implications for ...
Big Data? Big Issues: Degradation in Longitudinal Data and Implications for ...
 
Internet Archives as a Tool for Research: Decay in Large Scale Archival Records
Internet Archives as a Tool for Research: Decay in Large Scale Archival RecordsInternet Archives as a Tool for Research: Decay in Large Scale Archival Records
Internet Archives as a Tool for Research: Decay in Large Scale Archival Records
 
Wire Workshop: Overview slides for ArchiveHub Project
Wire Workshop: Overview slides for ArchiveHub ProjectWire Workshop: Overview slides for ArchiveHub Project
Wire Workshop: Overview slides for ArchiveHub Project
 
data, big data, open data
data, big data, open datadata, big data, open data
data, big data, open data
 
Privacy in the Age of Big Data: Exploring the Role of Modern Identity Managem...
Privacy in the Age of Big Data: Exploring the Role of Modern Identity Managem...Privacy in the Age of Big Data: Exploring the Role of Modern Identity Managem...
Privacy in the Age of Big Data: Exploring the Role of Modern Identity Managem...
 
Designing Cybersecurity Policies with Field Experiments
Designing Cybersecurity Policies with Field ExperimentsDesigning Cybersecurity Policies with Field Experiments
Designing Cybersecurity Policies with Field Experiments
 
Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science  Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science
 
The What, Why and How of Big Data
The What, Why and How of Big DataThe What, Why and How of Big Data
The What, Why and How of Big Data
 
Conclusion
ConclusionConclusion
Conclusion
 
Isolating values from big data with the help of four v’s
Isolating values from big data with the help of four v’sIsolating values from big data with the help of four v’s
Isolating values from big data with the help of four v’s
 
Jf2516311637
Jf2516311637Jf2516311637
Jf2516311637
 
Jf2516311637
Jf2516311637Jf2516311637
Jf2516311637
 
Governing Big Data : Principles and practices
Governing Big Data : Principles and practicesGoverning Big Data : Principles and practices
Governing Big Data : Principles and practices
 
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCESBROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
 
wireless sensor network
wireless sensor networkwireless sensor network
wireless sensor network
 
Top 10 Read articles in Web & semantic technology
 Top  10 Read articles in Web & semantic technology Top  10 Read articles in Web & semantic technology
Top 10 Read articles in Web & semantic technology
 
Big data - a review (2013 4)
Big data - a review (2013 4)Big data - a review (2013 4)
Big data - a review (2013 4)
 
hariri2019.pdf
hariri2019.pdfhariri2019.pdf
hariri2019.pdf
 
Big Data Analytics (1).ppt
Big Data Analytics (1).pptBig Data Analytics (1).ppt
Big Data Analytics (1).ppt
 
Using Graphs to Enable National-Scale Analytics
Using Graphs to Enable National-Scale AnalyticsUsing Graphs to Enable National-Scale Analytics
Using Graphs to Enable National-Scale Analytics
 

Mehr von mwe400

050817 geomedia news networks
050817 geomedia news networks050817 geomedia news networks
050817 geomedia news networksmwe400
 
022217 ia hackathon presentation
022217 ia  hackathon presentation022217 ia  hackathon presentation
022217 ia hackathon presentationmwe400
 
062016 jcdl media networks upload
062016 jcdl media networks upload062016 jcdl media networks upload
062016 jcdl media networks uploadmwe400
 
Immutable Technology and the Breakdown of Organizational Change.
Immutable Technology and the Breakdown of Organizational Change.Immutable Technology and the Breakdown of Organizational Change.
Immutable Technology and the Breakdown of Organizational Change.mwe400
 
032415 marketing 101 watershed upload
032415 marketing 101   watershed upload032415 marketing 101   watershed upload
032415 marketing 101 watershed uploadmwe400
 
AEJMC 2014 - Big Data and Education
AEJMC 2014 - Big Data and EducationAEJMC 2014 - Big Data and Education
AEJMC 2014 - Big Data and Educationmwe400
 
AEJMC 2014 - Online News and Linking
AEJMC 2014 - Online News and LinkingAEJMC 2014 - Online News and Linking
AEJMC 2014 - Online News and Linkingmwe400
 
Internet Archives and Social Science Research - Yeungnam University
Internet Archives and Social Science Research - Yeungnam UniversityInternet Archives and Social Science Research - Yeungnam University
Internet Archives and Social Science Research - Yeungnam Universitymwe400
 

Mehr von mwe400 (8)

050817 geomedia news networks
050817 geomedia news networks050817 geomedia news networks
050817 geomedia news networks
 
022217 ia hackathon presentation
022217 ia  hackathon presentation022217 ia  hackathon presentation
022217 ia hackathon presentation
 
062016 jcdl media networks upload
062016 jcdl media networks upload062016 jcdl media networks upload
062016 jcdl media networks upload
 
Immutable Technology and the Breakdown of Organizational Change.
Immutable Technology and the Breakdown of Organizational Change.Immutable Technology and the Breakdown of Organizational Change.
Immutable Technology and the Breakdown of Organizational Change.
 
032415 marketing 101 watershed upload
032415 marketing 101   watershed upload032415 marketing 101   watershed upload
032415 marketing 101 watershed upload
 
AEJMC 2014 - Big Data and Education
AEJMC 2014 - Big Data and EducationAEJMC 2014 - Big Data and Education
AEJMC 2014 - Big Data and Education
 
AEJMC 2014 - Online News and Linking
AEJMC 2014 - Online News and LinkingAEJMC 2014 - Online News and Linking
AEJMC 2014 - Online News and Linking
 
Internet Archives and Social Science Research - Yeungnam University
Internet Archives and Social Science Research - Yeungnam UniversityInternet Archives and Social Science Research - Yeungnam University
Internet Archives and Social Science Research - Yeungnam University
 

Kürzlich hochgeladen

Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 

Kürzlich hochgeladen (20)

Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 

Analyzing Big Data from News Media and Social Movements

  • 1. Matthew S. Weber Rutgers University @docmattweber Presented At 130th Annual Meeting of Big Data, Big Theory & The Thread of Recent History
  • 3. Pekin Daily Times, Pekin, IL, October 8, 2013 Credit: Pekin Daily Times
  • 4. 4
  • 5.
  • 6.
  • 7. What’s in the data? 7 Source | Destination | Date | Frequency | Content Type | Bytes | Content Link Data: http://gawker.com/5953665/mitt-romneys- staff-played-the-media-covering-them-in-a- friendly-game-of-flag-football Mitt Romney's Staff Played the Media Covering Them in a Friendly Game of Flag http://gawker.com 2012-10-22
  • 8. 8
  • 9.
  • 10.
  • 11.
  • 12.
  • 13. 13 News Media on the Web (Weber, Ognyanova, Kosterich & Nguyen, 2015)
  • 14.
  • 15.
  • 16. NJ Local News: 2007 - 2012
  • 17. 17 0 1 2 3 4 5 6 7 0 100 200 300 400 500 600 700 800 900 1000 2007 2008 2009 2010 2011 2012 Avg.MBperWebpage Avg.NumberofWebpages NJ.com Domain Analysis Number of Pages Avg MB
  • 18. 18 Dataset Research Potential Dates Captures Unique URLs Hurricane Katrina Online networks and organizational resilience (Chewning, Lai and Doerfel, 2012; Perry, Taylor and Doerfel, 2003) in the wake of disasters; information dissemination 2003 – 2012 1,694,236 663,740 Superstorm Sandy 2003 – 2012 41,703,112 20,013,455 US Senate Study the growth of political activity in online environments (Adamic & Glance, 2005; Bruns, 2007; Chang & Park, 2012); polarization & media discourse 109th – 112th Congresses 26,965,770 8,674,397 US House 51,840,777 12,410,014 Occupy Wall Street Previous research on NGOs in the online environment (Bach & Stark, 2004; Shumate, 2003, 2012; Shumate, Fulk, & Monge, 2005); use of hyperlink data to study the formation and role of alliances between SMOs 2010 – 2012 247,928,272 11,3259,655 US Media Previous studies of news media organizations (Greer & Mensing, 2006; Weber, 2012; Weber & Monge, In Press); focus on evolutionary patterns 2008 – 2012 1,315,132,555 539,184,823
  • 19. To what degree are large-scale datasets reliable?
  • 20. 20
  • 21. 21
  • 22. 22
  • 23. 0 5 10 15 20 25 30 050000010000001500000200000025000003000000 Potential vs. Actual URLs CountofPages 23t CountofURLs Potential Actual Difference
  • 24. 24 0e+002e+064e+066e+06 Changes in Crawl Completeness CountofPages t CountofURLs OWS House Senate Katrina existing potential b = set a unit of time for analysis, c choosing n periods across a total time T
  • 25. In the ideal case, it would be possible to create a factor that corrects for data degrade: bt How does this help? Each of the illustrated cases fits against an exponential function ~ b • Senate: 0.13 • House: 0.13 • Katrina: 0.02 • OWS: 0.10 25 ebt
  • 26. 26 Challenges are not unique to these data Courtesy of Marc Smith, NodeXL
  • 27. 27
  • 28. Research support from: NSF Award #1244727; Additional support from the NetSCI Lab @ Rutgers

Hinweis der Redaktion

  1. Emporer Penguins… huddling together for survival... Population... Interacting in a large ecosystem with other animals.
  2. Emporer Penguins… huddling together for survival... Population... Interacting in a large ecosystem with other animals.
  3. WhiteHouse.gov press release from May 1, 2003, archived on May 6, 2003
  4. WhiteHouse.gov press release from May 1, 2003, archived on October 1, 2003  
  5. July 14, 2006
  6. July 14, 2006
  7. February 25 2011
  8. Correlations between outgoing link vectors to show profile similarities
  9. 20th Century Collection = 9TB of metadata Media Seed List = 4,891 For instance, researchers have proposed focusing archival efforts on capturing data that changes the most frequently, in order to capture the majority of new content [36]. Elsewhere, researchers have suggested that crawling strategies should prioritize archival efforts based on the size and relative position of websites within their larger ecosystems [37].
  10. Driscoll and Walker (2014) For instance, a comparison of Twitter data collected via a public API and data collected from a “fire hose” provided by GNIP PowerTrack, found significant differences between the two datasets. In most cases the PowerTrack data proved to be more powerful,
  11. 3 month windows of time…