SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Using Tags and Clustering to Identify Topic-specific Blogs Conor Hayes Digital Enterprise Research Institute, National University of Ireland, Galway, Ireland Paolo Avesani, Bruno Kessler Institute (ITC-IRST) Trento, Italy
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Tag clouds
The Long Tail ,[object Object],[object Object],[object Object],[object Object],[object Object]
Data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Clustering ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Clustering: Tags vs. Content ,[object Object]
Partitioning the tag space
Tag frequency distribution per cluster ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
A-bloggers ,[object Object]
Intrablog similarity: A- vs. C-blogs  ,[object Object]
Similarity to centroid: A- vs. C-blogs  ,[object Object]
A-bloggers are ,[object Object],[object Object]
Relevance? ,[object Object],[object Object],[object Object],[object Object],[object Object]
Verification 2: by Google
Similarity to pages from Google
Consistency over time ?
Blogger Entropy ,[object Object],win t+n q   : number of clusters at win t+n  containing users from cluster  r n r i   :   number of users from cluster  r  contained in cluster  i  at win t+n n r :  number of users from cluster  r  available at win t+n win t
Entropy: a-blogs vs c-blogs ,[object Object],[object Object],[object Object]
Example of A-blogs and C-blogs ,[object Object],[object Object],A-blogs C-blogs
Conclusions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
References ,[object Object],[object Object]
Appendix ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],spam

Weitere ähnliche Inhalte

Was ist angesagt?

Dekoh Press Meet, Bangalore, India
Dekoh Press Meet, Bangalore, IndiaDekoh Press Meet, Bangalore, India
Dekoh Press Meet, Bangalore, India
dekohworld
 
Searching the Internet
Searching the InternetSearching the Internet
Searching the Internet
vanalery
 
Interactive Internet
Interactive InternetInteractive Internet
Interactive Internet
James Sutter
 
Web 2.0 In a Nutshell : A Librarian Guide to the World of Web 2.0
Web 2.0 In a Nutshell: A Librarian Guide to the World of Web 2.0Web 2.0 In a Nutshell: A Librarian Guide to the World of Web 2.0
Web 2.0 In a Nutshell : A Librarian Guide to the World of Web 2.0
teaguese
 

Was ist angesagt? (20)

aa
aaaa
aa
 
Dekoh Press Meet, Bangalore, India
Dekoh Press Meet, Bangalore, IndiaDekoh Press Meet, Bangalore, India
Dekoh Press Meet, Bangalore, India
 
Web 2.0 and other emerging technologies
Web 2.0 and other emerging technologiesWeb 2.0 and other emerging technologies
Web 2.0 and other emerging technologies
 
Social Semantic Web on Facebook Open Graph protocol and Twitter Annotations
Social Semantic Web on Facebook Open Graph protocol and Twitter AnnotationsSocial Semantic Web on Facebook Open Graph protocol and Twitter Annotations
Social Semantic Web on Facebook Open Graph protocol and Twitter Annotations
 
Web 2.0 stuff to make your life easier
Web 2.0 stuff to make your life easierWeb 2.0 stuff to make your life easier
Web 2.0 stuff to make your life easier
 
Toolicious Presentation at SoCon07
Toolicious Presentation at SoCon07Toolicious Presentation at SoCon07
Toolicious Presentation at SoCon07
 
Open Content Library LGM 2007
Open Content Library LGM 2007Open Content Library LGM 2007
Open Content Library LGM 2007
 
Web 2.0 & Social Computing
Web 2.0 & Social Computing Web 2.0 & Social Computing
Web 2.0 & Social Computing
 
Searching the Internet
Searching the InternetSearching the Internet
Searching the Internet
 
Web 2.0 for Lawyers (SL CLE)
Web 2.0 for Lawyers (SL CLE)Web 2.0 for Lawyers (SL CLE)
Web 2.0 for Lawyers (SL CLE)
 
MyLifeBits van Microsoft
MyLifeBits van MicrosoftMyLifeBits van Microsoft
MyLifeBits van Microsoft
 
RSS and Social Bookmarking
RSS and Social BookmarkingRSS and Social Bookmarking
RSS and Social Bookmarking
 
Web 2.0 for IA's
Web 2.0 for IA'sWeb 2.0 for IA's
Web 2.0 for IA's
 
Practical examples of web2.0 in the development sector
Practical examples of web2.0 in the development sectorPractical examples of web2.0 in the development sector
Practical examples of web2.0 in the development sector
 
Interactive Internet
Interactive InternetInteractive Internet
Interactive Internet
 
Web 2.0 And Repositories
Web 2.0 And RepositoriesWeb 2.0 And Repositories
Web 2.0 And Repositories
 
Blogstl (1)
Blogstl (1)Blogstl (1)
Blogstl (1)
 
Using Web 2.0 Principles to Become Librarian 2.0: Blogs
Using Web 2.0 Principles to Become Librarian 2.0: BlogsUsing Web 2.0 Principles to Become Librarian 2.0: Blogs
Using Web 2.0 Principles to Become Librarian 2.0: Blogs
 
Social Bookmarking
Social BookmarkingSocial Bookmarking
Social Bookmarking
 
Web 2.0 In a Nutshell : A Librarian Guide to the World of Web 2.0
Web 2.0 In a Nutshell: A Librarian Guide to the World of Web 2.0Web 2.0 In a Nutshell: A Librarian Guide to the World of Web 2.0
Web 2.0 In a Nutshell : A Librarian Guide to the World of Web 2.0
 

Ähnlich wie Using Tags and Clustering to Identify Topic-specific Blogs

GContext: A context-based query construction service for Google
GContext: A context-based query construction service for GoogleGContext: A context-based query construction service for Google
GContext: A context-based query construction service for Google
John Pap
 
Are they any use? Price per use comparisons...
Are they any use? Price per use comparisons...Are they any use? Price per use comparisons...
Are they any use? Price per use comparisons...
Jason Price, PhD
 
Extracting Key Terms From Noisy and Multi-theme Documents
Extracting Key Terms From Noisy and Multi-theme DocumentsExtracting Key Terms From Noisy and Multi-theme Documents
Extracting Key Terms From Noisy and Multi-theme Documents
maria.grineva
 
On Incentive-based Tagging
On Incentive-based TaggingOn Incentive-based Tagging
On Incentive-based Tagging
Francesco Rizzo
 
Taxonomies for Publishing: Enhancing the User Experience
Taxonomies for Publishing: Enhancing the User ExperienceTaxonomies for Publishing: Enhancing the User Experience
Taxonomies for Publishing: Enhancing the User Experience
TSoholt
 
Clustering as presented at UX Poland 2013
Clustering as presented at UX Poland 2013Clustering as presented at UX Poland 2013
Clustering as presented at UX Poland 2013
Ravi Mynampaty
 
Paper id 37201536
Paper id 37201536Paper id 37201536
Paper id 37201536
IJRAT
 

Ähnlich wie Using Tags and Clustering to Identify Topic-specific Blogs (20)

Conor Hayes - Topics, tags and trends in the blogosphere
Conor Hayes - Topics, tags and trends in the blogosphereConor Hayes - Topics, tags and trends in the blogosphere
Conor Hayes - Topics, tags and trends in the blogosphere
 
Blog clustering
Blog clusteringBlog clustering
Blog clustering
 
Effective Extraction of Thematically Grouped Key Terms From Text
Effective Extraction of Thematically Grouped Key Terms From TextEffective Extraction of Thematically Grouped Key Terms From Text
Effective Extraction of Thematically Grouped Key Terms From Text
 
EDS for JIBS
EDS for JIBSEDS for JIBS
EDS for JIBS
 
GContext: A context-based query construction service for Google
GContext: A context-based query construction service for GoogleGContext: A context-based query construction service for Google
GContext: A context-based query construction service for Google
 
Are they any use? Price per use comparisons...
Are they any use? Price per use comparisons...Are they any use? Price per use comparisons...
Are they any use? Price per use comparisons...
 
Detecting Blogs Independently from the Language and Content MSM09
Detecting Blogs Independently from the Language and Content MSM09Detecting Blogs Independently from the Language and Content MSM09
Detecting Blogs Independently from the Language and Content MSM09
 
Evidence of Learning in Blogs
Evidence of Learning in BlogsEvidence of Learning in Blogs
Evidence of Learning in Blogs
 
Ay3313861388
Ay3313861388Ay3313861388
Ay3313861388
 
Folksonomy and Tagging in the Social Web
Folksonomy and Tagging in the Social WebFolksonomy and Tagging in the Social Web
Folksonomy and Tagging in the Social Web
 
Extracting Key Terms From Noisy and Multi-theme Documents
Extracting Key Terms From Noisy and Multi-theme DocumentsExtracting Key Terms From Noisy and Multi-theme Documents
Extracting Key Terms From Noisy and Multi-theme Documents
 
Semantic Text Processing Powered by Wikipedia
Semantic Text Processing Powered by WikipediaSemantic Text Processing Powered by Wikipedia
Semantic Text Processing Powered by Wikipedia
 
IMPROVING COLLABORATIVE RECOMMENDATION VIA USER-ITEM SUBGROUPS
IMPROVING COLLABORATIVE RECOMMENDATION VIA USER-ITEM SUBGROUPSIMPROVING COLLABORATIVE RECOMMENDATION VIA USER-ITEM SUBGROUPS
IMPROVING COLLABORATIVE RECOMMENDATION VIA USER-ITEM SUBGROUPS
 
EDS for IFLA
EDS for IFLAEDS for IFLA
EDS for IFLA
 
On Incentive-based Tagging
On Incentive-based TaggingOn Incentive-based Tagging
On Incentive-based Tagging
 
clustering_classification.ppt
clustering_classification.pptclustering_classification.ppt
clustering_classification.ppt
 
IRJET- Finding Related Forum Posts through Intention-Based Segmentation
IRJET-  	  Finding Related Forum Posts through Intention-Based SegmentationIRJET-  	  Finding Related Forum Posts through Intention-Based Segmentation
IRJET- Finding Related Forum Posts through Intention-Based Segmentation
 
Taxonomies for Publishing: Enhancing the User Experience
Taxonomies for Publishing: Enhancing the User ExperienceTaxonomies for Publishing: Enhancing the User Experience
Taxonomies for Publishing: Enhancing the User Experience
 
Clustering as presented at UX Poland 2013
Clustering as presented at UX Poland 2013Clustering as presented at UX Poland 2013
Clustering as presented at UX Poland 2013
 
Paper id 37201536
Paper id 37201536Paper id 37201536
Paper id 37201536
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Using Tags and Clustering to Identify Topic-specific Blogs