SlideShare ist ein Scribd-Unternehmen logo
1 von 13
BotNetBM

          A Benchmark for Social Network


                                       CWI
                            Project Meeting@Innsbruck
                              Feb 28 - Mar 04, 2011




Wednesday, March 02, 2011
Motivation
     —   Highly linked data

     —   No (good) benchmark yet for social
          networks




                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
BotNetBM
     —   A benchmark for social networks

     —   Simulates an RDF OLTP backend

     —   Simulates random activities of large #users

     —   Simulates on-site “analyst” ➠ weekly
          “analytic report”

     —   One parameter: scale (#user accounts to
          start BM)
                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
BotNetBM Queries
     —   SPARQL 1.1 + SPARUL

     —   User Actions

          ◦ Interactive queries (80%)

          ◦ Update transactions (20%)

     —   Measurement: successful #clicks/min.

          ◦ Transactions commit, penalty for > 3 sec.

          ◦ Interactive queries response time < 3 sec.

     —   Analytic queries (must finish within simulated weekend)

                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Limitations
     —   Data generator: too uniform, not realistic for social networks

          ◦ 10 operations / user / simulated day

          ◦ all users are equally active

          ◦ some queries have no “meaningful” relation to each other

          ◦ read/write contention unrealistically frequent
          ◦ ...

     —   Query mix:

          ◦ Does not exploit SPARQL 1.1 advanced features
          ◦ No link to other RDF datasets

     —   Queries do not run with the open source ed. of Virtuoso Server

                               Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Our Goals
     —   Exploit SPARQL1.1 features in queries

          ◦ “Property Path Expressions”

     —   Add links to well-know RDF data sets into the queries

          ◦ DBpedia

     —   Use real-life analysis info (e.g., twitter)

          ◦ redesign data generator

          ◦ distribution of interactive/update queries

     —   Use real-life social network data

          ◦ twitter, facebook, orkut, MySpace, ...

     —   Migration to MonetDB

                               Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Done
     —   Loaded into the Virtuoso Server (commercial ed.)

     —   Design of new query mix

     —   Twitter datasets

          ◦ http://infochimps.com/collections/twitter-census

          ◦ http://an.kaist.ac.kr/traces/WWW2010.html

          ◦ http://snap.stanford.edu/data/twitter7.html

          ◦ http://twitter.mpi-sws.org/

     —   Analysis information

          ◦ “The Man Your Man Could Smell Like: Twitter Analytics Report”

          ◦ “Characterizing user behavior in online social networks”

          ◦ “User Interactions in Social Networks and their Implications”


                                 Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Interactive & Analytic Queries

     Q1 - Q8: Information of Profiles & Friends
     1.   Find all users whose first names contain a particular string, e.g., “Minh”.

     2.   Return the names of people who study in the same school and have the same age as a user. These
          people can be the classmates of the user.

     3.   Find people studied from the same school that connect with you by a path of friend relationship. (Use
          the “Property Path Expression” in SPARQL 1.1 with arbitrary length path)

     4.   Find all friends who like an action movie whose actor is Tom Cruise. (Use the information from dbpedia
          for the movie and actor Tom Cruise)

     5.   Find all people living in a specific location, e.g., Amsterdam, that can be reached from a user by at most
          3 steps friend relationship.

     6.   Show all the friends of yours who are living in Europe. (Use the information from dbpedia. For example,
          Amsterdam is a city in Europe, London is a city in Europe)

     7.   Find top-10 suggested friends for a user: those people that are currently not your friend but are friends
          of many of your friends. (Get all friends of your friends, order them by the number of people in your
          friends list connecting to them)

     8.   Return all users that have not joined a specific group but more than 5 friends of theirs joined the group.



                                       Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Interactive & Analytic Queries

     Q9 - Q14: Posts or Tweets
     9.   Show 10 latest posts/tweets from your friends or the friends of them. (Order by posting time)

     10. Show active posts/tweets - the 10 latest commented posts/tweets from your friends. (Order
         by the timestamp of the last comments on the posts)

     11. Return top-10 most interesting posts from your friends - First order by the number of
         “like” (or in Twitter, the number of “re-tweet” posts) on the posts from your friends, then
         order by the number of comments.

     12. Return all posts about an event (e.g., Unrest in Tunisia) in 10 recent days. (Based on the
         hash tags if they are available. In case no tag appears in the post, check whether the content
         of the post contains the terms in the searching event.)

     13. Show all posts about a specific location, e.g., Egypt, in 10 recent days. (Use the information
         from DBpedia for identify the location of the post. For example, Cairo is the capital of Egypt,
         Tahrir square is in Cairo.)

     14. Find number of inactive user: all users activated for at least 30 days but did not have any
         post or all users that do not have any more post for 60 days.



                                   Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Interactive & Analytic Queries

     Q15 - Q17: Hash tags
     15.Show all photos posted by my friends that I was tagged.

     16.Find top-10 friends or all friends of friends of you that have
        common interest. (Based on the similarity between the tags in
        your posts and tags in their posts)

     17.What are the current hottest events/problems? (Get the hash tags
        from posts and order by the number of their appearances in 10
        recent days)




                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
Interactive & Analytic Queries

     Q18 - Q19: other information
     18.Which area is the most active area? (Order by the total number of
        posts in each location in 5 recent days.)

     19.Return the top-10 locations that have the fastest growth in the
        number of users. (Count the number of people joined before 10
        days and those joined during the 10 recent days, and then,
        compute the developing rate.)




                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
SPARQL/Update Queries
     1. Update user profile

     2. Posts/Tweets:

           2.1. Add a posts (Popularity: high)

           2.2. Remove a posts (Popularity: low)

           2.3. Add tags for your friends

           2.4. Add/Remove a comment

     3. Friends

           3.1. Add a friend (Popularity: high)

           3.2. Remove a friend (Popularity: low)

                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011
SPARQL/Update Queries
     4. Group, Event

           4.1. Join/Leave a group/event

           4.2. Add/Delete post in the group/event

     5. Photos

           5.1. Add/Delete a photo

           5.2. Add/Remove tags in the photo

           5.3. Add/Remove a comment
           5.4. Remove tags to me from all the pictures of my friends

                            Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck


Wednesday, March 02, 2011

Weitere ähnliche Inhalte

Andere mochten auch (9)

Privacy-Preserving Schema Reuse
Privacy-Preserving Schema ReusePrivacy-Preserving Schema Reuse
Privacy-Preserving Schema Reuse
 
Dl2014 slides
Dl2014 slidesDl2014 slides
Dl2014 slides
 
Arrays in database systems, the next frontier?
Arrays in database systems, the next frontier?Arrays in database systems, the next frontier?
Arrays in database systems, the next frontier?
 
Exposing Real World Information for the Web of Things
Exposing Real World Information for the Web of ThingsExposing Real World Information for the Web of Things
Exposing Real World Information for the Web of Things
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
 
Tractor Pulling on Data Warehouse
Tractor Pulling on Data WarehouseTractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
 
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksOn Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
 
Efficiently Maintaining Distributed Model-Based Views on Real-Time Data Streams
Efficiently Maintaining Distributed Model-Based Views on Real-Time Data StreamsEfficiently Maintaining Distributed Model-Based Views on Real-Time Data Streams
Efficiently Maintaining Distributed Model-Based Views on Real-Time Data Streams
 
Planetdata
PlanetdataPlanetdata
Planetdata
 

Ähnlich wie BotNetBenchmark - A Benchmark for Social Network

Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook:  case study of BlogTOPredicting what gets ‘Likes’ on Facebook:  case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTOToronto Metropolitan University
 
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTOPredicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTOPriya Kumar
 
Social Media Crawling & Mining Seminar
Social Media Crawling & Mining Seminar Social Media Crawling & Mining Seminar
Social Media Crawling & Mining Seminar Symeon Papadopoulos
 
Social media as a tool for terminological research
Social media as a tool for terminological researchSocial media as a tool for terminological research
Social media as a tool for terminological researchTERMCAT
 
Flux of MEME - DOW 1st semester
Flux of MEME - DOW 1st semesterFlux of MEME - DOW 1st semester
Flux of MEME - DOW 1st semesterthomas alisi
 
Developing rich interactive eBooks to teach linked open data to professionals...
Developing rich interactive eBooks to teach linked open data to professionals...Developing rich interactive eBooks to teach linked open data to professionals...
Developing rich interactive eBooks to teach linked open data to professionals...John Domingue
 
2016 09-28 social network analysis with node-xl_emke
2016 09-28 social network analysis with node-xl_emke2016 09-28 social network analysis with node-xl_emke
2016 09-28 social network analysis with node-xl_emkeDr Martina Emke
 
Social Media Analysis... according to Net7
Social Media Analysis... according to Net7Social Media Analysis... according to Net7
Social Media Analysis... according to Net7Net7
 
Analyzing customer sentiments in microblogs
Analyzing customer sentiments in microblogsAnalyzing customer sentiments in microblogs
Analyzing customer sentiments in microblogsStefan Sommer
 
Text mining on Twitter information based on R platform
Text mining on Twitter information based on R platformText mining on Twitter information based on R platform
Text mining on Twitter information based on R platformFayan TAO
 
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social NetworksBang Hui Lim
 
Anticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsAnticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsMatthew Rowe
 
Mapping Online Publics: Researching the Uses of Twitter
Mapping Online Publics: Researching the Uses of TwitterMapping Online Publics: Researching the Uses of Twitter
Mapping Online Publics: Researching the Uses of TwitterAxel Bruns
 
IAT334-Lec04-DesignIdeasPrinciples.pptx
IAT334-Lec04-DesignIdeasPrinciples.pptxIAT334-Lec04-DesignIdeasPrinciples.pptx
IAT334-Lec04-DesignIdeasPrinciples.pptxssuseraae9cd
 
Open Annotation Collaboration Briefing
Open Annotation Collaboration BriefingOpen Annotation Collaboration Briefing
Open Annotation Collaboration BriefingTimothy Cole
 
Information Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ DeloitteInformation Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ DeloitteDeep Kayal
 
Accessing and analysing your own social media data.pptx
Accessing and analysing your own social media data.pptxAccessing and analysing your own social media data.pptx
Accessing and analysing your own social media data.pptxLadduAnanu
 

Ähnlich wie BotNetBenchmark - A Benchmark for Social Network (20)

Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook:  case study of BlogTOPredicting what gets ‘Likes’ on Facebook:  case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
 
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTOPredicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
 
Social Media Crawling & Mining Seminar
Social Media Crawling & Mining Seminar Social Media Crawling & Mining Seminar
Social Media Crawling & Mining Seminar
 
CSE509 Lecture 5
CSE509 Lecture 5CSE509 Lecture 5
CSE509 Lecture 5
 
Social media as a tool for terminological research
Social media as a tool for terminological researchSocial media as a tool for terminological research
Social media as a tool for terminological research
 
Flux of MEME - DOW 1st semester
Flux of MEME - DOW 1st semesterFlux of MEME - DOW 1st semester
Flux of MEME - DOW 1st semester
 
Developing rich interactive eBooks to teach linked open data to professionals...
Developing rich interactive eBooks to teach linked open data to professionals...Developing rich interactive eBooks to teach linked open data to professionals...
Developing rich interactive eBooks to teach linked open data to professionals...
 
2016 09-28 social network analysis with node-xl_emke
2016 09-28 social network analysis with node-xl_emke2016 09-28 social network analysis with node-xl_emke
2016 09-28 social network analysis with node-xl_emke
 
Social Media Analysis... according to Net7
Social Media Analysis... according to Net7Social Media Analysis... according to Net7
Social Media Analysis... according to Net7
 
Analyzing customer sentiments in microblogs
Analyzing customer sentiments in microblogsAnalyzing customer sentiments in microblogs
Analyzing customer sentiments in microblogs
 
Text mining on Twitter information based on R platform
Text mining on Twitter information based on R platformText mining on Twitter information based on R platform
Text mining on Twitter information based on R platform
 
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
 
Anticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsAnticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community Forums
 
Social Media in Japan (Panel in Blogtalk2009)
Social Media in Japan (Panel in Blogtalk2009)Social Media in Japan (Panel in Blogtalk2009)
Social Media in Japan (Panel in Blogtalk2009)
 
Mapping Online Publics: Researching the Uses of Twitter
Mapping Online Publics: Researching the Uses of TwitterMapping Online Publics: Researching the Uses of Twitter
Mapping Online Publics: Researching the Uses of Twitter
 
IAT334-Lec04-DesignIdeasPrinciples.pptx
IAT334-Lec04-DesignIdeasPrinciples.pptxIAT334-Lec04-DesignIdeasPrinciples.pptx
IAT334-Lec04-DesignIdeasPrinciples.pptx
 
Twitter in Academic Conferences
Twitter in Academic ConferencesTwitter in Academic Conferences
Twitter in Academic Conferences
 
Open Annotation Collaboration Briefing
Open Annotation Collaboration BriefingOpen Annotation Collaboration Briefing
Open Annotation Collaboration Briefing
 
Information Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ DeloitteInformation Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ Deloitte
 
Accessing and analysing your own social media data.pptx
Accessing and analysing your own social media data.pptxAccessing and analysing your own social media data.pptx
Accessing and analysing your own social media data.pptx
 

Mehr von PlanetData Network of Excellence

A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoPlanetData Network of Excellence
 
Towards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingTowards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingPlanetData Network of Excellence
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamPlanetData Network of Excellence
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingPlanetData Network of Excellence
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...PlanetData Network of Excellence
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchPlanetData Network of Excellence
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSPlanetData Network of Excellence
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...PlanetData Network of Excellence
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsPlanetData Network of Excellence
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...PlanetData Network of Excellence
 
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsAdaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsPlanetData Network of Excellence
 

Mehr von PlanetData Network of Excellence (20)

A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about Trentino
 
Towards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingTowards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory Sensing
 
Pay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching NetworksPay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching Networks
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMS
 
CLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data ArchitectureCLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data Architecture
 
Data and Knowledge Evolution
Data and Knowledge Evolution  Data and Knowledge Evolution
Data and Knowledge Evolution
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
 
Access Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract ModelsAccess Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract Models
 
Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
 
Heuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQLHeuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQL
 
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsAdaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of Endpoints
 
Building a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data CloudBuilding a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data Cloud
 
OntoGen Extension for Exploring Image Collections
OntoGen Extension for Exploring Image CollectionsOntoGen Extension for Exploring Image Collections
OntoGen Extension for Exploring Image Collections
 

Kürzlich hochgeladen

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 

Kürzlich hochgeladen (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 

BotNetBenchmark - A Benchmark for Social Network

  • 1. BotNetBM A Benchmark for Social Network CWI Project Meeting@Innsbruck Feb 28 - Mar 04, 2011 Wednesday, March 02, 2011
  • 2. Motivation — Highly linked data — No (good) benchmark yet for social networks Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 3. BotNetBM — A benchmark for social networks — Simulates an RDF OLTP backend — Simulates random activities of large #users — Simulates on-site “analyst” ➠ weekly “analytic report” — One parameter: scale (#user accounts to start BM) Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 4. BotNetBM Queries — SPARQL 1.1 + SPARUL — User Actions ◦ Interactive queries (80%) ◦ Update transactions (20%) — Measurement: successful #clicks/min. ◦ Transactions commit, penalty for > 3 sec. ◦ Interactive queries response time < 3 sec. — Analytic queries (must finish within simulated weekend) Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 5. Limitations — Data generator: too uniform, not realistic for social networks ◦ 10 operations / user / simulated day ◦ all users are equally active ◦ some queries have no “meaningful” relation to each other ◦ read/write contention unrealistically frequent ◦ ... — Query mix: ◦ Does not exploit SPARQL 1.1 advanced features ◦ No link to other RDF datasets — Queries do not run with the open source ed. of Virtuoso Server Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 6. Our Goals — Exploit SPARQL1.1 features in queries ◦ “Property Path Expressions” — Add links to well-know RDF data sets into the queries ◦ DBpedia — Use real-life analysis info (e.g., twitter) ◦ redesign data generator ◦ distribution of interactive/update queries — Use real-life social network data ◦ twitter, facebook, orkut, MySpace, ... — Migration to MonetDB Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 7. Done — Loaded into the Virtuoso Server (commercial ed.) — Design of new query mix — Twitter datasets ◦ http://infochimps.com/collections/twitter-census ◦ http://an.kaist.ac.kr/traces/WWW2010.html ◦ http://snap.stanford.edu/data/twitter7.html ◦ http://twitter.mpi-sws.org/ — Analysis information ◦ “The Man Your Man Could Smell Like: Twitter Analytics Report” ◦ “Characterizing user behavior in online social networks” ◦ “User Interactions in Social Networks and their Implications” Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 8. Interactive & Analytic Queries Q1 - Q8: Information of Profiles & Friends 1. Find all users whose first names contain a particular string, e.g., “Minh”. 2. Return the names of people who study in the same school and have the same age as a user. These people can be the classmates of the user. 3. Find people studied from the same school that connect with you by a path of friend relationship. (Use the “Property Path Expression” in SPARQL 1.1 with arbitrary length path) 4. Find all friends who like an action movie whose actor is Tom Cruise. (Use the information from dbpedia for the movie and actor Tom Cruise) 5. Find all people living in a specific location, e.g., Amsterdam, that can be reached from a user by at most 3 steps friend relationship. 6. Show all the friends of yours who are living in Europe. (Use the information from dbpedia. For example, Amsterdam is a city in Europe, London is a city in Europe) 7. Find top-10 suggested friends for a user: those people that are currently not your friend but are friends of many of your friends. (Get all friends of your friends, order them by the number of people in your friends list connecting to them) 8. Return all users that have not joined a specific group but more than 5 friends of theirs joined the group. Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 9. Interactive & Analytic Queries Q9 - Q14: Posts or Tweets 9. Show 10 latest posts/tweets from your friends or the friends of them. (Order by posting time) 10. Show active posts/tweets - the 10 latest commented posts/tweets from your friends. (Order by the timestamp of the last comments on the posts) 11. Return top-10 most interesting posts from your friends - First order by the number of “like” (or in Twitter, the number of “re-tweet” posts) on the posts from your friends, then order by the number of comments. 12. Return all posts about an event (e.g., Unrest in Tunisia) in 10 recent days. (Based on the hash tags if they are available. In case no tag appears in the post, check whether the content of the post contains the terms in the searching event.) 13. Show all posts about a specific location, e.g., Egypt, in 10 recent days. (Use the information from DBpedia for identify the location of the post. For example, Cairo is the capital of Egypt, Tahrir square is in Cairo.) 14. Find number of inactive user: all users activated for at least 30 days but did not have any post or all users that do not have any more post for 60 days. Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 10. Interactive & Analytic Queries Q15 - Q17: Hash tags 15.Show all photos posted by my friends that I was tagged. 16.Find top-10 friends or all friends of friends of you that have common interest. (Based on the similarity between the tags in your posts and tags in their posts) 17.What are the current hottest events/problems? (Get the hash tags from posts and order by the number of their appearances in 10 recent days) Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 11. Interactive & Analytic Queries Q18 - Q19: other information 18.Which area is the most active area? (Order by the total number of posts in each location in 5 recent days.) 19.Return the top-10 locations that have the fastest growth in the number of users. (Count the number of people joined before 10 days and those joined during the 10 recent days, and then, compute the developing rate.) Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 12. SPARQL/Update Queries 1. Update user profile 2. Posts/Tweets: 2.1. Add a posts (Popularity: high) 2.2. Remove a posts (Popularity: low) 2.3. Add tags for your friends 2.4. Add/Remove a comment 3. Friends 3.1. Add a friend (Popularity: high) 3.2. Remove a friend (Popularity: low) Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011
  • 13. SPARQL/Update Queries 4. Group, Event 4.1. Join/Leave a group/event 4.2. Add/Delete post in the group/event 5. Photos 5.1. Add/Delete a photo 5.2. Add/Remove tags in the photo 5.3. Add/Remove a comment 5.4. Remove tags to me from all the pictures of my friends Feb 28 - Mar 4, 2011 ProjectMeeting@Innsbruck Wednesday, March 02, 2011