SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Downloaden Sie, um offline zu lesen
Enabling Semantic Analysis of User Browsing Patterns
in the Web of Data
M.Sc. Julia Hoxha
Institute of Applied Informatics and Formal Description Methods (AIFB)
Karlsruhe Institute of Technology

USEWOD Workshop @WWW2012
Lyon, France
KIT – University of the State of Baden-Württemberg and
National Laboratory of the Helmholtz Association

www.kit.edu
Paper
 Hoxha, J., Junghans, M., and Agarwal, S. (2012).
Enabling Semantic Analysis of User Browsing
Patterns in the Web of Data. In 2nd International
Workshop on Usage Analysis and the Web of Data
(USEWOD), 21st International World Wide Web
Conference (WWW2012), Lyon, France, vol. CoRR,
abs/1204.2713.
 http://arxiv.org/abs/1204.2713
Outline
 Introduction
 Framework for Behavior Analysis
 Semantic Modeling of Cross-site Browsing Behavior
 Web Browsing Activity Model (WAM)
 Formalization Approach
 Querying Behavioral Patterns
 Evaluation
 Conclusions

J. Hoxha – USEWOD Workshop, Lyon, 2012

3
Introduction
 Understanding user behavior in accessing Web
resources helps site providers/domain experts:
• Discover user preferences or detect bottlenecks
swrc:Publication
•
ID Time Build adaptive Web sites
User Action
isA
1 [17:11:49:21 http://www.google.de/search?q=Lyon+www2012
• Make appropriate
users, etc.
1 [17:11:49:33] http://dbpedia.org/page/Lyon recommendations to swrc:Proceedings
1

[17:11:49:39] http://data.semanticweb.org/conference/
www/2011/demo/a-demo-search-engine-for-products

ns2:relatedToEvent

dc:creator

 How to facilitate the analysis of usage patterns?
HTTP Requests of Usage Logs
swrc:Conference
InProceedi
Event ngs
ns3:based_near

foaf:Person

• Provide formal, semantic description of usage logs
dbpedia:
literal
Populated
Place
• Offer techniques to expressively query patterns
ns1:name

SWDF Domain Ontology
J. Hoxha – USEWOD Workshop, Lyon, 2012

4
Modeling and Analysis Framework
Pattern Mining

Analysis

Querying Capabilities

Semantic Formalization

Browsing
Activity
Formalization

Transformation
Preprocessing

Event A
Event B
Event C

Formalization

Selection
---------

Repository
Domain
Ontologies

Semantic Activity Models
Target Data

---------

Event K
Event N

Preprocessed
Data

Annotation
with Domain Ontology

Transformed
Data

Semantic Formalization

Semantic
Activity
Model

Web Browsing Behavior
Monitoring System

Monitoring

Cross-site
Browsing
Activities

?

?

www

www

...
User 1

User n

J. Hoxha – USEWOD Workshop, Lyon, 2012

User Session of browsing Events

Event e1 = (A1, I1, t1)
Type Ai ={content, function}
s: <l1, l2, l3,
Input I1 = {i1,...,ik}
URL l1, Time t1

Event en = (An, In, tn)
Type An
..., ln>In = {i1,...,ik}
Input
URL ln, Time tn
5
Definitions
 Event
• l full URL invoked, T types, P parameter, t timestamp

 Event types
• Tc content type of an event
• Tf function type of an event

 Session
• s is ordered sequence of events
•
, s.t. i is the event order in s
• Ts start time and Te end time, s.t.
J. Hoxha – USEWOD Workshop, Lyon, 2012

6
Web browsing Activity Model (WAM)
http://www.avis.com/car-rental/reservation/
start-reservation.ac?resForm.pickUpLocation=Lyon
http://data.semanticweb.org/person/julia-hoxha
owa:Parameter
Name

Literal
wam:hasValue

wam:hasName

wam:Output
Variable

Literal

wam:userID

wam:userIP

rdfs:subClassOf

wam:Input
Variable

Domain Ontology
used for semantic
enrichment

Literal

time:Temporal
Entity

wam:Parameter

time:Instant

wam:hasInput

wam:User

time:Interval

wam:hasParameter

wam:hasUser

wam:inInterval

wam:Session
wam:hasTime

Literal

Based on function
and content

wam:hasEvent

wam:order

wam:hasStartEvent

wam:Event
wam:StartEvent

?

wam:hasEndEvent

wam:EventType
wam:function
Type

rdfs:subClassOf

wam:eventURL

wam:EndEvent

event:Event

wam:Function
Type

wam:EventURL

wam:contentType

wam:Content
Type

rdfs:subClassOf

wam:fullURL

Literal

wam:baseURL

wam:BaseURL

wam:<http://greenlinkeddata.org/wam.owl#>
time:<http://www.w3.org/2006/time#>
event: <http://purl.org/NET/c4dm/event.owl#>
rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
7
rdfs:<http://www.w3.org/2000/01/rdf-schema#>
Formalization Approach
 Formalization based on
WAM ontology
• Step 1. Semantic Enrichment
• Step 2. Extend Knowledge
Base (ABox assertions for

Semantic
Formalization
Transformation
Preprocessing
Selection
---------

events & domain ontology)

• Step 3. RDF Serialization

Target Data

---------

Preprocessed
Data

Event A
Event B
Event C

Event K
Event N

Annotation
with Domain Ontology
Semantic
Activity
Models

Transformed
Data

 Semantic Enrichment
•
•
•

For each link in logs, find URI of Web resource
Find RDF representation of the resource (via a Mapping Template)
e.g. SWDF:
Extract ontology classes to which it belongs – used as ContentType of event
http://data.semanticweb.org/person/julia-hoxha/html HTML
(Person, ResearchGroup, Publication, MusicGroup,- etc.)
http://data.semanticweb.org/person/julia-hoxha - URI
http://data.semanticweb.org/person/julia-hoxha/rdf - RDF/XML

J. Hoxha – USEWOD Workshop, Lyon, 2012

8
Semantic Analysis
 Querying with semantic constraints
Example:
- In how many sessions within Mar-Apr 2011
users searched in Google, afterwards visited a
page in SWDF?
Various levels of abstraction:
e.g. instead of google -> any search engine
or instead of any page -> WWW2011 page
or even higher abstraction -> Conference page

„Conference“
isA
„WWW2011“
isA

e1.time e1.urlBase e1.type

s: <e1, ..., e2, ef >

 Address also temporal constraints
regarding the dynamics of user browsing behavior
J. Hoxha – USEWOD Workshop, Lyon, 2012

9
Temporal Constraints
 Consider real time (timestamps) and abstract time
(order of events) to query usage patterns
Q: find sessions with start time Ts and end time Te containing an event e1 with URL
www.ex1.org, eventually succeeded by another e2 in the session with URL www.ex2.org

 We address temporal logics capable of ontological
reasoning
AAistrue at atsome state
true the next state
isis trueat all states
after the initial state s1

along the path
on
• apply temporal operators e.g. next, eventually, always the path
X
LTL Formula in a
(based on Lineal Temporal Logic - LTL)
State Transition System
• query formulated as LTL formula extended with DL axioms

LTL + DL - Proposition A as a set of Abox assertions e.g.
J. Hoxha – USEWOD Workshop, Lyon, 2012

10
DL-LTL Query Formulation
 Queries formulate
• 1) certain conditions on the session itself
• 2) temporal patterns in the events within the session

 Query:
Q (s):
find sessions with start time Ts and end time Te
containing an event e1 with content type “publication”,
eventually succeeded by another e2 with function type “search engine”

2) Temporal patterns within itself
1) Conditions on the session a session

expressed as a DL-LTL formula, e.g.

J. Hoxha – USEWOD Workshop, Lyon, 2012

11
Query Answering Approach
 Step 1. Check constraints on the session itself
 Step 2. Verify temporal constraints applying model
checking technique
Iterate over sessions S={S1, S2,…,Sn}
(a) build a finite state automaton (FSA) for each Si
(b) verification of DL-LTL formula
iterate over the states of FSA to determine whether a
condition holds in the respective state

J. Hoxha – USEWOD Workshop, Lyon, 2012

12
Evaluation
 Validate feasibility of the
formalization approach
 Show feasibility of the query
answering approach
• Query sessions with
different patterns
• Measure performance

 Formalization

SWDF
2009

DBPedia
3-3

Monitoring
Period

01.Jul.0912.Jul.09

01.Jul.0912.Jul.09

avg.#sessions
/day

235.9

2899

2831

31893

#sessions
Bing
2.7%

Google
97%

• Only 1.46% of daily sessions containing SPARQL queries
SDWF 2009: % of sessions initiated in the domain

Dbpedia of sessions
DBPedia 2009: %2009
initiated in the domain
13
Evaluation (II)
 Querying
• answering time varies
slightly for the queries
(~0.15 seconds)
• For up to 1000 sessions
below 1.4 seconds

time (sec)

Q1

J. Hoxha – USEWOD Workshop, Lyon, 2012

• model checking time
is small
• OWL reasoning takes
~ 94% of the overall
answering time
nr. sessions
14
Conclusions
 Propose a framework for behavior modeling and
analysis:
• Approach for semantic formalization of logs
• Techniques of querying patterns with temporal and
semantic constraints

 Challenges and Future Work
•
•
•
•

Find datasets of client-side navigation logs at multiple sites
Domain Ontology acquisition
Classification Techniques to find FunctionType
Optimization of Query Answering

J. Hoxha – USEWOD Workshop, Lyon, 2012

15

Weitere ähnliche Inhalte

Ähnlich wie Semantic Analysis of User Browsing Patterns in the Web of Data @USEWOD, WWW2012

Model-Driven Engineering of Workflow User Interfaces
Model-Driven Engineering of Workflow User InterfacesModel-Driven Engineering of Workflow User Interfaces
Model-Driven Engineering of Workflow User Interfaces
Juan Manuel Gonzalez Calleros
 
Stream Reasoning: Where we got so far. Oxford 2010.1.18
Stream Reasoning: Where we got so far. Oxford 2010.1.18Stream Reasoning: Where we got so far. Oxford 2010.1.18
Stream Reasoning: Where we got so far. Oxford 2010.1.18
Emanuele Della Valle
 
Keynote reusability measurement and social community analysis from mooc con...
Keynote   reusability measurement and social community analysis from mooc con...Keynote   reusability measurement and social community analysis from mooc con...
Keynote reusability measurement and social community analysis from mooc con...
HannibalHsieh
 
1st SEALS evaluation campaign results: a worldwide evaluation of semantic tec...
1st SEALS evaluation campaign results: a worldwide evaluation of semantic tec...1st SEALS evaluation campaign results: a worldwide evaluation of semantic tec...
1st SEALS evaluation campaign results: a worldwide evaluation of semantic tec...
SEALS - Semantic Evaluation at Large Scale
 
2_presFriday_ontologydevelopment
2_presFriday_ontologydevelopment2_presFriday_ontologydevelopment
2_presFriday_ontologydevelopment
Pieter Pauwels
 

Ähnlich wie Semantic Analysis of User Browsing Patterns in the Web of Data @USEWOD, WWW2012 (20)

Model-Driven Engineering of Workflow User Interfaces
Model-Driven Engineering of Workflow User InterfacesModel-Driven Engineering of Workflow User Interfaces
Model-Driven Engineering of Workflow User Interfaces
 
Stream Reasoning: Where we got so far. Oxford 2010.1.18
Stream Reasoning: Where we got so far. Oxford 2010.1.18Stream Reasoning: Where we got so far. Oxford 2010.1.18
Stream Reasoning: Where we got so far. Oxford 2010.1.18
 
UCIAD overview
UCIAD overviewUCIAD overview
UCIAD overview
 
Towards Model-Based AHMI Development
Towards Model-Based AHMI DevelopmentTowards Model-Based AHMI Development
Towards Model-Based AHMI Development
 
Towards Model-Based AHMI Automatic Evaluation
Towards Model-Based AHMI Automatic EvaluationTowards Model-Based AHMI Automatic Evaluation
Towards Model-Based AHMI Automatic Evaluation
 
Keynote reusability measurement and social community analysis from mooc con...
Keynote   reusability measurement and social community analysis from mooc con...Keynote   reusability measurement and social community analysis from mooc con...
Keynote reusability measurement and social community analysis from mooc con...
 
Author's workflow and the role of open access
Author's workflow and the role of open accessAuthor's workflow and the role of open access
Author's workflow and the role of open access
 
Methodology for the Development of Vocal User Interfaces
Methodology for the Development of Vocal User InterfacesMethodology for the Development of Vocal User Interfaces
Methodology for the Development of Vocal User Interfaces
 
Semantic Accessibility to e-Learning Web Services
Semantic Accessibility to e-Learning Web ServicesSemantic Accessibility to e-Learning Web Services
Semantic Accessibility to e-Learning Web Services
 
1st SEALS evaluation campaign results: a worldwide evaluation of semantic tec...
1st SEALS evaluation campaign results: a worldwide evaluation of semantic tec...1st SEALS evaluation campaign results: a worldwide evaluation of semantic tec...
1st SEALS evaluation campaign results: a worldwide evaluation of semantic tec...
 
Towards Task-Based Linguistic Modeling for designing GUIs
Towards Task-Based Linguistic Modeling for designing GUIsTowards Task-Based Linguistic Modeling for designing GUIs
Towards Task-Based Linguistic Modeling for designing GUIs
 
A pattern-based ontology for describing publishing workflows
A pattern-based ontology for describing publishing workflowsA pattern-based ontology for describing publishing workflows
A pattern-based ontology for describing publishing workflows
 
2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)
 
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
 
Simulating Enterprise Architecture Models
Simulating Enterprise Architecture Models Simulating Enterprise Architecture Models
Simulating Enterprise Architecture Models
 
Defensa.V11
Defensa.V11Defensa.V11
Defensa.V11
 
2_presFriday_ontologydevelopment
2_presFriday_ontologydevelopment2_presFriday_ontologydevelopment
2_presFriday_ontologydevelopment
 
SWORD: The Story So Far
SWORD: The Story So FarSWORD: The Story So Far
SWORD: The Story So Far
 
Best node js course
Best node js courseBest node js course
Best node js course
 
Platforms and the Semantic Web
Platforms and the Semantic WebPlatforms and the Semantic Web
Platforms and the Semantic Web
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Semantic Analysis of User Browsing Patterns in the Web of Data @USEWOD, WWW2012

  • 1. Enabling Semantic Analysis of User Browsing Patterns in the Web of Data M.Sc. Julia Hoxha Institute of Applied Informatics and Formal Description Methods (AIFB) Karlsruhe Institute of Technology USEWOD Workshop @WWW2012 Lyon, France KIT – University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association www.kit.edu
  • 2. Paper  Hoxha, J., Junghans, M., and Agarwal, S. (2012). Enabling Semantic Analysis of User Browsing Patterns in the Web of Data. In 2nd International Workshop on Usage Analysis and the Web of Data (USEWOD), 21st International World Wide Web Conference (WWW2012), Lyon, France, vol. CoRR, abs/1204.2713.  http://arxiv.org/abs/1204.2713
  • 3. Outline  Introduction  Framework for Behavior Analysis  Semantic Modeling of Cross-site Browsing Behavior  Web Browsing Activity Model (WAM)  Formalization Approach  Querying Behavioral Patterns  Evaluation  Conclusions J. Hoxha – USEWOD Workshop, Lyon, 2012 3
  • 4. Introduction  Understanding user behavior in accessing Web resources helps site providers/domain experts: • Discover user preferences or detect bottlenecks swrc:Publication • ID Time Build adaptive Web sites User Action isA 1 [17:11:49:21 http://www.google.de/search?q=Lyon+www2012 • Make appropriate users, etc. 1 [17:11:49:33] http://dbpedia.org/page/Lyon recommendations to swrc:Proceedings 1 [17:11:49:39] http://data.semanticweb.org/conference/ www/2011/demo/a-demo-search-engine-for-products ns2:relatedToEvent dc:creator  How to facilitate the analysis of usage patterns? HTTP Requests of Usage Logs swrc:Conference InProceedi Event ngs ns3:based_near foaf:Person • Provide formal, semantic description of usage logs dbpedia: literal Populated Place • Offer techniques to expressively query patterns ns1:name SWDF Domain Ontology J. Hoxha – USEWOD Workshop, Lyon, 2012 4
  • 5. Modeling and Analysis Framework Pattern Mining Analysis Querying Capabilities Semantic Formalization Browsing Activity Formalization Transformation Preprocessing Event A Event B Event C Formalization Selection --------- Repository Domain Ontologies Semantic Activity Models Target Data --------- Event K Event N Preprocessed Data Annotation with Domain Ontology Transformed Data Semantic Formalization Semantic Activity Model Web Browsing Behavior Monitoring System Monitoring Cross-site Browsing Activities ? ? www www ... User 1 User n J. Hoxha – USEWOD Workshop, Lyon, 2012 User Session of browsing Events Event e1 = (A1, I1, t1) Type Ai ={content, function} s: <l1, l2, l3, Input I1 = {i1,...,ik} URL l1, Time t1 Event en = (An, In, tn) Type An ..., ln>In = {i1,...,ik} Input URL ln, Time tn 5
  • 6. Definitions  Event • l full URL invoked, T types, P parameter, t timestamp  Event types • Tc content type of an event • Tf function type of an event  Session • s is ordered sequence of events • , s.t. i is the event order in s • Ts start time and Te end time, s.t. J. Hoxha – USEWOD Workshop, Lyon, 2012 6
  • 7. Web browsing Activity Model (WAM) http://www.avis.com/car-rental/reservation/ start-reservation.ac?resForm.pickUpLocation=Lyon http://data.semanticweb.org/person/julia-hoxha owa:Parameter Name Literal wam:hasValue wam:hasName wam:Output Variable Literal wam:userID wam:userIP rdfs:subClassOf wam:Input Variable Domain Ontology used for semantic enrichment Literal time:Temporal Entity wam:Parameter time:Instant wam:hasInput wam:User time:Interval wam:hasParameter wam:hasUser wam:inInterval wam:Session wam:hasTime Literal Based on function and content wam:hasEvent wam:order wam:hasStartEvent wam:Event wam:StartEvent ? wam:hasEndEvent wam:EventType wam:function Type rdfs:subClassOf wam:eventURL wam:EndEvent event:Event wam:Function Type wam:EventURL wam:contentType wam:Content Type rdfs:subClassOf wam:fullURL Literal wam:baseURL wam:BaseURL wam:<http://greenlinkeddata.org/wam.owl#> time:<http://www.w3.org/2006/time#> event: <http://purl.org/NET/c4dm/event.owl#> rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> 7 rdfs:<http://www.w3.org/2000/01/rdf-schema#>
  • 8. Formalization Approach  Formalization based on WAM ontology • Step 1. Semantic Enrichment • Step 2. Extend Knowledge Base (ABox assertions for Semantic Formalization Transformation Preprocessing Selection --------- events & domain ontology) • Step 3. RDF Serialization Target Data --------- Preprocessed Data Event A Event B Event C Event K Event N Annotation with Domain Ontology Semantic Activity Models Transformed Data  Semantic Enrichment • • • For each link in logs, find URI of Web resource Find RDF representation of the resource (via a Mapping Template) e.g. SWDF: Extract ontology classes to which it belongs – used as ContentType of event http://data.semanticweb.org/person/julia-hoxha/html HTML (Person, ResearchGroup, Publication, MusicGroup,- etc.) http://data.semanticweb.org/person/julia-hoxha - URI http://data.semanticweb.org/person/julia-hoxha/rdf - RDF/XML J. Hoxha – USEWOD Workshop, Lyon, 2012 8
  • 9. Semantic Analysis  Querying with semantic constraints Example: - In how many sessions within Mar-Apr 2011 users searched in Google, afterwards visited a page in SWDF? Various levels of abstraction: e.g. instead of google -> any search engine or instead of any page -> WWW2011 page or even higher abstraction -> Conference page „Conference“ isA „WWW2011“ isA e1.time e1.urlBase e1.type s: <e1, ..., e2, ef >  Address also temporal constraints regarding the dynamics of user browsing behavior J. Hoxha – USEWOD Workshop, Lyon, 2012 9
  • 10. Temporal Constraints  Consider real time (timestamps) and abstract time (order of events) to query usage patterns Q: find sessions with start time Ts and end time Te containing an event e1 with URL www.ex1.org, eventually succeeded by another e2 in the session with URL www.ex2.org  We address temporal logics capable of ontological reasoning AAistrue at atsome state true the next state isis trueat all states after the initial state s1 along the path on • apply temporal operators e.g. next, eventually, always the path X LTL Formula in a (based on Lineal Temporal Logic - LTL) State Transition System • query formulated as LTL formula extended with DL axioms LTL + DL - Proposition A as a set of Abox assertions e.g. J. Hoxha – USEWOD Workshop, Lyon, 2012 10
  • 11. DL-LTL Query Formulation  Queries formulate • 1) certain conditions on the session itself • 2) temporal patterns in the events within the session  Query: Q (s): find sessions with start time Ts and end time Te containing an event e1 with content type “publication”, eventually succeeded by another e2 with function type “search engine” 2) Temporal patterns within itself 1) Conditions on the session a session expressed as a DL-LTL formula, e.g. J. Hoxha – USEWOD Workshop, Lyon, 2012 11
  • 12. Query Answering Approach  Step 1. Check constraints on the session itself  Step 2. Verify temporal constraints applying model checking technique Iterate over sessions S={S1, S2,…,Sn} (a) build a finite state automaton (FSA) for each Si (b) verification of DL-LTL formula iterate over the states of FSA to determine whether a condition holds in the respective state J. Hoxha – USEWOD Workshop, Lyon, 2012 12
  • 13. Evaluation  Validate feasibility of the formalization approach  Show feasibility of the query answering approach • Query sessions with different patterns • Measure performance  Formalization SWDF 2009 DBPedia 3-3 Monitoring Period 01.Jul.0912.Jul.09 01.Jul.0912.Jul.09 avg.#sessions /day 235.9 2899 2831 31893 #sessions Bing 2.7% Google 97% • Only 1.46% of daily sessions containing SPARQL queries SDWF 2009: % of sessions initiated in the domain Dbpedia of sessions DBPedia 2009: %2009 initiated in the domain 13
  • 14. Evaluation (II)  Querying • answering time varies slightly for the queries (~0.15 seconds) • For up to 1000 sessions below 1.4 seconds time (sec) Q1 J. Hoxha – USEWOD Workshop, Lyon, 2012 • model checking time is small • OWL reasoning takes ~ 94% of the overall answering time nr. sessions 14
  • 15. Conclusions  Propose a framework for behavior modeling and analysis: • Approach for semantic formalization of logs • Techniques of querying patterns with temporal and semantic constraints  Challenges and Future Work • • • • Find datasets of client-side navigation logs at multiple sites Domain Ontology acquisition Classification Techniques to find FunctionType Optimization of Query Answering J. Hoxha – USEWOD Workshop, Lyon, 2012 15