SlideShare ist ein Scribd-Unternehmen logo
1 von 16
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
1
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
Profiling User Activities With Minimal Traffic Traces
Tiep Mai, Deepak Ajwani and Alessandra Sala
Bell Laboratories, Ireland
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
2
Outline
• Micro-action burst decomposition
• Representative URL selection
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
3
End-to-End View of the Telecom Network
Mobile
user
Web
services
Client-side
data
Server-side
data
Telecom data
Huge data but
with limited features
Empower telecom data analysis with this data
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
4
Providing Personalized Services
• Personalized services require user activity profiling
- Traditional approaches rely on features extracted from rich data sources
- Server side data: full URLs of visited pages, page categories, transaction data, search queries, click
through rate, etc.
- Client side data: full URLs (cookies), application data (web browsing), etc.
- Network side data: full URLs, HTTP packet content, etc.
• Our goal: Provide medium-grained user profiling with privacy preserving limited
dataset for a large user-pool
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
5
Mobile Web Traces
User Behavioral Analysis from Timestamped Data
• Mobile traces provide precious insights in user behavior
- Critical to enable service personalization and enrich user’s online
experience
• Complete mobile web traces risk to reveal sensitive info
- http://finance.yahoo.com/q?s=BAC  Bank of America Corp. stock
price
- https://www.google.ie/#q=postnatal+depression  sensitive health
condition
- http://www.amazon.com/Dell-Inspiron-i15R-15-6-inch-
Laptop/dp/B009US2BKA  specific purchased product
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
6
Removing Sensitive Data from URL Traces
• Telecom Operators subjected to restrictive privacy legislations
• Conservative approach to share data
- Anonymized, truncate and sampled data
- Traces from10,000 anonymized users over 30 days, i.e. +130 Million records
• Focus on the dataset of truncated URLs or IP addresses
• Resulting data:
1. Truncated: www.amazon.com/Dell-Inspiron-i15R-15-6-inch-Laptop/dp/B009US2BKA
2. Noisy: unintentional web traffic as advertisement, web analytics, etc.
Quality of behavior analysis depends on effectively separating
unintentional traffic from user activities on truncated URL
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
7
• Collection of web traces of several URL types
• Aim: filter out traces that do not represent explicit user action
- Identifying features to drive detection on unintentional traces
- Validate across different users
• Diversity of web domains:
Web Browsing Behaviors Across Time & Users
1e−03
1e−01
1e+01
1e+03
1e+05
0 25 50 75
time (secs)
downloadsize
Domain
1e−03
1e−01
1e+01
1e+03
1e+05
0 400 800 1200
time (secs)downloadsize
Domain (gaming)
High diversity in user activities High diversity across users
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
8
Methodology Approach
• User activities as collection of micro user actions, i.e. burst
- Web clicks, chat replies
• Assumption: Each burst represents atomic user activity
- Combination of intended and unintended web-traffics
• Methodology
1. Burst decomposition
2. Activity extraction:
- Domain classification : Leverage specialized feature of domain appearance in the burst
- Online representative URL selection and activity association
Increase prediction
accuracy by 20%
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
9
Burst Decomposition –
Statistical Parametric Distribution Fitting
• Goal: Decompose the web-trace back into
constituent data bursts
• A need for a threshold of packet inter-arrival time (IAT)
to separate traces into bursts
• Study the inter-arrival time distribution
• No parametric distribution would match most user
traces
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
10
Burst Decomposition Algorithm
• Robust burst decomposition algorithm that is
independent of the distribution shape
• Starting from the smallest value, find the
value such that extended probability by
increasing decaying point is insignificant,
compared to the accumulated probability at
that point
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
11
Domain Classification – Initial Insight
• Goal: automatically identify URLs representing user
activities
• Measurements are aggregated for all users for each
domain
- Record-level measurements
- Burst-level measurements
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
12
Domain Classification - Methodology
• Logistic regression
• Validation error and AIC, BIC
• Two discriminating features
- ob,j=1 – ub,j=1 (~ 22.87) : probability that a domain comes first in bursts with more than one
unique domains
- ub,j=2 (~ -9.51) : probability that a domain comes in bursts with two unique domains
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
13
Trade-offs of Domain Classification Results
• Trade-off between accuracy, sensitivity, precision
and specificity
- Maximizing accuracy
- Maximizing sensitivity and specificity
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
14
Future Works
• Mapping domain to activities (reading, shopping, browsing) and identifying
user activities online
• Activity query and recommendation
• Correlating truncated URL data with user location data
- Spatial temporal study of user activities
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
15
Conclusions and Remarks
• Telecom data: Huge but limited; Strict privacy regulations
• URL trace data:
- Privacy preservation with truncation
- Noisy data
- Burst property of micro user actions
• Goal: Perform activity extraction and behaviour analysis for a large user-pool with
limited and noisy data
• Method:
- Burst decomposition and feature extractions
- Representative URL identification and activity extraction
Doing medium-grained behavior analysis
is feasible with limited, noisy and privacy preservation URL data
COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED.
ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION
16
Thank you
• Thank you
• Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

Lehigh Carbon NG911 / E911 Review
Lehigh Carbon NG911 / E911 ReviewLehigh Carbon NG911 / E911 Review
Lehigh Carbon NG911 / E911 Review
Mark Fletcher, ENP
 

Was ist angesagt? (20)

2012 ah vegas remote networking fundamentals
2012 ah vegas   remote networking fundamentals2012 ah vegas   remote networking fundamentals
2012 ah vegas remote networking fundamentals
 
Aruba wireless and clear pass 6 integration guide v1 1.3
Aruba wireless and clear pass 6 integration guide v1 1.3Aruba wireless and clear pass 6 integration guide v1 1.3
Aruba wireless and clear pass 6 integration guide v1 1.3
 
2012 ah vegas mobile device fundamentals
2012 ah vegas   mobile device fundamentals2012 ah vegas   mobile device fundamentals
2012 ah vegas mobile device fundamentals
 
2012 ah apj deploying byod
2012 ah apj   deploying byod2012 ah apj   deploying byod
2012 ah apj deploying byod
 
2012 ah vegas top10 tips from aruba tac
2012 ah vegas   top10 tips from aruba tac2012 ah vegas   top10 tips from aruba tac
2012 ah vegas top10 tips from aruba tac
 
Enable your networks to support enterprise mobility
Enable your networks to support enterprise mobilityEnable your networks to support enterprise mobility
Enable your networks to support enterprise mobility
 
Airheads barcelona 2010 securing wireless la ns
Airheads barcelona 2010   securing wireless la nsAirheads barcelona 2010   securing wireless la ns
Airheads barcelona 2010 securing wireless la ns
 
Airheads barcelona 2010 rf design for retail warehousing manufacturing
Airheads barcelona 2010   rf design for retail warehousing manufacturingAirheads barcelona 2010   rf design for retail warehousing manufacturing
Airheads barcelona 2010 rf design for retail warehousing manufacturing
 
Building an aruba proof of concept lab javier urtubia
Building an aruba proof of concept lab javier urtubiaBuilding an aruba proof of concept lab javier urtubia
Building an aruba proof of concept lab javier urtubia
 
Unleash the power, intelligence, and analytics of your networks with a flexib...
Unleash the power, intelligence, and analytics of your networks with a flexib...Unleash the power, intelligence, and analytics of your networks with a flexib...
Unleash the power, intelligence, and analytics of your networks with a flexib...
 
Lehigh Carbon NG911 / E911 Review
Lehigh Carbon NG911 / E911 ReviewLehigh Carbon NG911 / E911 Review
Lehigh Carbon NG911 / E911 Review
 
Next generation remote networks aruba instant gokul rajagopalan
Next generation remote networks aruba instant gokul rajagopalanNext generation remote networks aruba instant gokul rajagopalan
Next generation remote networks aruba instant gokul rajagopalan
 
2012 ah apj top 10 tips from aruba tac
2012 ah apj   top 10 tips from aruba tac2012 ah apj   top 10 tips from aruba tac
2012 ah apj top 10 tips from aruba tac
 
5 steps to a faster, smarter wlan
5 steps to a faster, smarter wlan5 steps to a faster, smarter wlan
5 steps to a faster, smarter wlan
 
2012 ah apj mobile device fundamentals
2012 ah apj   mobile device fundamentals2012 ah apj   mobile device fundamentals
2012 ah apj mobile device fundamentals
 
2012 ah vegas unified access fundamentals
2012 ah vegas   unified access fundamentals2012 ah vegas   unified access fundamentals
2012 ah vegas unified access fundamentals
 
Mobile Devices and Wi-Fi
Mobile Devices and Wi-FiMobile Devices and Wi-Fi
Mobile Devices and Wi-Fi
 
2012 ah vegas wlan design for high density
2012 ah vegas   wlan design for high density2012 ah vegas   wlan design for high density
2012 ah vegas wlan design for high density
 
BYOD with ClearPass
BYOD with ClearPassBYOD with ClearPass
BYOD with ClearPass
 
2 top10 tips from aruba tac rizwan shaikh
2 top10 tips from aruba tac rizwan shaikh2 top10 tips from aruba tac rizwan shaikh
2 top10 tips from aruba tac rizwan shaikh
 

Andere mochten auch

Telco 2.0 Report Summary: Telcos' Role in Advertising Value Chain
Telco 2.0 Report Summary:  Telcos' Role in Advertising Value ChainTelco 2.0 Report Summary:  Telcos' Role in Advertising Value Chain
Telco 2.0 Report Summary: Telcos' Role in Advertising Value Chain
bazza1664
 
Customer segmentation
Customer segmentationCustomer segmentation
Customer segmentation
weave Belgium
 
Monetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service ProvidersMonetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service Providers
DataWorks Summit
 

Andere mochten auch (20)

Marketing campaign to sell long term deposits
Marketing campaign to sell long term depositsMarketing campaign to sell long term deposits
Marketing campaign to sell long term deposits
 
Roadmap to realizing the value of telco data – opportunities, challenges, use...
Roadmap to realizing the value of telco data – opportunities, challenges, use...Roadmap to realizing the value of telco data – opportunities, challenges, use...
Roadmap to realizing the value of telco data – opportunities, challenges, use...
 
Customer Segmentation Principles
Customer Segmentation PrinciplesCustomer Segmentation Principles
Customer Segmentation Principles
 
AWS re:Invent 2016: Predicting Customer Churn with Amazon Machine Learning (M...
AWS re:Invent 2016: Predicting Customer Churn with Amazon Machine Learning (M...AWS re:Invent 2016: Predicting Customer Churn with Amazon Machine Learning (M...
AWS re:Invent 2016: Predicting Customer Churn with Amazon Machine Learning (M...
 
Customer segmentation approach
Customer segmentation approachCustomer segmentation approach
Customer segmentation approach
 
Mobile Communication and Big Data by Prof. Richard Ling
Mobile Communication and Big Data by Prof. Richard LingMobile Communication and Big Data by Prof. Richard Ling
Mobile Communication and Big Data by Prof. Richard Ling
 
Role of Analytics in Customer Management
Role of Analytics in Customer ManagementRole of Analytics in Customer Management
Role of Analytics in Customer Management
 
Telco churn presentation
Telco churn presentationTelco churn presentation
Telco churn presentation
 
A Big Data Telco Solution by Dr. Laura Wynter
A Big Data Telco Solution by Dr. Laura WynterA Big Data Telco Solution by Dr. Laura Wynter
A Big Data Telco Solution by Dr. Laura Wynter
 
Patient Powered Research with Big Data and Connected Communities by Assoc. P...
Patient Powered Research with Big Data and Connected Communities  by Assoc. P...Patient Powered Research with Big Data and Connected Communities  by Assoc. P...
Patient Powered Research with Big Data and Connected Communities by Assoc. P...
 
Telco 4.0 Business Operating Model Value Proposition Overview
Telco 4.0 Business Operating Model Value Proposition   OverviewTelco 4.0 Business Operating Model Value Proposition   Overview
Telco 4.0 Business Operating Model Value Proposition Overview
 
Telco 2.0 Report Summary: Telcos' Role in Advertising Value Chain
Telco 2.0 Report Summary:  Telcos' Role in Advertising Value ChainTelco 2.0 Report Summary:  Telcos' Role in Advertising Value Chain
Telco 2.0 Report Summary: Telcos' Role in Advertising Value Chain
 
獲利世代Business Model Generation
獲利世代Business Model Generation獲利世代Business Model Generation
獲利世代Business Model Generation
 
Telco Paper by Blueocean Market Intelligence
Telco Paper by Blueocean Market IntelligenceTelco Paper by Blueocean Market Intelligence
Telco Paper by Blueocean Market Intelligence
 
Brand Building in the Age of Big Data by Mr. Gavin Coombes
Brand Building in the Age of Big Data by Mr. Gavin CoombesBrand Building in the Age of Big Data by Mr. Gavin Coombes
Brand Building in the Age of Big Data by Mr. Gavin Coombes
 
Layering Common Sense on Top of all that Rocket Science by Prof. Sharon Dunwoody
Layering Common Sense on Top of all that Rocket Science by Prof. Sharon DunwoodyLayering Common Sense on Top of all that Rocket Science by Prof. Sharon Dunwoody
Layering Common Sense on Top of all that Rocket Science by Prof. Sharon Dunwoody
 
Customer segmentation
Customer segmentationCustomer segmentation
Customer segmentation
 
Words and More Words: Challenges of Big Data by Prof. Edie Rasmussen
Words and More Words: Challenges of Big Data by Prof. Edie RasmussenWords and More Words: Challenges of Big Data by Prof. Edie Rasmussen
Words and More Words: Challenges of Big Data by Prof. Edie Rasmussen
 
FAST Digital Telco
FAST Digital TelcoFAST Digital Telco
FAST Digital Telco
 
Monetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service ProvidersMonetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service Providers
 

Ähnlich wie (Mobile Web Applications track) "Profiling User Activities with Minimal Traffic Traces" - Tiep Mai, Deepak Ajwani and Alessandra SalaIcwe v3 b

Rethink the core_webcast_download_22_may2012
Rethink the core_webcast_download_22_may2012Rethink the core_webcast_download_22_may2012
Rethink the core_webcast_download_22_may2012
informer13
 
Living objects network performance_management_v2
Living objects network performance_management_v2Living objects network performance_management_v2
Living objects network performance_management_v2
Yoan SMADJA
 
Smallcellsforumforjason 150318164026-conversion-gate01
Smallcellsforumforjason 150318164026-conversion-gate01Smallcellsforumforjason 150318164026-conversion-gate01
Smallcellsforumforjason 150318164026-conversion-gate01
Terra Sacrifice
 
BridgingTheGap-Atlanta-final
BridgingTheGap-Atlanta-finalBridgingTheGap-Atlanta-final
BridgingTheGap-Atlanta-final
Mark Niehus, RCDD
 
Delivering Application Analytics for an Application Fluent Network
Delivering Application Analytics for an Application Fluent NetworkDelivering Application Analytics for an Application Fluent Network
Delivering Application Analytics for an Application Fluent Network
Benjamin Eggerstedt
 
Pangpse training q12011
Pangpse training q12011Pangpse training q12011
Pangpse training q12011
Joe Palo Alto
 

Ähnlich wie (Mobile Web Applications track) "Profiling User Activities with Minimal Traffic Traces" - Tiep Mai, Deepak Ajwani and Alessandra SalaIcwe v3 b (20)

Young Enterprise Day 2014 – Unified Access
Young Enterprise Day 2014 – Unified AccessYoung Enterprise Day 2014 – Unified Access
Young Enterprise Day 2014 – Unified Access
 
Right size your core network
Right size your core networkRight size your core network
Right size your core network
 
How Alcatel-Lucent Enterprise Makes Universities State-of-the-Art
How Alcatel-Lucent Enterprise Makes Universities State-of-the-ArtHow Alcatel-Lucent Enterprise Makes Universities State-of-the-Art
How Alcatel-Lucent Enterprise Makes Universities State-of-the-Art
 
Architecting IoT by Mathew - Alcatel Lucent @ MIMOS IoT TWG Day1
Architecting IoT by Mathew - Alcatel Lucent @ MIMOS IoT TWG Day1Architecting IoT by Mathew - Alcatel Lucent @ MIMOS IoT TWG Day1
Architecting IoT by Mathew - Alcatel Lucent @ MIMOS IoT TWG Day1
 
Rethink the core_webcast_download_22_may2012
Rethink the core_webcast_download_22_may2012Rethink the core_webcast_download_22_may2012
Rethink the core_webcast_download_22_may2012
 
Living objects network performance_management_v2
Living objects network performance_management_v2Living objects network performance_management_v2
Living objects network performance_management_v2
 
OmniSwitch 6860/E Overview
OmniSwitch 6860/E Overview OmniSwitch 6860/E Overview
OmniSwitch 6860/E Overview
 
How eStruxture Data Centers is Using ECE to Rapidly Scale Their Business
How eStruxture Data Centers is Using ECE to Rapidly Scale Their BusinessHow eStruxture Data Centers is Using ECE to Rapidly Scale Their Business
How eStruxture Data Centers is Using ECE to Rapidly Scale Their Business
 
Smallcellsforumforjason 150318164026-conversion-gate01
Smallcellsforumforjason 150318164026-conversion-gate01Smallcellsforumforjason 150318164026-conversion-gate01
Smallcellsforumforjason 150318164026-conversion-gate01
 
Alcatel Lucent: Field Insights from Real-World Deployment of Multi-Vendor Het...
Alcatel Lucent: Field Insights from Real-World Deployment of Multi-Vendor Het...Alcatel Lucent: Field Insights from Real-World Deployment of Multi-Vendor Het...
Alcatel Lucent: Field Insights from Real-World Deployment of Multi-Vendor Het...
 
BridgingTheGap-Atlanta-final
BridgingTheGap-Atlanta-finalBridgingTheGap-Atlanta-final
BridgingTheGap-Atlanta-final
 
Alcatel lucent Enterprise LAN Portfolio Overview
Alcatel lucent Enterprise LAN Portfolio OverviewAlcatel lucent Enterprise LAN Portfolio Overview
Alcatel lucent Enterprise LAN Portfolio Overview
 
Addressing Needs of BYOD for Enterprises with Unified Access
Addressing Needs of BYOD for Enterprises with Unified AccessAddressing Needs of BYOD for Enterprises with Unified Access
Addressing Needs of BYOD for Enterprises with Unified Access
 
Delivering Application Analytics for an Application Fluent Network
Delivering Application Analytics for an Application Fluent NetworkDelivering Application Analytics for an Application Fluent Network
Delivering Application Analytics for an Application Fluent Network
 
PDF Transforming Your Infrastructure into a Utility-Grade Network
PDF Transforming Your Infrastructure into a Utility-Grade NetworkPDF Transforming Your Infrastructure into a Utility-Grade Network
PDF Transforming Your Infrastructure into a Utility-Grade Network
 
Gregory Touretsky - Intel IT- Open Cloud Journey
Gregory Touretsky - Intel IT- Open Cloud JourneyGregory Touretsky - Intel IT- Open Cloud Journey
Gregory Touretsky - Intel IT- Open Cloud Journey
 
Alcatel - 7750 SR & CGNAT SR-OS Fundamental
Alcatel - 7750 SR & CGNAT SR-OS FundamentalAlcatel - 7750 SR & CGNAT SR-OS Fundamental
Alcatel - 7750 SR & CGNAT SR-OS Fundamental
 
Pangpse training q12011
Pangpse training q12011Pangpse training q12011
Pangpse training q12011
 
Introduction to Software Defined WANs
Introduction to Software Defined WANsIntroduction to Software Defined WANs
Introduction to Software Defined WANs
 
A stepped approach to unified access
A stepped approach to unified access A stepped approach to unified access
A stepped approach to unified access
 

Mehr von icwe2015

(Web User Interfaces track) "Getting the Query Right: User Interface Design o...
(Web User Interfaces track) "Getting the Query Right: User Interface Design o...(Web User Interfaces track) "Getting the Query Right: User Interface Design o...
(Web User Interfaces track) "Getting the Query Right: User Interface Design o...
icwe2015
 
(Web Application Design track) "Two Factor Authentication Made Easy" - Alex Q...
(Web Application Design track) "Two Factor Authentication Made Easy" - Alex Q...(Web Application Design track) "Two Factor Authentication Made Easy" - Alex Q...
(Web Application Design track) "Two Factor Authentication Made Easy" - Alex Q...
icwe2015
 
(Semantic Web Technologies and Applications track) "MIRROR: Automatic R2RML M...
(Semantic Web Technologies and Applications track) "MIRROR: Automatic R2RML M...(Semantic Web Technologies and Applications track) "MIRROR: Automatic R2RML M...
(Semantic Web Technologies and Applications track) "MIRROR: Automatic R2RML M...
icwe2015
 
(Linked Data Development and Exploitation track) "YQL as a Platform for Linke...
(Linked Data Development and Exploitation track) "YQL as a Platform for Linke...(Linked Data Development and Exploitation track) "YQL as a Platform for Linke...
(Linked Data Development and Exploitation track) "YQL as a Platform for Linke...
icwe2015
 
(Linked Data Interfaces and Querying track) "SUMMA: A Common API for Linked D...
(Linked Data Interfaces and Querying track) "SUMMA: A Common API for Linked D...(Linked Data Interfaces and Querying track) "SUMMA: A Common API for Linked D...
(Linked Data Interfaces and Querying track) "SUMMA: A Common API for Linked D...
icwe2015
 
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
icwe2015
 
(Industry track) "Interactive networks for digital cultural heritage collecti...
(Industry track) "Interactive networks for digital cultural heritage collecti...(Industry track) "Interactive networks for digital cultural heritage collecti...
(Industry track) "Interactive networks for digital cultural heritage collecti...
icwe2015
 
(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...
(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...
(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...
icwe2015
 
(Mobile Web Applications track) "Mobile-IDM: A Design Method for Modeling the...
(Mobile Web Applications track) "Mobile-IDM: A Design Method for Modeling the...(Mobile Web Applications track) "Mobile-IDM: A Design Method for Modeling the...
(Mobile Web Applications track) "Mobile-IDM: A Design Method for Modeling the...
icwe2015
 
(SoWeMine Workshop) "Retrieving Relevant and Interesting Tweets during Live T...
(SoWeMine Workshop) "Retrieving Relevant and Interesting Tweets during Live T...(SoWeMine Workshop) "Retrieving Relevant and Interesting Tweets during Live T...
(SoWeMine Workshop) "Retrieving Relevant and Interesting Tweets during Live T...
icwe2015
 
(NLPIT Workshop) (Keynote) Nathan Schneider - “Hacking a Way Through the Twit...
(NLPIT Workshop) (Keynote) Nathan Schneider - “Hacking a Way Through the Twit...(NLPIT Workshop) (Keynote) Nathan Schneider - “Hacking a Way Through the Twit...
(NLPIT Workshop) (Keynote) Nathan Schneider - “Hacking a Way Through the Twit...
icwe2015
 
(PEWET Workshop) (Keynote) Vincenzo De Florio - “Fractally-organized Connecti...
(PEWET Workshop) (Keynote) Vincenzo De Florio - “Fractally-organized Connecti...(PEWET Workshop) (Keynote) Vincenzo De Florio - “Fractally-organized Connecti...
(PEWET Workshop) (Keynote) Vincenzo De Florio - “Fractally-organized Connecti...
icwe2015
 
(Web Application Design track) "Liquid Stream Processing across Web Browsers ...
(Web Application Design track) "Liquid Stream Processing across Web Browsers ...(Web Application Design track) "Liquid Stream Processing across Web Browsers ...
(Web Application Design track) "Liquid Stream Processing across Web Browsers ...
icwe2015
 
(Web Composition and Mashups track) "REST Web Service Description for Graph-B...
(Web Composition and Mashups track) "REST Web Service Description for Graph-B...(Web Composition and Mashups track) "REST Web Service Description for Graph-B...
(Web Composition and Mashups track) "REST Web Service Description for Graph-B...
icwe2015
 
(Semantic Web Technologies and Applications track) "A Quantitative Comparison...
(Semantic Web Technologies and Applications track) "A Quantitative Comparison...(Semantic Web Technologies and Applications track) "A Quantitative Comparison...
(Semantic Web Technologies and Applications track) "A Quantitative Comparison...
icwe2015
 
(Keynote) Peter Mika - “Making the Web Searchable”
(Keynote) Peter Mika - “Making the Web Searchable”(Keynote) Peter Mika - “Making the Web Searchable”
(Keynote) Peter Mika - “Making the Web Searchable”
icwe2015
 
(Keynote) Mike Thelwall - “Sentiment Strength Detection for Social Media Text...
(Keynote) Mike Thelwall - “Sentiment Strength Detection for Social Media Text...(Keynote) Mike Thelwall - “Sentiment Strength Detection for Social Media Text...
(Keynote) Mike Thelwall - “Sentiment Strength Detection for Social Media Text...
icwe2015
 

Mehr von icwe2015 (19)

Mikkonen liquid-sw-icwe2015
Mikkonen liquid-sw-icwe2015Mikkonen liquid-sw-icwe2015
Mikkonen liquid-sw-icwe2015
 
(Web User Interfaces track) "Getting the Query Right: User Interface Design o...
(Web User Interfaces track) "Getting the Query Right: User Interface Design o...(Web User Interfaces track) "Getting the Query Right: User Interface Design o...
(Web User Interfaces track) "Getting the Query Right: User Interface Design o...
 
(Web Application Design track) "Two Factor Authentication Made Easy" - Alex Q...
(Web Application Design track) "Two Factor Authentication Made Easy" - Alex Q...(Web Application Design track) "Two Factor Authentication Made Easy" - Alex Q...
(Web Application Design track) "Two Factor Authentication Made Easy" - Alex Q...
 
(Semantic Web Technologies and Applications track) "MIRROR: Automatic R2RML M...
(Semantic Web Technologies and Applications track) "MIRROR: Automatic R2RML M...(Semantic Web Technologies and Applications track) "MIRROR: Automatic R2RML M...
(Semantic Web Technologies and Applications track) "MIRROR: Automatic R2RML M...
 
(Linked Data Development and Exploitation track) "YQL as a Platform for Linke...
(Linked Data Development and Exploitation track) "YQL as a Platform for Linke...(Linked Data Development and Exploitation track) "YQL as a Platform for Linke...
(Linked Data Development and Exploitation track) "YQL as a Platform for Linke...
 
(Linked Data Interfaces and Querying track) "SUMMA: A Common API for Linked D...
(Linked Data Interfaces and Querying track) "SUMMA: A Common API for Linked D...(Linked Data Interfaces and Querying track) "SUMMA: A Common API for Linked D...
(Linked Data Interfaces and Querying track) "SUMMA: A Common API for Linked D...
 
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
 
(Industry track) "Interactive networks for digital cultural heritage collecti...
(Industry track) "Interactive networks for digital cultural heritage collecti...(Industry track) "Interactive networks for digital cultural heritage collecti...
(Industry track) "Interactive networks for digital cultural heritage collecti...
 
(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...
(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...
(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...
 
(Mobile Web Applications track) "Mobile-IDM: A Design Method for Modeling the...
(Mobile Web Applications track) "Mobile-IDM: A Design Method for Modeling the...(Mobile Web Applications track) "Mobile-IDM: A Design Method for Modeling the...
(Mobile Web Applications track) "Mobile-IDM: A Design Method for Modeling the...
 
(Linked Data Development and Exploitation track) "Curtains Up! Lights, Camera...
(Linked Data Development and Exploitation track) "Curtains Up! Lights, Camera...(Linked Data Development and Exploitation track) "Curtains Up! Lights, Camera...
(Linked Data Development and Exploitation track) "Curtains Up! Lights, Camera...
 
(SoWeMine Workshop) "Retrieving Relevant and Interesting Tweets during Live T...
(SoWeMine Workshop) "Retrieving Relevant and Interesting Tweets during Live T...(SoWeMine Workshop) "Retrieving Relevant and Interesting Tweets during Live T...
(SoWeMine Workshop) "Retrieving Relevant and Interesting Tweets during Live T...
 
(NLPIT Workshop) (Keynote) Nathan Schneider - “Hacking a Way Through the Twit...
(NLPIT Workshop) (Keynote) Nathan Schneider - “Hacking a Way Through the Twit...(NLPIT Workshop) (Keynote) Nathan Schneider - “Hacking a Way Through the Twit...
(NLPIT Workshop) (Keynote) Nathan Schneider - “Hacking a Way Through the Twit...
 
(PEWET Workshop) (Keynote) Vincenzo De Florio - “Fractally-organized Connecti...
(PEWET Workshop) (Keynote) Vincenzo De Florio - “Fractally-organized Connecti...(PEWET Workshop) (Keynote) Vincenzo De Florio - “Fractally-organized Connecti...
(PEWET Workshop) (Keynote) Vincenzo De Florio - “Fractally-organized Connecti...
 
(Web Application Design track) "Liquid Stream Processing across Web Browsers ...
(Web Application Design track) "Liquid Stream Processing across Web Browsers ...(Web Application Design track) "Liquid Stream Processing across Web Browsers ...
(Web Application Design track) "Liquid Stream Processing across Web Browsers ...
 
(Web Composition and Mashups track) "REST Web Service Description for Graph-B...
(Web Composition and Mashups track) "REST Web Service Description for Graph-B...(Web Composition and Mashups track) "REST Web Service Description for Graph-B...
(Web Composition and Mashups track) "REST Web Service Description for Graph-B...
 
(Semantic Web Technologies and Applications track) "A Quantitative Comparison...
(Semantic Web Technologies and Applications track) "A Quantitative Comparison...(Semantic Web Technologies and Applications track) "A Quantitative Comparison...
(Semantic Web Technologies and Applications track) "A Quantitative Comparison...
 
(Keynote) Peter Mika - “Making the Web Searchable”
(Keynote) Peter Mika - “Making the Web Searchable”(Keynote) Peter Mika - “Making the Web Searchable”
(Keynote) Peter Mika - “Making the Web Searchable”
 
(Keynote) Mike Thelwall - “Sentiment Strength Detection for Social Media Text...
(Keynote) Mike Thelwall - “Sentiment Strength Detection for Social Media Text...(Keynote) Mike Thelwall - “Sentiment Strength Detection for Social Media Text...
(Keynote) Mike Thelwall - “Sentiment Strength Detection for Social Media Text...
 

Kürzlich hochgeladen

VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
nilamkumrai
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
ydyuyu
 

Kürzlich hochgeladen (20)

Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort ServiceBusty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
 
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
 
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
 
Real Escorts in Al Nahda +971524965298 Dubai Escorts Service
Real Escorts in Al Nahda +971524965298 Dubai Escorts ServiceReal Escorts in Al Nahda +971524965298 Dubai Escorts Service
Real Escorts in Al Nahda +971524965298 Dubai Escorts Service
 
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
 
(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...
(+971568250507  ))#  Young Call Girls  in Ajman  By Pakistani Call Girls  in ...(+971568250507  ))#  Young Call Girls  in Ajman  By Pakistani Call Girls  in ...
(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...
 
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
 
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53
 
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
 
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
 
Yerawada ] Independent Escorts in Pune - Book 8005736733 Call Girls Available...
Yerawada ] Independent Escorts in Pune - Book 8005736733 Call Girls Available...Yerawada ] Independent Escorts in Pune - Book 8005736733 Call Girls Available...
Yerawada ] Independent Escorts in Pune - Book 8005736733 Call Girls Available...
 
Pirangut | Call Girls Pune Phone No 8005736733 Elite Escort Service Available...
Pirangut | Call Girls Pune Phone No 8005736733 Elite Escort Service Available...Pirangut | Call Girls Pune Phone No 8005736733 Elite Escort Service Available...
Pirangut | Call Girls Pune Phone No 8005736733 Elite Escort Service Available...
 
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
 
Al Barsha Night Partner +0567686026 Call Girls Dubai
Al Barsha Night Partner +0567686026 Call Girls  DubaiAl Barsha Night Partner +0567686026 Call Girls  Dubai
Al Barsha Night Partner +0567686026 Call Girls Dubai
 
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
 

(Mobile Web Applications track) "Profiling User Activities with Minimal Traffic Traces" - Tiep Mai, Deepak Ajwani and Alessandra SalaIcwe v3 b

  • 1. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION 1 COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION Profiling User Activities With Minimal Traffic Traces Tiep Mai, Deepak Ajwani and Alessandra Sala Bell Laboratories, Ireland
  • 2. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION 2 Outline • Micro-action burst decomposition • Representative URL selection
  • 3. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION 3 End-to-End View of the Telecom Network Mobile user Web services Client-side data Server-side data Telecom data Huge data but with limited features Empower telecom data analysis with this data
  • 4. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION 4 Providing Personalized Services • Personalized services require user activity profiling - Traditional approaches rely on features extracted from rich data sources - Server side data: full URLs of visited pages, page categories, transaction data, search queries, click through rate, etc. - Client side data: full URLs (cookies), application data (web browsing), etc. - Network side data: full URLs, HTTP packet content, etc. • Our goal: Provide medium-grained user profiling with privacy preserving limited dataset for a large user-pool
  • 5. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION 5 Mobile Web Traces User Behavioral Analysis from Timestamped Data • Mobile traces provide precious insights in user behavior - Critical to enable service personalization and enrich user’s online experience • Complete mobile web traces risk to reveal sensitive info - http://finance.yahoo.com/q?s=BAC  Bank of America Corp. stock price - https://www.google.ie/#q=postnatal+depression  sensitive health condition - http://www.amazon.com/Dell-Inspiron-i15R-15-6-inch- Laptop/dp/B009US2BKA  specific purchased product
  • 6. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION 6 Removing Sensitive Data from URL Traces • Telecom Operators subjected to restrictive privacy legislations • Conservative approach to share data - Anonymized, truncate and sampled data - Traces from10,000 anonymized users over 30 days, i.e. +130 Million records • Focus on the dataset of truncated URLs or IP addresses • Resulting data: 1. Truncated: www.amazon.com/Dell-Inspiron-i15R-15-6-inch-Laptop/dp/B009US2BKA 2. Noisy: unintentional web traffic as advertisement, web analytics, etc. Quality of behavior analysis depends on effectively separating unintentional traffic from user activities on truncated URL
  • 7. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION 7 • Collection of web traces of several URL types • Aim: filter out traces that do not represent explicit user action - Identifying features to drive detection on unintentional traces - Validate across different users • Diversity of web domains: Web Browsing Behaviors Across Time & Users 1e−03 1e−01 1e+01 1e+03 1e+05 0 25 50 75 time (secs) downloadsize Domain 1e−03 1e−01 1e+01 1e+03 1e+05 0 400 800 1200 time (secs)downloadsize Domain (gaming) High diversity in user activities High diversity across users
  • 8. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION 8 Methodology Approach • User activities as collection of micro user actions, i.e. burst - Web clicks, chat replies • Assumption: Each burst represents atomic user activity - Combination of intended and unintended web-traffics • Methodology 1. Burst decomposition 2. Activity extraction: - Domain classification : Leverage specialized feature of domain appearance in the burst - Online representative URL selection and activity association Increase prediction accuracy by 20%
  • 9. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION 9 Burst Decomposition – Statistical Parametric Distribution Fitting • Goal: Decompose the web-trace back into constituent data bursts • A need for a threshold of packet inter-arrival time (IAT) to separate traces into bursts • Study the inter-arrival time distribution • No parametric distribution would match most user traces
  • 10. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION 10 Burst Decomposition Algorithm • Robust burst decomposition algorithm that is independent of the distribution shape • Starting from the smallest value, find the value such that extended probability by increasing decaying point is insignificant, compared to the accumulated probability at that point
  • 11. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION 11 Domain Classification – Initial Insight • Goal: automatically identify URLs representing user activities • Measurements are aggregated for all users for each domain - Record-level measurements - Burst-level measurements
  • 12. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION 12 Domain Classification - Methodology • Logistic regression • Validation error and AIC, BIC • Two discriminating features - ob,j=1 – ub,j=1 (~ 22.87) : probability that a domain comes first in bursts with more than one unique domains - ub,j=2 (~ -9.51) : probability that a domain comes in bursts with two unique domains
  • 13. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION 13 Trade-offs of Domain Classification Results • Trade-off between accuracy, sensitivity, precision and specificity - Maximizing accuracy - Maximizing sensitivity and specificity
  • 14. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION 14 Future Works • Mapping domain to activities (reading, shopping, browsing) and identifying user activities online • Activity query and recommendation • Correlating truncated URL data with user location data - Spatial temporal study of user activities
  • 15. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION 15 Conclusions and Remarks • Telecom data: Huge but limited; Strict privacy regulations • URL trace data: - Privacy preservation with truncation - Noisy data - Burst property of micro user actions • Goal: Perform activity extraction and behaviour analysis for a large user-pool with limited and noisy data • Method: - Burst decomposition and feature extractions - Representative URL identification and activity extraction Doing medium-grained behavior analysis is feasible with limited, noisy and privacy preservation URL data
  • 16. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. ALCATEL-LUCENT — INTERNAL PROPRIETARY — USE PURSUANT TO COMPANY INSTRUCTION 16 Thank you • Thank you • Questions?