SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Downloaden Sie, um offline zu lesen
IEEE, 12th Annual Conference on Privacy Security
Trust, PST 2014
MindYourPrivacy: Design and
Implementation of a Visualization
System for Third-Party Web
Tracking
Yuuki Takano, Satoshi Ohta,
Takeshi Takahashi, Ruo Ando,
Tomoya Inoue
1
Introduction
❖ The number of third-party Web tracking is growing each year.!
❖ online privacy is now significant issue!
❖ SNSs and targeted ads can associate real names of individuals with tracking
information!
❖ Propose MindYourPrivacy to visualize and show third-party web tracking.!
❖ deep-packet-inspection based architecture!
❖ to support heterogeneous browsers and devices!
❖ Experimented MindYourPrivacy at the Workshop (WIDE Camp 2014 Autumn in
JAPAN), which has 129 attendees.!
❖ reveal that clustering web graph helps to detect ads’ sites by analyzing user traffic!
❖ some graph theory features also help to heuristically detect ads sites
2
Related Work
Web Tracking Mechanism
❖ Third-party Web tracker typically tracks by cookie,
Etags or flash storage
web bug (1x1 pict)
ads
social widgets
First-party Web servers
Third-party Web tracker
tracking id (cookie, Etags, flash storage, etc...)
contents
contents
3
platform.twitter.com
guest_id=v1%3A135875454567229819!
twll=l%3D1363156464
4
platform.twitter.com
guest_id=v1%3A135875454567229819!
twll=l%3D1363156464
YES. Twitter knows our tendency.
5
Related Work
Web Tracking Detection Techniques
❖ ShareMeNot!
❖ swap a link to known data-collection sites such as Facebook!
❖ Roesner et al. “Detecting and defending against third-party tracking on the
web”, USENIX NSDI 2012!
❖ Lightbeam!
❖ visualize web graph between first and third-party sites!
❖ https://www.mozilla.org/lightbeam/!
❖ AdBlock Plus!
❖ signature based ads detection and blocking!
❖ https://adblockplus.org/en/firefox
6
Related Work
Measurements
❖ Several researchers reported on third party web tracker.!
❖ One of the research reported third-party trackers within Alexa’s top 500 domains.!
❖ Roesner et al, “Detecting and defending against third-party tracking on the web”, USENIX NSDI 2012!
e fact that the tracking
t it is thus difficult to
or policy solutions.
s classification is ag-
on of the mechanisms
e storage may be done
, and information may
ker in any way. This
anism makes the clas-
evolution of specific
by trackers.
ework, we created a
tomatically classifies
rved on the client-side.
Figure 6: Prevalence of Trackers on Top 500 Domains.
Trackers are counted on domains, i.e., if a particular tracker
appears on two pages of a domain, it is counted once.
Top 20 Trackers on Alexa’s Top 500 Domains!
[Roesner et al. NSDI 2012]
7
MindYourPrivacy
Design Principle
❖ We designed and implemented a visualization system for third-party web tracking called
MindYourPrivacy.!
❖ To clearly show third-party web trackers to users.!
❖ Design Principles of MindYourPrivacy!
❖ Independence from browsers and devices!
❖ the existence of various OSes or devices such as Linux, Windows, MacOS, and smartphone
OSes such as Android and iOS complicates the problem!
❖ adopt a deep-packet-inspection based approach to support heterogeneous browsers and devices!
❖ Accessibility and comprehensiveness of the analysis results!
❖ easy to access: MindYourPrivacy provides analysis results in the form of an HTML file via an
HTTP server to facilitate users’ access to them!
❖ easy to understand: visualize trackers by tag cloud fashion, and provide web graph’s file further
analysis
8
Design and Implementation
Web Tracker Identification Methodology (1)
❖ HTTP Referrer Web Graph Analysis!
❖ generate a web graph by using HTTP referrer tag!
❖ if an site is referred by many other sites, MindYourPrivacy
assumes that it is a suspicious site tracking users!
❖ Domain Aggregation!
❖ to show users which organizations track them, MindYourPrivacy
aggregates domains as either second or third level!
❖ platform.twitter.com and platform0.twitter.com are aggregated to
twitter.com
9
Design and Implementation
Web Tracker Identification Methodology (2)
❖ DNS-SOA-Record-Based Grouping!
❖ aggregate domains by DNS SOA record!
❖ facebook.com and facebook.net are aggregated into dns.facebook.com,
which is their DNS SOA record!
❖ Balanchander et al., “Privacy diffusion on the web: a longitudinal
perspective”, WWW 2009!
❖ Weighted site Ranking of User Data Leakage!
❖ MindYourPrivacy shows not only web trackers but also leaking sites to
trackers!
❖ leaking sites are scored, but the details are omitted here. see our paper
10
Design and Implementation
System Model
❖ MindYourPrivacy captures traffic of users’ web access!
❖ show analyzed results via MindYourPrivacy’s web server!
❖ users need not install or configure specific applications
MindYourPrivacy
The Internet
Traffic Capture
Web Access
Analyzed Result via HTTP
Outgoing Traffic
Router・・・
Users
11
Design and Implementation
Implementation Architecture
❖ Catenaccio DPI!
❖ capture traffic from network IF!
❖ reconstruct TCP stream and store captured data into
NoSQL DB!
❖ written in C++!
❖ NoSQL DB!
❖ use MongoDB as a database!
❖ Tracking Analyzer!
❖ analyze measurement data!
❖ written in JavaScript and Python!
❖ HTML/Graph File Generator!
❖ generate visualized results!
❖ written in Python!
❖ HTML Server!
❖ serve HTML/Graph files to users
Catenaccio DPI NoSQL DB
Tracking Analyzer
HTML/Graph File
Generator
HTML Server
NW/IF
L2 Datagram
Measurement Data
Analyzed Result
Measurement Data
HTML/Graph Files
Analyzing Result
12
Design and Implementation
Web User Interface
❖ visualize suspicious web trackers as tag cloud fashion!
❖ domains are grouped by DNS SOA records!
❖ referring sites are shown in right pane
Experiment at WIDE Camp 2013 Autumn
❖ We experimented MindYourPrivacy at WIDE camp 2013 autumn.!
❖ WIDE Camp 2013 Autumn (Sep. 10 - Sep. 13)!
❖ a workshop for Internet researchers, operators and developers!
❖ 129 attendees, most of whom are either IT specialists or
students majoring IT!
❖ the experiment is agreed by every attendees (for only research
purpose)!
❖ We captured the attendees’ web browsing traffic and analyzed.
14
Experiment
User Traffic Analysis (1)
❖ Obtained 734,194 HTTP
requests and 1,661
individual source IP
addresses (IPv4 and IPv6).!
❖ A directed web graph is
generated by using HTTP
referrer header.!
❖ There are 3,966 nodes and
12,941 edges.!
❖ Analyze this web graph to
find web trackers.
15
Experiment
User Traffic Analysis (2)
❖ To find web trackers, we extract top most-referred sites
from the web graph!
❖ Advertisements and social sites, which tend to track
users, have many incoming links.
ttendees
Total
117
12
129
RLs are only
TABLE II: Top-five Most-referred Sites
Site # of incoming links
google-analytics.com 847
facebook.com 437
twitter.com 393
doubleclick.net 380
google.com 356
16
Top-Five Most-referred Sites
Experiment
User Traffic Analysis (3)
❖ We then adopted a clustering technique (M-CODE) to the web graph.!
❖ As a result of clustering, many ad-sites are found in cluster.
referred Graph Pane: This pane provides referred
.dot and .sif formats. Users can download these
re and analyze or visualize the referred graph by
viz, Cytoscape, etc. Figures 5 and Figure 6 show
examples using Cytoscape. Through this sort of
users can easily find to which sites many other
IV. Experiment
strate the usability and effectiveness of the pro-
m, we conducted an experiment at WIDE camp
September 10–13 2013.
E project [19] is a research and development
apan aimed at developing a widely integrated
nvironment. It organizes camps every spring and
many researchers, developers, and students tak-
discussing Internet technologies. Table I lists the
f the camp attendees. There were 129 attendees,
m are either IT specialists or students majoring in
conducted two types of experiments: user traffic
questionnaire-based use analysis.
whose values are random text strings, the number of coo
values we observed, and examples. In total we obser
2,309 and 2,671 requests for platform.twitter.com
www.facebook.com, respectively. However, we found o
about 100 unique values for each cookie, though fr
www.facebook.com is 397. fr thus does not seem to
tracking cookies, and the 100 likely indicates the numbe
attendees (which was also around 100) or devices. The res
reveal that tracking cookies can also be used for per-u
analysis and visualization.
We then applied MCODE clustering [20] to the graph
Figure 5 to find further features. This allowed us to obse
many ad sites clustered into the rank 1 cluster by MCO
The following domains were ad sites found in the ran
cluster of Figure 6:
doubleclick.net, amazon-adsystem.com,
googleadservices.com, i-mobile.co.jp,
advg.jp, adingo.jp, iogous.com, admeld.com,
criteo.com.
Ad sites generally tend to collect user information for busin
purposes. We therefore should be concerned with the priv
issues they present. This discovery should help further anal
and visualization concerning such sites. Table IV lists
feature vector of ads and other sites that appeared in Figur
ad-sites in cluster
17
Experiment
User Traffic Analysis (4)
❖ We analyzed the cluster from the aspect of graph theory’s feature.!
❖ As a result of that, we found that ad-sites’ #incoming links, #outgoing links
and neighborhood connectivity are quite different from others.!
❖ ad-sites have many incoming links, but few outgoing links!
❖ ad-sites’ neighborhood connectivity is relatively low
18
Fig. 6: Rank 1 Cluster by MCODE (include loops = false,
degree cutoff = 2, haircut = true, fluff = false, node score
cutoff = 0.2, k-core = 2, and max. depth = 100)
TABLE IV: Feature Vector of Rank 1 Cluster’s Edge (Average
and Unbiased Variance)
#incoming links # of outgoing
links
Neighborhood
connectivity
avg. var. avg. var. avg. var.
ad sites 90.2 12405.4 15.2 3972.9 46.0 3972.9
others 30.2 3972.9 29.7 569.3 130.2 5212.0
measures, and the most popular measure is to use multiple
browsers. Although multiple browser usage does not strictly
the DNT flag i
tracking; it is ju
referrers or coo
online usability
not use SNSs.
of infrastructur
pros and cons o
The free-form
• Use privat
• Delete HT
• Use AdBlo
• Absolutely
Modern Web b
mode to isolat
responded that
Some of them
for not disablin
Some attendee
blocks online a
leakage throug
attendees answ
tracking. Such
privacy are qui
Question 3: D
after seeing the
Experiment
User Traffic Analysis (5)
❖ Do Not Track flag is used to announce a wish of users to
third-party trackers.!
❖ However only 40,650 (40,605/734,194 = 6 %) DNT
enabled requests are observed.
19
Conclusion and Future Work
❖ Proposed a visualization system for third-party web tracking called
MindYourPrivacy.!
❖ browser and device independent architecture!
❖ visualize web trackers as tag cloud fashion!
❖ Experimented MindYourPrivacy at WIDE camp 2013 autumn and analyze users’
web browsing traffic.!
❖ generate web graph by HTTP referrer and analyze it!
❖ revealed that graph clustering and some graph theory’s features are useful to
find web trackers!
❖ Adopting more sophisticated approaches we revealed at the experiment, and
signature based approach is a future work.
20
EOF
21

Weitere ähnliche Inhalte

Ähnlich wie MindYourPrivacy: Design and Implementation of a Visualization System for Third-Party Web Tracking

Operating System Upgrade Implementation Report And...
Operating System Upgrade Implementation Report And...Operating System Upgrade Implementation Report And...
Operating System Upgrade Implementation Report And...
Julie Kwhl
 

Ähnlich wie MindYourPrivacy: Design and Implementation of a Visualization System for Third-Party Web Tracking (20)

What is web scraping?
What is web scraping?What is web scraping?
What is web scraping?
 
CIS1203 Web Design Principles - Part 1
CIS1203 Web Design Principles - Part 1CIS1203 Web Design Principles - Part 1
CIS1203 Web Design Principles - Part 1
 
IRJET- Phishing Website Detection System
IRJET- Phishing Website Detection SystemIRJET- Phishing Website Detection System
IRJET- Phishing Website Detection System
 
Yelpcamp: A review based website for campgrounds
Yelpcamp: A review based website for campgroundsYelpcamp: A review based website for campgrounds
Yelpcamp: A review based website for campgrounds
 
We are Digital Puppets
We are Digital PuppetsWe are Digital Puppets
We are Digital Puppets
 
Making Web Analytics actionable with Web Content Management
Making Web Analytics actionable with Web Content ManagementMaking Web Analytics actionable with Web Content Management
Making Web Analytics actionable with Web Content Management
 
Web Engineering
Web EngineeringWeb Engineering
Web Engineering
 
Detecting eCommerce Fraud with Neo4j and Linkurious
Detecting eCommerce Fraud with Neo4j and LinkuriousDetecting eCommerce Fraud with Neo4j and Linkurious
Detecting eCommerce Fraud with Neo4j and Linkurious
 
Open / Public APIs - From Implementation to Digital Business Model
Open / Public APIs - From Implementation to Digital Business ModelOpen / Public APIs - From Implementation to Digital Business Model
Open / Public APIs - From Implementation to Digital Business Model
 
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...A Deep Learning Technique for Web Phishing Detection Combined URL Features an...
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...
 
Smart Crawler Automation with RMI
Smart Crawler Automation with RMISmart Crawler Automation with RMI
Smart Crawler Automation with RMI
 
Advanced internet technologies
Advanced internet technologiesAdvanced internet technologies
Advanced internet technologies
 
Operating System Upgrade Implementation Report And...
Operating System Upgrade Implementation Report And...Operating System Upgrade Implementation Report And...
Operating System Upgrade Implementation Report And...
 
Detection of Phishing Websites
Detection of Phishing WebsitesDetection of Phishing Websites
Detection of Phishing Websites
 
Deep Web
Deep WebDeep Web
Deep Web
 
Trends in front end engineering_handouts
Trends in front end engineering_handoutsTrends in front end engineering_handouts
Trends in front end engineering_handouts
 
DEVELOPING PRODUCTS UPDATE-ALERT SYSTEM FOR E-COMMERCE WEBSITES USERS USING ...
DEVELOPING PRODUCTS UPDATE-ALERT SYSTEM  FOR E-COMMERCE WEBSITES USERS USING ...DEVELOPING PRODUCTS UPDATE-ALERT SYSTEM  FOR E-COMMERCE WEBSITES USERS USING ...
DEVELOPING PRODUCTS UPDATE-ALERT SYSTEM FOR E-COMMERCE WEBSITES USERS USING ...
 
DEVELOPING PRODUCTS UPDATE-ALERT SYSTEM FOR E-COMMERCE WEBSITES USERS USING H...
DEVELOPING PRODUCTS UPDATE-ALERT SYSTEM FOR E-COMMERCE WEBSITES USERS USING H...DEVELOPING PRODUCTS UPDATE-ALERT SYSTEM FOR E-COMMERCE WEBSITES USERS USING H...
DEVELOPING PRODUCTS UPDATE-ALERT SYSTEM FOR E-COMMERCE WEBSITES USERS USING H...
 
Search Engine Scrapper
Search Engine ScrapperSearch Engine Scrapper
Search Engine Scrapper
 
A Novel Interface to a Web Crawler using VB.NET Technology
A Novel Interface to a Web Crawler using VB.NET TechnologyA Novel Interface to a Web Crawler using VB.NET Technology
A Novel Interface to a Web Crawler using VB.NET Technology
 

Mehr von Yuuki Takano

Mehr von Yuuki Takano (16)

アクターモデル
アクターモデルアクターモデル
アクターモデル
 
π計算
π計算π計算
π計算
 
FARIS: Fast and Memory-efficient URL Filter by Domain Specific Machine
FARIS: Fast and Memory-efficient URL Filter by Domain Specific MachineFARIS: Fast and Memory-efficient URL Filter by Domain Specific Machine
FARIS: Fast and Memory-efficient URL Filter by Domain Specific Machine
 
リアクティブプログラミング
リアクティブプログラミングリアクティブプログラミング
リアクティブプログラミング
 
Transactional Memory
Transactional MemoryTransactional Memory
Transactional Memory
 
Tutorial of SF-TAP Flow Abstractor
Tutorial of SF-TAP Flow AbstractorTutorial of SF-TAP Flow Abstractor
Tutorial of SF-TAP Flow Abstractor
 
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
 
CUDAメモ
CUDAメモCUDAメモ
CUDAメモ
 
【やってみた】リーマン多様体へのグラフ描画アルゴリズムの実装【実装してみた】
【やってみた】リーマン多様体へのグラフ描画アルゴリズムの実装【実装してみた】【やってみた】リーマン多様体へのグラフ描画アルゴリズムの実装【実装してみた】
【やってみた】リーマン多様体へのグラフ描画アルゴリズムの実装【実装してみた】
 
SF-TAP: L7レベルネットワークトラフィック解析器
SF-TAP: L7レベルネットワークトラフィック解析器SF-TAP: L7レベルネットワークトラフィック解析器
SF-TAP: L7レベルネットワークトラフィック解析器
 
SF-TAP: 柔軟で規模追従可能なトラフィック解析基盤の設計
SF-TAP: 柔軟で規模追従可能なトラフィック解析基盤の設計SF-TAP: 柔軟で規模追従可能なトラフィック解析基盤の設計
SF-TAP: 柔軟で規模追従可能なトラフィック解析基盤の設計
 
Measurement Study of Open Resolvers and DNS Server Version
Measurement Study of Open Resolvers and DNS Server VersionMeasurement Study of Open Resolvers and DNS Server Version
Measurement Study of Open Resolvers and DNS Server Version
 
Security workshop 20131220
Security workshop 20131220Security workshop 20131220
Security workshop 20131220
 
Security workshop 20131213
Security workshop 20131213Security workshop 20131213
Security workshop 20131213
 
Security workshop 20131127
Security workshop 20131127Security workshop 20131127
Security workshop 20131127
 
A Measurement Study of Open Resolvers and DNS Server Version
A Measurement Study of Open Resolvers and DNS Server VersionA Measurement Study of Open Resolvers and DNS Server Version
A Measurement Study of Open Resolvers and DNS Server Version
 

Kürzlich hochgeladen

VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
@Chandigarh #call #Girls 9053900678 @Call #Girls in @Punjab 9053900678
 
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
@Chandigarh #call #Girls 9053900678 @Call #Girls in @Punjab 9053900678
 
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRLLucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
imonikaupta
 
Thalassery Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call G...
Thalassery Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call G...Thalassery Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call G...
Thalassery Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call G...
Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure
 
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
nirzagarg
 
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Chandigarh Call girls 9053900678 Call girls in Chandigarh
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
ydyuyu
 
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Kürzlich hochgeladen (20)

VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
 
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
 
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
 
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort ServiceBusty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
 
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRLLucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
 
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
 
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
 
Thalassery Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call G...
Thalassery Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call G...Thalassery Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call G...
Thalassery Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call G...
 
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...
 
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirt
 
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
 
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
 
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
 
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
 
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 

MindYourPrivacy: Design and Implementation of a Visualization System for Third-Party Web Tracking

  • 1. IEEE, 12th Annual Conference on Privacy Security Trust, PST 2014 MindYourPrivacy: Design and Implementation of a Visualization System for Third-Party Web Tracking Yuuki Takano, Satoshi Ohta, Takeshi Takahashi, Ruo Ando, Tomoya Inoue 1
  • 2. Introduction ❖ The number of third-party Web tracking is growing each year.! ❖ online privacy is now significant issue! ❖ SNSs and targeted ads can associate real names of individuals with tracking information! ❖ Propose MindYourPrivacy to visualize and show third-party web tracking.! ❖ deep-packet-inspection based architecture! ❖ to support heterogeneous browsers and devices! ❖ Experimented MindYourPrivacy at the Workshop (WIDE Camp 2014 Autumn in JAPAN), which has 129 attendees.! ❖ reveal that clustering web graph helps to detect ads’ sites by analyzing user traffic! ❖ some graph theory features also help to heuristically detect ads sites 2
  • 3. Related Work Web Tracking Mechanism ❖ Third-party Web tracker typically tracks by cookie, Etags or flash storage web bug (1x1 pict) ads social widgets First-party Web servers Third-party Web tracker tracking id (cookie, Etags, flash storage, etc...) contents contents 3
  • 6. Related Work Web Tracking Detection Techniques ❖ ShareMeNot! ❖ swap a link to known data-collection sites such as Facebook! ❖ Roesner et al. “Detecting and defending against third-party tracking on the web”, USENIX NSDI 2012! ❖ Lightbeam! ❖ visualize web graph between first and third-party sites! ❖ https://www.mozilla.org/lightbeam/! ❖ AdBlock Plus! ❖ signature based ads detection and blocking! ❖ https://adblockplus.org/en/firefox 6
  • 7. Related Work Measurements ❖ Several researchers reported on third party web tracker.! ❖ One of the research reported third-party trackers within Alexa’s top 500 domains.! ❖ Roesner et al, “Detecting and defending against third-party tracking on the web”, USENIX NSDI 2012! e fact that the tracking t it is thus difficult to or policy solutions. s classification is ag- on of the mechanisms e storage may be done , and information may ker in any way. This anism makes the clas- evolution of specific by trackers. ework, we created a tomatically classifies rved on the client-side. Figure 6: Prevalence of Trackers on Top 500 Domains. Trackers are counted on domains, i.e., if a particular tracker appears on two pages of a domain, it is counted once. Top 20 Trackers on Alexa’s Top 500 Domains! [Roesner et al. NSDI 2012] 7
  • 8. MindYourPrivacy Design Principle ❖ We designed and implemented a visualization system for third-party web tracking called MindYourPrivacy.! ❖ To clearly show third-party web trackers to users.! ❖ Design Principles of MindYourPrivacy! ❖ Independence from browsers and devices! ❖ the existence of various OSes or devices such as Linux, Windows, MacOS, and smartphone OSes such as Android and iOS complicates the problem! ❖ adopt a deep-packet-inspection based approach to support heterogeneous browsers and devices! ❖ Accessibility and comprehensiveness of the analysis results! ❖ easy to access: MindYourPrivacy provides analysis results in the form of an HTML file via an HTTP server to facilitate users’ access to them! ❖ easy to understand: visualize trackers by tag cloud fashion, and provide web graph’s file further analysis 8
  • 9. Design and Implementation Web Tracker Identification Methodology (1) ❖ HTTP Referrer Web Graph Analysis! ❖ generate a web graph by using HTTP referrer tag! ❖ if an site is referred by many other sites, MindYourPrivacy assumes that it is a suspicious site tracking users! ❖ Domain Aggregation! ❖ to show users which organizations track them, MindYourPrivacy aggregates domains as either second or third level! ❖ platform.twitter.com and platform0.twitter.com are aggregated to twitter.com 9
  • 10. Design and Implementation Web Tracker Identification Methodology (2) ❖ DNS-SOA-Record-Based Grouping! ❖ aggregate domains by DNS SOA record! ❖ facebook.com and facebook.net are aggregated into dns.facebook.com, which is their DNS SOA record! ❖ Balanchander et al., “Privacy diffusion on the web: a longitudinal perspective”, WWW 2009! ❖ Weighted site Ranking of User Data Leakage! ❖ MindYourPrivacy shows not only web trackers but also leaking sites to trackers! ❖ leaking sites are scored, but the details are omitted here. see our paper 10
  • 11. Design and Implementation System Model ❖ MindYourPrivacy captures traffic of users’ web access! ❖ show analyzed results via MindYourPrivacy’s web server! ❖ users need not install or configure specific applications MindYourPrivacy The Internet Traffic Capture Web Access Analyzed Result via HTTP Outgoing Traffic Router・・・ Users 11
  • 12. Design and Implementation Implementation Architecture ❖ Catenaccio DPI! ❖ capture traffic from network IF! ❖ reconstruct TCP stream and store captured data into NoSQL DB! ❖ written in C++! ❖ NoSQL DB! ❖ use MongoDB as a database! ❖ Tracking Analyzer! ❖ analyze measurement data! ❖ written in JavaScript and Python! ❖ HTML/Graph File Generator! ❖ generate visualized results! ❖ written in Python! ❖ HTML Server! ❖ serve HTML/Graph files to users Catenaccio DPI NoSQL DB Tracking Analyzer HTML/Graph File Generator HTML Server NW/IF L2 Datagram Measurement Data Analyzed Result Measurement Data HTML/Graph Files Analyzing Result 12
  • 13. Design and Implementation Web User Interface ❖ visualize suspicious web trackers as tag cloud fashion! ❖ domains are grouped by DNS SOA records! ❖ referring sites are shown in right pane
  • 14. Experiment at WIDE Camp 2013 Autumn ❖ We experimented MindYourPrivacy at WIDE camp 2013 autumn.! ❖ WIDE Camp 2013 Autumn (Sep. 10 - Sep. 13)! ❖ a workshop for Internet researchers, operators and developers! ❖ 129 attendees, most of whom are either IT specialists or students majoring IT! ❖ the experiment is agreed by every attendees (for only research purpose)! ❖ We captured the attendees’ web browsing traffic and analyzed. 14
  • 15. Experiment User Traffic Analysis (1) ❖ Obtained 734,194 HTTP requests and 1,661 individual source IP addresses (IPv4 and IPv6).! ❖ A directed web graph is generated by using HTTP referrer header.! ❖ There are 3,966 nodes and 12,941 edges.! ❖ Analyze this web graph to find web trackers. 15
  • 16. Experiment User Traffic Analysis (2) ❖ To find web trackers, we extract top most-referred sites from the web graph! ❖ Advertisements and social sites, which tend to track users, have many incoming links. ttendees Total 117 12 129 RLs are only TABLE II: Top-five Most-referred Sites Site # of incoming links google-analytics.com 847 facebook.com 437 twitter.com 393 doubleclick.net 380 google.com 356 16 Top-Five Most-referred Sites
  • 17. Experiment User Traffic Analysis (3) ❖ We then adopted a clustering technique (M-CODE) to the web graph.! ❖ As a result of clustering, many ad-sites are found in cluster. referred Graph Pane: This pane provides referred .dot and .sif formats. Users can download these re and analyze or visualize the referred graph by viz, Cytoscape, etc. Figures 5 and Figure 6 show examples using Cytoscape. Through this sort of users can easily find to which sites many other IV. Experiment strate the usability and effectiveness of the pro- m, we conducted an experiment at WIDE camp September 10–13 2013. E project [19] is a research and development apan aimed at developing a widely integrated nvironment. It organizes camps every spring and many researchers, developers, and students tak- discussing Internet technologies. Table I lists the f the camp attendees. There were 129 attendees, m are either IT specialists or students majoring in conducted two types of experiments: user traffic questionnaire-based use analysis. whose values are random text strings, the number of coo values we observed, and examples. In total we obser 2,309 and 2,671 requests for platform.twitter.com www.facebook.com, respectively. However, we found o about 100 unique values for each cookie, though fr www.facebook.com is 397. fr thus does not seem to tracking cookies, and the 100 likely indicates the numbe attendees (which was also around 100) or devices. The res reveal that tracking cookies can also be used for per-u analysis and visualization. We then applied MCODE clustering [20] to the graph Figure 5 to find further features. This allowed us to obse many ad sites clustered into the rank 1 cluster by MCO The following domains were ad sites found in the ran cluster of Figure 6: doubleclick.net, amazon-adsystem.com, googleadservices.com, i-mobile.co.jp, advg.jp, adingo.jp, iogous.com, admeld.com, criteo.com. Ad sites generally tend to collect user information for busin purposes. We therefore should be concerned with the priv issues they present. This discovery should help further anal and visualization concerning such sites. Table IV lists feature vector of ads and other sites that appeared in Figur ad-sites in cluster 17
  • 18. Experiment User Traffic Analysis (4) ❖ We analyzed the cluster from the aspect of graph theory’s feature.! ❖ As a result of that, we found that ad-sites’ #incoming links, #outgoing links and neighborhood connectivity are quite different from others.! ❖ ad-sites have many incoming links, but few outgoing links! ❖ ad-sites’ neighborhood connectivity is relatively low 18 Fig. 6: Rank 1 Cluster by MCODE (include loops = false, degree cutoff = 2, haircut = true, fluff = false, node score cutoff = 0.2, k-core = 2, and max. depth = 100) TABLE IV: Feature Vector of Rank 1 Cluster’s Edge (Average and Unbiased Variance) #incoming links # of outgoing links Neighborhood connectivity avg. var. avg. var. avg. var. ad sites 90.2 12405.4 15.2 3972.9 46.0 3972.9 others 30.2 3972.9 29.7 569.3 130.2 5212.0 measures, and the most popular measure is to use multiple browsers. Although multiple browser usage does not strictly the DNT flag i tracking; it is ju referrers or coo online usability not use SNSs. of infrastructur pros and cons o The free-form • Use privat • Delete HT • Use AdBlo • Absolutely Modern Web b mode to isolat responded that Some of them for not disablin Some attendee blocks online a leakage throug attendees answ tracking. Such privacy are qui Question 3: D after seeing the
  • 19. Experiment User Traffic Analysis (5) ❖ Do Not Track flag is used to announce a wish of users to third-party trackers.! ❖ However only 40,650 (40,605/734,194 = 6 %) DNT enabled requests are observed. 19
  • 20. Conclusion and Future Work ❖ Proposed a visualization system for third-party web tracking called MindYourPrivacy.! ❖ browser and device independent architecture! ❖ visualize web trackers as tag cloud fashion! ❖ Experimented MindYourPrivacy at WIDE camp 2013 autumn and analyze users’ web browsing traffic.! ❖ generate web graph by HTTP referrer and analyze it! ❖ revealed that graph clustering and some graph theory’s features are useful to find web trackers! ❖ Adopting more sophisticated approaches we revealed at the experiment, and signature based approach is a future work. 20