SlideShare ist ein Scribd-Unternehmen logo
1 von 3
Complex Networks Class Project 
! 
Location Correlation in Human Mobility 
! 
Marcello Tomasini 
Bio-Complex Lab 
Department of Computer Sciences 
Florida Tech
Twitter Miner Implementation 
The application which mine Twitter is developed in Python and uses the 
following libraries: 
• twitter (Python Twitter Tools): data is collected through Twitter stream 
API and appended to local buffer 
• pymongo (MongoDB): data is stored on the Biocomplex Lab MongoDB 
instance 
• logging: Python logging facility is used to keep track of code exceptions, 
and non-standard twitter messages in the stream (warning, limit, 
disconnect). Mostly for debugging. Exceptions don’t stop program 
execution (mostly), but try to recover instead, in order to avoid manual 
intervention 
• collections: collections.deque is used for a thread-safe high-performance 
local buffering in order to reduce Network IO and overhead on BioComplex 
Lab MongoDB server 
• threading: data is pushed to BioComplex Lab MongoDB instance by a 
separate thread. Thread pop out a fixed amount of elements from the 
deque and try the insert operation. If insert operation fails, revert back the 
transition. No tweets lost. Python GIL is not an issue here since the thread 
is IO bounded 
! 
Code runs on Amazon EC2 t2.micro instance for maximum reliability (SLA 
99.95%). 
Code performance: easily handle ~8Mbps twitter stream (worldwide stream 
of geotagged tweets) corresponding to ~2000 tweet/s.
Network Builder Implementation 
The application which build the network is developed in Python and uses the 
following libraries: 
• pymongo (MongoDB): filter tweets with a bounding box (due to a Twitter 
bug) and retrieve data from BioComplex Lab MongoDB instance. Query 
projections help to reduce data transferred over network 
• scikit-learn: provides functions to compute k-means clustering of 
coordinate points. Clusters will represent locations 
• numpy: provides fast arrays and matrices data structures 
• matplotlib.pyplot: plot graphs 
• igraph: create and export the network structure 
! 
Clustering need a distance metric; coordinates are not in an euclidean space, 
but in a spherical space, thus to compute the great-circle distance [1] 
between two points we could use haversine formula [2] 
! 
However, most implementations use a distance matrix when supplied with a 
non standard metric, which requires O(n2) space. Given the size of the 
dataset that’s impractical, thus we use Mercator projection [3] to project 
coordinates in an euclidean space and then use standard k-means algorithm. 
!!!!! 
[1] http://en.wikipedia.org/wiki/Great-circle_distance 
[2] http://en.wikipedia.org/wiki/Haversine_formula 
[3] http://en.wikipedia.org/wiki/Mercator_projection

Weitere ähnliche Inhalte

Was ist angesagt?

Tutorial Nequick
Tutorial NequickTutorial Nequick
Tutorial Nequick
sfu-kras
 
electic mashinary fundamentals 5th edition Lab tasks stack
electic mashinary fundamentals 5th edition Lab tasks stackelectic mashinary fundamentals 5th edition Lab tasks stack
electic mashinary fundamentals 5th edition Lab tasks stack
Muhammad Nasir
 

Was ist angesagt? (20)

An Experiment-Driven Performance Model of Stream Processing Operators in Fog ...
An Experiment-Driven Performance Model of Stream Processing Operators in Fog ...An Experiment-Driven Performance Model of Stream Processing Operators in Fog ...
An Experiment-Driven Performance Model of Stream Processing Operators in Fog ...
 
Master-Thesis-Matlab-Projects
Master-Thesis-Matlab-ProjectsMaster-Thesis-Matlab-Projects
Master-Thesis-Matlab-Projects
 
MATLAB Thesis for Students
MATLAB Thesis for StudentsMATLAB Thesis for Students
MATLAB Thesis for Students
 
Matlab Projects for Electrical Students
Matlab Projects for Electrical StudentsMatlab Projects for Electrical Students
Matlab Projects for Electrical Students
 
Matlab Thesis for Phd Students
Matlab Thesis for Phd StudentsMatlab Thesis for Phd Students
Matlab Thesis for Phd Students
 
MATLAB Projects for Master Thesis Students
MATLAB Projects for Master Thesis StudentsMATLAB Projects for Master Thesis Students
MATLAB Projects for Master Thesis Students
 
MATLAB Project Topics
MATLAB Project TopicsMATLAB Project Topics
MATLAB Project Topics
 
Matlab Simulink Electrical Projects
Matlab Simulink Electrical ProjectsMatlab Simulink Electrical Projects
Matlab Simulink Electrical Projects
 
Postgraduate Projects in Scilab
Postgraduate Projects in ScilabPostgraduate Projects in Scilab
Postgraduate Projects in Scilab
 
A NOVEL PROTOTYPE MODEL FOR SWARM MOBILE ROBOT NAVIGATION BASED FUZZY LOGIC C...
A NOVEL PROTOTYPE MODEL FOR SWARM MOBILE ROBOT NAVIGATION BASED FUZZY LOGIC C...A NOVEL PROTOTYPE MODEL FOR SWARM MOBILE ROBOT NAVIGATION BASED FUZZY LOGIC C...
A NOVEL PROTOTYPE MODEL FOR SWARM MOBILE ROBOT NAVIGATION BASED FUZZY LOGIC C...
 
Tutorial Nequick
Tutorial NequickTutorial Nequick
Tutorial Nequick
 
Load Balancing Projects for Master Thesis Students
Load Balancing Projects for Master Thesis StudentsLoad Balancing Projects for Master Thesis Students
Load Balancing Projects for Master Thesis Students
 
MATLAB Thesis Projects
MATLAB Thesis ProjectsMATLAB Thesis Projects
MATLAB Thesis Projects
 
electic mashinary fundamentals 5th edition Lab tasks stack
electic mashinary fundamentals 5th edition Lab tasks stackelectic mashinary fundamentals 5th edition Lab tasks stack
electic mashinary fundamentals 5th edition Lab tasks stack
 
Simulink Projects For EEE
Simulink Projects For EEESimulink Projects For EEE
Simulink Projects For EEE
 
Simulink Projects in Matlab
Simulink Projects in MatlabSimulink Projects in Matlab
Simulink Projects in Matlab
 
Simulation Projects in Matlab
Simulation Projects in MatlabSimulation Projects in Matlab
Simulation Projects in Matlab
 
Parallel Left Ventricle Simulation Using the FEniCS Framework
Parallel Left Ventricle Simulation Using the FEniCS FrameworkParallel Left Ventricle Simulation Using the FEniCS Framework
Parallel Left Ventricle Simulation Using the FEniCS Framework
 
Matlab IEEE Projects
Matlab IEEE ProjectsMatlab IEEE Projects
Matlab IEEE Projects
 
Matlab Electrical Master Thesis
Matlab Electrical Master ThesisMatlab Electrical Master Thesis
Matlab Electrical Master Thesis
 

Ähnlich wie CSE5656 Complex Networks - Location Correlation in Human Mobility, Implementation

BISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesBISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple Spaces
Srinath Perera
 
MongoDB at MapMyFitness from a DevOps Perspective
MongoDB at MapMyFitness from a DevOps PerspectiveMongoDB at MapMyFitness from a DevOps Perspective
MongoDB at MapMyFitness from a DevOps Perspective
MongoDB
 

Ähnlich wie CSE5656 Complex Networks - Location Correlation in Human Mobility, Implementation (20)

"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft..."Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
 
Python in the real world : from everyday applications to advanced robotics
Python in the real world : from everyday applications to advanced roboticsPython in the real world : from everyday applications to advanced robotics
Python in the real world : from everyday applications to advanced robotics
 
BISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesBISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple Spaces
 
Websocket 101 in Python
Websocket 101 in PythonWebsocket 101 in Python
Websocket 101 in Python
 
MongoDB at MapMyFitness from a DevOps Perspective
MongoDB at MapMyFitness from a DevOps PerspectiveMongoDB at MapMyFitness from a DevOps Perspective
MongoDB at MapMyFitness from a DevOps Perspective
 
Pysense: wireless sensor computing in Python?
Pysense: wireless sensor computing in Python?Pysense: wireless sensor computing in Python?
Pysense: wireless sensor computing in Python?
 
An assessment of internet of things protocols for constrain apps
An assessment of internet of things protocols for constrain appsAn assessment of internet of things protocols for constrain apps
An assessment of internet of things protocols for constrain apps
 
Concurrency and parallel in .net
Concurrency and parallel in .netConcurrency and parallel in .net
Concurrency and parallel in .net
 
Workshop slides
Workshop slidesWorkshop slides
Workshop slides
 
What’s eating python performance
What’s eating python performanceWhat’s eating python performance
What’s eating python performance
 
B.Eng-Final Year Project interim-report
B.Eng-Final Year Project interim-reportB.Eng-Final Year Project interim-report
B.Eng-Final Year Project interim-report
 
An Introduction to OMNeT++ 6.0
An Introduction to OMNeT++ 6.0An Introduction to OMNeT++ 6.0
An Introduction to OMNeT++ 6.0
 
MongoDB at MapMyFitness
MongoDB at MapMyFitnessMongoDB at MapMyFitness
MongoDB at MapMyFitness
 
The Onward Journey: Porting Twisted to Python 3
The Onward Journey: Porting Twisted to Python 3The Onward Journey: Porting Twisted to Python 3
The Onward Journey: Porting Twisted to Python 3
 
An Introduction to OMNeT++ 5.4
An Introduction to OMNeT++ 5.4An Introduction to OMNeT++ 5.4
An Introduction to OMNeT++ 5.4
 
Tos tutorial
Tos tutorialTos tutorial
Tos tutorial
 
Dc ch02 : protocol architecture
Dc ch02 : protocol architectureDc ch02 : protocol architecture
Dc ch02 : protocol architecture
 
6. TinyOS_2.pdf
6. TinyOS_2.pdf6. TinyOS_2.pdf
6. TinyOS_2.pdf
 
IoT meets Big Data
IoT meets Big DataIoT meets Big Data
IoT meets Big Data
 
Software architacture recovery
Software architacture recoverySoftware architacture recovery
Software architacture recovery
 

Kürzlich hochgeladen

Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
HyderabadDolls
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Kürzlich hochgeladen (20)

Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 

CSE5656 Complex Networks - Location Correlation in Human Mobility, Implementation

  • 1. Complex Networks Class Project ! Location Correlation in Human Mobility ! Marcello Tomasini Bio-Complex Lab Department of Computer Sciences Florida Tech
  • 2. Twitter Miner Implementation The application which mine Twitter is developed in Python and uses the following libraries: • twitter (Python Twitter Tools): data is collected through Twitter stream API and appended to local buffer • pymongo (MongoDB): data is stored on the Biocomplex Lab MongoDB instance • logging: Python logging facility is used to keep track of code exceptions, and non-standard twitter messages in the stream (warning, limit, disconnect). Mostly for debugging. Exceptions don’t stop program execution (mostly), but try to recover instead, in order to avoid manual intervention • collections: collections.deque is used for a thread-safe high-performance local buffering in order to reduce Network IO and overhead on BioComplex Lab MongoDB server • threading: data is pushed to BioComplex Lab MongoDB instance by a separate thread. Thread pop out a fixed amount of elements from the deque and try the insert operation. If insert operation fails, revert back the transition. No tweets lost. Python GIL is not an issue here since the thread is IO bounded ! Code runs on Amazon EC2 t2.micro instance for maximum reliability (SLA 99.95%). Code performance: easily handle ~8Mbps twitter stream (worldwide stream of geotagged tweets) corresponding to ~2000 tweet/s.
  • 3. Network Builder Implementation The application which build the network is developed in Python and uses the following libraries: • pymongo (MongoDB): filter tweets with a bounding box (due to a Twitter bug) and retrieve data from BioComplex Lab MongoDB instance. Query projections help to reduce data transferred over network • scikit-learn: provides functions to compute k-means clustering of coordinate points. Clusters will represent locations • numpy: provides fast arrays and matrices data structures • matplotlib.pyplot: plot graphs • igraph: create and export the network structure ! Clustering need a distance metric; coordinates are not in an euclidean space, but in a spherical space, thus to compute the great-circle distance [1] between two points we could use haversine formula [2] ! However, most implementations use a distance matrix when supplied with a non standard metric, which requires O(n2) space. Given the size of the dataset that’s impractical, thus we use Mercator projection [3] to project coordinates in an euclidean space and then use standard k-means algorithm. !!!!! [1] http://en.wikipedia.org/wiki/Great-circle_distance [2] http://en.wikipedia.org/wiki/Haversine_formula [3] http://en.wikipedia.org/wiki/Mercator_projection