SlideShare ist ein Scribd-Unternehmen logo
1 von 7
Downloaden Sie, um offline zu lesen
Social Maps for a City
Taha Kachwala
4408225
TU Delft
taha.kachwala21@gmail.com
ABSTRACT
There has been a large scale migration towards urban cities in
many countries around the globe. Traditional methods of
diversifying the citizens living in the city based on race, religion,
age, nationality cannot work anymore. For effectively managing a
city, the governments require to know what kind of communities
of people live in their city, and in which areas. Governments can
involve certain communities of people while taking specific
decisions. This can be done if people living in a city can be
segregated effectively on the basis of their interests, background,
knowledge etc. This paper suggests two methods based on social
media that can be used to effectively segregate people. Both the
methods will eventually create Social Maps, that can visualize
clusters of communities as well as how these communities are
related to each other. These maps can help city as well as national
governments to collectively improve the “Social Progress Index”
of the nation.
1. INTRODUCTION
In real world, every person can maintain roughly 150 real-world
relationships. This number is called the “Dunbar Number”. Some
people may have more and some may have less. However, in an
online world, people may have many more relationships, perhaps
a few thousand. The offline real-world relationships that people
have will roughly be an overlapping subset of relationships they
have online. The online relationships are more flexible as they can
operate globally and at all times of the day. If we have this online
relationship between people in a specific geographical area, then
we can roughly estimate their real world relationships as well. The
principle of ‘Homophily’, is a powerful tendency for people to
organize themselves into groups of people who are similar to
themselves, it doesn’t matter whether it is online or offline.
So if we accept the notion that people do, in fact, have
relationships that both shape and are shaped by their interactions,
then it follows that there may be some ways to measure these
relationships with some level of fidelity. Social network can help
to offer some information regarding these relationships though
with some biases. The relationship and interaction of people in
these social networks is extensively used today for
recommendation systems, to find people you may know, security
reasons etc then why not for managing a city effectively?
This paper suggests a methodology by which user generated data
on twitter, Facebook, LinkedIn, Xing, and Soundcloud can be
used to map users into certain communities based on their
interests. Then the relationships between different communities
are found and how much they are connected with each other e.g.
‘Tech geek’ community will be closely related to ‘Web
Developer’ community. These communities can further be used to
obtain relevant information from the data generated by these
users. People of the city can be diversified based on their social
construct unlike based on ‘Race’, ‘Religion’, ‘ethnicity’ etc which
has proven to be a poor proxy to represent diversity. City
governments/municipalities can consult specific or related groups
or communities of people to give valuable inputs on certain
decisions.
By categorizing city users into different communities like politics,
tech geeks, radio and newspapers, sports, travel, religious
ideologies, web developers and coders, bloggers, activists, age
groups, etc using user modelling on their social media platforms.
Then start creating relations between these groups of people.
If we achieve providing this data, the governments can learn more
about the social construct of the people living in a city, what they
like to do and what they can do about it. This will contribute to
the development of diversity. This diversity can be used in a way
to tackle some intractable problems of the society in a new way.
This can be used to tackle many of the urban challenges regarding
environment, transportation, buildings etc.
2. RELATED WORK
2.1 Social Progress Index
GDP is usually used as a measure of development of a nation.
GDP has defined and shaped our lives for the last 80 years. GDP
was a concept that was introduced by Simon Kuznets in a report
that he delivered called “national Income, 1929-1932’. But, in that
first report, Kuznets himself delivered a warning which said
‘welfare of a nation can, therefore, scarcely be inferred from a
measurement of national income’. It clearly states that GDP is a
tool to help us measure economic performance, but it’s not a
measure of our well-being.
Social Progress Index (SPI) is a new tool which helps measure the
social progress of people living in a city or country. It provides a
rich framework for measuring the multiple dimensions of social
progress, benchmarking success, and catalyzing greater human
well-being.
Social Progress Index is defined as
The capacity of a society to meet the basic human needs of its
citizens, establish the building blocks that allow citizens and
communities to enhance and sustain the quality of their lives, and
create the conditions for all individuals to reach their full
potential.
Figure 1 (Source-[7]) below gives a detail about all the attributes
that define ‘Social Progress Index’.
You can obtain more information about Social Progress Index
from [8] and watch a TED talk about it on [9]. To summarize,
Table 1 gives a list of top 20 countries measured with SPI and
their corresponding GDP’s.
Table 1
RA
NK
COUNTRY SPI GDP RA
NK
COUNTRY SPI GDP
1 New
Zealand
88.24 25,857 11 Austria 85.11 36,200
2 Switzerland 88.19 39,293 12 Germany 84.61 34,819
3 Iceland 88.07 33,880 13 United
Kingdom
84.56 32,671
4 Netherlands 87.37 36,438 14 Japan 84.21 31,425
5 Norway 87.12 47,547 15 Ireland 84.05 36,723
6 Sweden 87.08 34,945 16 United
States
82.77 45,336
7 Canada 86.95 35,936 17 Belgium 82.63 32,639
8 Finland 86.91 31,610 18 Slovenia 81.65 24,483
9 Denmark 86.55 32,363 19 Estonia 81.28 18,927
10 Austrailia 86.10 35,669 20 France 81.11 29,819
As we can see, United States stands 16th according to SPI, though
its GDP is the highest. On the contrary, New Zealand has a GDP
which is far lower than US, but its ranked #1 according to SPI.
This means that people in New Zealand are much happier than
those living in US.
One of the main goals of this paper is to help governments
increase their SPI by giving them a tool to make their citizens
happier, instead of bragging about the growth in GDP.
User information has been diversified across a lot of platforms.
Now with the help of Social Web and Web 2.0, we can try to
merge this user information from different platforms, into one.
For this paper, we will try to collaborate. We can use 2-different
types of models for such cross-system collaborative approaches.
● A centralized approach with standardized models that
can aggregate the distributed user information over
different platforms.
● A decentralized approach where dedicated software
components transfer user information from one
application’s representation into another.
In this paper, we will rely on the former model of centralized
approach. Within this centralized approach, two main submodels
exist. The first submodel relies on use of standardized user models
which involved applications must agree on. This involves using
generalized ontologies like General User Modeling Ontology
(GUMO) [1] or Friend-of-a-Friend (FOAF). For this paper we
will rely on the second submodel to build meta-models that allow
defining how application-dependent user data corresponds to user
data from another application. The advantage with this application
is that the application need not be using the same generic user
model as in the first case. The ontology also allows defining
relationships between the data and can be aggregated. So it is
possible to get a set of user-interests from Facebook and merge it
with music related interests from Soundcloud. Though the music
related interest will be a subset of user-interest but will be more
detailed and specific [2].
An assumption regarding this research paper is based on the fact
that the system generated will be used by city governments and
municipalities. Therefore, the governments can obtain required set
of permissions from their citizens in applications like facebook,
twitter, LinkedIn, Xing and Soundcloud. These sets of
permissions are such that it does not violate the user’s privacy.
3. SOCIAL MAPS USING SWUM
For building a successful Social Web User Model (SWUM) based
on various platforms, we first need to analyze what kind of data
we can capture from different social platforms without invading
the privacy of the citizens. Due to the extensive use of OAuth
protocol, many successful web platforms are ready to provide
their own authorization and basic profile information for external
applications. Facebook, twitter, LinkedIn etc are also ready to
provide more data about their users through their API’s to external
applications. Table 2 lists the relevant information that we can
obtain from these various social platforms.
Table 2
Platform Required User Permissions
facebook id,name,gender,locale,user friends(only the friends
living in the same city), email,
user_actions(books,fitness,music,news),
user_activities, user_interests, user_location,
user_education_history
Twitter Read tweets from timeline, who you follow
LinkedIn Basic Profile Fields, language fields, skills fields,
certification fields, Education fields, Position
fields
Xing Basic Profile Fields, professional_experience,
active_email
SoundCloud music interests
The above websites are chosen because they provide relevant and
useful information that can collectively be used to recreate a
perfect user model for every citizen. Table 3 provides the
semantics that we can get from each of the platforms.
To be able to create a social web user model, we need to analyze
which type of information and which user model dimensions
should be a part of the model and which attributes in these
dimensions should be supported.
Table 3
Platform Semantics obtained
facebook 1. Generic information about the user
2. User’s connections within the city(This
only provides the list of users who
access the same platform)
3. Daily activities of the user
4. Users interests
5. Behavioral Analysis
Twitter Twitter can help us gain ample of information
regarding the user.
1. Current activities of the user
2. Political viewpoints
3. Logical viewpoints
4. Behavioural analysis
LinkedIn Basic Profile Fields, language fields, skills fields,
certification fields, Education fields, Position
fields
Xing Basic Profile Fields, professional_experience,
active_email
SoundCloud music interests
3.1 User Model Dimensions
After collecting the above mentioned information about our users,
we need to model the dimensions of this information so that we
can create our user model. The following is taxonomy of
dimensions that we need to use.
1. Personal Characteristics and Demographics- Basic
information like age, gender, name, address, location,
contact information. We can collect this data from
facebook, LinkedIn and Xing.
2. Interests- The type of hobbies and interests a user has
e.g. news, politics, gaming, online shopping etc.
Facebook and twitter can collectively give us an
accurate set of users’ interests.
3. Mental and Physical well-being - Describes individual
characteristics like physical limitations, health or mental
states like stress, cognitive load. This can be derived
from information received from facebook page likes, the
people the user follows on twitter, job profile from
LinkedIn and Xing, and music interests from
soundcloud.
4. Knowledge- This describes how socially active a user is
in certain fields, their educational status, skills etc. This
type of information can be derived from the tweets a
user makes, educational universities, their
qualifications, people they follow on twitter. This type
of information is dynamic and needs to be analyzed
after certain periods because the knowledge level
always changes from time to time. Knowledge
regarding certain topics might increase or decrease
overtime. This information can be derived from Social
business applications like LinkedIn and Xing.
5. Individual Behaviour- This is certainly one of the most
important characteristic that can define a user. This
dimension has a direct impact on the previously
specified dimensions and can also be used to infer
information about the previous dimensions. Deriving
User Behaviour is a complicated process as it is usually
an implicit feature and is not available on a user’s social
profile. We will discuss analysis of this dimension in
more detail in section 3.6.
6. Context: In computer science this term generally refers
to “any information that can be used to characterize the
situation of an entity” [3]. In the area of user modelling,
this term focuses on the user’s environment (location
and time, devices the user uses). According to research,
‘context’ is a very important area as far as user
modelling is concerned [4], but it has a very limited
application for this research.
This means that for an effective generation of user model for our
application, it is important to cover the dimensions of Personal
characteristic and Demographics, Interests, Mental and Physical
well-being, Knowledge and Individual Behaviour [6].
3.2 User Model Attributes
We have selected the required dimensions, so now we need to
define the attributes of these dimensions that our user model
supports. Table 4 shows an example of attributes in Personal
Characteristic dimension.
Table 4
3.3 WordNet
Similar information about the same user is stored over many
applications e.g. Facebook, twitter, LinkedIn all store basic
information about the same user, but with different names.
Facebook uses the term ‘username’ while twitter uses ‘handle’ to
identify unique users. This problem of attribute name
heterogeneity complicates a possible aggregation using Meta-
Model strategy. To solve this problem, WordNet is used.
WordNet defines word sense relations between words. You can
dive into more details about WordNet from [5]. To summarize, if
a word represents a user attribute, the relatedness between
different attributes can be acquired through WordNet. As our
previous assumption was that the user himself gives permissions
on our application for his profiles in different applications,
aggregation of personal characteristics is not a problem. But for
aggregation of interests, and mental and physical well-being, a
little help from WordNet would not hurt.
3.4 Use Case: Profile Aggregation
The figure above shows an example of how a user profile from
Facebook and LinkedIn can be aggregated together to form our
Social Web User Model. For deriving our first dimension of
Personal characteristics, we need to merge the data which we get
from facebook and LinkedIn. We know that name, email, contact
info, educational qualifications, current employment can be
derived from both facebook as well as LinkedIn. But our
dimension would be more concrete if we use professional data
from LinkedIn, because on facebook many people are studying at
‘Hogwarts school of Wizardry’ and working at ‘Mah Lyf, Mah
Rulez’. Even demographic information from LinkedIn sometimes
should be preferred over facebook, but I find it more relevant to
extract it from facebook because people update basic information
on facebook more frequently than LinkedIn (As once they found a
good job, they do not update their location, email etc).
Demographics
● Gender: string
○ Male: bool
○ Female:
bool
● Birthdate: Date
● Language: string
● Education: string
○ High-
school:
bool
○ Bachelor’s
: bool
○ Master’s:
bool
○ Phd: bool
● Employment: string
○ employed:
bool
Contact Information
● Name: string
● Mobile number: int
● e-mail: string
● Places lived: list of locations
● Current City: Location
Location
● Country: string
● State: string
● City: string
● address: string
For our second dimension regarding ‘Interests’, it is clear that
facebook can provide more concrete details. We can obtain more
information regarding user preferences from what the user likes,
his interests that are already extracted by facebook. Similarly for
‘Knowledge’, we can extract data from LinkedIn regarding work
experience, past jobs, skills etc.
3.5 Mapping Connections between different
User Models
Once we have the Social web User Model ready for all or most of
the citizens living in the city, we can start analyzing connection
and map them. This can be done in 2 steps.
1. Finding relations between different dimensions of the
user model.
2. Finding relations between same dimensions of different
users’ model.
Let me explain both the steps using an example. Let’s assume that
there are 10 male users, living in Rotterdam, in an age group of
20-30 yrs (Personal Info and demographics), they are all
aerospace engineers (Knowledge), and are interested into sports
and rock music (Interests). Now we have a new male user A
whose ‘Interests’ dimension is incomplete. ‘A’ lives in Rotterdam,
is an aerospace engineer and is in the age group of 20-30 yrs, then
there is a high probability that he might be interested in sports and
rock music.
For the 2nd step, let us assume that there are 3 groups of users
A,B and C. Group A is interested into politics, sports and music,
group B is interested in politics, sports and stock market and
Group C is interested into stock market and Justin Bieber. From
analyzing these dimensions, we can find that politics and sports
are closely related to music and stock markets, while stock
markets is somewhat related to Justin Bieber. By this we can infer
that Group A and B are closely related, B and C are somewhat
related and A and C are not related. This concludes that politics
and sports are interests that are far away from having an interest in
Justin Bieber.
Finally we can map these relationships using a force directed
graphs. This will create a Social Map. These social maps can be
generated for each and every dimension of the Social Web User
Model. These dimensions can be studied in detail, and how
different attributes of these dimensions are related. These maps
can be used to take decisions by the government on different
scenarios. Governments can easily consult and gather opinions for
specific situations from the communities which will be most
affected and the other closely related communities.
3.6 Behavior Analysis
The kind of democracy that is offered by social media and the
internet, has resulted in users exhibiting different behaviours like
sharing, posting, liking, commenting, tweeting, following and,
advertising on a daily basis. By analyzing these user behaviours
over social media, they can be categorized into individual and
collective behaviours. Individual behaviour is exhibited by a
single user, whereas collective behaviour is observed when a
group of users behave together for e.g. users using the same
hashtag on twitter.
3.6.1 Individual Behavior Analysis
Individual Behaviour can be considered one of the following
1. User-User Behaviour: Observed between two
users. For e.g. befriending or following
another user
2. User-Entity Behaviour: Liking a post or
posting a tweet on twitter.
3. User-Community Behaviour: Joining/Leaving
groups on facebook or LinkedIn
Irrespective of the type of behaviour, we can use computational
methodology to analyze behaviour and find interesting patterns.
To analyze individual behaviour, we can trace who the user
follows on twitter overtime and try to understand the underlying
reasons for such followings. A machine learning program can be
implemented using randomization tests or causality testing
techniques [7].
3.6.2 Collective Behavior Analysis
Collective behaviour analysis can be easily derived from
analyzing individuals that exhibit a collective behaviour
independently. It can be achieved by aggregating the result of
individual behaviour analysis. You can read more about
behavioural analysis from [8].
This behavioural data is massive, expansive and, indicative of
user preferences, interests and opinions. These ‘opinions’ are
something which is one of the most important aspect that a city
government needs to know about their citizens. These opinions
can vary collectively based on the communities of users on
different issues. During a certain financial situation, opinions from
group of economists can be of use, while during a political
situation, opinions from the politically interested community
would be more relevant over others. This can help the government
in managing the welfare and well-being of their citizens and
collectively increase the ‘Social Progress Index’ of the city.
3.7 Conclusion-Social Maps using Social Web
User Model
If we can secure the above information, from every citizen in the
city, we can conveniently create a user model for every citizen of
the city. This data can then be effectively used to map
relationships between users. We can find clusters of communities
living within the city, their job profiles, salaries received, and
eventually target issues that really matter.
However due to the NSA-revelations and the fear of secret
government surveillance programs, many people will be reluctant
to provide the required permissions. However the users need to be
assured that they won't be targets to any such surveillance
programs or censorship and it is for the sole purpose of
maintaining a city. Regarding this issue, I did a short research
survey to find out how much people are willing to provide their
social data. The results are shown on the next page (source-
http://www.pollican.com/result/7/What_information_are_you_wil
ling_to_share_with_your_government).
From these survey results, I found out that people are usually not
willing to give away facebook data to governments. According to
this survey, the only data they are willing to share is email. From
LinkedIn and Xing, people are readily willing to share their skills,
experience, and educational details. Most of them are ready to
give away their Job Profile as well. Twitter has the most positive
results as people are willingly ready to share their twitter streams
as well as follow list. The only down side of twitter is that not
many people have twitter accounts or tweet regularly. But
however, this leads us to the next step of this paper of developing
‘Social Maps’ using twitter data.
4. SOCIAL MAPS USING TWITTER DATA
In this section we will use twitter data to generate social maps of a
particular city and propose an algorithm for doing it. For example
purpose, we will be using Munich as a target city for analyzing.
As this is for a city government, we will assume that we already
have Personal Information and demographics, and every citizen’s
twitter handle. Twitter streams for Munich can be analyzed using
coordinates 48.1333° N, 11.5667° E. This approach is in
connectivity with [10].
4.1 Algorithm for gathering Data
Before starting this algorithm, create a database table having the
following fields
uid
(int)
Twitter Handle
(string)
Visited
(bool)
Relationships
(int Array)
Cluster or Community
(string)
The twitter handle will have a set of handles of users and uid is
user id which will be unique. Visited will be set to true if that user
has been visited so that we do not run into an infinite recursive
loop. Relationships will have an array of uid’s who the user
follows and is within our dataset. Cluster or community will be
the group in which the user belongs, e.g. music, geek, politics,
sports etc.
Algorithm:
1. Select a seed user from the collection of handles.
2. Create a FIFO list that will store the handles followed
by the seed user which belongs in our dataset. Then
determine the corresponding uid’s of these handles and
store the array in ‘Relationships’ column. Mark the
current seed as visited=true.
3. Start analyzing the Relationship list of the seed user. If a
Uid in that list is not visited, then go to step 2 and run a
recursive loop using the Handle of the unvisited uid as
new seed.
4. Run through the table, if a uid is not visited, then set the
handle as seed and go to step 2, else abort.
Once this algorithm has completed its run, we will have a
‘Relationship’ array for every uid in the database which will be
connected to 1 or more uid’s.
4.2 Laying out the Network Graph
As we are primarily interested in homophily and clustering, we
will use a graph layout which can express communities of
relationships. We will use force-directed graph layout algorithm
[11]. With this approach, relationships will act like springs and
each user node will repel nearby nodes. This graph will eventually
represent the following properties:
1. People with many relationships between them will be
arranged into tight clusters.
2. People with the fewest relationships between then will
appear at opposite edges of the graph.
3. People who have many relationships at both ends of the
graph will appear in the middle.
4. Clusters with few or no relationships between then will
appear very far apart on the graph.
It is based on the concept that if there are 10 people, there can be a
total of 45 relationships between them which is given by the
formulae (n*(n-1)/2). This means that every person is related to
every other person, the force directed graph will be a perfect
symmetric ball. Similarly, if these 10 people are split into 2
groups of 5 people each, and both the groups hate each other, but
each member of the same group has a relation with every member
of the same group, then the final force directed graph will be 2
separate balls with no connection between them.
Similarly if we visualize the data from a city in this way, we will
be able to measure the separateness of communities.
4.3 Detecting Communities and Adding color
Communities can be detected by the number of shared
relationships or interests within a given subgroup. We can use
Louvian community detection algorithm [12], which iteratively
determines communities of interest within a larger network and
can assign community membership to each user accordingly.
Finally we can assign each community a color arbitrarily. There
maybe some user nodes that are affiliated to multiple
communities. These users can be assigned a community with
whom they share the maximum relationships. Another approach is
that they can be given a blend of colors of all the community they
are affiliated to. This can also generate a boundary between
different communities. For example, a person belonging to a
group primarily concerned with politics (blue) and a group
primarily concerned with music (yellow) may be represented by
green.
Finally we can plot these users based on their locations on the
map of Munich. Since we already have the geo locations of each
user, we can just plot these users based on their community colors
on the map. The representation of the force directed graph and the
geographical social map is shown in the following figures. Each
dot in both of these maps, represent a person, and the color of the
dot represents the community. The geographical map is just for
representational purposes.
4.4 Determining Community Interests
Each user node can first be given various sizes which depend on
the number of relationships it shares with other nearby nodes.
This means, more the relationships, larger will be the node.
Finally we can determine the total community interests by
manually inspecting each node, starting with the largest nodes in
the community. Typically people will organize themselves into
groups like: sports, music, media, movies, politics, finance, news,
arts, literature, engineering, cultures etc.
Finally we can start to monitor traffic for each community using,
1. Hashtags
2. shared links
3. languages used
4. Operating systems in use (desktop, mobile, android,
iOS, windows etc)
5. geographic coordinates
6. age
5. CONCLUSION
Interactive and Informative social Maps can be generated by using
both, the social web User Model as well as using Twitter data.
These Social Maps can effectively represent the clusters of
communities living is different areas in a geographical region.
These Maps can be a boon for managing various activities for the
city.
6. REFERENCES
[1] Heckmann, D., Schwarzkopf, E., Mori, J., Dengler, D.,
Krner, A.: The user model
[2] and context ontology gumo revisited for future web 2.0
extensions. In: Proceedings of the Int. Workshop on
Contexts and Ontologies: Representation and
Reasoning. CEUR Workshop Proceedings, vol. 298.
CEUR-WS.org (2007).
[3] Till Plumbaum, Songxuan Wu, Ernesto William, Sahin
Albayrak: User Modeling for the Social Semantic Web.
[4] Dey, A.K.: Understanding and using context. Personal
and Ubiquitous Computing 5, 4–7 (2001)
[5] Said, A., Berkovsky, S., De Luca, E.W.: Putting things
in context: Challenge on context-aware movie
recommendation. In: Proceedings of the Workshop on
Context-Aware Movie Recommendation. pp. 2–6.
CAMRa ’10, ACM, New York, NY, USA (2010)
[6] Bernardo Magnini and Carlo Strapparava: Using
WordNet to Improve User Modelling in a Web
Document Recommender System. In: ITC-irst, Istituto
per la Ricerca Scientica e Tecnologica, I-38050 Trento,
ITALY.
http://multiwordnet.fbk.eu/paper/WordnetWumNAACL
.pdf
[7] Zafarani, R., Abbasi, MA., Liu, H., Social Media
Mining: An Introduction, Cambridge University Press,
2014
[8] http://www.socialprogressimperative.org/system/resourc
es/W1siZiIsIjIwMTQvMDUvMjYvMTYvMzcvMDAv
MjUzL1NvY2lhbF9Qcm9ncmVzc19JbmRleF8yMDE0
X0V4ZWN1dGl2ZV9TdW1tYXJ5LnBkZiJdXQ/Social
%20Progress%20Index%202014%20Executive%20Su
mmary.pdf
[9] http://www.socialprogressimperative.org/data/spi#data_
table/countries/spi/
[10] http://www.ted.com/talks/michael_green_what_the_soci
al_progress_index_can_reveal_about_your_country/tran
script?language=en#t-70266
[11] http://peoplemaps.org
[12] http://en.wikipedia.org/wiki/Force-
directed_graph_drawing
[13] http://perso.uclouvain.be/vincent.blondel/research/louva
in.html

Weitere ähnliche Inhalte

Andere mochten auch

Almaverde Bio App
Almaverde Bio AppAlmaverde Bio App
Almaverde Bio AppMatteo Losi
 
Возрастная структура населения России в 2000-е годы
Возрастная структура населения России в 2000-е годыВозрастная структура населения России в 2000-е годы
Возрастная структура населения России в 2000-е годыVrachiRF
 
Bệnh Viện Xương Khớp
Bệnh Viện Xương KhớpBệnh Viện Xương Khớp
Bệnh Viện Xương Khớpyasmine699
 
The australian mining industry and its workforce
The australian mining industry and its workforceThe australian mining industry and its workforce
The australian mining industry and its workforceDavid_Bainter
 
Debbie Wakefield Resume 2015
Debbie Wakefield Resume 2015Debbie Wakefield Resume 2015
Debbie Wakefield Resume 2015Debbie Wakefield
 
Tony_Pope_Updated_Resume_08 - 2015 - REVISED
Tony_Pope_Updated_Resume_08 - 2015 - REVISEDTony_Pope_Updated_Resume_08 - 2015 - REVISED
Tony_Pope_Updated_Resume_08 - 2015 - REVISEDAnthony (Tony) Pope
 

Andere mochten auch (13)

Almaverde Bio App
Almaverde Bio AppAlmaverde Bio App
Almaverde Bio App
 
Возрастная структура населения России в 2000-е годы
Возрастная структура населения России в 2000-е годыВозрастная структура населения России в 2000-е годы
Возрастная структура населения России в 2000-е годы
 
Comunicación escrita
Comunicación escritaComunicación escrita
Comunicación escrita
 
Blog summary
Blog summaryBlog summary
Blog summary
 
한마음지 6월호
한마음지 6월호한마음지 6월호
한마음지 6월호
 
2015 WPFG CREDENTIALS_
2015 WPFG CREDENTIALS_2015 WPFG CREDENTIALS_
2015 WPFG CREDENTIALS_
 
How to Lose Weight With Skinny Body Care
How to Lose Weight With Skinny Body CareHow to Lose Weight With Skinny Body Care
How to Lose Weight With Skinny Body Care
 
Bệnh Viện Xương Khớp
Bệnh Viện Xương KhớpBệnh Viện Xương Khớp
Bệnh Viện Xương Khớp
 
The australian mining industry and its workforce
The australian mining industry and its workforceThe australian mining industry and its workforce
The australian mining industry and its workforce
 
Diseno de software
Diseno de softwareDiseno de software
Diseno de software
 
Debbie Wakefield Resume 2015
Debbie Wakefield Resume 2015Debbie Wakefield Resume 2015
Debbie Wakefield Resume 2015
 
Tony_Pope_Updated_Resume_08 - 2015 - REVISED
Tony_Pope_Updated_Resume_08 - 2015 - REVISEDTony_Pope_Updated_Resume_08 - 2015 - REVISED
Tony_Pope_Updated_Resume_08 - 2015 - REVISED
 
Tugas sik kelompok
Tugas  sik kelompokTugas  sik kelompok
Tugas sik kelompok
 

Ähnlich wie SocialMapsForACity

A REVIEW ON SOCIOLOGICAL IMPACTS OF SOCIAL NETWORKING
A REVIEW ON SOCIOLOGICAL IMPACTS OF SOCIAL NETWORKINGA REVIEW ON SOCIOLOGICAL IMPACTS OF SOCIAL NETWORKING
A REVIEW ON SOCIOLOGICAL IMPACTS OF SOCIAL NETWORKINGKelly Lipiec
 
Citizen speak out: public e-Engagement experience of Slovakia
Citizen speak out: public e-Engagement experience of Slovakia Citizen speak out: public e-Engagement experience of Slovakia
Citizen speak out: public e-Engagement experience of Slovakia Anton Shynkaruk
 
The Impacts of Social Networking and Its Analysis
The Impacts of Social Networking and Its AnalysisThe Impacts of Social Networking and Its Analysis
The Impacts of Social Networking and Its AnalysisIJMER
 
Can Social Networks Create Social Capital in Politics FINAL
Can Social Networks Create Social Capital in Politics FINALCan Social Networks Create Social Capital in Politics FINAL
Can Social Networks Create Social Capital in Politics FINALAndres Obando
 
20 9131 a review of social media edit septian
20 9131 a review of social media edit septian20 9131 a review of social media edit septian
20 9131 a review of social media edit septianIAESIJEECS
 
Usaid report
Usaid reportUsaid report
Usaid reportJamaity
 
Survey paper: Social Networking and its impact on Youth, Culture, Communicati...
Survey paper: Social Networking and its impact on Youth, Culture, Communicati...Survey paper: Social Networking and its impact on Youth, Culture, Communicati...
Survey paper: Social Networking and its impact on Youth, Culture, Communicati...Imesha Perera
 
Cosine similarity-based algorithm for social networking recommendation
Cosine similarity-based algorithm for social networking  recommendationCosine similarity-based algorithm for social networking  recommendation
Cosine similarity-based algorithm for social networking recommendationIJECEIAES
 
A Study On The Changing Trends In Social Media And Its Impact Globally
A Study On The Changing Trends In Social Media And Its Impact GloballyA Study On The Changing Trends In Social Media And Its Impact Globally
A Study On The Changing Trends In Social Media And Its Impact GloballyAlicia Edwards
 
social media seminar -Gautam dithuluru
social media seminar -Gautam dithulurusocial media seminar -Gautam dithuluru
social media seminar -Gautam dithuluruGowtham Duthuluru
 
Wefusa digital mediaandsociety_report2016
Wefusa digital mediaandsociety_report2016Wefusa digital mediaandsociety_report2016
Wefusa digital mediaandsociety_report2016OptimediaSpain
 
Social media mktg practice v8
Social media mktg practice v8Social media mktg practice v8
Social media mktg practice v8suresh sood
 
COMMENTARYVirtual Boundaries Ethical Considerations for.docx
COMMENTARYVirtual Boundaries Ethical Considerations for.docxCOMMENTARYVirtual Boundaries Ethical Considerations for.docx
COMMENTARYVirtual Boundaries Ethical Considerations for.docxdrandy1
 
COMMENTARYVirtual Boundaries Ethical Considerations for.docx
COMMENTARYVirtual Boundaries Ethical Considerations for.docxCOMMENTARYVirtual Boundaries Ethical Considerations for.docx
COMMENTARYVirtual Boundaries Ethical Considerations for.docxcargillfilberto
 
Social networking-overview
Social networking-overviewSocial networking-overview
Social networking-overviewsakshicherry
 
Social Media Define the Era in Digital Media
Social Media Define the Era in Digital MediaSocial Media Define the Era in Digital Media
Social Media Define the Era in Digital Mediainventionjournals
 
IRJET- Sentiment Analysis using Machine Learning
IRJET- Sentiment Analysis using Machine LearningIRJET- Sentiment Analysis using Machine Learning
IRJET- Sentiment Analysis using Machine LearningIRJET Journal
 

Ähnlich wie SocialMapsForACity (20)

A REVIEW ON SOCIOLOGICAL IMPACTS OF SOCIAL NETWORKING
A REVIEW ON SOCIOLOGICAL IMPACTS OF SOCIAL NETWORKINGA REVIEW ON SOCIOLOGICAL IMPACTS OF SOCIAL NETWORKING
A REVIEW ON SOCIOLOGICAL IMPACTS OF SOCIAL NETWORKING
 
Citizen speak out: public e-Engagement experience of Slovakia
Citizen speak out: public e-Engagement experience of Slovakia Citizen speak out: public e-Engagement experience of Slovakia
Citizen speak out: public e-Engagement experience of Slovakia
 
The Impacts of Social Networking and Its Analysis
The Impacts of Social Networking and Its AnalysisThe Impacts of Social Networking and Its Analysis
The Impacts of Social Networking and Its Analysis
 
G social
G socialG social
G social
 
Can Social Networks Create Social Capital in Politics FINAL
Can Social Networks Create Social Capital in Politics FINALCan Social Networks Create Social Capital in Politics FINAL
Can Social Networks Create Social Capital in Politics FINAL
 
20 9131 a review of social media edit septian
20 9131 a review of social media edit septian20 9131 a review of social media edit septian
20 9131 a review of social media edit septian
 
Usaid report
Usaid reportUsaid report
Usaid report
 
Survey paper: Social Networking and its impact on Youth, Culture, Communicati...
Survey paper: Social Networking and its impact on Youth, Culture, Communicati...Survey paper: Social Networking and its impact on Youth, Culture, Communicati...
Survey paper: Social Networking and its impact on Youth, Culture, Communicati...
 
H018144450
H018144450H018144450
H018144450
 
Cosine similarity-based algorithm for social networking recommendation
Cosine similarity-based algorithm for social networking  recommendationCosine similarity-based algorithm for social networking  recommendation
Cosine similarity-based algorithm for social networking recommendation
 
A Study On The Changing Trends In Social Media And Its Impact Globally
A Study On The Changing Trends In Social Media And Its Impact GloballyA Study On The Changing Trends In Social Media And Its Impact Globally
A Study On The Changing Trends In Social Media And Its Impact Globally
 
social media seminar -Gautam dithuluru
social media seminar -Gautam dithulurusocial media seminar -Gautam dithuluru
social media seminar -Gautam dithuluru
 
E-Governance – Some Challenges Ahead: Social Media Spurring Participation
E-Governance – Some Challenges Ahead: Social Media Spurring ParticipationE-Governance – Some Challenges Ahead: Social Media Spurring Participation
E-Governance – Some Challenges Ahead: Social Media Spurring Participation
 
Wefusa digital mediaandsociety_report2016
Wefusa digital mediaandsociety_report2016Wefusa digital mediaandsociety_report2016
Wefusa digital mediaandsociety_report2016
 
Social media mktg practice v8
Social media mktg practice v8Social media mktg practice v8
Social media mktg practice v8
 
COMMENTARYVirtual Boundaries Ethical Considerations for.docx
COMMENTARYVirtual Boundaries Ethical Considerations for.docxCOMMENTARYVirtual Boundaries Ethical Considerations for.docx
COMMENTARYVirtual Boundaries Ethical Considerations for.docx
 
COMMENTARYVirtual Boundaries Ethical Considerations for.docx
COMMENTARYVirtual Boundaries Ethical Considerations for.docxCOMMENTARYVirtual Boundaries Ethical Considerations for.docx
COMMENTARYVirtual Boundaries Ethical Considerations for.docx
 
Social networking-overview
Social networking-overviewSocial networking-overview
Social networking-overview
 
Social Media Define the Era in Digital Media
Social Media Define the Era in Digital MediaSocial Media Define the Era in Digital Media
Social Media Define the Era in Digital Media
 
IRJET- Sentiment Analysis using Machine Learning
IRJET- Sentiment Analysis using Machine LearningIRJET- Sentiment Analysis using Machine Learning
IRJET- Sentiment Analysis using Machine Learning
 

SocialMapsForACity

  • 1. Social Maps for a City Taha Kachwala 4408225 TU Delft taha.kachwala21@gmail.com ABSTRACT There has been a large scale migration towards urban cities in many countries around the globe. Traditional methods of diversifying the citizens living in the city based on race, religion, age, nationality cannot work anymore. For effectively managing a city, the governments require to know what kind of communities of people live in their city, and in which areas. Governments can involve certain communities of people while taking specific decisions. This can be done if people living in a city can be segregated effectively on the basis of their interests, background, knowledge etc. This paper suggests two methods based on social media that can be used to effectively segregate people. Both the methods will eventually create Social Maps, that can visualize clusters of communities as well as how these communities are related to each other. These maps can help city as well as national governments to collectively improve the “Social Progress Index” of the nation. 1. INTRODUCTION In real world, every person can maintain roughly 150 real-world relationships. This number is called the “Dunbar Number”. Some people may have more and some may have less. However, in an online world, people may have many more relationships, perhaps a few thousand. The offline real-world relationships that people have will roughly be an overlapping subset of relationships they have online. The online relationships are more flexible as they can operate globally and at all times of the day. If we have this online relationship between people in a specific geographical area, then we can roughly estimate their real world relationships as well. The principle of ‘Homophily’, is a powerful tendency for people to organize themselves into groups of people who are similar to themselves, it doesn’t matter whether it is online or offline. So if we accept the notion that people do, in fact, have relationships that both shape and are shaped by their interactions, then it follows that there may be some ways to measure these relationships with some level of fidelity. Social network can help to offer some information regarding these relationships though with some biases. The relationship and interaction of people in these social networks is extensively used today for recommendation systems, to find people you may know, security reasons etc then why not for managing a city effectively? This paper suggests a methodology by which user generated data on twitter, Facebook, LinkedIn, Xing, and Soundcloud can be used to map users into certain communities based on their interests. Then the relationships between different communities are found and how much they are connected with each other e.g. ‘Tech geek’ community will be closely related to ‘Web Developer’ community. These communities can further be used to obtain relevant information from the data generated by these users. People of the city can be diversified based on their social construct unlike based on ‘Race’, ‘Religion’, ‘ethnicity’ etc which has proven to be a poor proxy to represent diversity. City governments/municipalities can consult specific or related groups or communities of people to give valuable inputs on certain decisions. By categorizing city users into different communities like politics, tech geeks, radio and newspapers, sports, travel, religious ideologies, web developers and coders, bloggers, activists, age groups, etc using user modelling on their social media platforms. Then start creating relations between these groups of people. If we achieve providing this data, the governments can learn more about the social construct of the people living in a city, what they like to do and what they can do about it. This will contribute to the development of diversity. This diversity can be used in a way to tackle some intractable problems of the society in a new way. This can be used to tackle many of the urban challenges regarding environment, transportation, buildings etc. 2. RELATED WORK 2.1 Social Progress Index GDP is usually used as a measure of development of a nation. GDP has defined and shaped our lives for the last 80 years. GDP was a concept that was introduced by Simon Kuznets in a report that he delivered called “national Income, 1929-1932’. But, in that first report, Kuznets himself delivered a warning which said ‘welfare of a nation can, therefore, scarcely be inferred from a measurement of national income’. It clearly states that GDP is a tool to help us measure economic performance, but it’s not a measure of our well-being. Social Progress Index (SPI) is a new tool which helps measure the social progress of people living in a city or country. It provides a rich framework for measuring the multiple dimensions of social progress, benchmarking success, and catalyzing greater human well-being. Social Progress Index is defined as The capacity of a society to meet the basic human needs of its citizens, establish the building blocks that allow citizens and communities to enhance and sustain the quality of their lives, and create the conditions for all individuals to reach their full potential.
  • 2. Figure 1 (Source-[7]) below gives a detail about all the attributes that define ‘Social Progress Index’. You can obtain more information about Social Progress Index from [8] and watch a TED talk about it on [9]. To summarize, Table 1 gives a list of top 20 countries measured with SPI and their corresponding GDP’s. Table 1 RA NK COUNTRY SPI GDP RA NK COUNTRY SPI GDP 1 New Zealand 88.24 25,857 11 Austria 85.11 36,200 2 Switzerland 88.19 39,293 12 Germany 84.61 34,819 3 Iceland 88.07 33,880 13 United Kingdom 84.56 32,671 4 Netherlands 87.37 36,438 14 Japan 84.21 31,425 5 Norway 87.12 47,547 15 Ireland 84.05 36,723 6 Sweden 87.08 34,945 16 United States 82.77 45,336 7 Canada 86.95 35,936 17 Belgium 82.63 32,639 8 Finland 86.91 31,610 18 Slovenia 81.65 24,483 9 Denmark 86.55 32,363 19 Estonia 81.28 18,927 10 Austrailia 86.10 35,669 20 France 81.11 29,819 As we can see, United States stands 16th according to SPI, though its GDP is the highest. On the contrary, New Zealand has a GDP which is far lower than US, but its ranked #1 according to SPI. This means that people in New Zealand are much happier than those living in US. One of the main goals of this paper is to help governments increase their SPI by giving them a tool to make their citizens happier, instead of bragging about the growth in GDP. User information has been diversified across a lot of platforms. Now with the help of Social Web and Web 2.0, we can try to merge this user information from different platforms, into one. For this paper, we will try to collaborate. We can use 2-different types of models for such cross-system collaborative approaches. ● A centralized approach with standardized models that can aggregate the distributed user information over different platforms. ● A decentralized approach where dedicated software components transfer user information from one application’s representation into another. In this paper, we will rely on the former model of centralized approach. Within this centralized approach, two main submodels exist. The first submodel relies on use of standardized user models which involved applications must agree on. This involves using generalized ontologies like General User Modeling Ontology (GUMO) [1] or Friend-of-a-Friend (FOAF). For this paper we will rely on the second submodel to build meta-models that allow defining how application-dependent user data corresponds to user data from another application. The advantage with this application is that the application need not be using the same generic user model as in the first case. The ontology also allows defining relationships between the data and can be aggregated. So it is possible to get a set of user-interests from Facebook and merge it with music related interests from Soundcloud. Though the music related interest will be a subset of user-interest but will be more detailed and specific [2]. An assumption regarding this research paper is based on the fact that the system generated will be used by city governments and municipalities. Therefore, the governments can obtain required set of permissions from their citizens in applications like facebook, twitter, LinkedIn, Xing and Soundcloud. These sets of permissions are such that it does not violate the user’s privacy. 3. SOCIAL MAPS USING SWUM For building a successful Social Web User Model (SWUM) based on various platforms, we first need to analyze what kind of data we can capture from different social platforms without invading the privacy of the citizens. Due to the extensive use of OAuth protocol, many successful web platforms are ready to provide their own authorization and basic profile information for external applications. Facebook, twitter, LinkedIn etc are also ready to provide more data about their users through their API’s to external applications. Table 2 lists the relevant information that we can obtain from these various social platforms. Table 2 Platform Required User Permissions facebook id,name,gender,locale,user friends(only the friends living in the same city), email, user_actions(books,fitness,music,news), user_activities, user_interests, user_location, user_education_history
  • 3. Twitter Read tweets from timeline, who you follow LinkedIn Basic Profile Fields, language fields, skills fields, certification fields, Education fields, Position fields Xing Basic Profile Fields, professional_experience, active_email SoundCloud music interests The above websites are chosen because they provide relevant and useful information that can collectively be used to recreate a perfect user model for every citizen. Table 3 provides the semantics that we can get from each of the platforms. To be able to create a social web user model, we need to analyze which type of information and which user model dimensions should be a part of the model and which attributes in these dimensions should be supported. Table 3 Platform Semantics obtained facebook 1. Generic information about the user 2. User’s connections within the city(This only provides the list of users who access the same platform) 3. Daily activities of the user 4. Users interests 5. Behavioral Analysis Twitter Twitter can help us gain ample of information regarding the user. 1. Current activities of the user 2. Political viewpoints 3. Logical viewpoints 4. Behavioural analysis LinkedIn Basic Profile Fields, language fields, skills fields, certification fields, Education fields, Position fields Xing Basic Profile Fields, professional_experience, active_email SoundCloud music interests 3.1 User Model Dimensions After collecting the above mentioned information about our users, we need to model the dimensions of this information so that we can create our user model. The following is taxonomy of dimensions that we need to use. 1. Personal Characteristics and Demographics- Basic information like age, gender, name, address, location, contact information. We can collect this data from facebook, LinkedIn and Xing. 2. Interests- The type of hobbies and interests a user has e.g. news, politics, gaming, online shopping etc. Facebook and twitter can collectively give us an accurate set of users’ interests. 3. Mental and Physical well-being - Describes individual characteristics like physical limitations, health or mental states like stress, cognitive load. This can be derived from information received from facebook page likes, the people the user follows on twitter, job profile from LinkedIn and Xing, and music interests from soundcloud. 4. Knowledge- This describes how socially active a user is in certain fields, their educational status, skills etc. This type of information can be derived from the tweets a user makes, educational universities, their qualifications, people they follow on twitter. This type of information is dynamic and needs to be analyzed after certain periods because the knowledge level always changes from time to time. Knowledge regarding certain topics might increase or decrease overtime. This information can be derived from Social business applications like LinkedIn and Xing. 5. Individual Behaviour- This is certainly one of the most important characteristic that can define a user. This dimension has a direct impact on the previously specified dimensions and can also be used to infer information about the previous dimensions. Deriving User Behaviour is a complicated process as it is usually an implicit feature and is not available on a user’s social profile. We will discuss analysis of this dimension in more detail in section 3.6. 6. Context: In computer science this term generally refers to “any information that can be used to characterize the situation of an entity” [3]. In the area of user modelling, this term focuses on the user’s environment (location and time, devices the user uses). According to research, ‘context’ is a very important area as far as user modelling is concerned [4], but it has a very limited application for this research. This means that for an effective generation of user model for our application, it is important to cover the dimensions of Personal characteristic and Demographics, Interests, Mental and Physical well-being, Knowledge and Individual Behaviour [6].
  • 4. 3.2 User Model Attributes We have selected the required dimensions, so now we need to define the attributes of these dimensions that our user model supports. Table 4 shows an example of attributes in Personal Characteristic dimension. Table 4 3.3 WordNet Similar information about the same user is stored over many applications e.g. Facebook, twitter, LinkedIn all store basic information about the same user, but with different names. Facebook uses the term ‘username’ while twitter uses ‘handle’ to identify unique users. This problem of attribute name heterogeneity complicates a possible aggregation using Meta- Model strategy. To solve this problem, WordNet is used. WordNet defines word sense relations between words. You can dive into more details about WordNet from [5]. To summarize, if a word represents a user attribute, the relatedness between different attributes can be acquired through WordNet. As our previous assumption was that the user himself gives permissions on our application for his profiles in different applications, aggregation of personal characteristics is not a problem. But for aggregation of interests, and mental and physical well-being, a little help from WordNet would not hurt. 3.4 Use Case: Profile Aggregation The figure above shows an example of how a user profile from Facebook and LinkedIn can be aggregated together to form our Social Web User Model. For deriving our first dimension of Personal characteristics, we need to merge the data which we get from facebook and LinkedIn. We know that name, email, contact info, educational qualifications, current employment can be derived from both facebook as well as LinkedIn. But our dimension would be more concrete if we use professional data from LinkedIn, because on facebook many people are studying at ‘Hogwarts school of Wizardry’ and working at ‘Mah Lyf, Mah Rulez’. Even demographic information from LinkedIn sometimes should be preferred over facebook, but I find it more relevant to extract it from facebook because people update basic information on facebook more frequently than LinkedIn (As once they found a good job, they do not update their location, email etc). Demographics ● Gender: string ○ Male: bool ○ Female: bool ● Birthdate: Date ● Language: string ● Education: string ○ High- school: bool ○ Bachelor’s : bool ○ Master’s: bool ○ Phd: bool ● Employment: string ○ employed: bool Contact Information ● Name: string ● Mobile number: int ● e-mail: string ● Places lived: list of locations ● Current City: Location Location ● Country: string ● State: string ● City: string ● address: string
  • 5. For our second dimension regarding ‘Interests’, it is clear that facebook can provide more concrete details. We can obtain more information regarding user preferences from what the user likes, his interests that are already extracted by facebook. Similarly for ‘Knowledge’, we can extract data from LinkedIn regarding work experience, past jobs, skills etc. 3.5 Mapping Connections between different User Models Once we have the Social web User Model ready for all or most of the citizens living in the city, we can start analyzing connection and map them. This can be done in 2 steps. 1. Finding relations between different dimensions of the user model. 2. Finding relations between same dimensions of different users’ model. Let me explain both the steps using an example. Let’s assume that there are 10 male users, living in Rotterdam, in an age group of 20-30 yrs (Personal Info and demographics), they are all aerospace engineers (Knowledge), and are interested into sports and rock music (Interests). Now we have a new male user A whose ‘Interests’ dimension is incomplete. ‘A’ lives in Rotterdam, is an aerospace engineer and is in the age group of 20-30 yrs, then there is a high probability that he might be interested in sports and rock music. For the 2nd step, let us assume that there are 3 groups of users A,B and C. Group A is interested into politics, sports and music, group B is interested in politics, sports and stock market and Group C is interested into stock market and Justin Bieber. From analyzing these dimensions, we can find that politics and sports are closely related to music and stock markets, while stock markets is somewhat related to Justin Bieber. By this we can infer that Group A and B are closely related, B and C are somewhat related and A and C are not related. This concludes that politics and sports are interests that are far away from having an interest in Justin Bieber. Finally we can map these relationships using a force directed graphs. This will create a Social Map. These social maps can be generated for each and every dimension of the Social Web User Model. These dimensions can be studied in detail, and how different attributes of these dimensions are related. These maps can be used to take decisions by the government on different scenarios. Governments can easily consult and gather opinions for specific situations from the communities which will be most affected and the other closely related communities. 3.6 Behavior Analysis The kind of democracy that is offered by social media and the internet, has resulted in users exhibiting different behaviours like sharing, posting, liking, commenting, tweeting, following and, advertising on a daily basis. By analyzing these user behaviours over social media, they can be categorized into individual and collective behaviours. Individual behaviour is exhibited by a single user, whereas collective behaviour is observed when a group of users behave together for e.g. users using the same hashtag on twitter. 3.6.1 Individual Behavior Analysis Individual Behaviour can be considered one of the following 1. User-User Behaviour: Observed between two users. For e.g. befriending or following another user 2. User-Entity Behaviour: Liking a post or posting a tweet on twitter. 3. User-Community Behaviour: Joining/Leaving groups on facebook or LinkedIn Irrespective of the type of behaviour, we can use computational methodology to analyze behaviour and find interesting patterns. To analyze individual behaviour, we can trace who the user follows on twitter overtime and try to understand the underlying reasons for such followings. A machine learning program can be implemented using randomization tests or causality testing techniques [7]. 3.6.2 Collective Behavior Analysis Collective behaviour analysis can be easily derived from analyzing individuals that exhibit a collective behaviour independently. It can be achieved by aggregating the result of individual behaviour analysis. You can read more about behavioural analysis from [8]. This behavioural data is massive, expansive and, indicative of user preferences, interests and opinions. These ‘opinions’ are something which is one of the most important aspect that a city government needs to know about their citizens. These opinions can vary collectively based on the communities of users on different issues. During a certain financial situation, opinions from group of economists can be of use, while during a political situation, opinions from the politically interested community would be more relevant over others. This can help the government in managing the welfare and well-being of their citizens and collectively increase the ‘Social Progress Index’ of the city. 3.7 Conclusion-Social Maps using Social Web User Model If we can secure the above information, from every citizen in the city, we can conveniently create a user model for every citizen of the city. This data can then be effectively used to map relationships between users. We can find clusters of communities living within the city, their job profiles, salaries received, and eventually target issues that really matter. However due to the NSA-revelations and the fear of secret government surveillance programs, many people will be reluctant to provide the required permissions. However the users need to be assured that they won't be targets to any such surveillance programs or censorship and it is for the sole purpose of maintaining a city. Regarding this issue, I did a short research survey to find out how much people are willing to provide their social data. The results are shown on the next page (source- http://www.pollican.com/result/7/What_information_are_you_wil ling_to_share_with_your_government). From these survey results, I found out that people are usually not willing to give away facebook data to governments. According to this survey, the only data they are willing to share is email. From LinkedIn and Xing, people are readily willing to share their skills, experience, and educational details. Most of them are ready to give away their Job Profile as well. Twitter has the most positive results as people are willingly ready to share their twitter streams as well as follow list. The only down side of twitter is that not many people have twitter accounts or tweet regularly. But
  • 6. however, this leads us to the next step of this paper of developing ‘Social Maps’ using twitter data. 4. SOCIAL MAPS USING TWITTER DATA In this section we will use twitter data to generate social maps of a particular city and propose an algorithm for doing it. For example purpose, we will be using Munich as a target city for analyzing. As this is for a city government, we will assume that we already have Personal Information and demographics, and every citizen’s twitter handle. Twitter streams for Munich can be analyzed using coordinates 48.1333° N, 11.5667° E. This approach is in connectivity with [10]. 4.1 Algorithm for gathering Data Before starting this algorithm, create a database table having the following fields uid (int) Twitter Handle (string) Visited (bool) Relationships (int Array) Cluster or Community (string) The twitter handle will have a set of handles of users and uid is user id which will be unique. Visited will be set to true if that user has been visited so that we do not run into an infinite recursive loop. Relationships will have an array of uid’s who the user follows and is within our dataset. Cluster or community will be the group in which the user belongs, e.g. music, geek, politics, sports etc. Algorithm: 1. Select a seed user from the collection of handles. 2. Create a FIFO list that will store the handles followed by the seed user which belongs in our dataset. Then determine the corresponding uid’s of these handles and store the array in ‘Relationships’ column. Mark the current seed as visited=true. 3. Start analyzing the Relationship list of the seed user. If a Uid in that list is not visited, then go to step 2 and run a recursive loop using the Handle of the unvisited uid as new seed. 4. Run through the table, if a uid is not visited, then set the handle as seed and go to step 2, else abort. Once this algorithm has completed its run, we will have a ‘Relationship’ array for every uid in the database which will be connected to 1 or more uid’s. 4.2 Laying out the Network Graph As we are primarily interested in homophily and clustering, we will use a graph layout which can express communities of relationships. We will use force-directed graph layout algorithm [11]. With this approach, relationships will act like springs and each user node will repel nearby nodes. This graph will eventually represent the following properties: 1. People with many relationships between them will be arranged into tight clusters. 2. People with the fewest relationships between then will appear at opposite edges of the graph. 3. People who have many relationships at both ends of the graph will appear in the middle. 4. Clusters with few or no relationships between then will appear very far apart on the graph. It is based on the concept that if there are 10 people, there can be a total of 45 relationships between them which is given by the formulae (n*(n-1)/2). This means that every person is related to every other person, the force directed graph will be a perfect symmetric ball. Similarly, if these 10 people are split into 2 groups of 5 people each, and both the groups hate each other, but each member of the same group has a relation with every member of the same group, then the final force directed graph will be 2 separate balls with no connection between them. Similarly if we visualize the data from a city in this way, we will be able to measure the separateness of communities.
  • 7. 4.3 Detecting Communities and Adding color Communities can be detected by the number of shared relationships or interests within a given subgroup. We can use Louvian community detection algorithm [12], which iteratively determines communities of interest within a larger network and can assign community membership to each user accordingly. Finally we can assign each community a color arbitrarily. There maybe some user nodes that are affiliated to multiple communities. These users can be assigned a community with whom they share the maximum relationships. Another approach is that they can be given a blend of colors of all the community they are affiliated to. This can also generate a boundary between different communities. For example, a person belonging to a group primarily concerned with politics (blue) and a group primarily concerned with music (yellow) may be represented by green. Finally we can plot these users based on their locations on the map of Munich. Since we already have the geo locations of each user, we can just plot these users based on their community colors on the map. The representation of the force directed graph and the geographical social map is shown in the following figures. Each dot in both of these maps, represent a person, and the color of the dot represents the community. The geographical map is just for representational purposes. 4.4 Determining Community Interests Each user node can first be given various sizes which depend on the number of relationships it shares with other nearby nodes. This means, more the relationships, larger will be the node. Finally we can determine the total community interests by manually inspecting each node, starting with the largest nodes in the community. Typically people will organize themselves into groups like: sports, music, media, movies, politics, finance, news, arts, literature, engineering, cultures etc. Finally we can start to monitor traffic for each community using, 1. Hashtags 2. shared links 3. languages used 4. Operating systems in use (desktop, mobile, android, iOS, windows etc) 5. geographic coordinates 6. age 5. CONCLUSION Interactive and Informative social Maps can be generated by using both, the social web User Model as well as using Twitter data. These Social Maps can effectively represent the clusters of communities living is different areas in a geographical region. These Maps can be a boon for managing various activities for the city. 6. REFERENCES [1] Heckmann, D., Schwarzkopf, E., Mori, J., Dengler, D., Krner, A.: The user model [2] and context ontology gumo revisited for future web 2.0 extensions. In: Proceedings of the Int. Workshop on Contexts and Ontologies: Representation and Reasoning. CEUR Workshop Proceedings, vol. 298. CEUR-WS.org (2007). [3] Till Plumbaum, Songxuan Wu, Ernesto William, Sahin Albayrak: User Modeling for the Social Semantic Web. [4] Dey, A.K.: Understanding and using context. Personal and Ubiquitous Computing 5, 4–7 (2001) [5] Said, A., Berkovsky, S., De Luca, E.W.: Putting things in context: Challenge on context-aware movie recommendation. In: Proceedings of the Workshop on Context-Aware Movie Recommendation. pp. 2–6. CAMRa ’10, ACM, New York, NY, USA (2010) [6] Bernardo Magnini and Carlo Strapparava: Using WordNet to Improve User Modelling in a Web Document Recommender System. In: ITC-irst, Istituto per la Ricerca Scientica e Tecnologica, I-38050 Trento, ITALY. http://multiwordnet.fbk.eu/paper/WordnetWumNAACL .pdf [7] Zafarani, R., Abbasi, MA., Liu, H., Social Media Mining: An Introduction, Cambridge University Press, 2014 [8] http://www.socialprogressimperative.org/system/resourc es/W1siZiIsIjIwMTQvMDUvMjYvMTYvMzcvMDAv MjUzL1NvY2lhbF9Qcm9ncmVzc19JbmRleF8yMDE0 X0V4ZWN1dGl2ZV9TdW1tYXJ5LnBkZiJdXQ/Social %20Progress%20Index%202014%20Executive%20Su mmary.pdf [9] http://www.socialprogressimperative.org/data/spi#data_ table/countries/spi/ [10] http://www.ted.com/talks/michael_green_what_the_soci al_progress_index_can_reveal_about_your_country/tran script?language=en#t-70266 [11] http://peoplemaps.org [12] http://en.wikipedia.org/wiki/Force- directed_graph_drawing [13] http://perso.uclouvain.be/vincent.blondel/research/louva in.html