SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Downloaden Sie, um offline zu lesen
Elian CARSENAT, NamSor2016-01-28
1“Using Sociolinguistics to
Enhance Customer
Segmentation, Geomarketing
& Diversity Analytics”
Founder Bio
2
Elian CARSENAT, a computer scientist trained at ENSIIE/INRIA, started
his career at JP Morgan in Paris in 1997. He later worked as
consultant and managed business & IT projects in London, Paris,
Moscow and Shanghai.
In 2012, Elian created NamSor, a piece of sociolinguistics software to
mine the 'Big Data' and better understand international flows of
money, ideas and people. NamSor helps answer the perennial
question all countries ask about their diasporas – who are they,
where are they and what are they doing.
NamSor has been used to attract Foreign Direct Investments (FDI), to
build-up international collaboration within scientific communities, to
attract and facilitate Diaspora investment in Start-ups...
as well as other use cases.
http://fr.linkedin.com/in/eliancarsenat/en
NamSor sorts Names
3
 Names are meaningful : we use sociolinguistics to extract their
semantics and deliver actionable intelligence.
 Names reflect cultural Identity
 NamSor data mining software
recognizes the linguistic or cultural
origin of names in any alphabet /
language, with fine grain and high
accuracy.
4
Gender Gap
in
Financing
5
Gender Gap
in
Science
Diasporas in Science
(in collaboration with French INSERM)
6
Thomson Reuters WebOfScience (6 countries, 250k scientists, 50k papers)
“Analysts uncovered amazing patterns in the way scientists’ names correlate with whom they publish, and who
they cite in their papers - not just in case of a particular country, but globally. Tania Vichnevskaia of the French
National Institute for Health (INSERM) presented the paper ‘Applying onomastics to scientometrics‘ at IREG
International symposium 2015 organised by University of Maribor and Shanghai Jiao Tong University. The
paper was prepared jointly with NamSor, a private start-up company specialized in mapping international
Diasporas.”
Source: WoS; Data Mining: INSERM with NamSor
Scholar names in some Canadian Universities
Chinese, Indian, Iranian, Moroccan, Italian names
7
Canadian Science Policy Conference - CSPC2015
8
 USE CASE – BOSTON CITY GEODEMOGRAPHICS
US Census vs NamSor geo-demographics
9
 In July 2015, the US Government announced new
rules that will require all cities and towns receiving
federal housing funds to assess patterns of
segregation.
 The NY Times has published interactive maps of
Boston geo-demographics, which we can compare
with the information inferred by NamSor
US Census Race Map of Boston
10
http://www.nytimes.com/interactive/2015/07/08/us/census-race-map.html
Using Voters List
 US Census:
1pixel = 40 inhabitants
 Voters List:
1 pixel = 1 voter
11
Source: Boston Voters List
Visualization : ESRI
Data Mining: NamSor+RapidMiner
Breaking down ‘White’ and ‘Asian’ into
Portuguese, Spanish, Italian, India, Pakistan, China, ...
12
Source: Boston Voters List
Visualization : ESRI
Data Mining: NamSor+RapidMiner
Who LIVES in New York ?
13
Who OWNS in Brooklyn, NY?
Inferring origin in NYC ACRIS (Real Estate OpenData)
14
> Brooklyn zip codes
>NamSororigins
Who OWNS in Brooklyn, NY?
Inferring origin in NYC ACRIS (Real Estate OpenData)
15
Interesting ‘Little’ spots
 ZIP 11209 : Irish
 ZIP 11219 : Jewish
 ZIP 11233 : African American
 ZIP 11228 : Italian
 ZIP 11208 : Hispanic
 ZIP 11214 : Chinese
 ZIP 11235 : Ukrainian/Russian
 ZIP 11416 : Indian
 ZIP 11222 : Polish
16
 USE CASE – ELECTIONS
A Decision Tree from FLORIDA Voters List
(open data)
17
 //TODO : based on FLORIDA
Segmenting ‘Asian’ voters would improve the model
Using NamSor Origin to infer : Indian, Vietnamese, Korean, Chinese, ...
18
Tree
ethno = (Chin: DEM {DEM=3311, REP=2636, IDP=48, INT=199, LPF=9, GRE=5, CPF=2, REF=2, AIP=0, PSL=0}
ethno = (Indi: DEM {DEM=12509, REP=4565, IDP=95, INT=432, LPF=32, GRE=10, CPF=0, REF=1, AIP=3, PSL=1}
ethno = (Indo: DEM {DEM=984, REP=718, IDP=9, INT=43, LPF=4, GRE=1, CPF=1, REF=0, AIP=0, PSL=0}
ethno = (Japa: DEM {DEM=488, REP=403, IDP=9, INT=34, LPF=2, GRE=1, CPF=1, REF=0, AIP=0, PSL=0}
ethno = (Kore: REP {DEM=1148, REP=1174, IDP=11, INT=75, LPF=3, GRE=0, CPF=0, REF=0, AIP=0, PSL=0}
ethno = (Mong: DEM {DEM=24, REP=22, IDP=0, INT=0, LPF=0, GRE=1, CPF=0, REF=0, AIP=0, PSL=0}
ethno = (Paki: DEM {DEM=4411, REP=843, IDP=25, INT=110, LPF=9, GRE=6, CPF=0, REF=0, AIP=0, PSL=0}
ethno = (Viet: REP {DEM=3798, REP=5780, IDP=65, INT=272, LPF=10, GRE=5, CPF=3, REF=3, AIP=2, PSL=0}
Pakistanis, Vietnamese didn’t vote the same.
19
 USE CASE – TRAVEL INTELLIGENCE
“Incredible India” – 1.2 BN People
Indian onomastics by State/Union Territory
20
Names in LATIN, BENGALI, DEVANAGARI, GUJARATI, GURMUKHI, KANNADA, MALAYALAM,
ORIYA, TAMIL, TELUGU, ARABIC
ASSAM: Karbi Anglong, within district
Inter-caste marriages ?
21
output Input Input
clusterId clusterParentId Firstname LastName parent is FirstParentLastParent
L25354:253L64958:2797 A¡à[¹ ¹}[ššã husband ¤àl¡ü[W¡³ [W¡}>à¹
L47490:1593L64958:2797 ¤àK[¹ [W¡}>๠father ¤àl¡ü[W¡³ [W¡}>à¹
L28582:1209L47490:1593 [³>à Òü}[t¡šã husband ¤àK[¹ [W¡}>à¹
L23643:669L35593:510 ™åKƒ}à [W¡}>๚ã father ¤ài¡[W¡³ [W¡}>à¹
L23643:669L35593:510 ³à>àÒü [W¡}>๚ã father ¤ài¡[W¡³ [W¡}>à¹
L47490:1593L35593:510 W¡àì=¢ [W¡}>๠father Wå¡ì¤ [W¡}>à¹
L23643:669L35593:510 A¡àì¹ t¡àì¹ïšã husband Wå¡ì¤ [W¡}>à¹
L35593:510L47490:1593 [ƒ[ºš [W¡}>๠father W¡àì¤ [W¡}>à¹
L23643:669L47490:1593 [¹>à [W¡}>๚ã father W¡àì¤ [W¡}>à¹
parent is husband
Count of serial Column Labels
Row Labels L47490:1593 L116370:3612 L54332:2031 L184096:2297 L35593:510 L168871:1819 L135664:4438 L51271:837
L23643:669 6931 84 5099 15 2069 28 791 1924
L151415:3559 18 212 11 6446 19 1217 55 6
L28582:1209 5132 68 3565 10 1494 17 592 1323
L116370:3612 66 10283 38 72 40 321 137 29
L9839:442 2491 60 1851 9 774 11 321 660
L168871:1819 7 263 6 361 8 2730 24 4
L23642:141 1198 8 822 2 375 4 156 332
L25354:253 1181 12 932 375 7 100 323
L135664:4438 20 154 5 22 19 44 2212 3
L87032:1210 11 315 13 51 14 141 37 9
L90333:3644 3 204 2 31 190 5
L184096:2297 13 1735 3 84 11 1
L87031:697 4 136 4 12 3 137 4 5
L14495:131 614 10 432 167 4 68 163
L63724:1422 17 83 10 34 34 28 96 6
L98994:891 31 161 46 21 19 59 21 5
ASSAM: Karbi Anlong district
names clustered L116370:3612
L23643:669
L151415:3559
L47490:1593
L28582:1209
L54332:2031
L184096:2297
L168871:1819
L9839:442
L135664:4438
L87032:1210
L90333:3644
L35593:510
L51271:837
L63724:1422
L154797:1168
L64959:1796
L23642:141
L87031:697
L6536:295
L98994:891
L25354:253
L64958:2797
L30570:2614
L90334:1189
L95839:287
L100510:366
L121390:783
Other
Source: Voters List; Data Mining: NamSor
Applications to an Airline’s customer intelligence
22
A global airline :
‘For 93% of our customers, when
NamSor recognizes an Indian
name, the client has travelled to
India in the past.’
Finer grain segmentation using
names brings insights about
diasporas travel pattern
visiting family and friends in
their home country, as well as
their specific needs.
Using NamSor API
23
(1) Get an API Key
(2) Get NamSor
RapidMiner Extension
Thank you!
Elian CARSENAT,
elian.carsenat@namsor.com
Phone : +33 6 52 77 99 07
http://www.namsor.com/
24
Juillet 2013, Ambassade de Lituanie à Paris

Weitere ähnliche Inhalte

Andere mochten auch

BIG,Emotiv Systems
BIG,Emotiv SystemsBIG,Emotiv Systems
BIG,Emotiv Systems
Yogesh Garg
 

Andere mochten auch (6)

Self-Organising Maps for Customer Segmentation using R - Shane Lynn - Dublin R
Self-Organising Maps for Customer Segmentation using R - Shane Lynn - Dublin RSelf-Organising Maps for Customer Segmentation using R - Shane Lynn - Dublin R
Self-Organising Maps for Customer Segmentation using R - Shane Lynn - Dublin R
 
identité de marque et Image de marque: Audit d'image
identité de marque et Image de marque: Audit d'imageidentité de marque et Image de marque: Audit d'image
identité de marque et Image de marque: Audit d'image
 
BIG,Emotiv Systems
BIG,Emotiv SystemsBIG,Emotiv Systems
BIG,Emotiv Systems
 
Segmentation Best Practices
Segmentation Best PracticesSegmentation Best Practices
Segmentation Best Practices
 
Les bonnes pratiques pour développer son activité commerciale sur les réseaux...
Les bonnes pratiques pour développer son activité commerciale sur les réseaux...Les bonnes pratiques pour développer son activité commerciale sur les réseaux...
Les bonnes pratiques pour développer son activité commerciale sur les réseaux...
 
Identité d'entreprise ou de marque : le nom, la signature et le logo
Identité d'entreprise ou de marque : le nom, la signature et le logoIdentité d'entreprise ou de marque : le nom, la signature et le logo
Identité d'entreprise ou de marque : le nom, la signature et le logo
 

Ähnlich wie Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diversity Analytics

Ethnic entrepreneurship case study- amsterdam
Ethnic entrepreneurship   case study- amsterdamEthnic entrepreneurship   case study- amsterdam
Ethnic entrepreneurship case study- amsterdam
Think Ethnic
 
Inside India's Coder Boom
Inside India's Coder BoomInside India's Coder Boom
Inside India's Coder Boom
Tanmoy Goswami
 

Ähnlich wie Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diversity Analytics (20)

Diasporas Digital Développement
Diasporas Digital DéveloppementDiasporas Digital Développement
Diasporas Digital Développement
 
Announcing NamSorML : AI classifiers for race, ethnicity and migration studies
Announcing NamSorML :  AI classifiers for race, ethnicity and migration studiesAnnouncing NamSorML :  AI classifiers for race, ethnicity and migration studies
Announcing NamSorML : AI classifiers for race, ethnicity and migration studies
 
Western Alliance Regional Data Collaboration
Western Alliance Regional Data CollaborationWestern Alliance Regional Data Collaboration
Western Alliance Regional Data Collaboration
 
Wimmics Overview 2021
Wimmics Overview 2021Wimmics Overview 2021
Wimmics Overview 2021
 
Big_Data_Implementation_Challenges.pdf
Big_Data_Implementation_Challenges.pdfBig_Data_Implementation_Challenges.pdf
Big_Data_Implementation_Challenges.pdf
 
Overview of the 2nd. Author Profiling task at PAN-CLEF 2014
Overview of the 2nd. Author Profiling task at PAN-CLEF 2014Overview of the 2nd. Author Profiling task at PAN-CLEF 2014
Overview of the 2nd. Author Profiling task at PAN-CLEF 2014
 
Weaving the Web of People and Things for Intelligent Cities
Weaving the Web of People and Things for Intelligent CitiesWeaving the Web of People and Things for Intelligent Cities
Weaving the Web of People and Things for Intelligent Cities
 
Smart cities, sustainable cities, city branding and lean start up methodology...
Smart cities, sustainable cities, city branding and lean start up methodology...Smart cities, sustainable cities, city branding and lean start up methodology...
Smart cities, sustainable cities, city branding and lean start up methodology...
 
How to Use Geospatial Data to Identify CPG Demnd Hotspots
How to Use Geospatial Data to Identify CPG Demnd HotspotsHow to Use Geospatial Data to Identify CPG Demnd Hotspots
How to Use Geospatial Data to Identify CPG Demnd Hotspots
 
Ethnic entrepreneurship case study- amsterdam
Ethnic entrepreneurship   case study- amsterdamEthnic entrepreneurship   case study- amsterdam
Ethnic entrepreneurship case study- amsterdam
 
Richardson IQ Brew Presentation Dallas CHAPTER.pptx
Richardson IQ Brew Presentation Dallas CHAPTER.pptxRichardson IQ Brew Presentation Dallas CHAPTER.pptx
Richardson IQ Brew Presentation Dallas CHAPTER.pptx
 
EOMO Project Final Presentation Slides
EOMO Project Final Presentation SlidesEOMO Project Final Presentation Slides
EOMO Project Final Presentation Slides
 
Global Power City index 2016 | The Mori Memorial Foundation
Global Power City index 2016 | The Mori Memorial FoundationGlobal Power City index 2016 | The Mori Memorial Foundation
Global Power City index 2016 | The Mori Memorial Foundation
 
Inside India's Coder Boom
Inside India's Coder BoomInside India's Coder Boom
Inside India's Coder Boom
 
2015 Tech M&A Monthly - Q1 Report
2015 Tech M&A Monthly - Q1 Report2015 Tech M&A Monthly - Q1 Report
2015 Tech M&A Monthly - Q1 Report
 
Details about visualization
Details about visualizationDetails about visualization
Details about visualization
 
Pres7
Pres7Pres7
Pres7
 
Live Social Semantics @ ESWC2010
Live Social Semantics @ ESWC2010Live Social Semantics @ ESWC2010
Live Social Semantics @ ESWC2010
 
Live Social Semantics @ ISWC2009
Live Social Semantics @ ISWC2009Live Social Semantics @ ISWC2009
Live Social Semantics @ ISWC2009
 
190409 ai in japan
190409 ai in japan190409 ai in japan
190409 ai in japan
 

Mehr von Instituto Diáspora Brasil (IDB)

Mehr von Instituto Diáspora Brasil (IDB) (20)

O Voto do Brasileiro no Exterior e a Necessidade de Uma Reforma Eleitoral
O Voto do Brasileiro no Exterior e a Necessidade de Uma Reforma EleitoralO Voto do Brasileiro no Exterior e a Necessidade de Uma Reforma Eleitoral
O Voto do Brasileiro no Exterior e a Necessidade de Uma Reforma Eleitoral
 
A Diaspora Brasileira e o Governo Lula: Um Framework Transnacional para Pens...
A Diaspora Brasileira e o  Governo Lula: Um Framework Transnacional para Pens...A Diaspora Brasileira e o  Governo Lula: Um Framework Transnacional para Pens...
A Diaspora Brasileira e o Governo Lula: Um Framework Transnacional para Pens...
 
Instituto Diaspora Brasil Newsletter - January 2024
Instituto Diaspora Brasil Newsletter - January 2024Instituto Diaspora Brasil Newsletter - January 2024
Instituto Diaspora Brasil Newsletter - January 2024
 
Brasileiros nos Estados Unidos e em Massachusetts: Um Perfil Demográfico e Ec...
Brasileiros nos Estados Unidos e em Massachusetts: Um Perfil Demográfico e Ec...Brasileiros nos Estados Unidos e em Massachusetts: Um Perfil Demográfico e Ec...
Brasileiros nos Estados Unidos e em Massachusetts: Um Perfil Demográfico e Ec...
 
Do “brain drain” às redes científicas globais.pptx
Do “brain drain” às redes científicas globais.pptxDo “brain drain” às redes científicas globais.pptx
Do “brain drain” às redes científicas globais.pptx
 
The Immigration Debate: A Racial Project 1608 - 2023race cam.pptx
The Immigration Debate: A Racial Project 1608 - 2023race cam.pptxThe Immigration Debate: A Racial Project 1608 - 2023race cam.pptx
The Immigration Debate: A Racial Project 1608 - 2023race cam.pptx
 
As Políticasde Vinculação do Brasil
As Políticasde Vinculação do BrasilAs Políticasde Vinculação do Brasil
As Políticasde Vinculação do Brasil
 
Imigração Transnacional: Um Novo Modo de (Re)Integração
Imigração Transnacional: Um Novo Modo de (Re)IntegraçãoImigração Transnacional: Um Novo Modo de (Re)Integração
Imigração Transnacional: Um Novo Modo de (Re)Integração
 
Transnational social protection Setting the agenda
Transnational social protection Setting the agendaTransnational social protection Setting the agenda
Transnational social protection Setting the agenda
 
Transnational Social Protection
Transnational Social ProtectionTransnational Social Protection
Transnational Social Protection
 
Anegepe - Apresentação do Lvro Brasileiros nos Estados Unidos
Anegepe - Apresentação do Lvro Brasileiros nos Estados UnidosAnegepe - Apresentação do Lvro Brasileiros nos Estados Unidos
Anegepe - Apresentação do Lvro Brasileiros nos Estados Unidos
 
Anegepe - Apresentação do Livro Brasileiros
Anegepe - Apresentação do Livro Brasileiros Anegepe - Apresentação do Livro Brasileiros
Anegepe - Apresentação do Livro Brasileiros
 
Boston by the Numbers
Boston by the NumbersBoston by the Numbers
Boston by the Numbers
 
Immigrant Integration
Immigrant IntegrationImmigrant Integration
Immigrant Integration
 
Brasileiros em Portugal: De Volta as Raízes Lusitanas
Brasileiros em Portugal: De Volta as Raízes LusitanasBrasileiros em Portugal: De Volta as Raízes Lusitanas
Brasileiros em Portugal: De Volta as Raízes Lusitanas
 
Perfil Migratório do Brasil - 2009
Perfil Migratório do Brasil - 2009Perfil Migratório do Brasil - 2009
Perfil Migratório do Brasil - 2009
 
A prosperous Boston for All - Vietnamese
A prosperous Boston for All - Vietnamese A prosperous Boston for All - Vietnamese
A prosperous Boston for All - Vietnamese
 
A Prosperous Boston for All - Vietnamese
A Prosperous Boston for All - Vietnamese A Prosperous Boston for All - Vietnamese
A Prosperous Boston for All - Vietnamese
 
The Five Largest Foreign-Born Groups in Massachusetts
The Five Largest Foreign-Born Groups in MassachusettsThe Five Largest Foreign-Born Groups in Massachusetts
The Five Largest Foreign-Born Groups in Massachusetts
 
A Prosperous Boston for All - Haitians
A Prosperous Boston for All - HaitiansA Prosperous Boston for All - Haitians
A Prosperous Boston for All - Haitians
 

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Sustainability by Design: Assessment Tool for Just Energy Transition Plans
Sustainability by Design: Assessment Tool for Just Energy Transition PlansSustainability by Design: Assessment Tool for Just Energy Transition Plans
Sustainability by Design: Assessment Tool for Just Energy Transition Plans
 
Junnar ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
Junnar ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...Junnar ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...
Junnar ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
 
Pimple Gurav ) Call Girls Service Pune | 8005736733 Independent Escorts & Dat...
Pimple Gurav ) Call Girls Service Pune | 8005736733 Independent Escorts & Dat...Pimple Gurav ) Call Girls Service Pune | 8005736733 Independent Escorts & Dat...
Pimple Gurav ) Call Girls Service Pune | 8005736733 Independent Escorts & Dat...
 
1935 CONSTITUTION REPORT IN RIPH FINALLS
1935 CONSTITUTION REPORT IN RIPH FINALLS1935 CONSTITUTION REPORT IN RIPH FINALLS
1935 CONSTITUTION REPORT IN RIPH FINALLS
 
Coastal Protection Measures in Hulhumale'
Coastal Protection Measures in Hulhumale'Coastal Protection Measures in Hulhumale'
Coastal Protection Measures in Hulhumale'
 
A Press for the Planet: Journalism in the face of the Environmental Crisis
A Press for the Planet: Journalism in the face of the Environmental CrisisA Press for the Planet: Journalism in the face of the Environmental Crisis
A Press for the Planet: Journalism in the face of the Environmental Crisis
 
Call On 6297143586 Yerwada Call Girls In All Pune 24/7 Provide Call With Bes...
Call On 6297143586  Yerwada Call Girls In All Pune 24/7 Provide Call With Bes...Call On 6297143586  Yerwada Call Girls In All Pune 24/7 Provide Call With Bes...
Call On 6297143586 Yerwada Call Girls In All Pune 24/7 Provide Call With Bes...
 
Scaling up coastal adaptation in Maldives through the NAP process
Scaling up coastal adaptation in Maldives through the NAP processScaling up coastal adaptation in Maldives through the NAP process
Scaling up coastal adaptation in Maldives through the NAP process
 
Antisemitism Awareness Act: pénaliser la critique de l'Etat d'Israël
Antisemitism Awareness Act: pénaliser la critique de l'Etat d'IsraëlAntisemitism Awareness Act: pénaliser la critique de l'Etat d'Israël
Antisemitism Awareness Act: pénaliser la critique de l'Etat d'Israël
 
(NEHA) Call Girls Nagpur Call Now 8250077686 Nagpur Escorts 24x7
(NEHA) Call Girls Nagpur Call Now 8250077686 Nagpur Escorts 24x7(NEHA) Call Girls Nagpur Call Now 8250077686 Nagpur Escorts 24x7
(NEHA) Call Girls Nagpur Call Now 8250077686 Nagpur Escorts 24x7
 
celebrity 💋 Patna Escorts Just Dail 8250092165 service available anytime 24 hour
celebrity 💋 Patna Escorts Just Dail 8250092165 service available anytime 24 hourcelebrity 💋 Patna Escorts Just Dail 8250092165 service available anytime 24 hour
celebrity 💋 Patna Escorts Just Dail 8250092165 service available anytime 24 hour
 
VIP Model Call Girls Baramati ( Pune ) Call ON 8005736733 Starting From 5K to...
VIP Model Call Girls Baramati ( Pune ) Call ON 8005736733 Starting From 5K to...VIP Model Call Girls Baramati ( Pune ) Call ON 8005736733 Starting From 5K to...
VIP Model Call Girls Baramati ( Pune ) Call ON 8005736733 Starting From 5K to...
 
2024: The FAR, Federal Acquisition Regulations, Part 31
2024: The FAR, Federal Acquisition Regulations, Part 312024: The FAR, Federal Acquisition Regulations, Part 31
2024: The FAR, Federal Acquisition Regulations, Part 31
 
The U.S. Budget and Economic Outlook (Presentation)
The U.S. Budget and Economic Outlook (Presentation)The U.S. Budget and Economic Outlook (Presentation)
The U.S. Budget and Economic Outlook (Presentation)
 
AHMR volume 10 number 1 January-April 2024
AHMR volume 10 number 1 January-April 2024AHMR volume 10 number 1 January-April 2024
AHMR volume 10 number 1 January-April 2024
 
Chakan ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
Chakan ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...Chakan ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...
Chakan ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
 
Just Call Vip call girls Wardha Escorts ☎️8617370543 Starting From 5K to 25K ...
Just Call Vip call girls Wardha Escorts ☎️8617370543 Starting From 5K to 25K ...Just Call Vip call girls Wardha Escorts ☎️8617370543 Starting From 5K to 25K ...
Just Call Vip call girls Wardha Escorts ☎️8617370543 Starting From 5K to 25K ...
 
The Economic and Organised Crime Office (EOCO) has been advised by the Office...
The Economic and Organised Crime Office (EOCO) has been advised by the Office...The Economic and Organised Crime Office (EOCO) has been advised by the Office...
The Economic and Organised Crime Office (EOCO) has been advised by the Office...
 
VIP Model Call Girls Kiwale ( Pune ) Call ON 8005736733 Starting From 5K to 2...
VIP Model Call Girls Kiwale ( Pune ) Call ON 8005736733 Starting From 5K to 2...VIP Model Call Girls Kiwale ( Pune ) Call ON 8005736733 Starting From 5K to 2...
VIP Model Call Girls Kiwale ( Pune ) Call ON 8005736733 Starting From 5K to 2...
 
Tuvalu Coastal Adaptation Project (TCAP)
Tuvalu Coastal Adaptation Project (TCAP)Tuvalu Coastal Adaptation Project (TCAP)
Tuvalu Coastal Adaptation Project (TCAP)
 

Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diversity Analytics

  • 1. Elian CARSENAT, NamSor2016-01-28 1“Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diversity Analytics”
  • 2. Founder Bio 2 Elian CARSENAT, a computer scientist trained at ENSIIE/INRIA, started his career at JP Morgan in Paris in 1997. He later worked as consultant and managed business & IT projects in London, Paris, Moscow and Shanghai. In 2012, Elian created NamSor, a piece of sociolinguistics software to mine the 'Big Data' and better understand international flows of money, ideas and people. NamSor helps answer the perennial question all countries ask about their diasporas – who are they, where are they and what are they doing. NamSor has been used to attract Foreign Direct Investments (FDI), to build-up international collaboration within scientific communities, to attract and facilitate Diaspora investment in Start-ups... as well as other use cases. http://fr.linkedin.com/in/eliancarsenat/en
  • 3. NamSor sorts Names 3  Names are meaningful : we use sociolinguistics to extract their semantics and deliver actionable intelligence.  Names reflect cultural Identity  NamSor data mining software recognizes the linguistic or cultural origin of names in any alphabet / language, with fine grain and high accuracy.
  • 6. Diasporas in Science (in collaboration with French INSERM) 6 Thomson Reuters WebOfScience (6 countries, 250k scientists, 50k papers) “Analysts uncovered amazing patterns in the way scientists’ names correlate with whom they publish, and who they cite in their papers - not just in case of a particular country, but globally. Tania Vichnevskaia of the French National Institute for Health (INSERM) presented the paper ‘Applying onomastics to scientometrics‘ at IREG International symposium 2015 organised by University of Maribor and Shanghai Jiao Tong University. The paper was prepared jointly with NamSor, a private start-up company specialized in mapping international Diasporas.” Source: WoS; Data Mining: INSERM with NamSor
  • 7. Scholar names in some Canadian Universities Chinese, Indian, Iranian, Moroccan, Italian names 7 Canadian Science Policy Conference - CSPC2015
  • 8. 8  USE CASE – BOSTON CITY GEODEMOGRAPHICS
  • 9. US Census vs NamSor geo-demographics 9  In July 2015, the US Government announced new rules that will require all cities and towns receiving federal housing funds to assess patterns of segregation.  The NY Times has published interactive maps of Boston geo-demographics, which we can compare with the information inferred by NamSor
  • 10. US Census Race Map of Boston 10 http://www.nytimes.com/interactive/2015/07/08/us/census-race-map.html
  • 11. Using Voters List  US Census: 1pixel = 40 inhabitants  Voters List: 1 pixel = 1 voter 11 Source: Boston Voters List Visualization : ESRI Data Mining: NamSor+RapidMiner
  • 12. Breaking down ‘White’ and ‘Asian’ into Portuguese, Spanish, Italian, India, Pakistan, China, ... 12 Source: Boston Voters List Visualization : ESRI Data Mining: NamSor+RapidMiner
  • 13. Who LIVES in New York ? 13
  • 14. Who OWNS in Brooklyn, NY? Inferring origin in NYC ACRIS (Real Estate OpenData) 14 > Brooklyn zip codes >NamSororigins
  • 15. Who OWNS in Brooklyn, NY? Inferring origin in NYC ACRIS (Real Estate OpenData) 15 Interesting ‘Little’ spots  ZIP 11209 : Irish  ZIP 11219 : Jewish  ZIP 11233 : African American  ZIP 11228 : Italian  ZIP 11208 : Hispanic  ZIP 11214 : Chinese  ZIP 11235 : Ukrainian/Russian  ZIP 11416 : Indian  ZIP 11222 : Polish
  • 16. 16  USE CASE – ELECTIONS
  • 17. A Decision Tree from FLORIDA Voters List (open data) 17  //TODO : based on FLORIDA
  • 18. Segmenting ‘Asian’ voters would improve the model Using NamSor Origin to infer : Indian, Vietnamese, Korean, Chinese, ... 18 Tree ethno = (Chin: DEM {DEM=3311, REP=2636, IDP=48, INT=199, LPF=9, GRE=5, CPF=2, REF=2, AIP=0, PSL=0} ethno = (Indi: DEM {DEM=12509, REP=4565, IDP=95, INT=432, LPF=32, GRE=10, CPF=0, REF=1, AIP=3, PSL=1} ethno = (Indo: DEM {DEM=984, REP=718, IDP=9, INT=43, LPF=4, GRE=1, CPF=1, REF=0, AIP=0, PSL=0} ethno = (Japa: DEM {DEM=488, REP=403, IDP=9, INT=34, LPF=2, GRE=1, CPF=1, REF=0, AIP=0, PSL=0} ethno = (Kore: REP {DEM=1148, REP=1174, IDP=11, INT=75, LPF=3, GRE=0, CPF=0, REF=0, AIP=0, PSL=0} ethno = (Mong: DEM {DEM=24, REP=22, IDP=0, INT=0, LPF=0, GRE=1, CPF=0, REF=0, AIP=0, PSL=0} ethno = (Paki: DEM {DEM=4411, REP=843, IDP=25, INT=110, LPF=9, GRE=6, CPF=0, REF=0, AIP=0, PSL=0} ethno = (Viet: REP {DEM=3798, REP=5780, IDP=65, INT=272, LPF=10, GRE=5, CPF=3, REF=3, AIP=2, PSL=0} Pakistanis, Vietnamese didn’t vote the same.
  • 19. 19  USE CASE – TRAVEL INTELLIGENCE
  • 20. “Incredible India” – 1.2 BN People Indian onomastics by State/Union Territory 20 Names in LATIN, BENGALI, DEVANAGARI, GUJARATI, GURMUKHI, KANNADA, MALAYALAM, ORIYA, TAMIL, TELUGU, ARABIC
  • 21. ASSAM: Karbi Anglong, within district Inter-caste marriages ? 21 output Input Input clusterId clusterParentId Firstname LastName parent is FirstParentLastParent L25354:253L64958:2797 A¡à[¹ ¹}[ššã husband ¤àl¡ü[W¡³ [W¡}>๠L47490:1593L64958:2797 ¤àK[¹ [W¡}>๠father ¤àl¡ü[W¡³ [W¡}>๠L28582:1209L47490:1593 [³>à Òü}[t¡šã husband ¤àK[¹ [W¡}>๠L23643:669L35593:510 ™åKƒ}à [W¡}>๚ã father ¤ài¡[W¡³ [W¡}>๠L23643:669L35593:510 ³à>àÒü [W¡}>๚ã father ¤ài¡[W¡³ [W¡}>๠L47490:1593L35593:510 W¡àì=¢ [W¡}>๠father Wå¡ì¤ [W¡}>๠L23643:669L35593:510 A¡àì¹ t¡àì¹ïšã husband Wå¡ì¤ [W¡}>๠L35593:510L47490:1593 [ƒ[ºš [W¡}>๠father W¡àì¤ [W¡}>๠L23643:669L47490:1593 [¹>à [W¡}>๚ã father W¡àì¤ [W¡}>๠parent is husband Count of serial Column Labels Row Labels L47490:1593 L116370:3612 L54332:2031 L184096:2297 L35593:510 L168871:1819 L135664:4438 L51271:837 L23643:669 6931 84 5099 15 2069 28 791 1924 L151415:3559 18 212 11 6446 19 1217 55 6 L28582:1209 5132 68 3565 10 1494 17 592 1323 L116370:3612 66 10283 38 72 40 321 137 29 L9839:442 2491 60 1851 9 774 11 321 660 L168871:1819 7 263 6 361 8 2730 24 4 L23642:141 1198 8 822 2 375 4 156 332 L25354:253 1181 12 932 375 7 100 323 L135664:4438 20 154 5 22 19 44 2212 3 L87032:1210 11 315 13 51 14 141 37 9 L90333:3644 3 204 2 31 190 5 L184096:2297 13 1735 3 84 11 1 L87031:697 4 136 4 12 3 137 4 5 L14495:131 614 10 432 167 4 68 163 L63724:1422 17 83 10 34 34 28 96 6 L98994:891 31 161 46 21 19 59 21 5 ASSAM: Karbi Anlong district names clustered L116370:3612 L23643:669 L151415:3559 L47490:1593 L28582:1209 L54332:2031 L184096:2297 L168871:1819 L9839:442 L135664:4438 L87032:1210 L90333:3644 L35593:510 L51271:837 L63724:1422 L154797:1168 L64959:1796 L23642:141 L87031:697 L6536:295 L98994:891 L25354:253 L64958:2797 L30570:2614 L90334:1189 L95839:287 L100510:366 L121390:783 Other Source: Voters List; Data Mining: NamSor
  • 22. Applications to an Airline’s customer intelligence 22 A global airline : ‘For 93% of our customers, when NamSor recognizes an Indian name, the client has travelled to India in the past.’ Finer grain segmentation using names brings insights about diasporas travel pattern visiting family and friends in their home country, as well as their specific needs.
  • 23. Using NamSor API 23 (1) Get an API Key (2) Get NamSor RapidMiner Extension
  • 24. Thank you! Elian CARSENAT, elian.carsenat@namsor.com Phone : +33 6 52 77 99 07 http://www.namsor.com/ 24 Juillet 2013, Ambassade de Lituanie à Paris