SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Downloaden Sie, um offline zu lesen
Kernix
digital & data
1
9th Meetup RecSysFr
Recommendation systems at
Kernix
Stanislas Morbieu
PhD candidate in Machine Learning
Cristian Perez
Data Scientist
2
A few words about us...
KERNIX DATA LAB 3KERNIX DATA LAB
Kernix
Kernix guide and accelerate companies
from PoC to IT integration in data science
projects.
Especially in risk prediction in the
bank/insurance sector, for Natixis, Euler
Hermes or Allianz.
2
founders
50
employees
600
completed
projects
3,8
millions €
in turnover
10
published
books
17
years of
experience
Data Lab
Digital Factory
Data Lab
KERNIX DATA LAB 4
4
Data Lab
Risk prediction :
∙ Allianz UK
∙ Euler Hermes
- Natixis
Marketing Automation :
∙ Radium One
∙ Performics
Predictive maintenance :
∙ SNCF
∙ Enedis
Fraud detection :
∙ Colas
Data Viz :
∙ SolarImpulse
∙ Cop 21
Ads recommendation
5
Difficulties
● Texts are not well written (typos,
grammar mistakes,
abbreviations)
● High volume and continuously
incoming
● High renewal rate
● Cold start
● Location and price
considerations
6
Strategies
7
● Recommendation based on
semantic categories and user
behavior
● Scalable semantic content
based recommendation
● Ongoing research project
Recommendation based on semantic categories
and user behavior
8
9
Recommendation based on semantic categories and user
behavior
“Dolls” category
“motorbikes”
category
10
Recommendation based on semantic categories and user
behavior
“Dolls” category
“motorbikes”
category
11
Recommendation based on semantic categories and user
behavior
“Dolls” category
“motorbikes”
category
12
Recommendation based on semantic categories and user
behavior
“Dolls” category
“motorbikes”
category
13
Recommendation based on semantic categories and user
behavior
“Dolls” category
“motorbikes”
category
Scalable semantic content based
recommendation
14
Recommendation Kernix
15
Process steps:
Formatting (lemmatization, tokenization,...) of the
description and the title
Computing a semantic vector for a classified ad
Computing distances with the most probables
neighbors
Saving relationships in the graph database
On demand, search the most similar classified ads of a
given one
Technical Stack
16
Pipeline
17
Monitoring
18
Conclusions
19
Conclusions
20
● Each project is unique:
○ Different requirements
○ Several strategies
● Tradeoffs and specific aspects to consider:
○ Text length (short or long text)
○ Take into account user behavior
○ Scalability (high renewal rate)
○ location and price consideration
○ types of content (pictures, text, song…)
● Other recommendation engines:
○ On demand video platform
○ Workshop recommendation
○ Content recommendation
○ Scientific publications
recommendation for a
pharmaceutical company
Agence Kernix
6 rue Lalande 75014 Paris
+33 (0)1 53 98 73 40
info@kernix.com
www.kernix.com
Merci
de votre attention
Stanislas Morbieu
PhD in Machine Learning
smorbieu@kernix.com
Cristian Perez
Data scientist
cperez@kernix.com

Weitere ähnliche Inhalte

Was ist angesagt?

Get symposium oct 1st 2015 Rotterdam - Ict for supply chain innovation
Get symposium oct 1st 2015 Rotterdam - Ict for supply chain innovationGet symposium oct 1st 2015 Rotterdam - Ict for supply chain innovation
Get symposium oct 1st 2015 Rotterdam - Ict for supply chain innovationJos van Hillegersberg
 
6th International conference on Advanced Computing (ADCOM-2020)
6th International conference on Advanced Computing (ADCOM-2020)6th International conference on Advanced Computing (ADCOM-2020)
6th International conference on Advanced Computing (ADCOM-2020)ijistjournal
 
Session 1.1 linked data applied: a field report from the netherlands
Session 1.1   linked data applied: a field report from the netherlandsSession 1.1   linked data applied: a field report from the netherlands
Session 1.1 linked data applied: a field report from the netherlandssemanticsconference
 
CDE Marketplace Sept 2016: Roke Manor Research Ltd (Autonomy & Big Data)
CDE Marketplace Sept 2016: Roke Manor Research Ltd (Autonomy & Big Data)CDE Marketplace Sept 2016: Roke Manor Research Ltd (Autonomy & Big Data)
CDE Marketplace Sept 2016: Roke Manor Research Ltd (Autonomy & Big Data)Defence and Security Accelerator
 
Call for papers - 11th International Conference on VLSI (VLSI 2020)
Call for papers - 11th International Conference on VLSI (VLSI 2020)Call for papers - 11th International Conference on VLSI (VLSI 2020)
Call for papers - 11th International Conference on VLSI (VLSI 2020)sipij
 
Lynx at TeReCom workshop at JURIX 2017
Lynx at TeReCom workshop at JURIX 2017Lynx at TeReCom workshop at JURIX 2017
Lynx at TeReCom workshop at JURIX 2017Lynx Project
 
SSHOC General Presentation
SSHOC General PresentationSSHOC General Presentation
SSHOC General PresentationSSHOC
 
Startups and Entrepreneurs Boosting Big Data Corporate Innovation
Startups and Entrepreneurs Boosting Big Data Corporate InnovationStartups and Entrepreneurs Boosting Big Data Corporate Innovation
Startups and Entrepreneurs Boosting Big Data Corporate InnovationEuropean Data Incubator (EDI)
 
BlueBRIDGE: Major Achievements & future vision
BlueBRIDGE: Major Achievements & future visionBlueBRIDGE: Major Achievements & future vision
BlueBRIDGE: Major Achievements & future visionBlue BRIDGE
 
PhD Projects in ContikiOS Research Ideas
PhD Projects in ContikiOS  Research IdeasPhD Projects in ContikiOS  Research Ideas
PhD Projects in ContikiOS Research IdeasPhD Services
 
Lynx Pilot 1 at ReMeP 2019
Lynx Pilot 1 at ReMeP 2019Lynx Pilot 1 at ReMeP 2019
Lynx Pilot 1 at ReMeP 2019Lynx Project
 
Lessons learned from the design of the SCIM API
Lessons learned from the design of the SCIM APILessons learned from the design of the SCIM API
Lessons learned from the design of the SCIM APIErik Wahlström
 
BDVA Big Data Summit 2016 (Valencia, Spain): Cross-Lingual Knowledge Extracti...
BDVA Big Data Summit 2016 (Valencia, Spain): Cross-Lingual Knowledge Extracti...BDVA Big Data Summit 2016 (Valencia, Spain): Cross-Lingual Knowledge Extracti...
BDVA Big Data Summit 2016 (Valencia, Spain): Cross-Lingual Knowledge Extracti...webLyzard technology
 
Statistical Modelling and Data Analytics
Statistical Modelling and Data AnalyticsStatistical Modelling and Data Analytics
Statistical Modelling and Data AnalyticsEUDAT
 
Call for papers - 11th International Conference on VLSI (VLSI 2020)
Call for papers -  11th International Conference on VLSI (VLSI 2020)Call for papers -  11th International Conference on VLSI (VLSI 2020)
Call for papers - 11th International Conference on VLSI (VLSI 2020)sipij
 
DataPorts Project presentation
DataPorts Project presentationDataPorts Project presentation
DataPorts Project presentationDataPortsProject
 
IASS2015 conference announcement
IASS2015 conference announcementIASS2015 conference announcement
IASS2015 conference announcementJeroen Coenders
 

Was ist angesagt? (20)

Get symposium oct 1st 2015 Rotterdam - Ict for supply chain innovation
Get symposium oct 1st 2015 Rotterdam - Ict for supply chain innovationGet symposium oct 1st 2015 Rotterdam - Ict for supply chain innovation
Get symposium oct 1st 2015 Rotterdam - Ict for supply chain innovation
 
Financial Services
Financial ServicesFinancial Services
Financial Services
 
6th International conference on Advanced Computing (ADCOM-2020)
6th International conference on Advanced Computing (ADCOM-2020)6th International conference on Advanced Computing (ADCOM-2020)
6th International conference on Advanced Computing (ADCOM-2020)
 
Session 1.1 linked data applied: a field report from the netherlands
Session 1.1   linked data applied: a field report from the netherlandsSession 1.1   linked data applied: a field report from the netherlands
Session 1.1 linked data applied: a field report from the netherlands
 
CDE Marketplace Sept 2016: Roke Manor Research Ltd (Autonomy & Big Data)
CDE Marketplace Sept 2016: Roke Manor Research Ltd (Autonomy & Big Data)CDE Marketplace Sept 2016: Roke Manor Research Ltd (Autonomy & Big Data)
CDE Marketplace Sept 2016: Roke Manor Research Ltd (Autonomy & Big Data)
 
20161130 Booosting Circular Demolition Erasmus MC - Presentation Elma Durmise...
20161130 Booosting Circular Demolition Erasmus MC - Presentation Elma Durmise...20161130 Booosting Circular Demolition Erasmus MC - Presentation Elma Durmise...
20161130 Booosting Circular Demolition Erasmus MC - Presentation Elma Durmise...
 
Call for papers - 11th International Conference on VLSI (VLSI 2020)
Call for papers - 11th International Conference on VLSI (VLSI 2020)Call for papers - 11th International Conference on VLSI (VLSI 2020)
Call for papers - 11th International Conference on VLSI (VLSI 2020)
 
Succeed Introduction - Rafael Carrasco
Succeed Introduction  - Rafael CarrascoSucceed Introduction  - Rafael Carrasco
Succeed Introduction - Rafael Carrasco
 
Lynx at TeReCom workshop at JURIX 2017
Lynx at TeReCom workshop at JURIX 2017Lynx at TeReCom workshop at JURIX 2017
Lynx at TeReCom workshop at JURIX 2017
 
SSHOC General Presentation
SSHOC General PresentationSSHOC General Presentation
SSHOC General Presentation
 
Startups and Entrepreneurs Boosting Big Data Corporate Innovation
Startups and Entrepreneurs Boosting Big Data Corporate InnovationStartups and Entrepreneurs Boosting Big Data Corporate Innovation
Startups and Entrepreneurs Boosting Big Data Corporate Innovation
 
BlueBRIDGE: Major Achievements & future vision
BlueBRIDGE: Major Achievements & future visionBlueBRIDGE: Major Achievements & future vision
BlueBRIDGE: Major Achievements & future vision
 
PhD Projects in ContikiOS Research Ideas
PhD Projects in ContikiOS  Research IdeasPhD Projects in ContikiOS  Research Ideas
PhD Projects in ContikiOS Research Ideas
 
Lynx Pilot 1 at ReMeP 2019
Lynx Pilot 1 at ReMeP 2019Lynx Pilot 1 at ReMeP 2019
Lynx Pilot 1 at ReMeP 2019
 
Lessons learned from the design of the SCIM API
Lessons learned from the design of the SCIM APILessons learned from the design of the SCIM API
Lessons learned from the design of the SCIM API
 
BDVA Big Data Summit 2016 (Valencia, Spain): Cross-Lingual Knowledge Extracti...
BDVA Big Data Summit 2016 (Valencia, Spain): Cross-Lingual Knowledge Extracti...BDVA Big Data Summit 2016 (Valencia, Spain): Cross-Lingual Knowledge Extracti...
BDVA Big Data Summit 2016 (Valencia, Spain): Cross-Lingual Knowledge Extracti...
 
Statistical Modelling and Data Analytics
Statistical Modelling and Data AnalyticsStatistical Modelling and Data Analytics
Statistical Modelling and Data Analytics
 
Call for papers - 11th International Conference on VLSI (VLSI 2020)
Call for papers -  11th International Conference on VLSI (VLSI 2020)Call for papers -  11th International Conference on VLSI (VLSI 2020)
Call for papers - 11th International Conference on VLSI (VLSI 2020)
 
DataPorts Project presentation
DataPorts Project presentationDataPorts Project presentation
DataPorts Project presentation
 
IASS2015 conference announcement
IASS2015 conference announcementIASS2015 conference announcement
IASS2015 conference announcement
 

Ähnlich wie Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kernix, and Stanislas MORBIEU, PhD candidate in Computer Science at Paris Descartes University

Energy Central Webinar on June 14, 2016
Energy Central Webinar on June 14, 2016Energy Central Webinar on June 14, 2016
Energy Central Webinar on June 14, 2016OMNETRIC
 
Logistiek manager van het Jaar - sessie Robuust Plannen
Logistiek manager van het Jaar - sessie Robuust PlannenLogistiek manager van het Jaar - sessie Robuust Plannen
Logistiek manager van het Jaar - sessie Robuust PlannenBas Van Bree
 
The RoTechnology official presentation
The RoTechnology official presentationThe RoTechnology official presentation
The RoTechnology official presentationRo Technology
 
SnT presentation for website
SnT presentation for websiteSnT presentation for website
SnT presentation for websiteAnnaYakimovich
 
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...Florian Wilhelm
 
Protech (Computer Vision, Machine Learning, Deep Learning, Image Processing, ...
Protech (Computer Vision, Machine Learning, Deep Learning, Image Processing, ...Protech (Computer Vision, Machine Learning, Deep Learning, Image Processing, ...
Protech (Computer Vision, Machine Learning, Deep Learning, Image Processing, ...Protech
 
Aftermarket2012 cargotec malcolmyoull
Aftermarket2012 cargotec malcolmyoullAftermarket2012 cargotec malcolmyoull
Aftermarket2012 cargotec malcolmyoullCopperberg
 
Graphdatenbank Neo4j: Konzept, Positionierung, Status Region DACH - Bruno Un...
 Graphdatenbank Neo4j: Konzept, Positionierung, Status Region DACH - Bruno Un... Graphdatenbank Neo4j: Konzept, Positionierung, Status Region DACH - Bruno Un...
Graphdatenbank Neo4j: Konzept, Positionierung, Status Region DACH - Bruno Un...Neo4j
 
SpaceBooking System
SpaceBooking SystemSpaceBooking System
SpaceBooking Systemaninkas
 
Spacebooking System
Spacebooking SystemSpacebooking System
Spacebooking Systemaninkas
 
Harnessing the Power of Computer Vision and Deep Learning
Harnessing the Power of Computer Vision and  Deep LearningHarnessing the Power of Computer Vision and  Deep Learning
Harnessing the Power of Computer Vision and Deep LearningDusko Rakin
 
Think Big | Enterprise Artificial Intelligence
Think Big | Enterprise Artificial IntelligenceThink Big | Enterprise Artificial Intelligence
Think Big | Enterprise Artificial IntelligenceData Science Milan
 
Domain Specific Languages for Parallel Graph AnalytiX (PGX)
Domain Specific Languages for Parallel Graph AnalytiX (PGX)Domain Specific Languages for Parallel Graph AnalytiX (PGX)
Domain Specific Languages for Parallel Graph AnalytiX (PGX)Eelco Visser
 
Open innovation platforms - idea dispenser or more ?
Open innovation platforms - idea dispenser or more ?Open innovation platforms - idea dispenser or more ?
Open innovation platforms - idea dispenser or more ?PRESANS
 
Fraunhofer – SINTEF: towards an initiative on Data Sovereignty in Europe
Fraunhofer – SINTEF: towards an initiative on Data Sovereignty in EuropeFraunhofer – SINTEF: towards an initiative on Data Sovereignty in Europe
Fraunhofer – SINTEF: towards an initiative on Data Sovereignty in EuropeThorsten Huelsmann
 
ICIC 2017: Technology Scouting: Decision Support in Strategic Analyses for Te...
ICIC 2017: Technology Scouting: Decision Support in Strategic Analyses for Te...ICIC 2017: Technology Scouting: Decision Support in Strategic Analyses for Te...
ICIC 2017: Technology Scouting: Decision Support in Strategic Analyses for Te...Dr. Haxel Consult
 
Introducing the Jisc National HPC Agreement
Introducing the Jisc National HPC AgreementIntroducing the Jisc National HPC Agreement
Introducing the Jisc National HPC AgreementMartin Hamilton
 
Spark-Zeppelin-ML on HWX
Spark-Zeppelin-ML on HWXSpark-Zeppelin-ML on HWX
Spark-Zeppelin-ML on HWXKirk Haslbeck
 

Ähnlich wie Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kernix, and Stanislas MORBIEU, PhD candidate in Computer Science at Paris Descartes University (20)

Energy Central Webinar on June 14, 2016
Energy Central Webinar on June 14, 2016Energy Central Webinar on June 14, 2016
Energy Central Webinar on June 14, 2016
 
Logistiek manager van het Jaar - sessie Robuust Plannen
Logistiek manager van het Jaar - sessie Robuust PlannenLogistiek manager van het Jaar - sessie Robuust Plannen
Logistiek manager van het Jaar - sessie Robuust Plannen
 
Membership Intro Presentation
Membership Intro PresentationMembership Intro Presentation
Membership Intro Presentation
 
The RoTechnology official presentation
The RoTechnology official presentationThe RoTechnology official presentation
The RoTechnology official presentation
 
SnT presentation for website
SnT presentation for websiteSnT presentation for website
SnT presentation for website
 
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
 
Protech (Computer Vision, Machine Learning, Deep Learning, Image Processing, ...
Protech (Computer Vision, Machine Learning, Deep Learning, Image Processing, ...Protech (Computer Vision, Machine Learning, Deep Learning, Image Processing, ...
Protech (Computer Vision, Machine Learning, Deep Learning, Image Processing, ...
 
Aftermarket2012 cargotec malcolmyoull
Aftermarket2012 cargotec malcolmyoullAftermarket2012 cargotec malcolmyoull
Aftermarket2012 cargotec malcolmyoull
 
Graphdatenbank Neo4j: Konzept, Positionierung, Status Region DACH - Bruno Un...
 Graphdatenbank Neo4j: Konzept, Positionierung, Status Region DACH - Bruno Un... Graphdatenbank Neo4j: Konzept, Positionierung, Status Region DACH - Bruno Un...
Graphdatenbank Neo4j: Konzept, Positionierung, Status Region DACH - Bruno Un...
 
SpaceBooking System
SpaceBooking SystemSpaceBooking System
SpaceBooking System
 
Spacebooking System
Spacebooking SystemSpacebooking System
Spacebooking System
 
Harnessing the Power of Computer Vision and Deep Learning
Harnessing the Power of Computer Vision and  Deep LearningHarnessing the Power of Computer Vision and  Deep Learning
Harnessing the Power of Computer Vision and Deep Learning
 
Think Big | Enterprise Artificial Intelligence
Think Big | Enterprise Artificial IntelligenceThink Big | Enterprise Artificial Intelligence
Think Big | Enterprise Artificial Intelligence
 
Domain Specific Languages for Parallel Graph AnalytiX (PGX)
Domain Specific Languages for Parallel Graph AnalytiX (PGX)Domain Specific Languages for Parallel Graph AnalytiX (PGX)
Domain Specific Languages for Parallel Graph AnalytiX (PGX)
 
Fortissimo Enabling manufacturing SMEs to benefit from HPC
FortissimoEnabling manufacturing SMEs to benefit from HPCFortissimoEnabling manufacturing SMEs to benefit from HPC
Fortissimo Enabling manufacturing SMEs to benefit from HPC
 
Open innovation platforms - idea dispenser or more ?
Open innovation platforms - idea dispenser or more ?Open innovation platforms - idea dispenser or more ?
Open innovation platforms - idea dispenser or more ?
 
Fraunhofer – SINTEF: towards an initiative on Data Sovereignty in Europe
Fraunhofer – SINTEF: towards an initiative on Data Sovereignty in EuropeFraunhofer – SINTEF: towards an initiative on Data Sovereignty in Europe
Fraunhofer – SINTEF: towards an initiative on Data Sovereignty in Europe
 
ICIC 2017: Technology Scouting: Decision Support in Strategic Analyses for Te...
ICIC 2017: Technology Scouting: Decision Support in Strategic Analyses for Te...ICIC 2017: Technology Scouting: Decision Support in Strategic Analyses for Te...
ICIC 2017: Technology Scouting: Decision Support in Strategic Analyses for Te...
 
Introducing the Jisc National HPC Agreement
Introducing the Jisc National HPC AgreementIntroducing the Jisc National HPC Agreement
Introducing the Jisc National HPC Agreement
 
Spark-Zeppelin-ML on HWX
Spark-Zeppelin-ML on HWXSpark-Zeppelin-ML on HWX
Spark-Zeppelin-ML on HWX
 

Mehr von recsysfr

Multi Task DPP for Basket Completion by Romain WARLOP, Fifty Five
Multi Task DPP for Basket Completion by Romain WARLOP, Fifty FiveMulti Task DPP for Basket Completion by Romain WARLOP, Fifty Five
Multi Task DPP for Basket Completion by Romain WARLOP, Fifty Fiverecsysfr
 
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...recsysfr
 
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinyclues
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - TinycluesPredictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinyclues
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinycluesrecsysfr
 
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...recsysfr
 
Injecting semantic links into a graph-based recommender system
Injecting semantic links into a graph-based recommender systemInjecting semantic links into a graph-based recommender system
Injecting semantic links into a graph-based recommender systemrecsysfr
 
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...recsysfr
 
Pulpix - Video Recommendation at Scale
Pulpix - Video Recommendation at ScalePulpix - Video Recommendation at Scale
Pulpix - Video Recommendation at Scalerecsysfr
 
Sequential Learning in the Position-Based Model
Sequential Learning in the Position-Based ModelSequential Learning in the Position-Based Model
Sequential Learning in the Position-Based Modelrecsysfr
 
Recommendation @ Meetic
Recommendation @ MeeticRecommendation @ Meetic
Recommendation @ Meeticrecsysfr
 
What can bring library metadata to the web? Trust, links and love
What can bring library metadata to the web? Trust, links and loveWhat can bring library metadata to the web? Trust, links and love
What can bring library metadata to the web? Trust, links and loverecsysfr
 
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-InformationMeta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-Informationrecsysfr
 
RecsysFR: Criteo presentation
RecsysFR: Criteo presentationRecsysFR: Criteo presentation
RecsysFR: Criteo presentationrecsysfr
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorizationrecsysfr
 
Recommendations @ Rakuten Group
Recommendations @ Rakuten GroupRecommendations @ Rakuten Group
Recommendations @ Rakuten Grouprecsysfr
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systemsrecsysfr
 
Recommender systems
Recommender systemsRecommender systems
Recommender systemsrecsysfr
 
Recommendation @Deezer
Recommendation @DeezerRecommendation @Deezer
Recommendation @Deezerrecsysfr
 
Flexible recommender systems based on graphs
Flexible recommender systems based on graphsFlexible recommender systems based on graphs
Flexible recommender systems based on graphsrecsysfr
 
Using Neural Networks to predict user ratings
Using Neural Networks to predict user ratingsUsing Neural Networks to predict user ratings
Using Neural Networks to predict user ratingsrecsysfr
 
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?recsysfr
 

Mehr von recsysfr (20)

Multi Task DPP for Basket Completion by Romain WARLOP, Fifty Five
Multi Task DPP for Basket Completion by Romain WARLOP, Fifty FiveMulti Task DPP for Basket Completion by Romain WARLOP, Fifty Five
Multi Task DPP for Basket Completion by Romain WARLOP, Fifty Five
 
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...
 
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinyclues
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - TinycluesPredictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinyclues
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinyclues
 
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
 
Injecting semantic links into a graph-based recommender system
Injecting semantic links into a graph-based recommender systemInjecting semantic links into a graph-based recommender system
Injecting semantic links into a graph-based recommender system
 
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
 
Pulpix - Video Recommendation at Scale
Pulpix - Video Recommendation at ScalePulpix - Video Recommendation at Scale
Pulpix - Video Recommendation at Scale
 
Sequential Learning in the Position-Based Model
Sequential Learning in the Position-Based ModelSequential Learning in the Position-Based Model
Sequential Learning in the Position-Based Model
 
Recommendation @ Meetic
Recommendation @ MeeticRecommendation @ Meetic
Recommendation @ Meetic
 
What can bring library metadata to the web? Trust, links and love
What can bring library metadata to the web? Trust, links and loveWhat can bring library metadata to the web? Trust, links and love
What can bring library metadata to the web? Trust, links and love
 
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-InformationMeta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
 
RecsysFR: Criteo presentation
RecsysFR: Criteo presentationRecsysFR: Criteo presentation
RecsysFR: Criteo presentation
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
 
Recommendations @ Rakuten Group
Recommendations @ Rakuten GroupRecommendations @ Rakuten Group
Recommendations @ Rakuten Group
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systems
 
Recommender systems
Recommender systemsRecommender systems
Recommender systems
 
Recommendation @Deezer
Recommendation @DeezerRecommendation @Deezer
Recommendation @Deezer
 
Flexible recommender systems based on graphs
Flexible recommender systems based on graphsFlexible recommender systems based on graphs
Flexible recommender systems based on graphs
 
Using Neural Networks to predict user ratings
Using Neural Networks to predict user ratingsUsing Neural Networks to predict user ratings
Using Neural Networks to predict user ratings
 
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?
 

Kürzlich hochgeladen

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 

Kürzlich hochgeladen (20)

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 

Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kernix, and Stanislas MORBIEU, PhD candidate in Computer Science at Paris Descartes University

  • 1. Kernix digital & data 1 9th Meetup RecSysFr Recommendation systems at Kernix Stanislas Morbieu PhD candidate in Machine Learning Cristian Perez Data Scientist
  • 2. 2 A few words about us...
  • 3. KERNIX DATA LAB 3KERNIX DATA LAB Kernix Kernix guide and accelerate companies from PoC to IT integration in data science projects. Especially in risk prediction in the bank/insurance sector, for Natixis, Euler Hermes or Allianz. 2 founders 50 employees 600 completed projects 3,8 millions € in turnover 10 published books 17 years of experience Data Lab Digital Factory Data Lab
  • 4. KERNIX DATA LAB 4 4 Data Lab Risk prediction : ∙ Allianz UK ∙ Euler Hermes - Natixis Marketing Automation : ∙ Radium One ∙ Performics Predictive maintenance : ∙ SNCF ∙ Enedis Fraud detection : ∙ Colas Data Viz : ∙ SolarImpulse ∙ Cop 21
  • 6. Difficulties ● Texts are not well written (typos, grammar mistakes, abbreviations) ● High volume and continuously incoming ● High renewal rate ● Cold start ● Location and price considerations 6
  • 7. Strategies 7 ● Recommendation based on semantic categories and user behavior ● Scalable semantic content based recommendation ● Ongoing research project
  • 8. Recommendation based on semantic categories and user behavior 8
  • 9. 9 Recommendation based on semantic categories and user behavior “Dolls” category “motorbikes” category
  • 10. 10 Recommendation based on semantic categories and user behavior “Dolls” category “motorbikes” category
  • 11. 11 Recommendation based on semantic categories and user behavior “Dolls” category “motorbikes” category
  • 12. 12 Recommendation based on semantic categories and user behavior “Dolls” category “motorbikes” category
  • 13. 13 Recommendation based on semantic categories and user behavior “Dolls” category “motorbikes” category
  • 14. Scalable semantic content based recommendation 14
  • 15. Recommendation Kernix 15 Process steps: Formatting (lemmatization, tokenization,...) of the description and the title Computing a semantic vector for a classified ad Computing distances with the most probables neighbors Saving relationships in the graph database On demand, search the most similar classified ads of a given one
  • 20. Conclusions 20 ● Each project is unique: ○ Different requirements ○ Several strategies ● Tradeoffs and specific aspects to consider: ○ Text length (short or long text) ○ Take into account user behavior ○ Scalability (high renewal rate) ○ location and price consideration ○ types of content (pictures, text, song…) ● Other recommendation engines: ○ On demand video platform ○ Workshop recommendation ○ Content recommendation ○ Scientific publications recommendation for a pharmaceutical company
  • 21. Agence Kernix 6 rue Lalande 75014 Paris +33 (0)1 53 98 73 40 info@kernix.com www.kernix.com Merci de votre attention Stanislas Morbieu PhD in Machine Learning smorbieu@kernix.com Cristian Perez Data scientist cperez@kernix.com