2. http://Learning-Layers-eu
Many thanks to
Emanuel Lacic
elacic@know-center.at
Graz University of Technology
Austria
2
Denis Parra
dparra@ing.puc.cl
Pontificia Universidad
Catolica
Chile
Christoph Trattner
ctrattner@know-center.at
Know-Center Graz
Austria
3. http://Learning-Layers-eu
What will this talk be about?
⢠(Real-time) product recommendations for
online marketplaces
⢠Scalability of recommender systems
⢠Utilizing social network data for the
recommendations of products to people
3
4. http://Learning-Layers-eu
How did this work start?
⢠Joint project with the Austrian start-up Blanc-Noir
⢠Personalized product recommender for online
marketplaces based on
â Actions in the marketplaces (e.g., Ebay, Amazon)
â Product information
â Social network data (e.g., Facebook, G+)
â Filter criteria
⢠Provided at (near) real-time!
⌠especially if there is a lot of data
⌠together with many data updates
4
6. http://Learning-Layers-eu
Whatâs available out there?
⢠Frameworks/approaches for scalable
recommendations
â Distributed data processing
⢠Apache Hadoop / Mahout (map/reduce paradigm)
â Relational databases
⢠MySQL, PostgreSQL (e.g., RecDB project)
â Collaborative Filtering improvements
⢠Matrix factorization
⢠Lack of a framework / approach that combines all
things we need
6
8. http://Learning-Layers-eu
Why Solr?
⢠âHigh-performance, full-featured text search engine libraryâ
⌠but more precise âŚ
⢠âHigh-performance, fully-featured token matching and scoring
libraryâ [Grainger, 2012]
⌠which provides âŚ.
â full-text searches (content-based)
â powerful queries (e.g., MoreLikeThis or Facets)
â (near) real-time data updates (no pre/re-calculations)
â easy schema updates (social data integration)
⢠Established open-source software (Apache license) with big
community
8
11. http://Learning-Layers-eu
Whatâs about the marketplace
and social data features?
11
0
0.05
0.1
0.15
0.2
Purchases Categories Title Description Interests Groups Likes Comments Interactions
nDCG
Data Features
0
0.1
0.2
0.3
0.4
0.5
Purchases Categories Title Description Interests Groups Likes Comments Interactions
UserCoverage
Data Features
⢠Both types of data are important for the recommender quality and
user coverage
12. http://Learning-Layers-eu
Whatâs about the hybrids?
12
0
0.01
0.02
0.03
0.04
0.05
MP CCFm CFs ALL
nDCG
Recommendation Algorithms
0
0.2
0.4
0.6
0.8
1
MP CCFm CFs ALL
UserCoverage
Data Features
⢠The hybrid approach provides a good trade-off of recommender
quality and user coverage
14. http://Learning-Layers-eu
What we have shown!
⢠Apache Solr is more than a search engine!
⢠Actually it is a great framework to implement a scalable
recommender engine for online marketplaces
⢠Near real-time recommendations through build-in query-functions
⢠Near real-time data updates
⢠Easy integration of social data
+ a high-performance full-text search engine for free!
⢠Evaluation on dataset gathered from SecondLife
⢠Different marketplace and social data features are important
⢠Hybrid approaches produce more robust recommendations
⢠It scales!
14
15. http://Learning-Layers-eu
What do we want to do in the
future?
15
⢠Online study together with BlancNoir with ârealâ data
⢠Impact of geo-spatial data
⢠Impact of temporal data (see WebScience track)
⢠Comparative study with other backend solutions (e.g.,
ElasticSearch)
16. http://Learning-Layers-eu
Thank you for your attention!
Code and framework:
https://github.com/learning-layers/SocRec
Questions?
Dominik Kowald
dkowald@know-center.at
Know-Center
Graz University of Technology (Austria)
16
21. http://Learning-Layers-eu
Recommendation Algorithms
implemented in the Engine
⢠MostPopular (MP)
â Recommends for any user the most purchased items
⢠Collaborative Filtering (CF)
â Find similar users (k nearest neighbors) and recommend novel items of
those users [Schafer et al., 2007]
â In Solr: select queries and facet counts
⢠Content-Based (C)
â Analyze item meta-data to find similar items [Pazzani et al., 2007]
â In Solr: MoreLikeThis function
⢠Hybrid (CCF)
â Combine different algorithms to overcome their individual limitations
[Burke et al., 2002]
â Each algorithm can be weighted / tuned according to its performance
21