SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Exploring Deep Space:
Learning Personalized Ranking in a Semantic Space
Jeroen Vuurens - Martha Larson - Arjen de Vries
1
https://arxiv.org/pdf/1608.00276v2
Star Wars IV
Terminator 2
The Matrix
f
(
x
)
Semantic spaces
2
image: https://www.tensorflow.org/versions/r0.10/tutorials/word2vec/index.html
Consistent encoding of relations [2,3]
[2] T. Mikolov, and J. Dean. Distributed representations of words and phrases and their
compositionality. Advances in neural information processing systems, 2013.
[3] T. Mikolov, W.-T. Yih, and G. Zweig. Linguistic Regularities in Continuous Space Word
man - woman country - capital
Semantic spaces
3
Consistent encoding of relations [2,3]
[2] T. Mikolov, and J. Dean. Distributed representations of words and phrases and their
compositionality. Advances in neural information processing systems, 2013.
[3] T. Mikolov, W.-T. Yih, and G. Zweig. Linguistic Regularities in Continuous Space Word
Semantic spaces
4
Consistent encoding of relations
Similar for items? e.g. genre, suspense,
strong language, suitability for children
Semantic Spaces
5
SciFi
Action
Drama
Romance
Thriller
Fantasy
War
Crime
Semantic Spaces
6
SuspenseRealism
90’s
Spielberg
Harrison Ford
Ranking items
7
Star Wars IV #5
Terminator 2 #4
The Matrix #5
Star Wars V, VI
Men in BlackJurassic ParkBack to the Future
Raiders of the Lost Ark
f(x)
Ranking items
8
f(x)
Hyperplane
+
-
Ranking items
9
Scary
+
-
Ranking items
10
Scary
likes
dislikes
indifferent
f(x)
f(x)
Implementation: learning vectors
11
[4] Q. V. Le and T. Mikolov. Distributed representations of sentences and documents.
In Proceedings of ICML, 2014.
ParagraphVector-DBOW [4],
movieId
userId_rating
Hierarchical Softmax
Implementation: learning vectors
12
[4] Q. V. Le and T. Mikolov. Distributed representations of sentences and documents.
In Proceedings of ICML, 2014.
ParagraphVector-DBOW[4]
Hierarchical Softmax
Star Wars
(user 3, rating 4) => user3_high
Implementation: learning vectors
13
[4] Q. V. Le and T. Mikolov. Distributed representations of sentences and documents.
In Proceedings of ICML, 2014.
PV-DBOW [4], content-based
movieId
wordInImdbReview
Hierarchical Softmax
Implementation: ranking items
14
lower rated
movie vector
higher rated
movie vector
hyperplane
coefficients
distance
hyperplane
Implementation: ranking items
15
lower rated
movie vector
higher rated
movie vector
hyperplane
coefficients
update:
Implementation: ranking items
16
lower rated
movie vector
higher rated
movie vector
hyperplane
coefficients
Evaluation
Movielens 1M (4k users, 6k movies)
• Simulated online evaluation
• 96% train, 2% validate, 2% test
• Recall@10 for ratings >= 4
• compare against BPRMF, WRMF,
UserKNN
17
Evaluation
System Recall@10 sig. over*
Popularity 0.053
BPRMF1
0.079 4
UserKNN2
0.087 4
WRMF3
0.089 4
DS-CF-500 0.144 1,2,3,4,5
DS-CF-1k 0.151 1,2,3,4,5
DS-CB-10k4
DS-VSM5
18
DS-CF:
• item vectors learned
from user-ratings
• marginally reduces
dimensionality
• sig. more effective
than other models
* all p < 0.001
Evaluation
System Recall@10 sig. over*
Popularity 0.053
BPRMF1
0.079 4
UserKNN2
0.087 4
WRMF3
0.089 4
DS-CF-500 0.144 1,2,3,4,5
DS-CF-1k 0.151 1,2,3,4,5
DS-CB-10k4
0.075
DS-VSM5
19
DS-CB:
• item vectors learned
from IMDB user
reviews
• requires high
dimensionality
• potentially useful for
novel items?
* all p < 0.001
DS-VSM:
• user-ratings used as
item vector
• ranking the items
according to a
hyperplane that
optimally ranks user’s
past ratings
Evaluation
System Recall@10 sig. over*
Popularity 0.053
BPRMF1
0.079 4
UserKNN2
0.087 4
WRMF3
0.089 4
DS-CF-500 0.144 1,2,3,4,5
DS-CF-1k 0.151 1,2,3,4,5
DS-CB-10k4
0.075
DS-VSM5
0.119 1,2,3,4
20
* all p < 0.001
Analysis of parameters
21
Number of most
recently rated items
Dimensionality of
the semantic space
Conclusion
• Semantic item vectors encode
substitutability
• Rank items according to hyperplane,
tuned to a user’s most recent N ratings.
22
Conclusion
• Semantic item vectors encode
substitutability
• Rank items according to hyperplane,
tuned to a user’s most recent N ratings.
• Semantic space generalizes over the
similarities between items
23
Conclusion
• Semantic item vectors encode
substitutability
• Rank items according to hyperplane,
tuned to a user’s most recent N ratings.
• Semantic space generalizes preferences
• Proposed pairwise L2R architecture
allows to use high-dimensional latent
vectors.
24
Questions?
[1] W. Lowe. Towards a theory of semantic space. In
Proceedings of CogSci, 2001.
[2] T. Mikolov, and J. Dean. Distributed representations of
words and phrases and their compositionality. Advances in
neural information processing systems, 2013.
[3] T. Mikolov, W.-T. Yih, and G. Zweig. Linguistic Regularities
in Continuous Space Word Representations. In Proceedings of
HLT-NAACL, 2013.paper: https://arxiv.org/abs/1608.00276
Compositionality of semantic spaces
26

Weitere ähnliche Inhalte

Andere mochten auch

Vello Kukk - Muutused haridusmaastikul õppejõu pilgu läbi
Vello Kukk - Muutused haridusmaastikul õppejõu pilgu läbiVello Kukk - Muutused haridusmaastikul õppejõu pilgu läbi
Vello Kukk - Muutused haridusmaastikul õppejõu pilgu läbilepakas
 
Ayuda aplicación matemática TPICI 1c 2016
Ayuda aplicación matemática TPICI 1c 2016Ayuda aplicación matemática TPICI 1c 2016
Ayuda aplicación matemática TPICI 1c 2016Irma Noemí No
 
Impacto de las tic en educaciòn
Impacto de las tic en educaciònImpacto de las tic en educaciòn
Impacto de las tic en educaciònclaudiamilenapg
 
LDV koostöö projekti tutvustus
LDV koostöö projekti tutvustusLDV koostöö projekti tutvustus
LDV koostöö projekti tutvustuslepakas
 
The Function of Aesthetic
The Function of AestheticThe Function of Aesthetic
The Function of AestheticLex Roman
 
Evolucion de la computadora daniel
Evolucion de la computadora danielEvolucion de la computadora daniel
Evolucion de la computadora danieladaniel275
 
Procrastination Drug Of A Nation
Procrastination Drug Of A NationProcrastination Drug Of A Nation
Procrastination Drug Of A NationHendrik Dacquin
 
10 time management hacks
10 time management hacks10 time management hacks
10 time management hacksricke78
 
IBM Marketing Cloud mobile solutions
IBM Marketing Cloud mobile solutionsIBM Marketing Cloud mobile solutions
IBM Marketing Cloud mobile solutionsVirginia Fernandez
 
Xuventude novas n57_xaneiro2015
Xuventude novas n57_xaneiro2015Xuventude novas n57_xaneiro2015
Xuventude novas n57_xaneiro2015satelite1
 
ネイバージャパン モジュールのバージョン管理
ネイバージャパン モジュールのバージョン管理ネイバージャパン モジュールのバージョン管理
ネイバージャパン モジュールのバージョン管理LINE Corporation (Tech Unit)
 
Kellogg's Digital Strategy
Kellogg's Digital StrategyKellogg's Digital Strategy
Kellogg's Digital Strategymenzov
 
Presentation for Annual Review and Planning Workshop of ndbmp - 2012
Presentation for Annual Review and Planning Workshop of  ndbmp - 2012Presentation for Annual Review and Planning Workshop of  ndbmp - 2012
Presentation for Annual Review and Planning Workshop of ndbmp - 2012Nazmul Faisal
 
Проект Йодобром
Проект ЙодобромПроект Йодобром
Проект Йодобромkulibin
 

Andere mochten auch (16)

Vello Kukk - Muutused haridusmaastikul õppejõu pilgu läbi
Vello Kukk - Muutused haridusmaastikul õppejõu pilgu läbiVello Kukk - Muutused haridusmaastikul õppejõu pilgu läbi
Vello Kukk - Muutused haridusmaastikul õppejõu pilgu läbi
 
Ayuda aplicación matemática TPICI 1c 2016
Ayuda aplicación matemática TPICI 1c 2016Ayuda aplicación matemática TPICI 1c 2016
Ayuda aplicación matemática TPICI 1c 2016
 
Impacto de las tic en educaciòn
Impacto de las tic en educaciònImpacto de las tic en educaciòn
Impacto de las tic en educaciòn
 
LDV koostöö projekti tutvustus
LDV koostöö projekti tutvustusLDV koostöö projekti tutvustus
LDV koostöö projekti tutvustus
 
The Function of Aesthetic
The Function of AestheticThe Function of Aesthetic
The Function of Aesthetic
 
economia_popular_y_solidaria
economia_popular_y_solidariaeconomia_popular_y_solidaria
economia_popular_y_solidaria
 
Evolucion de la computadora daniel
Evolucion de la computadora danielEvolucion de la computadora daniel
Evolucion de la computadora daniel
 
Procrastination Drug Of A Nation
Procrastination Drug Of A NationProcrastination Drug Of A Nation
Procrastination Drug Of A Nation
 
10 time management hacks
10 time management hacks10 time management hacks
10 time management hacks
 
IBM Marketing Cloud mobile solutions
IBM Marketing Cloud mobile solutionsIBM Marketing Cloud mobile solutions
IBM Marketing Cloud mobile solutions
 
Xuventude novas n57_xaneiro2015
Xuventude novas n57_xaneiro2015Xuventude novas n57_xaneiro2015
Xuventude novas n57_xaneiro2015
 
ICT for Lifelong Mobility
ICT for Lifelong MobilityICT for Lifelong Mobility
ICT for Lifelong Mobility
 
ネイバージャパン モジュールのバージョン管理
ネイバージャパン モジュールのバージョン管理ネイバージャパン モジュールのバージョン管理
ネイバージャパン モジュールのバージョン管理
 
Kellogg's Digital Strategy
Kellogg's Digital StrategyKellogg's Digital Strategy
Kellogg's Digital Strategy
 
Presentation for Annual Review and Planning Workshop of ndbmp - 2012
Presentation for Annual Review and Planning Workshop of  ndbmp - 2012Presentation for Annual Review and Planning Workshop of  ndbmp - 2012
Presentation for Annual Review and Planning Workshop of ndbmp - 2012
 
Проект Йодобром
Проект ЙодобромПроект Йодобром
Проект Йодобром
 

Ähnlich wie RSDL2016

Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Claudio Greco
 
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Alessandro Suglia
 
KDD17Tutorial_final (1).pdf
KDD17Tutorial_final (1).pdfKDD17Tutorial_final (1).pdf
KDD17Tutorial_final (1).pdfssuserf2f0fe
 
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017StampedeCon
 
Software version numbering - DSL of change
Software version numbering - DSL of changeSoftware version numbering - DSL of change
Software version numbering - DSL of changeSergii Shmarkatiuk
 
Named Entity Recognition using Bi-LSTM and Tenserflow Model
Named Entity Recognition using Bi-LSTM and Tenserflow ModelNamed Entity Recognition using Bi-LSTM and Tenserflow Model
Named Entity Recognition using Bi-LSTM and Tenserflow ModelIRJET Journal
 
Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Li Shen
 
Geo exploration simplified with Elastic Maps
Geo exploration simplified with Elastic MapsGeo exploration simplified with Elastic Maps
Geo exploration simplified with Elastic MapsElasticsearch
 
Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)fridolin.wild
 
Design Patterns using Amazon DynamoDB
 Design Patterns using Amazon DynamoDB Design Patterns using Amazon DynamoDB
Design Patterns using Amazon DynamoDBAmazon Web Services
 

Ähnlich wie RSDL2016 (20)

Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
 
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
 
KDD17Tutorial_final (1).pdf
KDD17Tutorial_final (1).pdfKDD17Tutorial_final (1).pdf
KDD17Tutorial_final (1).pdf
 
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
 
Wcre12b.ppt
Wcre12b.pptWcre12b.ppt
Wcre12b.ppt
 
Deep cv 101
Deep cv 101Deep cv 101
Deep cv 101
 
Software version numbering - DSL of change
Software version numbering - DSL of changeSoftware version numbering - DSL of change
Software version numbering - DSL of change
 
Named Entity Recognition using Bi-LSTM and Tenserflow Model
Named Entity Recognition using Bi-LSTM and Tenserflow ModelNamed Entity Recognition using Bi-LSTM and Tenserflow Model
Named Entity Recognition using Bi-LSTM and Tenserflow Model
 
Data Access Patterns
Data Access PatternsData Access Patterns
Data Access Patterns
 
Wcre12b.ppt
Wcre12b.pptWcre12b.ppt
Wcre12b.ppt
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Trivandrum
TrivandrumTrivandrum
Trivandrum
 
Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2
 
Geo exploration simplified with Elastic Maps
Geo exploration simplified with Elastic MapsGeo exploration simplified with Elastic Maps
Geo exploration simplified with Elastic Maps
 
Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)
 
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
 
All good things
All good thingsAll good things
All good things
 
Design Patterns using Amazon DynamoDB
 Design Patterns using Amazon DynamoDB Design Patterns using Amazon DynamoDB
Design Patterns using Amazon DynamoDB
 
Sigir 2011 proceedings
Sigir 2011 proceedingsSigir 2011 proceedings
Sigir 2011 proceedings
 
Repairing Hidden Links in Linked Data
Repairing Hidden Links in Linked DataRepairing Hidden Links in Linked Data
Repairing Hidden Links in Linked Data
 

Kürzlich hochgeladen

OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayupadhyaymani499
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
preservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxpreservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxnoordubaliya2003
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx023NiWayanAnggiSriWa
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)Tamer Koksalan, PhD
 
Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...navyadasi1992
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomyDrAnita Sharma
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 

Kürzlich hochgeladen (20)

OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyay
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
preservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxpreservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptx
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)
 
Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomy
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 

RSDL2016

Hinweis der Redaktion

  1. In this work, we looked at the potential benefit of representing items in a high-dimensional semantic space. In such a space, the proximity between items reflects their substitutability, and we show how that can be used at the basis of recommending items to users.
  2. Studies that use word embeddings have shown that when learning embeddings for words based on the context they appear in, not only are substitutes likely to positioned in close proximity, but semantic similarities between words often end up being encoded in a consist way, as can be seen for word-pairs with a difference in gender.
  3. And for tasks such as analogous reasoning, it has been shown that the composition over elementary semantic relations can be used to find missing words which shows the compositionality of semantic spaces.
  4. We expect that when using a similar learning process to learn semantic vectors for items, concepts that are useful to describe the differences between groups of items are also consistently encoded. For example, in the movie domain some useful encoded concepts may bare similarities to movie genres, whether a movie is considered exciting, scary, or contains strong language.
  5. Now, let us look, at an example In this normalized 2-dimensional semantic space, we positioned the 20-most popular movies in Movielens according to their substitutability, which we inferred from their user ratings. Within this distribution, some useful concepts such as movie genres can be identified,
  6. but also very coarsely we see that less suspenseful movies are positioned near the bottom, suspenseful movies more near the top, and for instance only movies from the 90’s appear on the left hand side. However, the low-dimensionality in this example shows there is friction, some items can not be positioned ideally. For recommendations, many concepts are potentially are useful to describe the interest of groups of users to groups of movies. Consider for instance that some users may strongly favor a specific actor. To improve the potential, we should raise the dimensionality of the space would allow for more semantic patterns to be encoded effectively and independently.
  7. Suppose that we have learned a semantic space that encodes the substitutability between movies by useful concepts how can we use this representation to recommend items to a specific user? We propose to learn a function that optimally ranks the items in the collection based on user’s past ratings thereby finding a mixture of semantic concepts that describes the user’s interest and thus also ranks the unrated items based on their expected preference.
  8. In this work, we use hyperplanes as a function to rank the items by their signed distance to the hyperplane.
  9. Why hyperplanes? Let’s say for instance there is some vector that encodes how scary a movie is. Within the user population, we probably find users that like scary movies, uses that dislike scary movies, but perhaps more important, we can also find users that are indifferent to whether a movie contains scary elements or not.
  10. Then, users that like or dislike scary movies they can have a hyperplane orthogonal to the direction in which scariness is encoded in space while a user that is indifferent to scary elements can have a hyperplane parallel to the scary vector. Most importantly, therefore, in this semantic space, it does not matter if two movies that are equally preferred by the targeted user are separated by concepts that user is indifferent to.
  11. So then we get to the implementation part. For learning item vectors, we use the distributed bag of words variant of the Paragraph Vector that was proposed by Le and Mikolov.
  12. In this architecture the input layer is a so-called 1-hot vector, that contains a node for every item in the collection. When learning an item-user-rating triple, only the input node that corresponds to the item is set to 1 and all others are set to 0. The corresponding column in the lower weight matrix contains the item embedding and is then copied to the hidden layer. For the output layer, we preprocessed the data by first converting the ratings to High or Low depending on whether the rating is greater or equal to the user’s average rating. Then we converted each user-rating into a single compound word that consists of the user_id and whether the rating was labelled high or low. This vocabulary for the output layer is turned into a Huffman tree to learn a hierarchical softmax. We learn the ratings one-at-a-time in random order
  13. Interestingly, we can also use the exact same architecture to learn the substitutability between items based on text descriptions. For this experiment we used all IMDB user reviews for the movies in Movielens, and then we learn embeddings for a movie by predicting the words that appear in its reviews.
  14. Then for the second step, when we have learned the semantic space To recommend movies to a specific user, we learn optimal hyperplane coefficients using a custom architecture that resembles pairwise learning to rank. We iteratively learn over pairs of movies that have received different ratings, and unrated items are considered to have a rating of 0. We insert the vector of the LOWER rated movie in A and the vector of the HIGHER rated movie in B. The weight matrix contains the hyperplane coefficients, therefore the hidden nodes will obtain the signed distances to the hyperplane.
  15. In this architecture g will obtain a value close to zero when the items are ranked correctly and a value of 1 when ranked in reverse order. Therefore g directly gives the gradient used for updating, During learning we only update the weight matrix by simply adding the gradient times b and subtracting the gradient times a with respect to a learning rate.
  16. The paper discusses three parameters that are used to control learning: And most interesting one, theta R controls the number of most recently rated items by the user, The rationale here is that if a user’s preference changes over time, the best recommendations are more like the most recently preferred items than items that the user preferred long ago. As we will see, limiting a user’s past preferences has a large impact on the effectiveness.
  17. For the evaluation, we used Movielens 1M. Since our system uses the N most recently rated items by the user, the evaluation requires an online setting. We split the data so that the test part covers 2% of the ratings and only contains the most recent ratings by users. We only measured effectiveness over ratings of at least 4 out of a maximal 5, i.o.w. the movies that the user really likes. We compare Recall@10 against the MyMediaLite implementations of BPRMF, WRMF and UserKNN.
  18. Then we move towards the results On top is the popularity baseline, what happens when we recommend the 10 most popular items the user has not seen. The next three baselines are the existing collaborative filtering baselines we compared against. Our first variant is the deep space collaborative filtering variant, which uses item vectors that are learned from the user-item ratings, and this significantly and greatly outperforms the other models.
  19. Next is our deep-space content-based variant, which uses semantic vectors learned from the words in IMDB user reviews. No collaborative filtering data is used. This variant performs better than the popularity baseline but less than the existing collaborative filtering baselines, but it is still interesting to see that we can estimate substitutability between movies from reviews. this could potentially help to improve the recommendation of new or rarely rated items, provided that we have sufficient text to learn substitutability.
  20. To evaluate the effectiveness of the ranking architecture without using a semantic space, we included a variant in which we represented every movie by a vector over their user-ratings, every user being a dimension. And then generated recommendations by learning a hyperplane as described. This significantly outperformed the existing collaborative filtering baselines we compared against. This shows that the improvement of our collaborative filtering variant is not just the result of learning semantic vectors, but also by the way we learn a hyperplane to rank the items
  21. We analyzed how changing the hyperparameters changes the effectiveness. and these are the 2 most interesting parameters on the left hand side the dimensionality of the semantic space We see that the proposed approach is underperforming below 300 dimensions and maxes out at about 1000 dimensions on the Movielens collection. and on the right hand side how many of most recently rated items by the targeted user are used find an optimal hyperplane. And when using only the 5 most recently rated items the recommendations are far more effective than using more history
  22. In summary: In this work we represented items in a semantic space in which the proximity between items reflects their substitutability. Using such a space, we recommend items by optimally ranking a user’s past preferences according to a hyperplane, which in the process also ranks the unrated items according to their expected preference. Our experiments show a significant improvement over existing baselines on Movielens.
  23. So the pending question is why does this work so well? The first step of learning semantic representations provides a generalization over item-similarities that is useful to generalize beyond specific items when recommending. Hence the improvement that we observe of our collaborative filtering variant over the VSM variant that uses vectors over user-ratings. But ongoing experiments indicate that it does not matter much how you learn a semantic space, for instance, when using the high-dimensional latent item vectors learned by BPRMF in the proposed hyperplane ranking method the results are almost as good as with Paragraph2Vec.
  24. Therefore, the real improvement seems to be the pairwise learning to rank architecture, that allows to handle higher dimensionality than existing algorithms. this increase in dimensionality can be used for more extensive encoding of concepts that are potentially useful for recommending items to a user.
  25. And if two items, such as in this example king and woman, differ by multiple concepts, the vector between them approximates the composition over these concepts.