SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Downloaden Sie, um offline zu lesen
Tensor	
  Decomposi-on	
  with	
  
Missing	
  Indices	
Yuto	
  Yamaguchi	
  and	
  Kohei	
  Hayashi	
17/08/22	
 IJCAI2017@Melbourne	
 1
Tensor	
  data	
17/08/22	
 IJCAI2017@Melbourne	
 2	
#	
  	
  
#	
(userA,	
   	
  #movie,	
   	
  Melbourne): 	
  1	
  
(userB,	
   	
  #tennis,	
   	
  Sydney): 	
   	
  2	
  
(userC,	
   	
  #dinner, 	
  Canberra):	
   	
  1	
  
(userB,	
   	
  #beer, 	
   	
  Brisbane): 	
   	
  1	
  
(userA,	
   	
  #dinner, 	
  Melbourne): 	
  2	
  
e.g.,	
  TwiNer	
  data	
  (user,	
  hashtag,	
  loca-on)	
Tensor	
  data	
  =	
  mul--­‐dimensional	
  data	
value
Tensor	
  decomposi-on	
17/08/22	
 IJCAI2017@Melbourne	
 3	
e.g.,	
  CP	
  decomposi-on	
  [Carroll	
  and	
  Chang,	
  1970]	
+	
 +	
  	
  	
  	
  …	
=	
Applica-ons	
  
•  Recommenda-ons,	
  noise	
  reduc-on,	
  data	
  compression,	
  …	
  
ˆXijk = UirVjrWkr
r
∑
X	
 V:,	
  1	
U:,	
  1	
W:,	
  1	
V:,	
  2	
U:,	
  2	
W:,	
  2
[Our	
  problem]	
  
what	
  if	
  indices	
  are	
  missing?	
17/08/22	
 IJCAI2017@Melbourne	
 4	
#	
  	
  
#	
(userA,	
   	
  #movie,	
   	
  Melbourne): 	
  1	
  
(userB,	
   	
  #tennis,	
   	
  Sydney): 	
   	
  2	
  
(userC,	
   	
  #dinner, 	
  Canberra):	
   	
  1	
  
(userB,	
   	
  #beer, 	
   	
  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐): 	
   	
  1	
  
(userA,	
   	
  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐, 	
  Melbourne): 	
  2	
  
Conven5onal	
  tensor	
  decomposi5on	
  algorithms	
  
do	
  not	
  apply	
  to	
  these	
  “incomplete	
  samples”	
  L	
value
[Our	
  problem]	
  
what	
  if	
  indices	
  are	
  missing?	
17/08/22	
 IJCAI2017@Melbourne	
 5	
#	
  	
  
#	
(userA,	
   	
  #movie,	
   	
  Melbourne): 	
  1	
  
(userB,	
   	
  #tennis,	
   	
  Sydney): 	
   	
  2	
  
(userC,	
   	
  #dinner, 	
  Canberra):	
   	
  1	
  
(userB,	
   	
  #beer, 	
   	
  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐): 	
   	
  1	
  
(userA,	
   	
  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐, 	
  Melbourne): 	
  2	
  
Conven5onal	
  tensor	
  decomposi5on	
  algorithms	
  
do	
  not	
  apply	
  to	
  these	
  “incomplete	
  samples”	
  L	
value	
Values	
  are	
  not	
  missing
PROPOSED	
  MODEL	
17/08/22	
 IJCAI2017@Melbourne	
 6
Basic	
  idea	
17/08/22	
 IJCAI2017@Melbourne	
 7	
(userA, 	
  #movie,	
  	
  Melbourne)	
  
(userB, 	
  #tennis,	
  	
  Sydney)	
  
(userC, 	
  #dinner,	
  Canberra)	
  
(userB, 	
  #beer, 	
  Brisbane)	
  
(userA, 	
  #dinner,	
  Melbourne)	
  
+	
 +	
  	
  	
  	
  …	
e.g.,	
  CPD	
infer	
construct	
decompose	
Solve	
  tensor	
  decomposi5on	
  and	
  missing	
  indices	
  inference	
  
repeatedly
Proposed	
  model	
  (1/2)	
17/08/22	
 IJCAI2017@Melbourne	
 8	
Handle	
  indices	
  as	
  unobserved	
  variables	
ˆin ∈ 1,2,…I,φ{ }
Observed	
  (can	
  be	
  missing)	
  indices	
True	
  (unobserved)	
  indices	
missing	
Tensor	
  elements	
Decomposi5on	
  
parameters	
[3rd-­‐order	
  case]
Proposed	
  model	
  (2/2)	
17/08/22	
 IJCAI2017@Melbourne	
 9	
1.	
  Generate	
  decomposi-on	
  parameters	
  depending	
  on	
  the	
  
	
  	
  	
  	
  	
  	
  decomposi-on	
  model	
  
Θ = U,V,W{ } Uir = N ⋅ 0,
1
λ
"
#
$
%
&
' for	
  all	
  i	
  and	
  r	
e.g.,	
  CPD
Proposed	
  model	
  (2/2)	
17/08/22	
 IJCAI2017@Melbourne	
 10	
2.	
  Generate	
  N	
  indices	
  (in,	
  jn,	
  kn)	
  
Delta	
  if	
  not	
  missing	
Uniform	
  if	
  missing	
in ~
Proposed	
  model	
  (2/2)	
17/08/22	
 IJCAI2017@Melbourne	
 11	
3.	
  Generate	
  N	
  tensor	
  elements	
  depending	
  on	
  decomposi-on	
  model	
  
e.g.,	
  CPD	
ˆXin jnkn
= UinrVjnrWknr
r
∑
Proposed	
  model	
  is	
  a	
  natural	
  extension	
  of	
  
the	
  conven-onal	
  tensor	
  decomposi-on	
17/08/22	
 IJCAI2017@Melbourne	
 12	
where	
 MLE	
  Θ	
  of	
  the	
  proposed	
  model
Parameter	
  inference	
Varia-onal	
  MAP-­‐EM	
  algorithm	
  
	
  	
  
•  E-­‐step	
  
– Missing	
  indices	
  are	
  inferred	
  using	
  learnt	
  tensor	
  
decomposi-on	
  
•  M-­‐step	
  
– Tensor	
  decomposi-on	
  is	
  learnt	
  using	
  inferred	
  
indices	
17/08/22	
 IJCAI2017@Melbourne	
 13	
See	
  the	
  paper	
  for	
  details	
  if	
  interested	
  J
Time	
  Complexity	
  (Mth-­‐order	
  tensor)	
17/08/22	
 IJCAI2017@Melbourne	
 14	
Proposed	
  algorithm	
  for	
  CPD	
Conven-onal	
  CPD	
N
Nm
-
R
Im
:	
  #	
  of	
  samples	
:	
  #	
  of	
  missing	
  indices	
  for	
  mth	
  mode	
:	
  #	
  of	
  latent	
  dimensions	
:	
  #	
  of	
  dimensions	
  for	
  mth	
  mode	
Only	
  addi5onal	
  term
EXPERIMENTS	
17/08/22	
 IJCAI2017@Melbourne	
 15
Compared	
  algorithms	
17/08/22	
 IJCAI2017@Melbourne	
 16	
[MAP-­‐EM]:	
  	
  Proposed	
  algo.	
  with	
  q	
  inferred	
  
	
  
[Uniform]:	
   	
  Proposed	
  algo.	
  with	
  q	
  fixed	
  as	
  uniform	
  
	
  
[Prior]:	
   	
   	
  Proposed	
  algo.	
  with	
  q	
  fixed	
  as	
  data	
  histogram	
  
	
  
[Minimal]:	
   	
  CPD	
  with	
  only	
  complete	
  samples	
  
	
  
[Complete]:	
  CPD	
  with	
  only	
  complete	
  modes	
  
	
  
[CMTF]:	
  	
   	
  Coupled	
  matrix	
  tensor	
  factoriza-on	
  [Acar+,	
  2011]	
Approx.	
  distribu5on	
  on	
  varia5onal	
  inference	
Proposed	
Baselines
Results	
17/08/22	
 IJCAI2017@Melbourne	
 17	
Lower	
  beNer	
 Lower	
  beNer	
 Upper	
  beNer	
Proposed	
  model	
  (red)	
  works	
  well	
  if	
  
•  the	
  number	
  of	
  samples	
  is	
  large,	
  or	
  
•  missing	
  ra-o	
  is	
  not	
  very	
  large	
Synthe5c	
  data	
  generated	
  by	
  our	
  model	
TwiZer	
  data	
  
(user,	
  hashtag,	
  loca5on)
sample	
  size	
  large	
  (n=10)	
 sample	
  size	
  small	
  (n=1)
Summary	
•  [New problem]
–  Defined a new tensor decomposition problem where
the indices are partially missing
•  [Model]
–  Proposed a probabilistic generative model to handle
missing indices
•  [Algorithm]
–  Developed a parameter inference algorithm	
17/08/22	
 IJCAI2017@Melbourne	
 18	
Github: yamaguchiyuto/missing_tensor_decomposition

Weitere ähnliche Inhalte

Ähnlich wie Tensor Decomposition with Missing Indices

Optimization of Continuous Queries in Federated Database and Stream Processin...
Optimization of Continuous Queries in Federated Database and Stream Processin...Optimization of Continuous Queries in Federated Database and Stream Processin...
Optimization of Continuous Queries in Federated Database and Stream Processin...
Zbigniew Jerzak
 
Parallel Algorithms K – means Clustering
Parallel Algorithms K – means ClusteringParallel Algorithms K – means Clustering
Parallel Algorithms K – means Clustering
Andreina Uzcategui
 
On the Performance of the Pareto Set Pursuing (PSP) Method for Mixed-Variable...
On the Performance of the Pareto Set Pursuing (PSP) Method for Mixed-Variable...On the Performance of the Pareto Set Pursuing (PSP) Method for Mixed-Variable...
On the Performance of the Pareto Set Pursuing (PSP) Method for Mixed-Variable...
Amir Ziai
 

Ähnlich wie Tensor Decomposition with Missing Indices (20)

An Efficient Multiplierless Transform algorithm for Video Coding
An Efficient Multiplierless Transform algorithm for Video CodingAn Efficient Multiplierless Transform algorithm for Video Coding
An Efficient Multiplierless Transform algorithm for Video Coding
 
Optimization for-power-sy-8631549
Optimization for-power-sy-8631549Optimization for-power-sy-8631549
Optimization for-power-sy-8631549
 
Testing of Matrices Multiplication Methods on Different Processors
Testing of Matrices Multiplication Methods on Different ProcessorsTesting of Matrices Multiplication Methods on Different Processors
Testing of Matrices Multiplication Methods on Different Processors
 
DL for molecules
DL for moleculesDL for molecules
DL for molecules
 
CAMSAP19
CAMSAP19CAMSAP19
CAMSAP19
 
Seminar
SeminarSeminar
Seminar
 
Designing Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.ProtoDesigning Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.Proto
 
BPSO&1-NN algorithm-based variable selection for power system stability ident...
BPSO&1-NN algorithm-based variable selection for power system stability ident...BPSO&1-NN algorithm-based variable selection for power system stability ident...
BPSO&1-NN algorithm-based variable selection for power system stability ident...
 
A Hybrid Data Clustering Approach using K-Means and Simplex Method-based Bact...
A Hybrid Data Clustering Approach using K-Means and Simplex Method-based Bact...A Hybrid Data Clustering Approach using K-Means and Simplex Method-based Bact...
A Hybrid Data Clustering Approach using K-Means and Simplex Method-based Bact...
 
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Tension in active shapes
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Tension in active shapesIEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Tension in active shapes
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Tension in active shapes
 
Interior Dual Optimization Software Engineering with Applications in BCS Elec...
Interior Dual Optimization Software Engineering with Applications in BCS Elec...Interior Dual Optimization Software Engineering with Applications in BCS Elec...
Interior Dual Optimization Software Engineering with Applications in BCS Elec...
 
IRJET- K-SVD: Dictionary Developing Algorithms for Sparse Representation ...
IRJET-  	  K-SVD: Dictionary Developing Algorithms for Sparse Representation ...IRJET-  	  K-SVD: Dictionary Developing Algorithms for Sparse Representation ...
IRJET- K-SVD: Dictionary Developing Algorithms for Sparse Representation ...
 
Optimization of Continuous Queries in Federated Database and Stream Processin...
Optimization of Continuous Queries in Federated Database and Stream Processin...Optimization of Continuous Queries in Federated Database and Stream Processin...
Optimization of Continuous Queries in Federated Database and Stream Processin...
 
Energy management system
Energy management systemEnergy management system
Energy management system
 
Parallel Algorithms K – means Clustering
Parallel Algorithms K – means ClusteringParallel Algorithms K – means Clustering
Parallel Algorithms K – means Clustering
 
Quality Prediction in Fingerprint Compression
Quality Prediction in Fingerprint CompressionQuality Prediction in Fingerprint Compression
Quality Prediction in Fingerprint Compression
 
On the Performance of the Pareto Set Pursuing (PSP) Method for Mixed-Variable...
On the Performance of the Pareto Set Pursuing (PSP) Method for Mixed-Variable...On the Performance of the Pareto Set Pursuing (PSP) Method for Mixed-Variable...
On the Performance of the Pareto Set Pursuing (PSP) Method for Mixed-Variable...
 
High Speed Signed multiplier for Digital Signal Processing Applications
High Speed Signed multiplier for Digital Signal Processing ApplicationsHigh Speed Signed multiplier for Digital Signal Processing Applications
High Speed Signed multiplier for Digital Signal Processing Applications
 
710201911
710201911710201911
710201911
 
22_RepeatedMeasuresDesign_Complete.pptx
22_RepeatedMeasuresDesign_Complete.pptx22_RepeatedMeasuresDesign_Complete.pptx
22_RepeatedMeasuresDesign_Complete.pptx
 

Mehr von Yuto Yamaguchi

OMNI-Prop: Seamless Node Classification on Arbitrary Label Correlation
OMNI-Prop: Seamless Node Classification on Arbitrary Label CorrelationOMNI-Prop: Seamless Node Classification on Arbitrary Label Correlation
OMNI-Prop: Seamless Node Classification on Arbitrary Label Correlation
Yuto Yamaguchi
 
Online User Location Inference Exploiting Spatiotemporal Correlations in Soci...
Online User Location Inference Exploiting Spatiotemporal Correlations in Soci...Online User Location Inference Exploiting Spatiotemporal Correlations in Soci...
Online User Location Inference Exploiting Spatiotemporal Correlations in Soci...
Yuto Yamaguchi
 
WWW2012勉強会:Information Diffusion in Social Networks
WWW2012勉強会:Information Diffusion in Social NetworksWWW2012勉強会:Information Diffusion in Social Networks
WWW2012勉強会:Information Diffusion in Social Networks
Yuto Yamaguchi
 

Mehr von Yuto Yamaguchi (13)

When Does Label Propagation Fail? A View from a Network Generative Model@ERAT...
When Does Label Propagation Fail? A View from a Network Generative Model@ERAT...When Does Label Propagation Fail? A View from a Network Generative Model@ERAT...
When Does Label Propagation Fail? A View from a Network Generative Model@ERAT...
 
Bridging Relational Learning Algorithms@ビッグデータ基盤勉強会
Bridging Relational Learning Algorithms@ビッグデータ基盤勉強会Bridging Relational Learning Algorithms@ビッグデータ基盤勉強会
Bridging Relational Learning Algorithms@ビッグデータ基盤勉強会
 
When Does Label Propagation Fail? A View from a Network Generative Model
When Does Label Propagation Fail? A View from a Network Generative ModelWhen Does Label Propagation Fail? A View from a Network Generative Model
When Does Label Propagation Fail? A View from a Network Generative Model
 
Robust Large-Scale Machine Learning in the Cloud
Robust Large-Scale Machine Learning in the CloudRobust Large-Scale Machine Learning in the Cloud
Robust Large-Scale Machine Learning in the Cloud
 
Patterns in Interactive Tagging Networks
Patterns in Interactive Tagging NetworksPatterns in Interactive Tagging Networks
Patterns in Interactive Tagging Networks
 
SocNL: Bayesian Label Propagation with Confidence
SocNL: Bayesian Label Propagation with ConfidenceSocNL: Bayesian Label Propagation with Confidence
SocNL: Bayesian Label Propagation with Confidence
 
OMNI-Prop: Seamless Node Classification on Arbitrary Label Correlation
OMNI-Prop: Seamless Node Classification on Arbitrary Label CorrelationOMNI-Prop: Seamless Node Classification on Arbitrary Label Correlation
OMNI-Prop: Seamless Node Classification on Arbitrary Label Correlation
 
Online User Location Inference Exploiting Spatiotemporal Correlations in Soci...
Online User Location Inference Exploiting Spatiotemporal Correlations in Soci...Online User Location Inference Exploiting Spatiotemporal Correlations in Soci...
Online User Location Inference Exploiting Spatiotemporal Correlations in Soci...
 
SIGMOD2013勉強会:Social Media
SIGMOD2013勉強会:Social MediaSIGMOD2013勉強会:Social Media
SIGMOD2013勉強会:Social Media
 
Towards Social User Profiling: Unified and Discriminative Influence Model for...
Towards Social User Profiling: Unified and Discriminative Influence Model for...Towards Social User Profiling: Unified and Discriminative Influence Model for...
Towards Social User Profiling: Unified and Discriminative Influence Model for...
 
The Length of Bridge Ties: Structural and Geographic Properties of Online So...
The Length of Bridge Ties: Structural and Geographic Properties of Online So...The Length of Bridge Ties: Structural and Geographic Properties of Online So...
The Length of Bridge Ties: Structural and Geographic Properties of Online So...
 
WWW2012勉強会:Information Diffusion in Social Networks
WWW2012勉強会:Information Diffusion in Social NetworksWWW2012勉強会:Information Diffusion in Social Networks
WWW2012勉強会:Information Diffusion in Social Networks
 
ICDE2012勉強会:Social Media
ICDE2012勉強会:Social MediaICDE2012勉強会:Social Media
ICDE2012勉強会:Social Media
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Tensor Decomposition with Missing Indices

  • 1. Tensor  Decomposi-on  with   Missing  Indices Yuto  Yamaguchi  and  Kohei  Hayashi 17/08/22 IJCAI2017@Melbourne 1
  • 2. Tensor  data 17/08/22 IJCAI2017@Melbourne 2 #     # (userA,    #movie,    Melbourne):  1   (userB,    #tennis,    Sydney):    2   (userC,    #dinner,  Canberra):    1   (userB,    #beer,    Brisbane):    1   (userA,    #dinner,  Melbourne):  2   e.g.,  TwiNer  data  (user,  hashtag,  loca-on) Tensor  data  =  mul--­‐dimensional  data value
  • 3. Tensor  decomposi-on 17/08/22 IJCAI2017@Melbourne 3 e.g.,  CP  decomposi-on  [Carroll  and  Chang,  1970] + +        … = Applica-ons   •  Recommenda-ons,  noise  reduc-on,  data  compression,  …   ˆXijk = UirVjrWkr r ∑ X V:,  1 U:,  1 W:,  1 V:,  2 U:,  2 W:,  2
  • 4. [Our  problem]   what  if  indices  are  missing? 17/08/22 IJCAI2017@Melbourne 4 #     # (userA,    #movie,    Melbourne):  1   (userB,    #tennis,    Sydney):    2   (userC,    #dinner,  Canberra):    1   (userB,    #beer,    -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐):    1   (userA,    -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐,  Melbourne):  2   Conven5onal  tensor  decomposi5on  algorithms   do  not  apply  to  these  “incomplete  samples”  L value
  • 5. [Our  problem]   what  if  indices  are  missing? 17/08/22 IJCAI2017@Melbourne 5 #     # (userA,    #movie,    Melbourne):  1   (userB,    #tennis,    Sydney):    2   (userC,    #dinner,  Canberra):    1   (userB,    #beer,    -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐):    1   (userA,    -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐,  Melbourne):  2   Conven5onal  tensor  decomposi5on  algorithms   do  not  apply  to  these  “incomplete  samples”  L value Values  are  not  missing
  • 7. Basic  idea 17/08/22 IJCAI2017@Melbourne 7 (userA,  #movie,    Melbourne)   (userB,  #tennis,    Sydney)   (userC,  #dinner,  Canberra)   (userB,  #beer,  Brisbane)   (userA,  #dinner,  Melbourne)   + +        … e.g.,  CPD infer construct decompose Solve  tensor  decomposi5on  and  missing  indices  inference   repeatedly
  • 8. Proposed  model  (1/2) 17/08/22 IJCAI2017@Melbourne 8 Handle  indices  as  unobserved  variables ˆin ∈ 1,2,…I,φ{ } Observed  (can  be  missing)  indices True  (unobserved)  indices missing Tensor  elements Decomposi5on   parameters [3rd-­‐order  case]
  • 9. Proposed  model  (2/2) 17/08/22 IJCAI2017@Melbourne 9 1.  Generate  decomposi-on  parameters  depending  on  the              decomposi-on  model   Θ = U,V,W{ } Uir = N ⋅ 0, 1 λ " # $ % & ' for  all  i  and  r e.g.,  CPD
  • 10. Proposed  model  (2/2) 17/08/22 IJCAI2017@Melbourne 10 2.  Generate  N  indices  (in,  jn,  kn)   Delta  if  not  missing Uniform  if  missing in ~
  • 11. Proposed  model  (2/2) 17/08/22 IJCAI2017@Melbourne 11 3.  Generate  N  tensor  elements  depending  on  decomposi-on  model   e.g.,  CPD ˆXin jnkn = UinrVjnrWknr r ∑
  • 12. Proposed  model  is  a  natural  extension  of   the  conven-onal  tensor  decomposi-on 17/08/22 IJCAI2017@Melbourne 12 where MLE  Θ  of  the  proposed  model
  • 13. Parameter  inference Varia-onal  MAP-­‐EM  algorithm       •  E-­‐step   – Missing  indices  are  inferred  using  learnt  tensor   decomposi-on   •  M-­‐step   – Tensor  decomposi-on  is  learnt  using  inferred   indices 17/08/22 IJCAI2017@Melbourne 13 See  the  paper  for  details  if  interested  J
  • 14. Time  Complexity  (Mth-­‐order  tensor) 17/08/22 IJCAI2017@Melbourne 14 Proposed  algorithm  for  CPD Conven-onal  CPD N Nm - R Im :  #  of  samples :  #  of  missing  indices  for  mth  mode :  #  of  latent  dimensions :  #  of  dimensions  for  mth  mode Only  addi5onal  term
  • 16. Compared  algorithms 17/08/22 IJCAI2017@Melbourne 16 [MAP-­‐EM]:    Proposed  algo.  with  q  inferred     [Uniform]:    Proposed  algo.  with  q  fixed  as  uniform     [Prior]:      Proposed  algo.  with  q  fixed  as  data  histogram     [Minimal]:    CPD  with  only  complete  samples     [Complete]:  CPD  with  only  complete  modes     [CMTF]:      Coupled  matrix  tensor  factoriza-on  [Acar+,  2011] Approx.  distribu5on  on  varia5onal  inference Proposed Baselines
  • 17. Results 17/08/22 IJCAI2017@Melbourne 17 Lower  beNer Lower  beNer Upper  beNer Proposed  model  (red)  works  well  if   •  the  number  of  samples  is  large,  or   •  missing  ra-o  is  not  very  large Synthe5c  data  generated  by  our  model TwiZer  data   (user,  hashtag,  loca5on) sample  size  large  (n=10) sample  size  small  (n=1)
  • 18. Summary •  [New problem] –  Defined a new tensor decomposition problem where the indices are partially missing •  [Model] –  Proposed a probabilistic generative model to handle missing indices •  [Algorithm] –  Developed a parameter inference algorithm 17/08/22 IJCAI2017@Melbourne 18 Github: yamaguchiyuto/missing_tensor_decomposition