ݺߣ

ݺߣShare a Scribd company logo
Matrix Factorization
Otakus v.s. No. of Figures
A 5 3 0 1
B 4 3 0 1
C 1 1 0 5
D 1 0 4 4
E 0 1 5 4
There are some common factors behind otakus and characters.
http://www.quuxlabs.com/blog/2010/09/matrix-factorization-a-simple-tutorial-and-im
plementation-in-python/
Otakus v.s. No. of Figures
A
B
C














match
The factors are
latent.
Not directly
observable
No one cares
No. of Otakus = M No. of characters = N No. of latent factor = K
A 5 3 0 1
B 4 3 0 1
C 1 1 0 5
D 1 0 4 4
E 0 1 5 4
?1
?2
?3
? 4
? ?
? ?
??
? ?
? ?
Matrix X
r1
r2
rA
rB
Matrix X
?? 1 ?? 2
??1 ??2
? ?
? ?1
 5
? ?
?? 1
 4
??
?? 1
 1

M
N
N
K
K
N
Singular value
decomposition


Minimize
Error
A 5 3 ? 1
B 4 3 ? 1
C 1 1 ? 5
D 1 ? 4 4
E ? 1 5 4
?1
?2
?3
? 4
? ?
? ?
??
? ?
? ?
?? 1
? ?
? ?1
 5
? ?
?? 1
 4
??
?? 1
 1

?=
(?, ?)
(?
?
??
?
???? )
2
Find and by gradient descent
Minimizing
Only considering the
defined value
??
? ?
A 5 3 ? 1
B 4 3 ? 1
C 1 1 ? 5
D 1 ? 4 4
E ? 1 5 4
?1
?2
?3
? 4
? ?
? ?
??
? ?
? ?
A 0.2 2.1
B 0.2 1.8
C 1.3 0.7
D 1.9 0.2
E 2.2 0.0
Assume the dimensions of r are all 2 (there are two factors)
1 ( 
 )
0.0 2.2
2 ( 
 )
0.1 1.5
3 ( 
 )
1.9 -0.3
-0.4
-0.3
2.2
0.6
0.1
More about Matrix Factorization
? Considering the induvial characteristics
? Ref: Matrix Factorization Techniques For
Recommender Systems
?=
(?, ?)
(?
?
??
?
+??+?? ????)
2
Find , , , by gradient descent
Minimizing
(can add regularization)
: otakus A likes to buy figures
: how popular character 1 is
? ?
? ?1
 5 ? ?
? ?1
+? ? +?1  5
Matrix Factorization
for Topic analysis
? Latent semantic analysis (LSA)
? Probability latent semantic analysis (PLSA)
? Thomas Hofmann, Probabilistic Latent Semantic Indexing, SIGIR, 1999
? latent Dirichlet allocation (LDA)
? David M. Blei, Andrew Y. Ng, Michael I. Jordan, Latent Dirichlet Allocation,
Journal of Machine Learning Research, 2003
Doc 1 Doc 2 Doc 3 Doc 4
ͶY 5 3 0 1
Ʊ 4 0 0 1
y 1 1 0 5
xe 1 0 0 4
ί 0 1 5 4
Number in
Table:
Term frequency
(weighted by inverse
document frequency)
Latent factors are topics
( ؔ  )
characterdocument,
otakusword

More Related Content

MF Optimization Optimization Optimization .pptx

  • 2. Otakus v.s. No. of Figures A 5 3 0 1 B 4 3 0 1 C 1 1 0 5 D 1 0 4 4 E 0 1 5 4 There are some common factors behind otakus and characters. http://www.quuxlabs.com/blog/2010/09/matrix-factorization-a-simple-tutorial-and-im plementation-in-python/
  • 3. Otakus v.s. No. of Figures A B C match The factors are latent. Not directly observable No one cares
  • 4. No. of Otakus = M No. of characters = N No. of latent factor = K A 5 3 0 1 B 4 3 0 1 C 1 1 0 5 D 1 0 4 4 E 0 1 5 4 ?1 ?2 ?3 ? 4 ? ? ? ? ?? ? ? ? ? Matrix X r1 r2 rA rB Matrix X ?? 1 ?? 2 ??1 ??2 ? ? ? ?1 5 ? ? ?? 1 4 ?? ?? 1 1 M N N K K N Singular value decomposition Minimize Error
  • 5. A 5 3 ? 1 B 4 3 ? 1 C 1 1 ? 5 D 1 ? 4 4 E ? 1 5 4 ?1 ?2 ?3 ? 4 ? ? ? ? ?? ? ? ? ? ?? 1 ? ? ? ?1 5 ? ? ?? 1 4 ?? ?? 1 1 ?= (?, ?) (? ? ?? ? ???? ) 2 Find and by gradient descent Minimizing Only considering the defined value ?? ? ?
  • 6. A 5 3 ? 1 B 4 3 ? 1 C 1 1 ? 5 D 1 ? 4 4 E ? 1 5 4 ?1 ?2 ?3 ? 4 ? ? ? ? ?? ? ? ? ? A 0.2 2.1 B 0.2 1.8 C 1.3 0.7 D 1.9 0.2 E 2.2 0.0 Assume the dimensions of r are all 2 (there are two factors) 1 ( ) 0.0 2.2 2 ( ) 0.1 1.5 3 ( ) 1.9 -0.3 -0.4 -0.3 2.2 0.6 0.1
  • 7. More about Matrix Factorization ? Considering the induvial characteristics ? Ref: Matrix Factorization Techniques For Recommender Systems ?= (?, ?) (? ? ?? ? +??+?? ????) 2 Find , , , by gradient descent Minimizing (can add regularization) : otakus A likes to buy figures : how popular character 1 is ? ? ? ?1 5 ? ? ? ?1 +? ? +?1 5
  • 8. Matrix Factorization for Topic analysis ? Latent semantic analysis (LSA) ? Probability latent semantic analysis (PLSA) ? Thomas Hofmann, Probabilistic Latent Semantic Indexing, SIGIR, 1999 ? latent Dirichlet allocation (LDA) ? David M. Blei, Andrew Y. Ng, Michael I. Jordan, Latent Dirichlet Allocation, Journal of Machine Learning Research, 2003 Doc 1 Doc 2 Doc 3 Doc 4 ͶY 5 3 0 1 Ʊ 4 0 0 1 y 1 1 0 5 xe 1 0 0 4 ί 0 1 5 4 Number in Table: Term frequency (weighted by inverse document frequency) Latent factors are topics ( ؔ ) characterdocument, otakusword

Editor's Notes

  • #2: ƽΨ Figure We can do dimension reduction on otakus and characters individually.