2. Session 16: Content Analysis 2 – Topics の論文
p527.pdf
A Time-Based Collective Factorization for Topic Discovery
and Monitoring in News
Carmen Vaca (Politecnico di Milano & Escuela Superior Politecnica del Litoral), Amin Mantrach
(Yahoo! Labs), Alejandro Jaimes (Yahoo! Labs), Marco Saerens (Université de Louvain)
P539.pdf
The Dual-Sparse Topic Model: Mining Focused Topics and
Focused Terms in Short Text
Tianyi Lin (The Chinese University of Hong Kong), Wentao Tian (The Chinese University of Hong
Kong), Qiaozhu Mei (University of Michigan), Hong Cheng (The Chinese University of Hong Kong)
P551.pdf
Acquisition of Open-Domain Classes via Intersective
Semantics
Marius Pa?ca (Google Inc.)
2
3. Session 16: Content Analysis 2 – Topics の論文
p527.pdf
A Time-Based Collective Factorization for Topic Discovery
and Monitoring in News
Carmen Vaca (Politecnico di Milano & Escuela Superior Politecnica del Litoral), Amin Mantrach
(Yahoo! Labs), Alejandro Jaimes (Yahoo! Labs), Marco Saerens (Université de Louvain)
P539.pdf
The Dual-Sparse Topic Model: Mining Focused Topics and
Focused Terms in Short Text
Tianyi Lin (The Chinese University of Hong Kong), Wentao Tian (The Chinese University of Hong
Kong), Qiaozhu Mei (University of Michigan), Hong Cheng (The Chinese University of Hong Kong)
P551.pdf
Acquisition of Open-Domain Classes via Intersective
Semantics
Marius Pa?ca (Google Inc.)
3
9. Session 16: Content Analysis 2 – Topics の論文
p527.pdf
A Time-Based Collective Factorization for Topic Discovery
and Monitoring in News
Carmen Vaca (Politecnico di Milano & Escuela Superior Politecnica del Litoral), Amin Mantrach
(Yahoo! Labs), Alejandro Jaimes (Yahoo! Labs), Marco Saerens (Université de Louvain)
P539.pdf
The Dual-Sparse Topic Model: Mining Focused Topics and
Focused Terms in Short Text
Tianyi Lin (The Chinese University of Hong Kong), Wentao Tian (The Chinese University of Hong
Kong), Qiaozhu Mei (University of Michigan), Hong Cheng (The Chinese University of Hong Kong)
P551.pdf
Acquisition of Open-Domain Classes via Intersective
Semantics
Marius Pa?ca (Google Inc.)
9
13. 評価
? DBLP, 20 Newsgroups, Twitterの3種類のデータ
13 Session 16: Content Analysis 2 - Topics 担当:白川(大阪大学)
The Dual-Sparse Topic Model: Mining Focused Topics and Focused Terms in Short Text
評価結果(論文より)
ユーザごとに
ツイートを
まとめたもの
データセット(論文より)
20 Newsgroupsの文書長を短くした
場合でも提案手法(DSPTM)が安定
提案手法(DsparseTM)が全般的に良い
Twitterだとほぼ1ツイート1トピックらしく,
Mixture of unigrams[Blei, JMLR03]がベスト
だが,それでも提案手法は結構良い
14. Session 16: Content Analysis 2 – Topics の論文
p527.pdf
A Time-Based Collective Factorization for Topic Discovery
and Monitoring in News
Carmen Vaca (Politecnico di Milano & Escuela Superior Politecnica del Litoral), Amin Mantrach
(Yahoo! Labs), Alejandro Jaimes (Yahoo! Labs), Marco Saerens (Université de Louvain)
P539.pdf
The Dual-Sparse Topic Model: Mining Focused Topics and
Focused Terms in Short Text
Tianyi Lin (The Chinese University of Hong Kong), Wentao Tian (The Chinese University of Hong
Kong), Qiaozhu Mei (University of Michigan), Hong Cheng (The Chinese University of Hong Kong)
P551.pdf
Acquisition of Open-Domain Classes via Intersective
Semantics
Marius Pa?ca (Google Inc.)
14