[Yang, Downey and Boyd-Graber 2015] Efficient Methods for Incorporating Knowl...Shuyo Nakatani
?
This document summarizes a paper that proposes a new topic modeling method called SC-LDA that incorporates prior knowledge about word correlations into LDA. SC-LDA uses a factor graph to encode must-link and cannot-link constraints between words based on an external knowledge source. It then integrates this prior knowledge into the LDA inference process to influence the topic assignments. The paper experiments with SC-LDA on several datasets and knowledge sources, finding it converges faster than baselines and produces more coherent topics.
ACL2014 Reading: [Zhang+] "Kneser-Ney Smoothing on Expected Count" and [Pickh...Shuyo Nakatani
?
The document summarizes two papers on language modeling techniques. [Zhang+ ACL2014] proposes applying Kneser-Ney smoothing to expected counts when training data has fractional weights, outperforming other methods on a domain adaptation task. [Pickhardt+ ACL2014] presents a generalized language model combining skipped n-grams and modified Kneser-Ney smoothing, reducing perplexity by 25.7% on small training data. The talks also reviewed modified Kneser-Ney smoothing and its application to language model adaptation.
[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametri...Shuyo Nakatani
?
This document summarizes the Dirichlet Process with Mixed Random Measures (DP-MRM) topic model. DP-MRM is a nonparametric, supervised topic model that does not require specifying the number of topics in advance. It places a Dirichlet process prior over label-specific random measures, with each measure representing the topics for a label. The generative process samples document-topic distributions from these random measures. Inference is done using a Chinese restaurant franchise process. Experiments show DP-MRM can automatically learn label-topic correspondences without manual specification.
Short Text Language Detection with Infinity-GramShuyo Nakatani
?
This document discusses short text language detection using n-gram analysis. It presents an existing language detection library called language-detection that uses character 3-grams and Bayesian filtering. The library achieves over 99% accuracy on 53 languages when trained on Wikipedia text. The document also reports on evaluations of the library and another method (CLD) on news text and European Parliament proceedings, achieving over 90% accuracy on most languages. However, accuracy drops to 90-95% for Twitter text due to its short, noisy nature.
[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing SystemsShuyo Nakatani
?
This document discusses methods for improving the reliability of crowdsourced systems by identifying spam workers. It proposes an iterative algorithm that exchanges messages between tasks and workers to predict answers and estimate error rates. The algorithm guarantees an upper bound on error rates that decreases exponentially as the number of iterations increases, allowing highly accurate predictions even with some unreliable workers. Experimental results demonstrate the algorithm achieves lower error rates than other common methods.
ACL2014 Reading: [Zhang+] "Kneser-Ney Smoothing on Expected Count" and [Pickh...Shuyo Nakatani
?
The document summarizes two papers on language modeling techniques. [Zhang+ ACL2014] proposes applying Kneser-Ney smoothing to expected counts when training data has fractional weights, outperforming other methods on a domain adaptation task. [Pickhardt+ ACL2014] presents a generalized language model combining skipped n-grams and modified Kneser-Ney smoothing, reducing perplexity by 25.7% on small training data. The talks also reviewed modified Kneser-Ney smoothing and its application to language model adaptation.
[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametri...Shuyo Nakatani
?
This document summarizes the Dirichlet Process with Mixed Random Measures (DP-MRM) topic model. DP-MRM is a nonparametric, supervised topic model that does not require specifying the number of topics in advance. It places a Dirichlet process prior over label-specific random measures, with each measure representing the topics for a label. The generative process samples document-topic distributions from these random measures. Inference is done using a Chinese restaurant franchise process. Experiments show DP-MRM can automatically learn label-topic correspondences without manual specification.
Short Text Language Detection with Infinity-GramShuyo Nakatani
?
This document discusses short text language detection using n-gram analysis. It presents an existing language detection library called language-detection that uses character 3-grams and Bayesian filtering. The library achieves over 99% accuracy on 53 languages when trained on Wikipedia text. The document also reports on evaluations of the library and another method (CLD) on news text and European Parliament proceedings, achieving over 90% accuracy on most languages. However, accuracy drops to 90-95% for Twitter text due to its short, noisy nature.
[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing SystemsShuyo Nakatani
?
This document discusses methods for improving the reliability of crowdsourced systems by identifying spam workers. It proposes an iterative algorithm that exchanges messages between tasks and workers to predict answers and estimate error rates. The algorithm guarantees an upper bound on error rates that decreases exponentially as the number of iterations increases, allowing highly accurate predictions even with some unreliable workers. Experimental results demonstrate the algorithm achieves lower error rates than other common methods.
11. IGOR
I G
O
R
? [Weston 2014] によるモデル概要
? Input
? Generalization 記憶の更新(格納)
? Output
? Response
http://deeplearning.hatenablog
.com/entry/memory_networks
によれば「ちなみに I, G, O, R の
名前の由来はフランケンシュタイ
ン博士の助手イゴール (IGOR) で
ある」とのことだが、ソースが見
つからない
実際は記憶ストレージに
文ベクトルを追加格納してるだけ
多分ここをまじめにモデル化
しないと AGI への道は開かない
12. Dynamic Memory Networks
[Kumar 2016]
? “Most tasks in natural language processing
can be cast into question answering (QA)
problems over language input.”
– Abstract の第1文
– NLPのほとんどのタスクは質問応答に帰着できる
? Memory Networks はいろんな問題が解ける!
– 質問応答、テキスト分類()、品詞タグ付け、etc
19. bAbI dataset
https://research.fb.com/downloads/babi/
? Memory Networks 用に作成された
超超やさしい質問応答のデータセット(baby!)
– 語彙極小(10~40)、回答は1単語
– 否定文を含むのは qa9 のみ
[qa1_single-supporting-fact]
1 Mary moved to the bathroom.
2 John went to the hallway.
3 Where is Mary? bathroom 1
4 Daniel went back to the hallway.
5 Sandra moved to the garden.
6 Where is Daniel? hallway 4
質問、回答、
参照すべき知識
知識
※ End-to-End Memory
Networks では使わない
qa knowledge vocaburary answer
qa1 2000 19 6
qa2 4338 33 6
qa3 14796 34 6
qa4 2000 14 6
qa5 5038 39 7
qa6 2066 33 2
qa7 2638 39 4
qa8 2634 34 14
qa9 2000 22 2
qa10 2000 21 3
qa11 2000 26 6
qa12 2000 20 6
qa13 2000 26 6
qa14 2372 25 6
qa15 2000 17 4
qa16 9000 17 4
qa17 250 16 2
qa18 1230 16 2
qa19 5000 19 12
qa20 1000 35 7
knowledge は総知識数
vocaburary は知識+質問の語彙数
answer は回答種類数
20. [qa9_simple-negation]
1 Mary is no longer in the bedroom.
2 Daniel moved to the hallway.
3 Is Mary in the bedroom? no 1
4 Sandra moved to the bedroom.
5 Sandra is in the bathroom.
6 Is Daniel in the bathroom? no 2
[qa20_agents-motivations]
1 Sumit is tired.
2 Where will sumit go? bedroom 1
3 Sumit went back to the bedroom.
4 Why did sumit go to the bedroom? tired 1
5 Sumit grabbed the pajamas there.
6 Why did sumit get the pajamas? tired 1
エスパーか!
21. [qa3_three-supporting-facts]
1 Mary moved to the bathroom.
2 Sandra journeyed to the bedroom.
3 Mary got the football there.
4 John went back to the bedroom.
5 Mary journeyed to the office.
6 John journeyed to the office.
7 John took the milk.
8 Daniel went back to the kitchen.
9 John moved to the bedroom.
10 Daniel went back to the hallway.
11 Daniel took the apple.
12 John left the milk there.
13 John travelled to the kitchen.
14 Sandra went back to the bathroom.
15 Daniel journeyed to the bathroom.
16 John journeyed to the bathroom.
17 Mary journeyed to the bathroom.
18 Sandra went back to the garden.
19 Sandra went to the office.
20 Daniel went to the garden.
21 Sandra went back to the hallway.
22 Daniel journeyed to the office.
23 Mary dropped the football.
24 John moved to the bedroom.
25 Where was the football before the bathroom? office 23 17 5
Mary が football を
持って office から
bathroom に移動した
23. 実験(qa1~20)
? 隠れユニット50, 100epoch
? 初期値を変えて各5回推論、一番良い
結果を採用
– 上述の実装は正解率を出す。論文に合わ
せて error rate に変換
? validation data = test data (手抜き)
– PE = Position Encoding
– LS = Linear Start
– RN = Random Noise (Temporal
Encoding の正則化)
25. References
? Weston, Jason, Sumit Chopra, and Antoine Bordes.
"Memory networks." arXiv preprint
arXiv:1410.3916 (2014).
? Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus.
"End-to-end memory networks." Advances in neural
information processing systems. 2015.
? Kumar, Ankit, et al. "Ask me anything: Dynamic
memory networks for natural language
processing." International Conference on Machine
Learning. 2016.
? [赤本] 坪井,海野,鈴木, 深層学習による自然言語処理 (機械学
習プロフェッショナルシリーズ), 2017