際際滷
Submit Search
Enliple korquad challenge
0 likes
398 views
S
Sanghyun Cho
Follow
碁殊危覦 語企 (KorQuAD Challenge)
Read less
Read more
1 of 18
Download now
Download to read offline
More Related Content
Enliple korquad challenge
1.
Enliple覦 語企
KorQuAD 1.0 覈 覦 覿磯蟲 蠍一貉危郁概螻 譟一(delosycho@gmail.com)
2.
覈谿 谿瑚 覦郁化 ろ
覦覯 ろ 蟆郁骸 蟆磯 覦 2
3.
谿瑚 覦郁化 KorQuAD 2.0
伎襯 り 一危 覦磯襯 覦蟆蟆 螻 蟆襷 谿瑚蟆 3 譟郁襷 殊 覺るる
4.
企
覈 れ螻 螳 - 覈 旧 讌ъ 譴觜蠍郁 - 螻給 蟆暑 覈語 伎(12 Layer, 256 Hidden Size) 伎 Tensorflow襷 ろ 伎蠍一 讌ъ 螳朱 覈語 螻 れ 危 朱誤磯ゼ る慨蠍一 螳 觜 蠏碁蠍一 焔レ 企 蟆 覲企る 企骸 ろ 覓伎 螳 螻覩殊 蟆 3
5.
蟷
螻給 Large 覈語 蟆曙 覓 襷 GPU Memory襯 蟲 蠍 覓語 讌 旧 貅覲 襯 伎 覈詩. 蠏碁 譴 れ襯 螻牛 Github 螳蟆 KorQuAD 1.0 旧 Large 覈語 蟷 覦壱 蟆 覲伎螻, 蠍一企 覈 覃 譬蟆り 螳, Large Model Teacher Model襦 れ 谿瑚 襯 覈語 Knowledge Distillation 覦レ朱 焔レ 螳企慨 蠍磯 . 3 Kor_pretrain_LM 蟾
6.
覲 襯
覿覿 -distillation Loss 覲 -Large Model soft label 螻給 覿覿 蠏碁襦 -KorQuAD 一危 豌襴 覦 旧襦 (Random Seed 蠍磯蓋 貊 れ 螳 蠏碁襦 ) 3
7.
ろ 覦覯 一 Distillation
Soft Label 至鍵 一危一 Large Model ろ 螻 豢 螳 Numpy襦 ロ. Tokenizing Large 覈語 Vocab Tokenizer襯 伎朱, Token ids 襦 convert Large Small 覈語 Vocab 覈 伎 螳螳 ロ襦 . ル伎 一危 覈襦 螳. - Start/End Soft Label - Start/End Hard Label - Segments, Mask - Token ids(Large Model Vocab), Token ids(Small Model Vocab) 3
8.
ろ 覦覯 Large Model螻
Small Model Vocab 蟲煙 谿願 譟伎蠍 覓語 Bert_Tokenizer襦 壱襯 るジ 蟆郁骸螳 . 蠏碁Μ螻 企 誤伎 るジ Vocab Tokenizer 焔 一 convert_tokens_to_ids 襯 覃 vocab 谿企 誤伎 [UNK] 一 蠍磯 蟆曙郁 蠍企. (paragraph 2~3螳 ) 企ゼ 語伎襦 螳螻 ろ 讌企慨螻 螳 蟆郁骸, 企 誤 覓語螳 焔レ 曙 殊狩れ . 覓: 蟾豌 蟾 覿襴貅 讌朱 螳も Samll Model Tokenizer: ['蟾', '##豌', '##', '蟾', '##', '##', '覿襴', '##', '##貅', '讌', '##朱', '螳', '##'] Large Model Tokenizer: ['蟾豌', '##', '##', '蟾', '##', '覿襴', '##', '##貅', '讌', '##朱', '螳', '##'] 3 Tokenizing 螻殊 覓語
9.
ろ 覦覯 覈語 蟆曙,
覲 讌 螻 伎 伎企ゼ 蟇磯, 豢螳 Transformer 伎企ゼ 豢螳 煙 覲 企慨. 3 覈 覲 Transformer Transformer Transformer Transformer Transformer Concat Hidden Start End Transformer Transformer Transformer Concat Hidden Transformer Start End Transformer Transformer Transformer Transformer Start End
10.
ろ 覦覯 Distillation Loss襯
れ 蟆 譴覃, Loss れ 朱誤磯 蟆 T(Temperature), a(Alpha)企. Temperature螳 襦 soft 朱襖 旧 襦 覃, Alpha 螳 soft label Loss hard label Loss 譴 企 蟆 襷 牛 蟆語襯 れ Weight 螳企. 3 Loss
11.
郁規 螳 覲 ろ
Distillation 覦覯朱 3螳讌襦 覿襯 ろ 讌. 3 Distillation 覦覯 Soft Label Training Hard Label Training Soft Label Training Hard Label Training Soft Label Training Evaluation Evaluation Evaluation a: 1.0 a: 0 a: 1.0 (1) (2) (3) 蠍一 2覯 蟆曙, Hard Label襦 給 1e-5襦 蟆 れ 旧 . a: 0.5 a: 0.5
12.
ろ 蟆郁骸 3 Distillation
磯ジ 蟆郁骸 覦覯 T F1 1 2.5 88.68 2 2.5 88.71 3 2.5 88.31 1 3.5 89.75 2 3.5 89.89 3 3.5 88.95 1 5.0 89.12 2 5.0 89.45 3 5.0 88.93 ろ 蟆郁骸, T襯 3.5襦 れ螻 Soft Label襦 牛 給襦 Hard Label襦 牛 覦覯 焔レ 螳 譬. 蠏 伎 Soft Label 視 螻殊 Vocab 谿企 蠍磯 [UNK] 一朱 誤 語伎螳 レ 蟆企手 螳. (%, dev)
13.
ろ 蟆郁骸 3 覈
覲 磯ジ 蟆郁骸 覈 覲 F1 Original 88.21 concat 87.98 layer+ 88.25 concat + layer+ 88.32 蟆 豕譬 伎伎 伎 2螳 伎企ゼ concat 蠍一ヾ覲企 ろ 焔レ 螳 蟆 覲伎. 豢螳 Transformer Layer襯 覿 焔レ るゴ蠍 讌襷 襷れ 譴伎朱, Concat 覯″一 豢螳 伎企ゼ 覿 焔レ 麹讌襷 襷 朱誤磯ゼ 蟆 觜 轟 譟伎讌 豕譬 伎伎 伎 伎企ゼ concat 蟆 conat, 豢螳 伎企ゼ 豢螳 蟆 Layer+襦 蠍壱.
14.
ろ 蟆郁骸 3 覈
覲 磯ジ 蟆郁骸 覈 覲 覦覯 F1 Original 1 88.75 concat 1 88.49 layer+ 1 88.97 concat + layer+ 1 89.32 Original 2 89.50 concat 2 89.27 layer+ 2 89.56 concat + layer+ 2 90.00 Distillation螻 豢螳 伎企ゼ 蟷 蠍一ヾ覲企 焔 轟 譬 貉語朱, 襷 朱誤磯ゼ 蟆 觜 襷譟燕 譴 . (%, dev, T=3.5)
15.
蟆磯 覦 Large
Model Teacher Model襦 れ螻 Knowledge Distillation ろ 讌. 覈語 覲 F1 1.3% レ 覲伎. Transformer 豕譬 伎伎 伎 伎 2螳襯 concat 豢螳 伎企ゼ F1 1.8% レ 螳 焔レ 覲伎. 蟲 覈語 Vocab 譟郁 るゴる 2覯 distillation 覦覯朱 語伎襦 誤 襯 れ 螳 蟆朱 螳. T 蟆曙 覈語 Bias/Variance ル 譟伎. (覲 ろ Variance螳 襦 T螳 譬 焔レ 覲伎) 3
16.
蟆磯 覦 伎
るジ ろ vocab 螳 Mecab-base 覈瑚骸 Mecab-small 覈語 distillation 1覯 distillation 覦覯 螳 焔レ 覲伎. 蟆 覈語 覲蟇磯 レ 蠏碁襦 蟆 豢螳 feature (螳豌企, 奄) 豢螳襦 distillation 螻手 貉語. 襯 蟆 蟆 伎 豐覦蟆 譴觜 feature襯 豢螳 覈語 牛企慨讌 覈詩 蟆 襷 磯, 觜襦 讌襷 危 覈語 牛 れ 焔レ 螳企慨 螻 . 譬 襯 牛 覿 覩碁螻 PyTorch襯 觜襯願 螻給 企骸 伎 譬朱, 企糾 螳 KorQuAD 豢 讌 企骸 伎 譬 蟆渚伎. 3
17.
谿瑚 覓誤 17 Sun,
Siqi, et al. "Patient knowledge distillation for bert model compression." arXiv preprint arXiv:1908.09355 (2019). Turc, Iulia, et al. "Well-read students learn better: On the importance of pre-training compact models." arXiv preprint arXiv:1908.08962 (2019). https://light-tree.tistory.com/196 https://jamiekang.github.io/2017/05/21/distilling-the- knowledge-in-a-neural-network/ https://blog.lunit.io/2018/03/22/distilling-the-knowledge-in-a- neural-network-nips-2014-workshop/
18.
螳矧
Download