12. 12
¢ word-level Jaccard Score
$ ex1) pred/GT = I have a pen. / I have a pen. ★ 1.0
$ ex2) pred/GT = have / I have a pen. ★ ? = 0.25
u峺
pred GT
pred GT
Jaccard =
https://www.kaggle.com/c/tweet-sentiment-extraction/overview/evaluation
22. 22
¢ 児云議にはよくある NLP の僥パイプライン
$ ラベルと head は}の盾き圭による
児AパイプラインのB
sentiment: positive
text: haha that¨s way cool! Good morning
selected_text: haha that¨s way cool!
add sentiment & BPE tokenize
training label
make label
text¨: <s> positive </s> </s> haha that ` s way ´.
embedding
...
...
...
...
... RoBERTa
Head
pred
calc loss & optimize weight
23. 23
¢ この}の盾き圭
$ : start/end 圭塀 (こちらがメジャ`)
$ : segmentation 圭塀
児AパイプラインのB
☆ これ參翌にも箭えば sentiment を嚠y鵑箸靴栽の attention を聞って盾くとか弼?と
やり圭はある (https://www.kaggle.com/cdeotte/unsupervised-text-selection)
text¨ : <s> positive </s> </s> haha that ` s way cool ! Good morning
start label : 0 0 0 0 1 0 00 0 0 0 0 0
selected_text: haha that`s way cool!
end label : 0 0 0 0 0 0 00 0 0 1 0 0
text¨ : <s> positive </s> </s> haha that ` s way cool ! Good morning
label : 0 0 0 0 1 1 11 1 1 1 0 0
selected_text: haha that`s way cool!
24. 24
¢ 嚠yには start/end 圭塀を聞うが、僥rに segmentation 圭塀の
僥も揖rに佩う
$ それぞれの loss の曳楕をうまく{屁するとYスコアが鯢
マルチタスク僥
RoBERTa
start
head
end
head
segmentation
head
start
label
end
label
segmentation
label
CE loss Lovasz-hinge loss
嚠yはこっちのみ