21. Main Components of Chatbot
Entity
???? ??(Entity)? ????! (Named Entity Recognition)
A named entity is a collection of rigidly designated chunks of text that refer to exactly one
or multiple identical, real or abstract concept instances.
Person, Location, Organization, Date, Time, ...
Ex)
??/Date ?? ? ???
?? ??/Location?? ???/Organization ???? ???.
??/Date ???/Location ?? ? ???!
??(Entity)??
22. Main Components of Chatbot
Entity
???? ?? ??? ??? NER ???? ???? ???? ?.
??? NER ??? ???? ?? ?? ???? ????? ??? ?????
???´´! ??? ??? ??? ??? ??. ?? ?? ??? ??? ??
?? ????.
=> ???? ????? ???? ????!
# ?? ??
Load NER_dict
for token in sent:
if token in NER_dict:
Check
23. Main Components of Chatbot
Entity
?? ???? ??? ??? ???!!!
???? NER ??? ??? ???? ??..
Sequence Labeling Model (HMM, CRF, RNN, ...) for POS, NER Tagging
?1 ?2 ?3
?1 ?2 ?3???(Observation Sequence)
?? ?? (Latent state)
?? ??(T) ?? ??(T)
?? ??(E) ?? ??(E) ?? ??(E)
https://github.com/krikit/annie
* ANNIE [2016 ?? ?? ?? ??? C ??? ?? ???]
Hidden Markov Model[Kaist ??? ???]
24. Main Components of Chatbot
Entity
BIO ?? for NER Model
???? ??(????) ? ?, ???? ???? ??? ?? ??? ??.
? ? ??? ??? ??? Entity? ???? ??? ?? ????.
Ex> <?? ??>/Person, <??? ??? ???>/Location
??? ??? ???? ?? ???? BIO ??? ????.
B : Entity? ?? ??, I : B ?? ???? Entity ??, O : Other
Ex>
??/B-PER ??/I-PER, ???/B-LOC ???/I-LOC ???/I-LOC
25. Main Components of Chatbot
Entity
Conditional Random Field for NER
http://homepages.inf.ed.ac.uk/csutton/publications/crftut-fnt.pdf
26. Main Components of Chatbot
Entity
Conditional Random Field for NER
import pycrfsuite # pip install python-crfsuite
def word2features(sent, i):
1> ??? ??? -2~2? ???
2> ??? ??? -2~2? POS ??
3> ??? ???
4> ???? ??
5> ?? ??
6> ??, ??? ??
7> ???? ?? ? ??
´
return features
CRF? ??? ??? ?? ????(???? ???)
27. Main Components of Chatbot
Entity
Conditional Random Field for NER
import pycrfsuite # pip install python-crfsuite
# ???? ?? ??
tagger = pycrfsuite.Tagger()
tagger.open('myner.crfsuite') # trained model
Predict = tagger.tag(sent2features(example_sent))
# ?? ?? ?? ?? ?
# B-DATE B-LOC O O O
CRF? ??? ??? ?? ????(???? ???)
cf. ??? ??? Classification? ???
Precision/Recall/F1-Score? ?? ?? ???.
???? ?? ?? F1-Score? 80 ??? ??
?... 20???? ??? ????? ?? ??..?!
?? state-of-the-art? ??? ???
Bidirectional LSTM + CRF
https://github.com/rockingdingo/deepnlp