狠狠撸

狠狠撸Share a Scribd company logo
Context-Aware Derivation
Prediction
Ekaterina Vylomova, Ryan Cotterell, Tim Baldwin, Trevor Cohn
1
SIGMORPHON SHARED TASK ON INFLECTION
run + PRESENT PARTICIPLE running
2
Ekaterina Vylomova, evylomova@gmail.com
SIGMORPHON SHARED TASK ON INFLECTION
run + PRESENT PARTICIPLE running
ran + PAST + PRESENT PARTICIPLE running
3
Ekaterina Vylomova, evylomova@gmail.com
SIGMORPHON SHARED TASK ON INFLECTION
run + PRESENT PARTICIPLE running
ran + PAST + PRESENT PARTICIPLE running
4
Ekaterina Vylomova, evylomova@gmail.com
GRAMMAR TAGS
SIGMORPHON SHARED TASK ON INFLECTION
run + PRESENT PARTICIPLE running
ran + PAST + PRESENT PARTICIPLE running
5
Ekaterina Vylomova, evylomova@gmail.com
GRAMMAR TAGS
? Well-studied
? Regular
? High Accuracy
WHAT ABOUT DERIVATION?
6
Ekaterina Vylomova, evylomova@gmail.com
STEM AGENT PATIENT RESULT ABILITY
bring bringer
stand standee
simulate simulator simulation simulatable
employ employer employee employment employable
WHAT ABOUT DERIVATION?
7
Ekaterina Vylomova, evylomova@gmail.com
STEM AGENT PATIENT RESULT ABILITY
bring bringer
stand standee
simulate simulator simulation simulatable
employ employer employee employment employable
Derivational slots?
Empty cells
DISTRIBUTIONAL SEMANTICS
8
Ekaterina Vylomova, evylomova@gmail.com
J.R.FIRTH
Z. HARRISL. WITTGENSTEIN
MOTIVATION
How well we can predict derivations directly from the
context?
9
Ekaterina Vylomova, evylomova@gmail.com
. . the ergometer ’s inability to properly SIMULATE
the larger rowers drag on a boat . . .
. . . this SIMULATE package is based on Simula 's object
oriented features and its coroutine concept . . .
. . . Bay pilots trained for the visit on a SIMULATE
at the California Maritime Academy . . .
MOTIVATION
How well we can predict derivations directly from the
context?
10
Ekaterina Vylomova, evylomova@gmail.com
. . the ergometer ’s inability to properly SIMULATE
the larger rowers drag on a boat . . .
. . . this SIMULATE package is based on Simula 's object
oriented features and its coroutine concept . . .
. . . Bay pilots trained for the visit on a SIMULATE
at the California Maritime Academy . . .
MOTIVATION
How well we can predict derivations directly from the
context?
11
Ekaterina Vylomova, evylomova@gmail.com
. . the ergometer ’s inability to properly SIMULATE
the larger rowers drag on a boat . . .
. . . this SIMULATION package is based on Simula 's object
oriented features and its coroutine concept . . .
. . . Bay pilots trained for the visit on a SIMULATE
at the California Maritime Academy . . .
MOTIVATION
How well we can predict derivations directly from the
context?
12
Ekaterina Vylomova, evylomova@gmail.com
. . the ergometer ’s inability to properly SIMULATE
the larger rowers drag on a boat . . .
. . . this SIMULATION package is based on Simula 's object
oriented features and its coroutine concept . . .
. . . Bay pilots trained for the visit on a SIMULATOR
at the California Maritime Academy . . .
BASELINE : 3-gram Modified KN smoothing
13
Ekaterina Vylomova, evylomova@gmail.com
This SIMULATE package is based on Simula 's object oriented features ... -47.9
This SIMULATES package is based on Simula 's object oriented features ... -50.0
This SIMULATED package is based on Simula 's object oriented features ... -49.0
This SIMULATING package is based on Simula 's object oriented features ... -49.5
This SIMULATION package is based on Simula 's object oriented features ... -46.1
This SIMULATOR package is based on Simula 's object oriented features ... -48.9
This SIMULATORS package is based on Simula 's object oriented features ... -50.7
log p
BASELINE : 3-gram Modified KN smoothing
14
Ekaterina Vylomova, evylomova@gmail.com
This SIMULATE package is based on Simula 's object oriented features ... -47.9
This SIMULATES package is based on Simula 's object oriented features ... -50.0
This SIMULATED package is based on Simula 's object oriented features ... -49.0
This SIMULATING package is based on Simula 's object oriented features ... -49.5
This SIMULATION package is based on Simula 's object oriented features ... -46.1
This SIMULATOR package is based on Simula 's object oriented features ... -48.9
This SIMULATORS package is based on Simula 's object oriented features ... -50.7
log p
DATASET
? English Verb Nominalizations only
? CELEX: <accusation, accuse+ation>
24 suffix classes / 1,456 base lemmas / 3,079 unique lemma
pairs
? Contexts: 107,041 contextual instances from English Wikipedia
? Pre-trained word embeddings: word2vec trained on Google News
15
Ekaterina Vylomova, evylomova@gmail.com
ARCHITECTURE OF THE MODEL
16
Ekaterina Vylomova, evylomova@gmail.com
ARCHITECTURE OF THE MODEL
17
Ekaterina Vylomova, evylomova@gmail.com
ARCHITECTURE OF THE MODEL
18
Ekaterina Vylomova, evylomova@gmail.com
ARCHITECTURE OF THE MODEL
19
Ekaterina Vylomova, evylomova@gmail.com
ARCHITECTURE OF THE MODEL
20
Ekaterina Vylomova, evylomova@gmail.com
ARCHITECTURE OF THE MODEL
21
Ekaterina Vylomova, evylomova@gmail.com
ARCHITECTURE OF THE MODEL
22
Ekaterina Vylomova, evylomova@gmail.com
ARCHITECTURE OF THE MODEL
23
Ekaterina Vylomova, evylomova@gmail.com
Accuracy For Predicted Lemmas (inc. bases)
24
Ekaterina Vylomova, evylomova@gmail.com
BASELINE
Base Only
CTX Only
Base+CTX
Base+CTX
+POS
Base+CTX
+POS+SD
ERROR ANALYSIS: ambiguity and vagueness
25
Ekaterina Vylomova, evylomova@gmail.com
ERROR ANALYSIS
26
Ekaterina Vylomova, evylomova@gmail.com
Correct (Context of ...) Predicted
Extra Variety of Forms
student studint, studion, studyant,
student
Especially in Split Lexicon Setting
trailer trailer, trailation, trailment
ERROR ANALYSIS
27
Ekaterina Vylomova, evylomova@gmail.com
Correct (Context of ...) Predicted
Lack of Forms
government and governance government
Bias Towards Productive Suffixes
stoppage stoption
NONCE WORDS
28
Ekaterina Vylomova, evylomova@gmail.com
transcribe laptify fape crimmle beteive
transcription laptification fape crimmle beterve
transcription laptification fapery crimmler betention
transcription laptification fapication crimmler beteption
transcription laptification fapionment crimmler betention
transcription laptification fapist crimmler betention
transcription laptification fapist crimmler beteption
transcript laptification fapery crimmler betention
transcript laptification fapist crimmler beteption
CONCLUSION
29
Ekaterina Vylomova, evylomova@gmail.com
There is regularity in derivational processes!
CONCLUSION
30
Ekaterina Vylomova, evylomova@gmail.com
There is regularity in derivational processes!
Code and data available at https://github.com/ivri/dmorph
Thank you for your { A T T E N D } !

More Related Content

Context-Aware Derivation Prediction // EACL 2017

  • 1. Context-Aware Derivation Prediction Ekaterina Vylomova, Ryan Cotterell, Tim Baldwin, Trevor Cohn 1
  • 2. SIGMORPHON SHARED TASK ON INFLECTION run + PRESENT PARTICIPLE running 2 Ekaterina Vylomova, evylomova@gmail.com
  • 3. SIGMORPHON SHARED TASK ON INFLECTION run + PRESENT PARTICIPLE running ran + PAST + PRESENT PARTICIPLE running 3 Ekaterina Vylomova, evylomova@gmail.com
  • 4. SIGMORPHON SHARED TASK ON INFLECTION run + PRESENT PARTICIPLE running ran + PAST + PRESENT PARTICIPLE running 4 Ekaterina Vylomova, evylomova@gmail.com GRAMMAR TAGS
  • 5. SIGMORPHON SHARED TASK ON INFLECTION run + PRESENT PARTICIPLE running ran + PAST + PRESENT PARTICIPLE running 5 Ekaterina Vylomova, evylomova@gmail.com GRAMMAR TAGS ? Well-studied ? Regular ? High Accuracy
  • 6. WHAT ABOUT DERIVATION? 6 Ekaterina Vylomova, evylomova@gmail.com STEM AGENT PATIENT RESULT ABILITY bring bringer stand standee simulate simulator simulation simulatable employ employer employee employment employable
  • 7. WHAT ABOUT DERIVATION? 7 Ekaterina Vylomova, evylomova@gmail.com STEM AGENT PATIENT RESULT ABILITY bring bringer stand standee simulate simulator simulation simulatable employ employer employee employment employable Derivational slots? Empty cells
  • 8. DISTRIBUTIONAL SEMANTICS 8 Ekaterina Vylomova, evylomova@gmail.com J.R.FIRTH Z. HARRISL. WITTGENSTEIN
  • 9. MOTIVATION How well we can predict derivations directly from the context? 9 Ekaterina Vylomova, evylomova@gmail.com . . the ergometer ’s inability to properly SIMULATE the larger rowers drag on a boat . . . . . . this SIMULATE package is based on Simula 's object oriented features and its coroutine concept . . . . . . Bay pilots trained for the visit on a SIMULATE at the California Maritime Academy . . .
  • 10. MOTIVATION How well we can predict derivations directly from the context? 10 Ekaterina Vylomova, evylomova@gmail.com . . the ergometer ’s inability to properly SIMULATE the larger rowers drag on a boat . . . . . . this SIMULATE package is based on Simula 's object oriented features and its coroutine concept . . . . . . Bay pilots trained for the visit on a SIMULATE at the California Maritime Academy . . .
  • 11. MOTIVATION How well we can predict derivations directly from the context? 11 Ekaterina Vylomova, evylomova@gmail.com . . the ergometer ’s inability to properly SIMULATE the larger rowers drag on a boat . . . . . . this SIMULATION package is based on Simula 's object oriented features and its coroutine concept . . . . . . Bay pilots trained for the visit on a SIMULATE at the California Maritime Academy . . .
  • 12. MOTIVATION How well we can predict derivations directly from the context? 12 Ekaterina Vylomova, evylomova@gmail.com . . the ergometer ’s inability to properly SIMULATE the larger rowers drag on a boat . . . . . . this SIMULATION package is based on Simula 's object oriented features and its coroutine concept . . . . . . Bay pilots trained for the visit on a SIMULATOR at the California Maritime Academy . . .
  • 13. BASELINE : 3-gram Modified KN smoothing 13 Ekaterina Vylomova, evylomova@gmail.com This SIMULATE package is based on Simula 's object oriented features ... -47.9 This SIMULATES package is based on Simula 's object oriented features ... -50.0 This SIMULATED package is based on Simula 's object oriented features ... -49.0 This SIMULATING package is based on Simula 's object oriented features ... -49.5 This SIMULATION package is based on Simula 's object oriented features ... -46.1 This SIMULATOR package is based on Simula 's object oriented features ... -48.9 This SIMULATORS package is based on Simula 's object oriented features ... -50.7 log p
  • 14. BASELINE : 3-gram Modified KN smoothing 14 Ekaterina Vylomova, evylomova@gmail.com This SIMULATE package is based on Simula 's object oriented features ... -47.9 This SIMULATES package is based on Simula 's object oriented features ... -50.0 This SIMULATED package is based on Simula 's object oriented features ... -49.0 This SIMULATING package is based on Simula 's object oriented features ... -49.5 This SIMULATION package is based on Simula 's object oriented features ... -46.1 This SIMULATOR package is based on Simula 's object oriented features ... -48.9 This SIMULATORS package is based on Simula 's object oriented features ... -50.7 log p
  • 15. DATASET ? English Verb Nominalizations only ? CELEX: <accusation, accuse+ation> 24 suffix classes / 1,456 base lemmas / 3,079 unique lemma pairs ? Contexts: 107,041 contextual instances from English Wikipedia ? Pre-trained word embeddings: word2vec trained on Google News 15 Ekaterina Vylomova, evylomova@gmail.com
  • 16. ARCHITECTURE OF THE MODEL 16 Ekaterina Vylomova, evylomova@gmail.com
  • 17. ARCHITECTURE OF THE MODEL 17 Ekaterina Vylomova, evylomova@gmail.com
  • 18. ARCHITECTURE OF THE MODEL 18 Ekaterina Vylomova, evylomova@gmail.com
  • 19. ARCHITECTURE OF THE MODEL 19 Ekaterina Vylomova, evylomova@gmail.com
  • 20. ARCHITECTURE OF THE MODEL 20 Ekaterina Vylomova, evylomova@gmail.com
  • 21. ARCHITECTURE OF THE MODEL 21 Ekaterina Vylomova, evylomova@gmail.com
  • 22. ARCHITECTURE OF THE MODEL 22 Ekaterina Vylomova, evylomova@gmail.com
  • 23. ARCHITECTURE OF THE MODEL 23 Ekaterina Vylomova, evylomova@gmail.com
  • 24. Accuracy For Predicted Lemmas (inc. bases) 24 Ekaterina Vylomova, evylomova@gmail.com BASELINE Base Only CTX Only Base+CTX Base+CTX +POS Base+CTX +POS+SD
  • 25. ERROR ANALYSIS: ambiguity and vagueness 25 Ekaterina Vylomova, evylomova@gmail.com
  • 26. ERROR ANALYSIS 26 Ekaterina Vylomova, evylomova@gmail.com Correct (Context of ...) Predicted Extra Variety of Forms student studint, studion, studyant, student Especially in Split Lexicon Setting trailer trailer, trailation, trailment
  • 27. ERROR ANALYSIS 27 Ekaterina Vylomova, evylomova@gmail.com Correct (Context of ...) Predicted Lack of Forms government and governance government Bias Towards Productive Suffixes stoppage stoption
  • 28. NONCE WORDS 28 Ekaterina Vylomova, evylomova@gmail.com transcribe laptify fape crimmle beteive transcription laptification fape crimmle beterve transcription laptification fapery crimmler betention transcription laptification fapication crimmler beteption transcription laptification fapionment crimmler betention transcription laptification fapist crimmler betention transcription laptification fapist crimmler beteption transcript laptification fapery crimmler betention transcript laptification fapist crimmler beteption
  • 29. CONCLUSION 29 Ekaterina Vylomova, evylomova@gmail.com There is regularity in derivational processes!
  • 30. CONCLUSION 30 Ekaterina Vylomova, evylomova@gmail.com There is regularity in derivational processes! Code and data available at https://github.com/ivri/dmorph Thank you for your { A T T E N D } !