4. Semantic analysis
Representation of documents
Axes of a spatial Probabilistic topics
- Euclidean spaceì—ì„œ ì •ì˜ ê°€ëŠ¥
- Hard to interprete
- 단어ìƒì— ì •ì˜ëœ probability distribution
- Interpretable
5. 1. Axes of a spatial
- LSA
2. Probabilistic topics
- LDA
3. Bayesian Nonparametric
- HDP
23. Topic models
- Topic A: 30% broccoli, 15% bananas, 10% breakfast, 10% munching, …
- Topic B: 20% cats, 20% cute, 15% dogs, 15% hamster, …
Doc 1 : I like to eat broccoli and bananas.
Doc 2 : I ate a banana and tomato smoothie for breakfast.
Doc 3 : Dogs and cats are cute.
Doc 4 : My sister adopted a cats yesterday.
Doc 5 : Look at this cute hamster munching on a piece of broccoli.
ì˜ˆì œ)
- Doc 1 and 2 : 100% topic A
- Doc 3 and 4 : 100% topic B
- Doc 5 : 60% topic A, 40% topic B