際際滷

際際滷Share a Scribd company logo
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Measuring the Structural and
Conceptual Similarity
of Folktales using Plot Graphs
Victoria Anugrah Lestari & Ruli Manurung
Faculty of Computer Science
Universitas Indonesia
victoria.anugrah@ui.ac.id, maruli@cs.ui.ac.id
Beijing, China
30 July 2015
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Folktales
Folktales are a characteristically anonymous, timeless,
and placeless tale circulated orally among a people.
http://onceuponatime.wikia.com/wiki/Rumpelstiltskin_(Fairytale)
http://indonesianfolklore.blogspot.com/2007/10/lutung-kasarung-folklore-from-west-java.html
http://indonesianfolklore.blogspot.com/2007/10/keong-emas-golden-snail-prince-raden.html 2/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Humanities work on folktales
 Vladimir Propp (1928): Morphology of the
(Russian) folktale  story grammars
 Aarne-Thompson-Uther (ATU) index (1910,
1961, 2004): story motifs, hierarchy of folktale
types
3/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Computational work on folktales
 Vaz Lobo & de Matos (2010): latent semantic mapping +
clustering 453 fairy tales from Gutenberg.
 Nguyen et al. (2012): classification based on genre, e.g.
legend, fairytale, jokes, puzzle, urban legend, etc. using
lexical, POS, NE, metadata.
 Nguyen et al. (2013): Ranking based on story types (ATU,
Brunvand) using IR, lexical, SVO triplets.
 Karsdorp & van den Bosch (2013): Topic modelling (L-LDA) for
multiple labelling of ATU motifs (defined by types).
4/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Folktales as narratives
 Narratives: Focus on sequence of related
events  structure
 Models of narrative: Turner (1994), Mateas &
Stern (2003), P辿rez y P辿rez & Sharples (2004),
etc.
5/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Folktales as narratives
 Narratives: Focus on sequence of related
events  structure
 Models of narrative: Turner (1994), Mateas &
Stern (2003), P辿rez y P辿rez & Sharples (2004),
etc.
 However: Fisseni & L旦we (2012): People tend
to focus on motifs & content, less on
structure.
5/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Plot graphs (McIntyre & Lapata, 2010)
6/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Goals of this work
 Construct representations that capture
structural & conceptual properties.
 Define similarity metric, use to organize
folktales.
 Compare to BoW-based methods wrt. ATU.
7/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Representing folktales as plot graphs
8/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Representing folktales as plot graphs
Action nodes: Action edges:
8/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Representing folktales as plot graphs
Action nodes:
Child
nodes:
Action edges:
Action-
Child
edges:
8/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Representing folktales as plot graphs
Action nodes:
Child
nodes:
Entity
nodes:
Action edges:
Action-
Child
edges:
Entity
edges:
8/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Representing folktales as plot graphs
Note that the core structure is linear.
Action nodes:
Child
nodes:
Entity
nodes:
Action edges:
Action-
Child
edges:
Entity
edges:
8/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Example
9/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Example
live
lion forest
subj in
9/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Example
live sleep
lion forest it tree
subj in subj under
9/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Example
live sleep come
lion forest it tree mouse
subj in subj under subj
9/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Example
live sleep come play
lion forest it tree mouse lionit
subj in subj under subj subj on
9/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Automatic construction
Stanford CoreNLP SemanticGraph (a.k.a.
dependency parse)
10/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
From SemanticGraph to plot graph
Some observation-based heuristics on selecting relations:
 Governors of nsubj (nominal subject), expl (expletive there), and aux (auxiliary)
 Add child if relation(parent,child) not conj, comp, adv, aux, cop, dep, expl, mark
11/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Construction example
12/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Construction example
CoreNLP CorefChain
(length >1)
12/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Construction example
CoreNLP CorefChain
(length >1)
12/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Construction example
CoreNLP CorefChain
(length >1)
12/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Final result
13/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Measuring plot graph similarity
A lion lives in the
forest. One day it
sleeps under a
tree. Then a
mouse plays on
the lion and
disturbs its sleep.
A lion eats meat. A
lion lives in the
jungle. One day it
rests under a tree.
14/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Measuring plot graph similarity
A lion lives in the
forest. One day it
sleeps under a
tree. Then a
mouse plays on
the lion and
disturbs its sleep.
A lion eats meat. A
lion lives in the
jungle. One day it
rests under a tree.
14/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Alignment of event sequence
Needleman-
Wunsch
15/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Conceptual similarity: Wu-Palmer
Measure path
distance
between 2
words based on
WordNet
taxonomy
Word pairs Similarity
sleep, live 0.25
disturb, rest 0.33
live, eat 0.29
prince, king 0.94
jungle, forest 0.31
palace, house 0.91
16/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Example mapping
eat live rest
live 0.29 1 0.33
sleep 0.22 0.25 0.43
play 0.29 0.33 0.43
disturb 0.29 0.33 0.33
eat live rest
0 -1 -2 -3
live -1 0.29 0 -1
sleep -2 -0.71 0.54 1
play -3 -1.71 -0.38 0.96
disturb -4 -2.71 -1.38 -0.04
Wu-Palmer similarity
Alignment scoring & traceback matrix
17/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Folktale similarity measurement
p1 & p2 = the two plot graphs being compared
留 = weighting for action node similarity
硫 = weighting for child node similarity
(a1i ,a2i ) = pair of action nodes from alignment of p1 and p2
g = gap penalty
(c1i ,c2i ) = pair of child nodes from alignment of p1 and p2
n = alignment length of p1 and p2
18/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Initial experiment
 Determining values for 留, 硫, and g
 For each story, 5 paraphrases manually created: word
replacement, sentence structure change, insertion/deletion of
phrases & sentences
 Measure similarity between paraphrases & across stories.
Maximize difference.
No. Title #Words
1 A friend in need is a friend indeed 133
2 Honesty is the best policy 129
3 A town mouse and a country mouse 260
4 How to tell a true princess 382
5 The butterfly lovers 572
6 Rumpelstiltskin 1106
http://www.english-for-students.com/Simple-Short-Stories.html 19/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Similarity scores using various parameters
g=
留 = 0.7, 硫 = 0.3 留 = 0.5, 硫 = 0.5 留 = 0.3, 硫 = 0.7
-1 -0.5 0 -1 -0.5 0 -1 -0.5 0
Between
paraphrases
Avg 0.83 0.80 0.74 0.83 0.80 0.73 0.83 0.79 0.71
Min 0.69 0.61 0.53 0.69 0.60 0.49 0.68 0.58 0.45
Across
stories
Avg 0.37 0.30 0.15 0.41 0.32 0.12 0.45 0.33 0.09
Max 0.55 0.45 0.25 0.55 0.43 0.20 0.55 0.42 0.16
BP min - AS max 0.14 0.16 0.28 0.14 0.17 0.29 0.13 0.16 0.29
Diff. between avgs 0.46 0.50 0.59 0.42 0.48 0.61 0.38 0.46 0.62
20/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Main experiment: BoW comparison
24 fairy tales from Fairy Books of Andrew Lang, grouped into 5 clusters under
ATU (fairy tales):
 Supernatural Adversaries  Bluebeard; Hansel and Gretel; Jack and the
Beanstalk; Rapunzel; The Twelve Dancing Princesses.
 Supernatural or Enchanted Relatives  Beauty and the Beast; Brother
and Sister; East of the Sun, West of the Moon; Snow White and Rose Red;
The Bushy Bride; The Six Swans; The Sleeping Beauty.
 Supernatural Helpers  Cinderella; Donkey Skin; Puss in Boots;
Rumpelstiltskin; The Goose Girl; The Story of Sigurd.
 Magic Objects  Aladdin and the Wonderful Lamp; Fortunatus and His
Purse; The Golden Goose; The Magic Ring.
 Other Stories of the Supernatural  Little Thumb; The Princess and the
Pea.
Measure similarity between clusters & across clusters.
http://www.gutenberg.org/ebooks/30580 21/24
Beijing
30 July 15
Folktales Plot graphs Similarity ExperimentsStory type Story
Plot graph Bag of words Combination
Within Across Within Across Within Across
Supernatural
adversaries
Bluebeard 0.1000 0.1037 0.8629 0.8618 0.4814 0.4586
Hansel and Gretel 0.1075 0.1157 0.8492 0.8630 0.4783 0.4894
Jack and the Beanstalk 0.1050 0.1110 0.9050 0.8891 0.5050 0.5001
Rapunzel 0.1000 0.1047 0.8790 0.8575 0.4895 0.4571
The Twelve Dancing Princesses 0.1125 0.1073 0.8808 0.8631 0.4966 0.4610
Supernatural
or enchanted
relatives
Beauty and the Beast 0.0767 0.0705 0.8803 0.8605 0.4785 0.4397
Brother and Sister 0.1233 0.1135 0.8881 0.8722 0.5057 0.4654
East of the Sun, West of the Moon 0.1117 0.1012 0.8914 0.8571 0.5015 0.4525
Snow White and Rose Red 0.1200 0.1165 0.8650 0.8566 0.4925 0.4865
The Bushy Bride 0.1200 0.1182 0.8862 0.8739 0.5031 0.4960
The Six Swans 0.0925 0.1100 0.9006 0.8662 0.5020 0.4881
The Sleeping Beauty 0.1125 0.1194 0.8990 0.8918 0.5087 0.5056
Supernatural
helpers
Cinderella 0.1180 0.1144 0.8150 0.8306 0.4665 0.4725
Donkey Skin 0.1040 0.1122 0.8873 0.9025 0.4956 0.5074
Puss in Boots 0.1175 0.1095 0.8170 0.8486 0.4672 0.4551
Rumpelstiltskin 0.0750 0.0858 0.8467 0.8569 0.4609 0.4478
The Goose Girl 0.1240 0.1178 0.8617 0.8624 0.4928 0.4643
The Story of Sigurd 0.1080 0.1178 0.8516 0.8670 0.4800 0.4664
Magic objects
Aladdin and the Wonderful Lamp 0.0975 0.0910 0.8958 0.8664 0.4946 0.4559
Fortunatus and His Purse 0.1133 0.1185 0.8945 0.8306 0.5039 0.4519
The Golden Goose 0.1033 0.1155 0.9006 0.8529 0.5012 0.4611
The Magic Ring 0.1033 0.1040 0.9120 0.8960 0.5077 0.4762
Other stories
Little Thumb 0.0300 0.1214 0.7444 0.8562 0.3872 0.4675
The Princess and the Pea 0.0300 0.0405 0.7444 0.7844 0.3872 0.3945
# Similarity within > across 10 (41.67%) 15 (62.50%) 19 (79.16%)
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
Analysis & Discussion
 Errors in automatic construction (dependency
parses arent really semantic graphs), e.g.:
along came a mouse vs. a mouse came,
coreference errors.
 Consistent with Fisseni & L旦we (2012)
findings: focus more on content & motifs?
 Combination of plot graph + BoW yields best
results.
23/24
Beijing
30 July 15
Folktales Plot graphs Similarity Experiments
 THANK YOU 
24/24

More Related Content

Recently uploaded (20)

The Arctic through the lens of data visualization
The Arctic through the lens of data visualizationThe Arctic through the lens of data visualization
The Arctic through the lens of data visualization
Zachary Labe
Direct Gene Transfer Techniques for Developing Transgenic Plants
Direct Gene Transfer Techniques for Developing Transgenic PlantsDirect Gene Transfer Techniques for Developing Transgenic Plants
Direct Gene Transfer Techniques for Developing Transgenic Plants
Kuldeep Gauliya
Telescope equatorial mount polar alignment quick reference guide
Telescope equatorial mount polar alignment quick reference guideTelescope equatorial mount polar alignment quick reference guide
Telescope equatorial mount polar alignment quick reference guide
bartf25
The Sense Organs: Structure and Function of the Eye and Skin | IGCSE Biology
The Sense Organs: Structure and Function of the Eye and Skin | IGCSE BiologyThe Sense Organs: Structure and Function of the Eye and Skin | IGCSE Biology
The Sense Organs: Structure and Function of the Eye and Skin | IGCSE Biology
Blessing Ndazie
SCIENCE 7 Q4 4 Assessing Earthquake Risks Using PHIVOLCS FaultFinder.pptx
SCIENCE 7 Q4 4 Assessing Earthquake Risks Using PHIVOLCS FaultFinder.pptxSCIENCE 7 Q4 4 Assessing Earthquake Risks Using PHIVOLCS FaultFinder.pptx
SCIENCE 7 Q4 4 Assessing Earthquake Risks Using PHIVOLCS FaultFinder.pptx
ROLANARIBATO3
Plant Tissue Culture-Effects of Chemical Factors.ppt
Plant Tissue Culture-Effects of Chemical Factors.pptPlant Tissue Culture-Effects of Chemical Factors.ppt
Plant Tissue Culture-Effects of Chemical Factors.ppt
laxmichoudhary77657
Respiration & Gas Exchange | Cambridge IGCSE Biology
Respiration & Gas Exchange | Cambridge IGCSE BiologyRespiration & Gas Exchange | Cambridge IGCSE Biology
Respiration & Gas Exchange | Cambridge IGCSE Biology
Blessing Ndazie
vibration-rotation spectra of a diatomic molecule.pptx
vibration-rotation spectra of a diatomic molecule.pptxvibration-rotation spectra of a diatomic molecule.pptx
vibration-rotation spectra of a diatomic molecule.pptx
kanmanivarsha
(Journal Club) - Transgenic mice for in vivo epigenome editing with CRISPR-ba...
(Journal Club) - Transgenic mice for in vivo epigenome editing with CRISPR-ba...(Journal Club) - Transgenic mice for in vivo epigenome editing with CRISPR-ba...
(Journal Club) - Transgenic mice for in vivo epigenome editing with CRISPR-ba...
David Podorefsky, PhD
Propagation of electromagnetic waves in free space.pptx
Propagation of electromagnetic waves in free space.pptxPropagation of electromagnetic waves in free space.pptx
Propagation of electromagnetic waves in free space.pptx
kanmanivarsha
Investigational New drug application process
Investigational New drug application processInvestigational New drug application process
Investigational New drug application process
onepalyer4
GRAPHS BIOSTATICS BPHARM 8 SEM UNIT 1 & 3.pptx
GRAPHS BIOSTATICS BPHARM 8 SEM UNIT 1 & 3.pptxGRAPHS BIOSTATICS BPHARM 8 SEM UNIT 1 & 3.pptx
GRAPHS BIOSTATICS BPHARM 8 SEM UNIT 1 & 3.pptx
KRUTIKA CHANNE
(Journal Club) - Understanding tumor ecosystems by single-cell sequencing: pr...
(Journal Club) - Understanding tumor ecosystems by single-cell sequencing: pr...(Journal Club) - Understanding tumor ecosystems by single-cell sequencing: pr...
(Journal Club) - Understanding tumor ecosystems by single-cell sequencing: pr...
David Podorefsky, PhD
QUIZ 1 in SCIENCE GRADE 10 QUARTER 3 BIOLOGY
QUIZ 1 in SCIENCE GRADE 10 QUARTER 3 BIOLOGYQUIZ 1 in SCIENCE GRADE 10 QUARTER 3 BIOLOGY
QUIZ 1 in SCIENCE GRADE 10 QUARTER 3 BIOLOGY
tbalagbis5
ABA_in_plant_abiotic_stress_mitigation1.ppt
ABA_in_plant_abiotic_stress_mitigation1.pptABA_in_plant_abiotic_stress_mitigation1.ppt
ABA_in_plant_abiotic_stress_mitigation1.ppt
laxmichoudhary77657
SCIENCE 7 Q4 1 Classifying Geological Faults.pptx
SCIENCE 7 Q4 1 Classifying Geological Faults.pptxSCIENCE 7 Q4 1 Classifying Geological Faults.pptx
SCIENCE 7 Q4 1 Classifying Geological Faults.pptx
ROLANARIBATO3
(Journal Club) - RNA m6A regulates transcription via DNA demethylation and ch...
(Journal Club) - RNA m6A regulates transcription via DNA demethylation and ch...(Journal Club) - RNA m6A regulates transcription via DNA demethylation and ch...
(Journal Club) - RNA m6A regulates transcription via DNA demethylation and ch...
David Podorefsky, PhD
GALILEO'S OBSERVATION ni Karlo Mariano.pptx
GALILEO'S OBSERVATION ni Karlo Mariano.pptxGALILEO'S OBSERVATION ni Karlo Mariano.pptx
GALILEO'S OBSERVATION ni Karlo Mariano.pptx
ejrguillermo
Biowaste Management and Its Utilization in Crop Production.pptx
Biowaste Management and Its Utilization in Crop Production.pptxBiowaste Management and Its Utilization in Crop Production.pptx
Biowaste Management and Its Utilization in Crop Production.pptx
Vivek Bhagat
(February 25th, 2025) Real-Time Insights into Cardiothoracic Research with In...
(February 25th, 2025) Real-Time Insights into Cardiothoracic Research with In...(February 25th, 2025) Real-Time Insights into Cardiothoracic Research with In...
(February 25th, 2025) Real-Time Insights into Cardiothoracic Research with In...
Scintica Instrumentation
The Arctic through the lens of data visualization
The Arctic through the lens of data visualizationThe Arctic through the lens of data visualization
The Arctic through the lens of data visualization
Zachary Labe
Direct Gene Transfer Techniques for Developing Transgenic Plants
Direct Gene Transfer Techniques for Developing Transgenic PlantsDirect Gene Transfer Techniques for Developing Transgenic Plants
Direct Gene Transfer Techniques for Developing Transgenic Plants
Kuldeep Gauliya
Telescope equatorial mount polar alignment quick reference guide
Telescope equatorial mount polar alignment quick reference guideTelescope equatorial mount polar alignment quick reference guide
Telescope equatorial mount polar alignment quick reference guide
bartf25
The Sense Organs: Structure and Function of the Eye and Skin | IGCSE Biology
The Sense Organs: Structure and Function of the Eye and Skin | IGCSE BiologyThe Sense Organs: Structure and Function of the Eye and Skin | IGCSE Biology
The Sense Organs: Structure and Function of the Eye and Skin | IGCSE Biology
Blessing Ndazie
SCIENCE 7 Q4 4 Assessing Earthquake Risks Using PHIVOLCS FaultFinder.pptx
SCIENCE 7 Q4 4 Assessing Earthquake Risks Using PHIVOLCS FaultFinder.pptxSCIENCE 7 Q4 4 Assessing Earthquake Risks Using PHIVOLCS FaultFinder.pptx
SCIENCE 7 Q4 4 Assessing Earthquake Risks Using PHIVOLCS FaultFinder.pptx
ROLANARIBATO3
Plant Tissue Culture-Effects of Chemical Factors.ppt
Plant Tissue Culture-Effects of Chemical Factors.pptPlant Tissue Culture-Effects of Chemical Factors.ppt
Plant Tissue Culture-Effects of Chemical Factors.ppt
laxmichoudhary77657
Respiration & Gas Exchange | Cambridge IGCSE Biology
Respiration & Gas Exchange | Cambridge IGCSE BiologyRespiration & Gas Exchange | Cambridge IGCSE Biology
Respiration & Gas Exchange | Cambridge IGCSE Biology
Blessing Ndazie
vibration-rotation spectra of a diatomic molecule.pptx
vibration-rotation spectra of a diatomic molecule.pptxvibration-rotation spectra of a diatomic molecule.pptx
vibration-rotation spectra of a diatomic molecule.pptx
kanmanivarsha
(Journal Club) - Transgenic mice for in vivo epigenome editing with CRISPR-ba...
(Journal Club) - Transgenic mice for in vivo epigenome editing with CRISPR-ba...(Journal Club) - Transgenic mice for in vivo epigenome editing with CRISPR-ba...
(Journal Club) - Transgenic mice for in vivo epigenome editing with CRISPR-ba...
David Podorefsky, PhD
Propagation of electromagnetic waves in free space.pptx
Propagation of electromagnetic waves in free space.pptxPropagation of electromagnetic waves in free space.pptx
Propagation of electromagnetic waves in free space.pptx
kanmanivarsha
Investigational New drug application process
Investigational New drug application processInvestigational New drug application process
Investigational New drug application process
onepalyer4
GRAPHS BIOSTATICS BPHARM 8 SEM UNIT 1 & 3.pptx
GRAPHS BIOSTATICS BPHARM 8 SEM UNIT 1 & 3.pptxGRAPHS BIOSTATICS BPHARM 8 SEM UNIT 1 & 3.pptx
GRAPHS BIOSTATICS BPHARM 8 SEM UNIT 1 & 3.pptx
KRUTIKA CHANNE
(Journal Club) - Understanding tumor ecosystems by single-cell sequencing: pr...
(Journal Club) - Understanding tumor ecosystems by single-cell sequencing: pr...(Journal Club) - Understanding tumor ecosystems by single-cell sequencing: pr...
(Journal Club) - Understanding tumor ecosystems by single-cell sequencing: pr...
David Podorefsky, PhD
QUIZ 1 in SCIENCE GRADE 10 QUARTER 3 BIOLOGY
QUIZ 1 in SCIENCE GRADE 10 QUARTER 3 BIOLOGYQUIZ 1 in SCIENCE GRADE 10 QUARTER 3 BIOLOGY
QUIZ 1 in SCIENCE GRADE 10 QUARTER 3 BIOLOGY
tbalagbis5
ABA_in_plant_abiotic_stress_mitigation1.ppt
ABA_in_plant_abiotic_stress_mitigation1.pptABA_in_plant_abiotic_stress_mitigation1.ppt
ABA_in_plant_abiotic_stress_mitigation1.ppt
laxmichoudhary77657
SCIENCE 7 Q4 1 Classifying Geological Faults.pptx
SCIENCE 7 Q4 1 Classifying Geological Faults.pptxSCIENCE 7 Q4 1 Classifying Geological Faults.pptx
SCIENCE 7 Q4 1 Classifying Geological Faults.pptx
ROLANARIBATO3
(Journal Club) - RNA m6A regulates transcription via DNA demethylation and ch...
(Journal Club) - RNA m6A regulates transcription via DNA demethylation and ch...(Journal Club) - RNA m6A regulates transcription via DNA demethylation and ch...
(Journal Club) - RNA m6A regulates transcription via DNA demethylation and ch...
David Podorefsky, PhD
GALILEO'S OBSERVATION ni Karlo Mariano.pptx
GALILEO'S OBSERVATION ni Karlo Mariano.pptxGALILEO'S OBSERVATION ni Karlo Mariano.pptx
GALILEO'S OBSERVATION ni Karlo Mariano.pptx
ejrguillermo
Biowaste Management and Its Utilization in Crop Production.pptx
Biowaste Management and Its Utilization in Crop Production.pptxBiowaste Management and Its Utilization in Crop Production.pptx
Biowaste Management and Its Utilization in Crop Production.pptx
Vivek Bhagat
(February 25th, 2025) Real-Time Insights into Cardiothoracic Research with In...
(February 25th, 2025) Real-Time Insights into Cardiothoracic Research with In...(February 25th, 2025) Real-Time Insights into Cardiothoracic Research with In...
(February 25th, 2025) Real-Time Insights into Cardiothoracic Research with In...
Scintica Instrumentation

Featured (20)

Storytelling For The Web: Integrate Storytelling in your Design Process
Storytelling For The Web: Integrate Storytelling in your Design ProcessStorytelling For The Web: Integrate Storytelling in your Design Process
Storytelling For The Web: Integrate Storytelling in your Design Process
Chiara Aliotta
Artificial Intelligence, Data and Competition SCHREPEL June 2024 OECD dis...
Artificial Intelligence, Data and Competition  SCHREPEL  June 2024 OECD dis...Artificial Intelligence, Data and Competition  SCHREPEL  June 2024 OECD dis...
Artificial Intelligence, Data and Competition SCHREPEL June 2024 OECD dis...
OECD Directorate for Financial and Enterprise Affairs
How to Leverage AI to Boost Employee Wellness - Lydia Di Francesco - SocialHR...
How to Leverage AI to Boost Employee Wellness - Lydia Di Francesco - SocialHR...How to Leverage AI to Boost Employee Wellness - Lydia Di Francesco - SocialHR...
How to Leverage AI to Boost Employee Wellness - Lydia Di Francesco - SocialHR...
SocialHRCamp
2024 State of Marketing Report by Hubspot
2024 State of Marketing Report  by Hubspot2024 State of Marketing Report  by Hubspot
2024 State of Marketing Report by Hubspot
Marius Sescu
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
Expeed Software
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
Pixeldarts
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
marketingartwork
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
Skeleton Technologies
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
Neil Kimberley
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
contently
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
Albert Qian
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
Search Engine Journal
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
SpeakerHub
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
Clark Boyd
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
Tessa Mero
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Lily Ray
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
Rajiv Jayarajah, MAppComm, ACC
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Christy Abraham Joy
Storytelling For The Web: Integrate Storytelling in your Design Process
Storytelling For The Web: Integrate Storytelling in your Design ProcessStorytelling For The Web: Integrate Storytelling in your Design Process
Storytelling For The Web: Integrate Storytelling in your Design Process
Chiara Aliotta
How to Leverage AI to Boost Employee Wellness - Lydia Di Francesco - SocialHR...
How to Leverage AI to Boost Employee Wellness - Lydia Di Francesco - SocialHR...How to Leverage AI to Boost Employee Wellness - Lydia Di Francesco - SocialHR...
How to Leverage AI to Boost Employee Wellness - Lydia Di Francesco - SocialHR...
SocialHRCamp
2024 State of Marketing Report by Hubspot
2024 State of Marketing Report  by Hubspot2024 State of Marketing Report  by Hubspot
2024 State of Marketing Report by Hubspot
Marius Sescu
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
Expeed Software
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
Pixeldarts
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
marketingartwork
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
Neil Kimberley
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
contently
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
Albert Qian
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
Search Engine Journal
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
SpeakerHub
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
Clark Boyd
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
Tessa Mero
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Lily Ray

LaTeCH 2015: Measuring the Structural and Conceptual Similarity of Folktales using Plot Graphs (Lestari & Manurung)

  • 1. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Measuring the Structural and Conceptual Similarity of Folktales using Plot Graphs Victoria Anugrah Lestari & Ruli Manurung Faculty of Computer Science Universitas Indonesia victoria.anugrah@ui.ac.id, maruli@cs.ui.ac.id Beijing, China 30 July 2015
  • 2. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Folktales Folktales are a characteristically anonymous, timeless, and placeless tale circulated orally among a people. http://onceuponatime.wikia.com/wiki/Rumpelstiltskin_(Fairytale) http://indonesianfolklore.blogspot.com/2007/10/lutung-kasarung-folklore-from-west-java.html http://indonesianfolklore.blogspot.com/2007/10/keong-emas-golden-snail-prince-raden.html 2/24
  • 3. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Humanities work on folktales Vladimir Propp (1928): Morphology of the (Russian) folktale story grammars Aarne-Thompson-Uther (ATU) index (1910, 1961, 2004): story motifs, hierarchy of folktale types 3/24
  • 4. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Computational work on folktales Vaz Lobo & de Matos (2010): latent semantic mapping + clustering 453 fairy tales from Gutenberg. Nguyen et al. (2012): classification based on genre, e.g. legend, fairytale, jokes, puzzle, urban legend, etc. using lexical, POS, NE, metadata. Nguyen et al. (2013): Ranking based on story types (ATU, Brunvand) using IR, lexical, SVO triplets. Karsdorp & van den Bosch (2013): Topic modelling (L-LDA) for multiple labelling of ATU motifs (defined by types). 4/24
  • 5. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Folktales as narratives Narratives: Focus on sequence of related events structure Models of narrative: Turner (1994), Mateas & Stern (2003), P辿rez y P辿rez & Sharples (2004), etc. 5/24
  • 6. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Folktales as narratives Narratives: Focus on sequence of related events structure Models of narrative: Turner (1994), Mateas & Stern (2003), P辿rez y P辿rez & Sharples (2004), etc. However: Fisseni & L旦we (2012): People tend to focus on motifs & content, less on structure. 5/24
  • 7. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Plot graphs (McIntyre & Lapata, 2010) 6/24
  • 8. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Goals of this work Construct representations that capture structural & conceptual properties. Define similarity metric, use to organize folktales. Compare to BoW-based methods wrt. ATU. 7/24
  • 9. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Representing folktales as plot graphs 8/24
  • 10. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Representing folktales as plot graphs Action nodes: Action edges: 8/24
  • 11. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Representing folktales as plot graphs Action nodes: Child nodes: Action edges: Action- Child edges: 8/24
  • 12. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Representing folktales as plot graphs Action nodes: Child nodes: Entity nodes: Action edges: Action- Child edges: Entity edges: 8/24
  • 13. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Representing folktales as plot graphs Note that the core structure is linear. Action nodes: Child nodes: Entity nodes: Action edges: Action- Child edges: Entity edges: 8/24
  • 14. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Example 9/24
  • 15. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Example live lion forest subj in 9/24
  • 16. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Example live sleep lion forest it tree subj in subj under 9/24
  • 17. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Example live sleep come lion forest it tree mouse subj in subj under subj 9/24
  • 18. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Example live sleep come play lion forest it tree mouse lionit subj in subj under subj subj on 9/24
  • 19. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Automatic construction Stanford CoreNLP SemanticGraph (a.k.a. dependency parse) 10/24
  • 20. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments From SemanticGraph to plot graph Some observation-based heuristics on selecting relations: Governors of nsubj (nominal subject), expl (expletive there), and aux (auxiliary) Add child if relation(parent,child) not conj, comp, adv, aux, cop, dep, expl, mark 11/24
  • 21. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Construction example 12/24
  • 22. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Construction example CoreNLP CorefChain (length >1) 12/24
  • 23. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Construction example CoreNLP CorefChain (length >1) 12/24
  • 24. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Construction example CoreNLP CorefChain (length >1) 12/24
  • 25. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Final result 13/24
  • 26. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Measuring plot graph similarity A lion lives in the forest. One day it sleeps under a tree. Then a mouse plays on the lion and disturbs its sleep. A lion eats meat. A lion lives in the jungle. One day it rests under a tree. 14/24
  • 27. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Measuring plot graph similarity A lion lives in the forest. One day it sleeps under a tree. Then a mouse plays on the lion and disturbs its sleep. A lion eats meat. A lion lives in the jungle. One day it rests under a tree. 14/24
  • 28. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Alignment of event sequence Needleman- Wunsch 15/24
  • 29. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Conceptual similarity: Wu-Palmer Measure path distance between 2 words based on WordNet taxonomy Word pairs Similarity sleep, live 0.25 disturb, rest 0.33 live, eat 0.29 prince, king 0.94 jungle, forest 0.31 palace, house 0.91 16/24
  • 30. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Example mapping eat live rest live 0.29 1 0.33 sleep 0.22 0.25 0.43 play 0.29 0.33 0.43 disturb 0.29 0.33 0.33 eat live rest 0 -1 -2 -3 live -1 0.29 0 -1 sleep -2 -0.71 0.54 1 play -3 -1.71 -0.38 0.96 disturb -4 -2.71 -1.38 -0.04 Wu-Palmer similarity Alignment scoring & traceback matrix 17/24
  • 31. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Folktale similarity measurement p1 & p2 = the two plot graphs being compared 留 = weighting for action node similarity 硫 = weighting for child node similarity (a1i ,a2i ) = pair of action nodes from alignment of p1 and p2 g = gap penalty (c1i ,c2i ) = pair of child nodes from alignment of p1 and p2 n = alignment length of p1 and p2 18/24
  • 32. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Initial experiment Determining values for 留, 硫, and g For each story, 5 paraphrases manually created: word replacement, sentence structure change, insertion/deletion of phrases & sentences Measure similarity between paraphrases & across stories. Maximize difference. No. Title #Words 1 A friend in need is a friend indeed 133 2 Honesty is the best policy 129 3 A town mouse and a country mouse 260 4 How to tell a true princess 382 5 The butterfly lovers 572 6 Rumpelstiltskin 1106 http://www.english-for-students.com/Simple-Short-Stories.html 19/24
  • 33. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Similarity scores using various parameters g= 留 = 0.7, 硫 = 0.3 留 = 0.5, 硫 = 0.5 留 = 0.3, 硫 = 0.7 -1 -0.5 0 -1 -0.5 0 -1 -0.5 0 Between paraphrases Avg 0.83 0.80 0.74 0.83 0.80 0.73 0.83 0.79 0.71 Min 0.69 0.61 0.53 0.69 0.60 0.49 0.68 0.58 0.45 Across stories Avg 0.37 0.30 0.15 0.41 0.32 0.12 0.45 0.33 0.09 Max 0.55 0.45 0.25 0.55 0.43 0.20 0.55 0.42 0.16 BP min - AS max 0.14 0.16 0.28 0.14 0.17 0.29 0.13 0.16 0.29 Diff. between avgs 0.46 0.50 0.59 0.42 0.48 0.61 0.38 0.46 0.62 20/24
  • 34. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Main experiment: BoW comparison 24 fairy tales from Fairy Books of Andrew Lang, grouped into 5 clusters under ATU (fairy tales): Supernatural Adversaries Bluebeard; Hansel and Gretel; Jack and the Beanstalk; Rapunzel; The Twelve Dancing Princesses. Supernatural or Enchanted Relatives Beauty and the Beast; Brother and Sister; East of the Sun, West of the Moon; Snow White and Rose Red; The Bushy Bride; The Six Swans; The Sleeping Beauty. Supernatural Helpers Cinderella; Donkey Skin; Puss in Boots; Rumpelstiltskin; The Goose Girl; The Story of Sigurd. Magic Objects Aladdin and the Wonderful Lamp; Fortunatus and His Purse; The Golden Goose; The Magic Ring. Other Stories of the Supernatural Little Thumb; The Princess and the Pea. Measure similarity between clusters & across clusters. http://www.gutenberg.org/ebooks/30580 21/24
  • 35. Beijing 30 July 15 Folktales Plot graphs Similarity ExperimentsStory type Story Plot graph Bag of words Combination Within Across Within Across Within Across Supernatural adversaries Bluebeard 0.1000 0.1037 0.8629 0.8618 0.4814 0.4586 Hansel and Gretel 0.1075 0.1157 0.8492 0.8630 0.4783 0.4894 Jack and the Beanstalk 0.1050 0.1110 0.9050 0.8891 0.5050 0.5001 Rapunzel 0.1000 0.1047 0.8790 0.8575 0.4895 0.4571 The Twelve Dancing Princesses 0.1125 0.1073 0.8808 0.8631 0.4966 0.4610 Supernatural or enchanted relatives Beauty and the Beast 0.0767 0.0705 0.8803 0.8605 0.4785 0.4397 Brother and Sister 0.1233 0.1135 0.8881 0.8722 0.5057 0.4654 East of the Sun, West of the Moon 0.1117 0.1012 0.8914 0.8571 0.5015 0.4525 Snow White and Rose Red 0.1200 0.1165 0.8650 0.8566 0.4925 0.4865 The Bushy Bride 0.1200 0.1182 0.8862 0.8739 0.5031 0.4960 The Six Swans 0.0925 0.1100 0.9006 0.8662 0.5020 0.4881 The Sleeping Beauty 0.1125 0.1194 0.8990 0.8918 0.5087 0.5056 Supernatural helpers Cinderella 0.1180 0.1144 0.8150 0.8306 0.4665 0.4725 Donkey Skin 0.1040 0.1122 0.8873 0.9025 0.4956 0.5074 Puss in Boots 0.1175 0.1095 0.8170 0.8486 0.4672 0.4551 Rumpelstiltskin 0.0750 0.0858 0.8467 0.8569 0.4609 0.4478 The Goose Girl 0.1240 0.1178 0.8617 0.8624 0.4928 0.4643 The Story of Sigurd 0.1080 0.1178 0.8516 0.8670 0.4800 0.4664 Magic objects Aladdin and the Wonderful Lamp 0.0975 0.0910 0.8958 0.8664 0.4946 0.4559 Fortunatus and His Purse 0.1133 0.1185 0.8945 0.8306 0.5039 0.4519 The Golden Goose 0.1033 0.1155 0.9006 0.8529 0.5012 0.4611 The Magic Ring 0.1033 0.1040 0.9120 0.8960 0.5077 0.4762 Other stories Little Thumb 0.0300 0.1214 0.7444 0.8562 0.3872 0.4675 The Princess and the Pea 0.0300 0.0405 0.7444 0.7844 0.3872 0.3945 # Similarity within > across 10 (41.67%) 15 (62.50%) 19 (79.16%)
  • 36. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments Analysis & Discussion Errors in automatic construction (dependency parses arent really semantic graphs), e.g.: along came a mouse vs. a mouse came, coreference errors. Consistent with Fisseni & L旦we (2012) findings: focus more on content & motifs? Combination of plot graph + BoW yields best results. 23/24
  • 37. Beijing 30 July 15 Folktales Plot graphs Similarity Experiments THANK YOU 24/24