�ݺ�ߣ

Encoding Generalized Quantifiers in
Dependency-based Compositional Semantics
Yubing Dong – University of Southern California
Ran Tian – Tohoku University
Yusuke Miyao – National Institute of Informatics, Japan

Background
Generalized Quantifiers (GQ)

Most students like noodles.
Generalized
Quantifier

Property-denoting
noun phrase
Generalized
Quantifier

Property-denoting
noun phrase
Predicate
Generalized
Quantifier

Most (Student) (LikeNoodles) ∈ {0,1}
Denotations
Student ⊆ 푊
LikeNoodles ⊆ 푊
Binary Relation over 푊

The relation imposed by a GQ is usually based on the notion ⋅ of set cardinalities
Most (Student) (LikeNoodles)
iff
퐒퐭퐮퐝퐞퐧퐭 ∩ 퐋퐢퐤퐞퐍퐨퐨퐝퐥퐞퐬
퐒퐭퐮퐝퐞퐧퐭
> 80%

Many
ALotOf
Few
AFew
AtMost[n]
AtLeast[n]

Background
Recognizing Textual Entailment (RTE)

Recognizing Textual Entailment (RTE)
Example:
• 푇1: Mary loves every dog.
• 푇2: Tom has a dog.
• 퐻: Tom has an animal that Mary loves.
• 푇1, 푇2 ⇒ 퐻 i.e. 푇1 and 푇2 entails 퐻
Definition: “푇 entails 퐻" (푇 ⇒ 퐻) if, typically, a human
reading 푇 would infer that 퐻 is most likely true
• Relatively loose, compared to logical entailment

GQ in RTE
At most 5 students like noodles.
At most 5 Japanese students like udon noodles.

GQ in RTE
At least 5 students like noodles.
At least 5 Japanese students like udon noodles.

GQ in RTE
Most Japanese students like udon noodles.

GQ in RTE
The FraCaS Corpus:
• Built in mid-1990s
• A set of hand-crafted entailment problems covering
wide range of semantic phenomena
Section 1 - Generalized Quantifiers:
• 74 problems:
• 44 have single premise sentence
• 30 have multiple premise sentence

GQ in RTE
Accuracies of previous systems on Section 1 of FraCaS corpus
System
Accuracy
Single Multi Overall
NatLog
MacCartney07 84.1%
N/A
MacCartney08 97.7%
CCG-Dist
Parser Syntax 70.5% 50.0% 62.2%
Gold Syntax 88.6% 80.0% 85.1%

GQ in RTE
System
Accuracy
NatLog
MacCartney07 84.1%
N/A
MacCartney08 97.7%
CCG-Dist
Parser Syntax 70.5% 50.0% 62.2%
Gold Syntax 88.6% 80.0% 85.1%
TIFMO
Baseline 79.5% 86.7% 82.4%
Selection 90.9% 93.3% 91.9%
Relation 88.6% 93.3% 90.5%
Selection+Relation 93.2% 96.7% 94.6%

But I’m getting ahead of myself…

Properties of GQs
Problem with encoding the “perfect semantics”
iff
퐒퐭퐮퐝퐞퐧퐭 ∩ 퐋퐢퐤퐞퐍퐨퐨퐝퐥퐞퐬
퐒퐭퐮퐝퐞퐧퐭
> 80%
Challenge: set cardinalities are difficult to perfectly encode

Properties of GQs
Compromise: only encode major GQ properties
• Interaction with universal and existential quantifications
• Conservativity
• Monotonicity

Properties of GQs
Interaction with universal and existential quantifications
Case 1:
퐴 ⊆ 퐵 ⇒ 퐹 퐴 퐵 ⇒ 퐴 ∩ 퐵 ≠ ∅
Example: “most”
All students like noodles.
There are students who like noodles.

Properties of GQs
Case 2:
퐴 ⊆ 퐵 ⇒ 퐹 퐴 퐵 ⇒ 퐴 ∩ 퐵 ≠ ∅
Example: “a lot of”
A lot of students like noodles.

Properties of GQs
Case 3:
퐴 ⊆ 퐵 ⇒ 퐹 퐴 퐵 ⇒ 퐴 ∩ 퐵 ≠ ∅
Example: “at most n”

Properties of GQs
Conservativity
The “domain restraining” role of the noun argument
• Eliminates objects that do not have the noun property
• Only need to consider which of the rest has the predicate property
퐹 퐴 퐵 ⟺ 퐹(퐴)(퐴 ∩ 퐵)
Example:
• “Few apples are toxic.”⟺“Few apples are toxic apples.”
• We don’t care non-apples toxicants, e.g. toxic oranges

Properties of GQs
Monotonicity
A GQ 퐹 ⋅ ⋅ is upward entailing in the noun argument if:
퐹 퐴′ 퐵 ⇒ 퐹 퐴 퐵 ∀퐴′ ⊆ 퐴
Similarly, a GQ can also be
• downward entailing in the noun argument, and
• upward/downward entailing in the predicate argument

Properties of GQs
Monotonicity
Example: “at most 푛” is downward entailing in each argument
At most 5 Japanese students like udon noodles.

Properties of GQs
Monotonicity
Example: “at least 푛” is upward entailing in each argument
At least 5 students like noodles.
At least 5 Japanese students like udon noodles.

Properties of GQs
Monotonicity
Example: “most” is neither upward nor downward entailing in
the noun argument
Most Japanese students like noodles.

Properties of GQs
Monotonicity
Example: but is upward entailing in the predicate argument
Most students like udon noodles.

Background
Dependency-based Compositional Semantics (DCS) for RTE
• Proposed by Tian et al. (2014)

DCS for RTE
DCS tree for “All students like udon noodles”

DCS for RTE
Abstract Denotations:
퐧퐨퐨퐝퐥퐞 ⊆ 푊
퐮퐝퐨퐧 ⊆ 푊
퐬퐭퐮퐝퐞퐧퐭 ⊆ 푊
퐥퐢퐤퐞 ⊆ 푊 × 푊

DCS for RTE
퐷1 = 퐧퐨퐨퐝퐥퐞 ∩ 퐮퐝퐨퐧
“udon noodles”

DCS for RTE
퐷2 = 퐥퐢퐤퐞 ∩ 푊푆퐵퐽 × 퐷1 푂퐵퐽
“like udon noodles”

DCS for RTE
퐷3 = 휋푆퐵퐽 퐷2
“subjects who like
udon noodles”

DCS for RTE
r R,C ≡ x ∅≠R∩ x ×Wr ⊆ x ×Cr
If 푅 and 퐶 have the same dimension,
• 푞⊆ 푟
퐷4 = 푞푆퐵퐽 ⊆
퐷3, 퐬퐭퐮퐝퐞퐧퐭
q⊆
푅, 퐶 = ∗ (0-dimension point set) when 퐶 ⊆ 푅,
• 푞⊆ 푟
푅, 퐶 = ∅ otherwise
wide reading of “⊆”

DCS for RTE
r R,C ≡ x ∅≠R∩ x ×Wr ⊆ x ×Cr
If 푅 and 퐶 have the same dimension,
• 푞⊆ 푟
푆퐵퐽 퐷2, 퐬퐭퐮퐝퐞퐧퐭
퐷5 = 푞⊆
q⊆
푅, 퐶 = ∗ (0-dimension point set) when 퐶 ⊆ 푅,
• 푞⊆ 푟
푅, 퐶 = ∅ otherwise
narrow reading of “⊆”
(“the set of udon noodles that all student like”)

DCS for RTE
푆퐵퐽 퐷2, 퐬퐭퐮퐝퐞퐧퐭
퐷5 = 푞⊆
Prove statement
• 퐷4 ≠ ∅ (wide reading) or
• 퐷5 ≠ ∅ (narrow reading)
using forward chaining

DCS for RTE
⊆ Basic operators 푟
/ functions:
• × - Cartesian product of sets
• ∩ - Set intersection
• 휋푟 - Projection onto domain of semantic role 푟
• 푙푟 - Relabeling
• 푞- Division
Basic types of statements:
• Non-emptiness: 퐴 ≠ ∅
• Subsumption: 퐴 ⊆ 퐵

Background
DCS for RTE: the selection operator
• Also introduced in Tian et al. (2014)

DCS for RTE: the selection operator
• Introduced as an extension to represent the generalized
selection operation in relational algebra
• Marked on a DCS tree node
• Wrap the abstract denotation 퐷 to form a new abstract
denotation 푠푓 퐷
• The properties of 푠푓 퐷 can be user defined
Example:
the set of highest mountains: 푠ℎ푖푔ℎ푒푠푡 (퐦퐨퐮퐧퐭퐚퐢퐧)

Encoding Generalized Quantifiers
as selections

Encoding GQs as Selections
We encode a GQ 퐹 using selection 푠퐹 as:
퐹 퐴 퐵 ≡ 푠퐹 퐴 ⊆ 퐵
Basic requirement:
• 퐹 should be upward-entailing in the predicate
argument 퐵
• A major limitation

• Entailment from universal quantification now written as:
퐴 ⊆ 퐵 ⇒ 푠퐹 퐴 ⊆ 퐵
• Conservativity as:
푠퐹 퐴 ⊆ 퐴 ∩ 퐵 ⇔ 푠퐹 퐴 ⊆ 퐵
• Both hold if we add axiom:
푠퐹 퐴 ⊆ 퐴

• Entailment to existence quantification now written as:
푠퐹 퐴 ⊆ 퐵 ⇒ 퐴 ∩ 퐵 ≠ ∅
• Holds if we add axiom:
푠퐹 퐴 ∩ 퐴 ≠ ∅

• Monotonicity in the noun argument 퐴 (e.g. upward) now
written as:
A ⊆ A′ ∧ 푠퐹 퐴 ⊆ 퐵 ⇒ 푠퐹 퐴′ ⊆ 퐵
• Holds if we add axiom:
A ⊆ A′ ⇒ 푠퐹 퐴 ⊇ 푠퐹 퐴′

DCS tree for “At least 5 students like udon noodles.”
where the GQ “at least 5” is encoded as selection 푠퐴푡퐿푒푎푠푡 5
Example: at least 푛
• Satisfied: upward-entailing in
predicate argument
• Entails existential quantification:
∀퐴 푠퐴푡퐿푒푎푠푡 5 퐴 ∩ 퐴 ≠ ∅
• Upward-entailing in noun argument:
∀퐴, 퐴′ 푠. t. A ⊆ A′
푠퐴푡퐿푒푎푠푡 5 퐴 ⊇ 푠퐴푡퐿푒푎푠푡 5 퐴′

Example:
“At least 5 Japanese students like udon noodles.”
⇒ “ At least 5 students like noodles.”
퐷3 ′
= 휋푆퐵퐽 퐥퐢퐤퐞 ∩ 푊푆퐵퐽 × 퐧퐨퐨퐝퐥퐞푂퐵퐽

Encoding Generalized Quantifiers
as relations

Encoding GQs as Relations
Intro to Relations
• Review: GQ can be seen as binary relation over 2푊
• Therefore, we introduce a new extension: relation
• A new type of statement
• A relation 푟퐹 퐴, 퐵 can represent arbitrary custom
relation between abstract denotations 퐴 and 퐵

Intro to Relations
Relation 푟퐹 퐴, 퐵
• The inference engine keeps track of which term pairs
are labeled with which relations
• Does 퐴 and 퐵 have relation 푟퐹?
• What terms have relation 푟퐹 to 퐴?
• Supports custom axioms for a relation
• What entails 푟퐹 퐴, 퐵 ?
• What does 푟퐹 퐴, 퐵 entail?

We intuitively encode a GQ 퐹 using relation 푟퐹 as:
퐹 퐴 퐵 ≡ r퐹 퐴, 퐵
Statement:
푟퐴푡푀표푠푡 5 퐬퐭퐮퐝퐞퐧퐭, 퐷3

• Entailment from universal quantification:
퐴 ⊆ 퐵 ⇒ 푟퐹 퐴, 퐵
• Entailment to existential quantification:
푟퐹 퐴, 퐵 ⇒ 퐴 ∩ 퐵 ≠ ∅
• Monotonicity (e.g. downward in both arguments):
푟퐹 퐴, 퐵 ∧ 퐴 ⊇ 퐴′ ∧ 퐵 ⊇ 퐵′ ⇒ 푟퐹 퐴′, 퐵′

• Conservativity:
푟퐹 퐴, 퐵 ⇒ 푟퐹 퐴, 퐴 ∩ 퐵
• How about the other direction?
푟퐹 퐴, 퐴 ∩ 퐵 ⇒ 푟퐹 퐴, 퐵

Challenge:
• The inference engine is based on forward chaining:
• Always try to deduce all possible implications from given
premises
• Efficient
• Opens the possibility of adapting DCS for entailment
generation

Challenge:
• The inference engine is based on forward chaining
• Therefore it’s infeasible to enumerate all forms 푋 = 퐴 ∩ 퐵
when 푟퐹 퐴, 푋 is claimed
• Number of possibilities explodes exponentially
• e.g. 푋 = 푋 ∩ 퐶 ∀퐶, 푋 = 퐴 ∩ 퐵 ∩ 퐶 = 퐴 ∩ 퐵 ∩ 퐶

Implementation: limit search using conditions 푋 ⊆ 퐴 ∧ 푋 ⊆ 퐵
If 푟퐹 퐴, 푋 and 푋 ⊆ 퐴:
• For each 퐵 ⊇ 푋:
• Check if 푋 = 퐴 ∩ 퐵
We emphasize this detail because formal semantic researchers
are often not aware of these difficulties.

Limitations
Limitation:
Relations in DCS trees are always explained as having the
widest scope, hence cannot deal with multiple relations in a
sentence.

Limitations
Example:
푃: At most 10 commissioners spend a lot of time at home.
We want to state
푟퐴푡푀표푠푡 10 퐜퐨퐦퐢퐬퐬퐢퐨퐧퐞퐫퐬, 퐷
where
퐷 = “people who spend a lot of time at home”
But this is impossible if “a lot of” is also encoded as a relation

Limitations
Example:
퐷 = "people who spend a lot of time at home"
Workaround:
Since “a lot of” is upward-entailing in predicate argument, we
can encode it using selection 푠퐴퐿표푡푂푓 , while still encode “at
most 10” using 푟퐴푡푀표푠푡 10

Limitations
Example:
퐷 = 푞푂퐵퐽 ⊆
퐷′, 푠퐴퐿표푡푂푓 퐭퐢퐦퐞
where
퐷′ = 퐬퐩퐞퐧퐝 ∩ 푊푆퐵퐽 × 푊푂퐵퐽 × 퐡퐨퐦퐞푀푂퐷
(“spend at home”)

Evaluation
Set-up
The FraCaS Corpus:
• Built in mid-1990s
• A set of hand-crafted entailment problems covering
wide range of semantic phenomena
Section 1 - Generalized Quantifiers:
• 74 problems:
• 44 have single premise sentence
• 30 have multiple premise sentence

Evaluation
Set-up
Settings:
• Baseline
• Selection
• Relation
• Selection+Relation

Evaluation
Set-up
Settings:
• Baseline
• Simply drop GQs
• Same tree structure as follows
• Selection
• Relation

Evaluation
Set-up
Settings:
• Baseline
• Selection
• Implement all GQs as selections, even for those
that are downward-entailing in predicate
argument
• Relation

Evaluation
Set-up
Settings:
• Baseline
• Selection
• Relation
• Implement all GQs as relations

Evaluation
Set-up
Settings:
• Baseline
• Selection
• Relation
• Use relations to encode GQs that are
downward-entailing in predicate argument
• Encode the rest with selections

Evaluation
System
Accuracy
NatLog
MacCartney07 84.1%
N/A
MacCartney08 97.7%
CCG-Dist
Parser Syntax 70.5% 50.0% 62.2%
Gold Syntax 88.6% 80.0% 85.1%
TIFMO
Baseline 79.5% 86.7% 82.4%
Selection 90.9% 93.3% 91.9%
Relation 88.6% 93.3% 90.5%
Selection+Relation 93.2% 96.7% 94.6%

Conclusion
• Generalized Quantifiers are important (for RTE)
• We explored ways of encoding GQs in DCS for RTE
• via selection extension
• via relation extension (newly proposed)
• Significant improvement in performance, but not perfect
• which suggests towards more powerful logical systems

�ݺ�ߣ

Encoding Generalized Quantifiers in Dependency-based Compositional Semantics

More Related Content

Encoding Generalized Quantifiers in Dependency-based Compositional Semantics