@cite{beaver-2004}: The Optimization of Discourse Anaphora #
David I. Beaver (2004), "The Optimization of Discourse Anaphora," Linguistics and Philosophy 27(1): 3-56. © 2004 Kluwer. DOI 10.1023/B:LING.0000010796.76522.7a.
Beaver reformulates @cite{grosz-joshi-weinstein-1995} Centering (and the @cite{brennan-friedman-pollard-1987} algorithmic version) as an Optimality Theory system over six ranked constraints (his §3.2 p. 14):
AGREE > DISJOINT > PRO-TOP > FAM-DEF > COHERE > ALIGN
His main theorem (his §3.4 (20), proven Appendix A): given a sentence in which the only definite expressions are proper nouns and pronouns, if either COT or BFP uniquely predicts an interpretation involving fully anaphoric resolution, then both do, and they resolve anaphors identically.
@cite{poesio-stevenson-eugenio-hitzeman-2004} §3.1 fn 12 endorse this OT reformulation as the right framework for "unpacking" Centering's constraints into ranked-OT terms.
Design: deep reuse of Centering primitives #
Per the audit's mathlib-style recommendation, three of Beaver's six constraints are LITERAL RESTATEMENTS of existing Centering primitives (Beaver acknowledges this in his §3.2-3.3 prose):
| Beaver constraint | Beaver's source | Our definition |
|---|---|---|
| PRO-TOP | "essentially the effect of Centering's Rule 1" (his p. 15-16) | via Rule1Gordon from Centering/Rule1.lean (NOT Rule1GJW95) |
| COHERE | "the conditions used in BFP to specify transition types" (his p. 17) | via cb (topic stability test) |
| ALIGN | same | via cb + cp (topic in preferred-center position) |
| AGREE | binding theory (his p. 14) | NEW (substrate-gap-flagged: Realization lacks number/gender features) |
| DISJOINT | Principle B (his p. 15) | NEW (substrate-gap-flagged: Utterance lacks predicate-argument structure) |
| FAM-DEF | familiarity (Heim 1982; his p. 16) | NEW (substrate-gap-flagged: Realization lacks definiteness) |
The 3 reused constraints make Theorem (20)'s equivalence with BFP structural on those clauses (it follows from the constraints being literal restatements). The 3 novel constraints fire as user-supplied boolean flags on the candidate type — substrate gaps that future commits can fill (number/gender features; predicate-argument structure; definiteness marking).
This is the PMF-vs-Measure mathlib pattern applied: Centering and OT are sibling substrates with their own vocabularies; the bridge lives in this paper-anchored study file rather than as a substrate-level identity. Where Beaver explicitly equates his constraints with Centering primitives, we define them via those primitives (deep reuse). Where Beaver introduces independent content (AGREE/DISJOINT/ FAM-DEF), we keep them independent.
Scope #
This file mechanizes:
- The 6 COT constraints with the design above.
- Beaver's tableau examples (5b), (12c) — predicting BFP-equivalent resolutions.
- Theorem (20) witnesses on those examples (the COT-optimal candidate matches the BFP-survivor).
- Beaver §4.1's PRO-TOP demotion: a re-ranked constraint hierarchy (PRO-TOP below FAM-DEF) gives Beaver (2c) "Mary likes tennis. She plays Jim. He used to play doubles with Mary." the bound reading where BFP overpredicts. The PSDH §3.1 fn 12 critique mechanized.
Out of scope (Beaver's §5 extensions):
- Production / bidirectional grammar (his §5.1-5.3)
- Multi-sentence text optimization (his §5.4 with examples (24)/(25))
- Accented pronoun interpretation (his §6)
- Beaver's full Appendix A proof of Theorem (20) — we mechanize per-example witnesses, not the general schematic proof.
A COT candidate: a fully-resolved cur utterance plus precomputed
boolean flags for the three constraints whose substrates linglib
doesn't yet model (AGREE/DISJOINT/FAM-DEF).
Beaver's GEN function (his §3.1 p. 13) generates all candidate
pairs of (form, meaning); the constraints filter. For our
formalization, candidates are alternative Utterance E R values
for the same surface form (different pronoun resolutions), with
the 3 substrate-gap flags supplied by the example author.
The 3 flags correspond to constraints whose semantics requires
information the existing Realization E R type doesn't carry:
agreementOK: pronoun-antecedent number/gender matchargDisjointOK: co-arguments of a predicate refer to distinct entitiesfamDefOK: every definite NP has an antecedent (familiarity)
- utt : Discourse.Centering.Utterance E R
The resolved utterance — pronouns bound to specific entities.
- agreementOK : Bool
AGREE: pronoun-antecedent agreement OK? (Substrate gap: Realization E R lacks number/gender features.
Linglib/Features/{Number,Gender}.leandefine the feature enums; an[HasPhi E]typeclass lift would let us derive this from the entity type. Queued.) - argDisjointOK : Bool
DISJOINT: co-arguments of any predicate are distinct entities? (Substrate gap: Utterance E R lacks predicate-argument structure.
Linglib/Theories/Semantics/ArgumentStructure/Defs.leanprovidesThematicFramewhich could carry this info; integration queued.) - famDefOK : Bool
FAM-DEF: every definite NP is familiar (has antecedent in prev)? (Substrate gap: Realization E R lacks definiteness marking.
Linglib/Features/Definiteness.leandefines the enum; an[HasDefiniteness E]typeclass + a[HasGivenness E]typeclass (the latter not yet defined; deferred to a follow-up PR per the post-Krifka substrate redesign) would letfamDefOKbe derived fromFeatures.GivennessStatus. Queued.)
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Beaver2004.instReprCandidate = { reprPrec := Beaver2004.instReprCandidate.repr }
AGREE (Beaver §3.2 p. 14): "Anaphoric expressions agree with their antecedents in number and gender."
Substrate gap. Implementation tests the candidate's precomputed
agreementOK flag rather than computing agreement from features
on Realization.
Equations
- Beaver2004.agree = Core.Constraint.OT.mkMark "AGREE" fun (c : Beaver2004.Candidate E R) => ¬c.agreementOK = true
Instances For
DISJOINT (Beaver §3.2 p. 14, "mirroring Principle B from binding theory"): "Co-arguments of a predicate are disjoint."
Substrate gap. Tests the candidate's precomputed argDisjointOK
flag.
Equations
- Beaver2004.disjoint = Core.Constraint.OT.mkMark "DISJOINT" fun (c : Beaver2004.Candidate E R) => ¬c.argDisjointOK = true
Instances For
PRO-TOP (@cite{beaver-2004} §3.2, "essentially the effect of Centering's Rule 1"): "The topic is pronominalized."
Reused from Rule1Gordon (the unconditional Rule 1 variant
from Centering/Rule1.lean, after @cite{gordon-grosz-gilliom-1993}).
Beaver §3.2 explicitly REMOVES the if-clause: in OT, PRO-TOP
has no antecedent — the topic should be pronominalized, full
stop. Defeasibility is encoded by PRO-TOP's lower ranking
position (3rd of 6), not by a propositional if-clause.
Important: an EARLIER version of this file used Rule1GJW95
(the conditional Rule 1: "if any pronoun, then CB
pronominalized"). That was the WRONG reuse — Beaver §3.2 ("if
there are no pronouns, then all candidate interpretations will
be equally bad as far as PRO-TOP is concerned") explicitly
states all interpretations VIOLATE PRO-TOP when no pronoun
realizes the topic. With the conditional Rule1GJW95, all
pronoun-free interpretations would satisfy PRO-TOP — the
opposite of Beaver's intent. Rule1Gordon (which fires
unconditionally when CB exists and isn't pronominalized) is
the correct restatement.
Equations
- Beaver2004.proTop prev = Core.Constraint.OT.mkMark "PRO-TOP" fun (c : Beaver2004.Candidate E R) => ¬Discourse.Centering.Rule1Gordon prev c.utt
Instances For
FAM-DEF (Beaver §3.2 p. 15, "Heim 1982"): "Each definite NP is familiar."
Substrate gap. Tests famDefOK flag.
Equations
- Beaver2004.famDef = Core.Constraint.OT.mkMark "FAM-DEF" fun (c : Beaver2004.Candidate E R) => ¬c.famDefOK = true
Instances For
COHERE (@cite{beaver-2004} §3.2, p. 15 statement; commentary p. 17): "The topic of the current sentence is the topic of the previous one."
Reused from Centering's cb: the "topic" of the current
utterance is cb prev cand.utt; the prior topic (supplied as
Option E). Per Beaver p. 17 ("satisfied only if the topic is
defined and unchanged") plus p. 16 ("undefinedness of the topic
will result in a violation"), COHERE fires when:
cb prev c.utt = none(current topic undefined), OR- both topics defined but different. Only when both are defined and equal is COHERE satisfied.
Equations
- One or more equations did not get rendered due to their size.
Instances For
ALIGN (@cite{beaver-2004} §3.2, p. 15 statement; commentary p. 18): "The topic is in subject position."
Reused from Centering's cb and cp: ALIGN is satisfied
iff the topic (cb prev c.utt) coincides with the preferred
center (c.utt.cp) — i.e., the highest-ranked Cf is the topic.
This IS BFP's "smooth shift / continue" test.
Per Beaver p. 18: "ALIGN literally requires the topic to be subject, but for canonical English sentences this is equivalent to saying that the topic is the preferred center of the current sentence." Per Beaver p. 16 (general policy on undefined topic), undefined topic counts as a violation — same shape as COHERE.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The COT constraint hierarchy (Beaver §3.2 p. 14):
AGREE > DISJOINT > PRO-TOP > FAM-DEF > COHERE > ALIGN.
Equations
- Beaver2004.cotRanking prev priorTopic = [Beaver2004.agree, Beaver2004.disjoint, Beaver2004.proTop prev, Beaver2004.famDef, Beaver2004.cohere prev priorTopic, Beaver2004.align prev]
Instances For
These two theorems are the substrate-level structural backing for
the cross-framework finding worked out in
Phenomena/Reference/Studies/PoesioEtAl2004.lean §5.1: Beaver's
COT cannot distinguish two candidates whose cb agrees, and ALIGN
additionally consults cp. Per-example facts in §5.1 (the
decide-checked Nat comparisons on u227/u229) follow as
corollaries: the cb-and-cp values for cand_u229 reduce, and the
structural theorems say the constraint evaluations are determined
by those reductions.
Mathlib analogue: a function `f` "factors through `g`" when
`g x = g y → f x = f y`. Here `g = cb prev` (or `g = cb prev × cp`)
and `f = (cohere prev priorTopic).eval` (or `(align prev).eval`).
The theorems are the formal statement of `Beaver.cohere ∘ Candidate
= ψ ∘ cb prev` for some `ψ`, surfacing the cb-only dependence
Beaver's substrate makes invisible.
COHERE factors through cb. For fixed prev and priorTopic,
Beaver's COHERE constraint cannot distinguish two candidates whose
cb prev c.utt values agree. Pure unfold + rw proof: COHERE's
predicate references cb prev c.utt only; mkMark turns the
predicate into an if-then-else; equal cb's give equal evaluations.
The cross-framework consequence (worked out in
PoesioEtAl2004.lean §5.1): on PSDH (10), cb u227 u229 returns
a single member of the cbAll tie set; COHERE's verdict on any
candidate c with cb u227 c.utt = cb u227 u229 is therefore
fully determined by cb u227 u229 and priorTopic — Beaver's
COT cannot recover the discarded tie member from candidate-side
information.
ALIGN factors through cb AND cp. ALIGN's predicate
references both cb prev c.utt and c.utt.cp; equal cb's AND
equal cp's give equal evaluations.
The cross-framework consequence: ALIGN inherits the cb-blindness
structurally (via h_cb), but additionally requires cp-equality
(via h_cp). Two candidates that agree on cb but disagree on cp
can produce different ALIGN verdicts; this is the formal opening
by which a positional totalizer (Strube-Hahn linear order) could
differ from Beaver's lex-min — Strube-Hahn changes which
realization wins the cf-sort tiebreaker, hence which is cp,
hence ALIGN's verdict on candidates that share cb.
Beaver §3.3 (12) (paper p. 23):
> (12) a. Jane is happy.
> b. She was congratulated by Freda,
> c. and Mary gave her a present.
BFP classifies (12c) as RETAIN (Jane maintained as topic,
realized in object position not subject). COT predicts the same
resolution: "her" → Jane (l=i in Beaver's notation). The competing
resolution "her" → Freda (l=j) is disfavored by COHERE.
Equations
Instances For
(12a) "Jane is happy."
Equations
- Beaver2004.D12.a = { realizations := [{ entity := "Jane", role := Discourse.Centering.GrammaticalRole.subject, isPronoun := false }] }
Instances For
(12b) "She was congratulated by Freda." She=Jane (resolved).
Equations
- One or more equations did not get rendered due to their size.
Instances For
(12c) Candidate where "her"=Jane (l=i): the BFP-survivor / COT prediction. The resolved utterance: subject Mary (full name), object her=Jane (pronoun), other "present" (full).
Equations
- One or more equations did not get rendered due to their size.
Instances For
(12c) Candidate where "her"=Freda (l=j): non-survivor under both BFP and COT.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The two candidates wrapped with substrate-gap flags. AGREE, DISJOINT, FAM-DEF all OK in both (Mary, Freda, present, Jane all distinct; "her" is pronoun matching either Jane or Freda in number/gender; no definite descriptions).
Equations
- Beaver2004.D12.cand_l_eq_i = { utt := Beaver2004.D12.c_l_eq_i, agreementOK := true, argDisjointOK := true, famDefOK := true }
Instances For
Equations
- Beaver2004.D12.cand_l_eq_j = { utt := Beaver2004.D12.c_l_eq_j, agreementOK := true, argDisjointOK := true, famDefOK := true }
Instances For
The prior topic, computed from (12a) → (12b): Jane.
Instances For
The prior topic for (12) is some "Jane".
Beaver tableau (13) line 1 (winner, l=i): only ALIGN violated. COHERE not violated because the topic (cb b c_l_eq_i = Jane) matches the prior topic (cb a b = Jane).
Beaver tableau (13) line 2 (loser, l=j): COHERE and ALIGN both violated (topic shifts from Jane to Freda; topic not in subject position).
Beaver Theorem (20) witness on example (12): the COT-optimal candidate (l=i) matches BFP's prediction (RETAIN with Jane as topic). The Beaver-side optimization picks l=i because it has fewer COHERE violations (the higher-ranked constraint among the two that distinguish the candidates).
Beaver §2.4 (2) and §4.1 p. 26 (paper pp. 10, 26):
> (2) a. Mary likes tennis.
> b. She plays Jim quite often.
> c. He used to play doubles with Mary.
BFP predicts: in (2c), since the topic (Mary) is not pronominalized
despite "He" being a pronoun, **the bound reading where "Mary" in
(2c) refers back to (2a)'s Mary is filtered out** (Rule 1 violation).
The resulting forced reading involves "two different people called
Mary" (Beaver p. 11) — empirically wrong; the bound reading is
available though awkward.
Beaver §4.1 p. 26: demoting PRO-TOP below FAM-DEF (re-ranking
`AGREE > DISJOINT > FAM-DEF > PRO-TOP > COHERE > ALIGN`)
correctly predicts the bound reading. Under the demoted ranking,
the FAM-DEF constraint (penalizing introduction of a "second
Mary") outranks PRO-TOP (penalizing the unpronominalized topic),
so the candidate where Mary in (2c) IS the prior Mary wins.
This is the @cite{poesio-stevenson-eugenio-hitzeman-2004} §3.1 fn
12 critique mechanized: "in BFP Rule 1 is effectively used as a
hard constraint, a problem fixed by Beaver's own optimality-
theoretic reformulation of the algorithm."
Equations
Instances For
(2a) "Mary likes tennis."
Equations
- One or more equations did not get rendered due to their size.
Instances For
(2b) "She plays Jim quite often." She=Mary (resolved).
Equations
- One or more equations did not get rendered due to their size.
Instances For
(2c) Bound reading (Mary in (2c) = Mary in (2a)): "He=Jim used to play doubles with Mary=Mary_a." Subject = Jim (pronoun "He"), object = Mary (full name, but ANAPHORIC to prior Mary — so famDefOK = true, FAM-DEF satisfied).
Equations
- One or more equations did not get rendered due to their size.
Instances For
(2c) Two-Marys reading (BFP's predicted resolution): "He=Jim
used to play doubles with Mary'≠Mary_a." Same surface; the
object "Mary" introduces a NEW entity ≠ the prior Mary. We
encode the new entity as a distinct string "Mary_new". Under
this resolution, FAM-DEF FIRES (Mary_new is a definite that has
no familiar antecedent).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Bound-reading candidate. AGREE/DISJOINT OK; famDefOK=true (Mary is anaphoric to the (2a) Mary).
Equations
- Beaver2004.D2.cand_bound = { utt := Beaver2004.D2.c_bound, agreementOK := true, argDisjointOK := true, famDefOK := true }
Instances For
Two-Marys candidate. AGREE/DISJOINT OK; famDefOK=FALSE (the second Mary is a brand-new definite without antecedent — FAM-DEF fires).
Equations
- Beaver2004.D2.cand_two_marys = { utt := Beaver2004.D2.c_two_marys, agreementOK := true, argDisjointOK := true, famDefOK := false }
Instances For
The prior topic for (2): Mary, the topic of (2b) given (2a).
Instances For
Bound-reading candidate violates PRO-TOP (the topic Mary is not pronominalized — "Mary" is a full name, not "her"). Under Beaver's CANONICAL ranking with PRO-TOP > FAM-DEF, this filters out the bound reading.
Two-Marys candidate violates FAM-DEF (the second "Mary" is a definite NP without familiar antecedent).
Under canonical COT ranking (PRO-TOP > FAM-DEF), the two-Marys reading wins (matches BFP). PRO-TOP violation is more costly.
Beaver §4.1 demoted ranking (his p. 26): AGREE > DISJOINT > FAM-DEF > PRO-TOP > COHERE > ALIGN. PRO-TOP demoted below
FAM-DEF.
Equations
- Beaver2004.cotRankingDemoted prev priorTopic = [Beaver2004.agree, Beaver2004.disjoint, Beaver2004.famDef, Beaver2004.proTop prev, Beaver2004.cohere prev priorTopic, Beaver2004.align prev]
Instances For
Beaver §4.1 prediction: under the demoted ranking, the bound reading wins because FAM-DEF (now ranked higher than PRO-TOP) eliminates the two-Marys candidate. The bound reading's PRO-TOP violation is now LESS costly than the two-Marys candidate's FAM-DEF violation. The PSDH §3.1 fn 12 critique mechanized: Beaver's reformulation FIXES BFP's Rule-1-as-hard-constraint overprediction by allowing Rule 1 to be defeated by higher-ranked constraints.
Beaver's headline theorem (his §3.4 (20), p. 26):
> Given a sentence in which the only definite expressions are
> proper nouns and pronouns, if either COT (with the constraints
> and constraint ranking above) or BFP uniquely predicts an
> interpretation involving fully anaphoric interpretation of all
> definites, then both do, and in this case they resolve anaphors
> identically.
With our **deep-reuse design** (PRO-TOP via Rule1GJW95, COHERE via
cb, ALIGN via cb+cp), this theorem is partly STRUCTURAL:
- The COT-side `proTop` constraint IS BFP's Rule 1 by definition.
- The COT-side `cohere` constraint IS BFP's "Cb stable" test.
- The COT-side `align` constraint IS BFP's "Cb = Cp" test.
- AGREE / DISJOINT / FAM-DEF are the BFP-Filter conditions
(Beaver §2.2 p. 8: BFP's Filter 1 = Rule 1 = our PRO-TOP;
Filter 3 = argument coreference = our DISJOINT). FAM-DEF is
Beaver's addition (BFP doesn't have it, since BFP doesn't
handle definite descriptions).
The structural correspondence makes the COT-vs-BFP equivalence
mechanically apparent for the 3 reused constraints (PRO-TOP / COHERE
/ ALIGN are LITERALLY defined via `Rule1Gordon` / `cb` / `cb`+`cp`,
so any "iff" theorem connecting them to the substrate is a
one-line `unfold` — i.e., the canonical naCanBridge anti-pattern
CLAUDE.md forbids). We previously had `proTop_iff_rule1Gordon_fails`,
`cohere_iff_cb_diverges`, `align_iff_cb_neq_cp` theorems here;
they were correctly flagged by audit as encoding-conclusions-as-
definitions. The 3 novel constraints don't change BFP's predictions
on Beaver's worked examples (verified by `decide` per-example).
Per audit's mathlib-discipline finding (the file defined 6
constraints + per-example violation theorems but never invoked
the ViolationProfile lex-ordering machinery), compute the full
ViolationProfile 6 for Beaver (12)'s two candidates under the
canonical COT ranking and verify the lex-ordering picks l=i.
A full `Tableau` construction would also work (and is what
mathlib-style would prefer), but currently `Utterance E R`
lacks `DecidableEq` in the substrate (deriving only `Repr`),
so `Finset (Candidate E R)` synthesis is blocked. The
`ViolationProfile` comparison gives the same kernel-checked OT
witness without requiring `Finset` machinery. Adding
`DecidableEq` to the `Realization` / `Utterance` structures'
deriving block (substrate change) would unblock the full
`Tableau.optimal = {cand_l_eq_i}` form; queued for follow-up.
Beaver tableau (13) line 1: ViolationProfile of cand_l_eq_i
under the canonical COT ranking.
Equations
Instances For
OT lex-min witness on Beaver (12): under the canonical COT
ranking, cand_l_eq_i's violation profile is strictly less than
cand_l_eq_j's in ViolationProfile's lex order — i.e., the
OT optimization mechanism (lex-min on Nat-vector profiles)
picks cand_l_eq_i as the unique optimal candidate. This is
the kernel-checked OT-mechanism witness, exercising the
Core.Constraint.OT substrate's mkProfile /
ViolationProfile.LinearOrder infrastructure.
Per audit's cross-framework finding, three sibling Centering formalizations exist in linglib that Beaver should engage:
- **`Sidner1983.lean`**: two-focus account. Beaver §4.3 explicitly
proposes SALIENT EMPATHY (his p. 30) which is structurally
Sidner's actor focus. Out-of-scope for this commit (Beaver §4.3
cross-linguistic constraints not formalized), but flagged.
- **`Centering/Rule2.lean::Rule2Strube1998Cheap`** (cheap iff
`cb prev cur = priorCp`). Beaver's ALIGN at the *previous*
utterance position structurally encodes the same predicate —
see hidden-agreement theorem below.
- **`KehlerRohde2013.lean`**: Bayesian decomposition of pronoun
resolution. Chronology: Beaver < KR2013, so the bridge belongs
in `KehlerRohde2013.lean`, not here. Future-work.
Hidden agreement: ALIGN ≈ Strube 1998 cheap (structurally).
Strube's "cheap" predicate (Rule2Strube1998Preferred) tests
cb prev cur = priorCp ∧ cb is some — the previous-Cp predicts
the current-CB. ALIGN at the current utterance tests the
analogous within-utterance condition: cb prev cur = current-cp.
Their structural identity: ALIGN says "topic = current Cp"; cheap says "topic = previous Cp." Same shape, different time index. Not literally equal — the "previous Cp" version is Strube's specific contribution. The shapes match; the witness below shows ALIGN is satisfied iff the (constructed) within- utterance cheap-condition holds.