Centering Theory — Rule 1 (Pronominalization Constraint) and Variants #
@cite{grosz-joshi-weinstein-1995} @cite{gordon-grosz-gilliom-1993} @cite{poesio-stevenson-eugenio-hitzeman-2004}
@cite{poesio-stevenson-eugenio-hitzeman-2004} §2.3.2 (p. 314-315) distinguish three versions of Rule 1 from the Centering literature, each making a different empirical claim about when pronominalization is required:
Rule 1 (GJW 95): if any CF in
Cf(U_{i-1})is pronominalized inU_i, thenCb(U_i)is pronominalized too. The standard centering formulation (@cite{grosz-joshi-weinstein-1995} §3). Conditional on some pronominalization occurring at all.Rule 1 (GJW 83): if
Cb(U_i) = Cb(U_{i-1}), a pronoun should be used. The earlier Grosz-Joshi-Weinstein 1983 formulation, conditioning on CB stability across utterances rather than on the presence of any pronominalization.Rule 1 (Gordon): the CB should always be pronominalized. @cite{gordon-grosz-gilliom-1993}'s reading-time experiments motivate this stronger claim — they observe a "repeated-name penalty" (RNP) when an entity that should be CB is realized as a proper name instead of a pronoun. Unconditional on the CB.
PSDH's empirical evaluation (Table 8, p. 333) finds:
- Rule 1 (GJW 95): 96.7% of utterances satisfy it under vanilla
- Rule 1 (GJW 83): 81% satisfy it
- Rule 1 (Gordon): only 44.5% satisfy it — far too strong as an unconditional claim
Following mathlib's "variants as separate Props with implications"
pattern (cf. Convex / StrictConvex / MidpointConvex), each
variant is its own predicate with Decidable instance, plus
implication theorems where they hold. NOT a Rule1Variant enum +
parameterized def — that would obscure the distinct conceptual
content of each version.
This file extracts the Rule 1 content from the older Rules.lean
(which was a single file mixing Rule 1 and Rule 2). Rule 2 variants
live in the sibling Rule2.lean.
@cite{grosz-joshi-weinstein-1995} Rule 1 (the standard /
"GJW 95" version per @cite{poesio-stevenson-eugenio-hitzeman-2004}
§2.3.2): if any element of Cf(U_{i-1}) is realized by a pronoun
in U_i, then Cb(U_i) is realized by a pronoun also.
Vacuously satisfied when no Cb exists. The constraint is conditional — it says nothing about whether the Cb must be pronominalized when no other entity is. Rule 1 only constrains what happens when pronominalization is used at all.
PSDH find this version is robust: ≤8% of utterances violate it across all parameter instantiations they test (Table 8, p. 333).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
@cite{grosz-joshi-weinstein-1995} report this earlier @cite{poesio-stevenson-eugenio-hitzeman-2004} §2.3.2 (p. 314) formulation: "If the CB of the current utterance is the same as the CB of the previous utterance, a pronoun should be used."
Conditional on CB stability across the utterance pair. The
prevCb parameter is the CB of prev (the "previous" CB, which
requires a further-back utterance to compute via the standard
cb prevPrev prev); we take it as an explicit Option E
parameter rather than recomputing it, mirroring how
classifyTransitionExtended takes prevCb explicitly.
PSDH Table 8 p. 333: 81% of vanilla-instantiation utterances satisfy this version.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
@cite{gordon-grosz-gilliom-1993} formulation per
@cite{poesio-stevenson-eugenio-hitzeman-2004} §2.3.2 (p. 315):
"The CB should be pronominalized." Unconditional on the
presence of pronominalization in U_i.
Motivated by the repeated-name penalty (RNP): @cite{gordon-grosz-gilliom-1993}'s reading-time experiments found increased reading times when proper names were used to realize entities that should have been the CB. Gordon et al. take this as evidence that the CB should be a pronoun, not just may be one when other entities are.
PSDH find this version is much too strong: only 44.5% of vanilla-instantiation utterances satisfy it (Table 8 p. 333) — by far the most-violated of the three Rule 1 variants.
Equations
- Discourse.Centering.Rule1Gordon prev cur = match Discourse.Centering.cb prev cur with | none => True | some curCb => Discourse.Centering.pronominalizes cur curCb
Instances For
Equations
- One or more equations did not get rendered due to their size.
Gordon ⇒ GJW 95 (unconditional). If the CB is always pronominalized (Gordon), then certainly whenever some CF is pronominalized, the CB is too (GJW 95). The implication is one-directional: GJW 95 has cases not covered by Gordon (specifically, the "no pronouns at all" case where GJW 95 is vacuously satisfied but Gordon is violated).
Corollary: the empirical Gordon-violation rate is an upper bound on the GJW 95 violation rate — consistent with PSDH Table 8 (44.5% Gordon-satisfying ≤ 96.7% GJW 95-satisfying, equivalently 55.5% Gordon-violating ≥ 3.3% GJW 95-violating).