Documentation

Linglib.Phenomena.Reference.Studies.KehlerRohde2013

@cite{kehler-rohde-2013} #

@cite{hobbs-1979} @cite{kehler-2002}

A Probabilistic Reconciliation of Coherence-Driven and Centering-Driven Theories of Pronoun Interpretation. Theoretical Linguistics 39(1-2), 1–37.

Core Argument #

Two theories make seemingly irreconcilable claims about pronoun interpretation. @cite{hobbs-1979}: it is a by-product of coherence establishment; grammatical form is irrelevant. Centering (Grosz, Joshi & Weinstein 1995): it is driven by information structure and grammatical roles; world knowledge is irrelevant.

The reconciliation is a Bayesian decomposition (eq. 13):

P(referent | pronoun) ∝ P(pronoun | referent) × P(referent)

The two terms have different conditioning:

P(referent): coherence-driven next-mention bias, computed via eq. (9): P(referent) = Σ_CR P(CR) × P(referent | CR)
P(pronoun | referent): production/form bias, driven by topichood (centering's contribution)

Five experiments with transfer-of-possession verbs and IC verbs confirm that these two components are empirically dissociable.

Key Findings #

#	Finding	Section
1	Imperfective → more Source interpretations than perfective	§3
2	Coherence relations strongly condition next-mention bias	§4
3	Shifting P(CR) via instructions shifts interpretation	§5
4	P(referent\|CR) stable across conditions	§6
5	Pronoun prompt shifts CR distribution bidirectionally	§7
6	Voice affects next-mention but not pronominalization per position	§8
7	Passive subject → more pronominalization than active subject	§8
8	Bayesian predictions match actual interpretation biases	§8
9	Contiguity class splits: Occasion → Goal, Elaboration → Source	§9

Independence Hypothesis #

P(pronoun | referent) is conditioned by topichood/subjecthood, while P(referent) is conditioned by coherence relations. These two components are independent: coherence-driven semantic biases affect next-mention but NOT pronominalization rate.

inductive KehlerRohde2013.PromptType :

Prompt type in passage completion experiments.

pronoun : PromptType
noPronoun : PromptType

Instances For

@[implicit_reducible]

instance KehlerRohde2013.instDecidableEqPromptType :

DecidableEq PromptType

Equations

KehlerRohde2013.instDecidableEqPromptType x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

@[implicit_reducible]

instance KehlerRohde2013.instReprPromptType :

Repr PromptType

Equations

KehlerRohde2013.instReprPromptType = { reprPrec := KehlerRohde2013.instReprPromptType.repr }

def KehlerRohde2013.instReprPromptType.repr :

PromptType → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

inductive KehlerRohde2013.InstructionCond :

Instruction condition (transfer-of-possession exps).

whatNext : InstructionCond
why : InstructionCond

Instances For

@[implicit_reducible]

instance KehlerRohde2013.instDecidableEqInstructionCond :

DecidableEq InstructionCond

Equations

KehlerRohde2013.instDecidableEqInstructionCond x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

@[implicit_reducible]

instance KehlerRohde2013.instReprInstructionCond :

Repr InstructionCond

Equations

KehlerRohde2013.instReprInstructionCond = { reprPrec := KehlerRohde2013.instReprInstructionCond.repr }

def KehlerRohde2013.instReprInstructionCond.repr :

InstructionCond → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

structure KehlerRohde2013.NextMentionModel :

Eq. (9): coherence-marginalized next-mention bias.

P(referent) = Σ_CR P(CR) × P(referent | CR)

The prior probability of a referent being mentioned next is a mixture of CR-specific biases weighted by the prior over coherence relations. This is the coherence-driven "top-down" component.

pCR : Core.Discourse.Coherence.CoherenceRelation → ℕ
P(CR): prior probability of coherence relation (%)
pSourceGivenCR : Core.Discourse.Coherence.CoherenceRelation → ℕ
P(referent = Source | CR): Source bias given CR (%)

Instances For

inductive KehlerRohde2013.TopichoodLevel :

Topichood level, determined by grammatical construction.

Passive subjects signal stronger topichood than active subjects: using a marked construction to place an entity in subject position is a stronger indicator that the speaker treats it as the sentence topic (@cite{davison-1984}). This is the centering-driven "bottom-up" component of the model.

The P(pronoun | referent) term in eq. (13) tracks this level, not grammatical role per se.

strong : TopichoodLevel
default_ : TopichoodLevel
low : TopichoodLevel

Instances For

@[implicit_reducible]

instance KehlerRohde2013.instDecidableEqTopichoodLevel :

DecidableEq TopichoodLevel

Equations

KehlerRohde2013.instDecidableEqTopichoodLevel x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

def KehlerRohde2013.instReprTopichoodLevel.repr :

TopichoodLevel → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

@[implicit_reducible]

instance KehlerRohde2013.instReprTopichoodLevel :

Repr TopichoodLevel

Equations

KehlerRohde2013.instReprTopichoodLevel = { reprPrec := KehlerRohde2013.instReprTopichoodLevel.repr }

def KehlerRohde2013.topichood (voice : UD.Voice) (isSubject : Bool) :

Compute topichood from voice and surface position.

Equations

KehlerRohde2013.topichood voice false = KehlerRohde2013.TopichoodLevel.low
KehlerRohde2013.topichood UD.Voice.Pass true = KehlerRohde2013.TopichoodLevel.strong
KehlerRohde2013.topichood voice true = KehlerRohde2013.TopichoodLevel.default_

Instances For

def KehlerRohde2013.sourceInterp_perfective :

ℕ

Table 1: Source interpretation rate by aspect. Imperfective focuses on ongoing event (Source still central); perfective focuses on end state (Goal = endpoint of transfer).

Equations

KehlerRohde2013.sourceInterp_perfective = 57

Instances For

def KehlerRohde2013.sourceInterp_imperfective :

ℕ

Equations

KehlerRohde2013.sourceInterp_imperfective = 80

Instances For

theorem KehlerRohde2013.imperfective_more_source :

sourceInterp_imperfective > sourceInterp_perfective

Imperfective yields more Source interpretations than perfective.

structure KehlerRohde2013.CRDatum :

Coherence relation frequency and bias data from Table 2 (perfective condition, transfer-of-possession verbs). "Violated Expectation" in the paper = CoherenceRelation.contrast.

cr : Core.Discourse.Coherence.CoherenceRelation
freqPct : ℕ
sourceGivenCR : ℕ

Instances For

@[implicit_reducible]

instance KehlerRohde2013.instReprCRDatum :

Equations

KehlerRohde2013.instReprCRDatum = { reprPrec := KehlerRohde2013.instReprCRDatum.repr }

def KehlerRohde2013.instReprCRDatum.repr :

CRDatum → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

def KehlerRohde2013.cr_occasion :

Equations

KehlerRohde2013.cr_occasion = { cr := Core.Discourse.Coherence.CoherenceRelation.occasion, freqPct := 38, sourceGivenCR := 18 }

Instances For

def KehlerRohde2013.cr_elaboration :

Equations

KehlerRohde2013.cr_elaboration = { cr := Core.Discourse.Coherence.CoherenceRelation.elaboration, freqPct := 28, sourceGivenCR := 98 }

Instances For

def KehlerRohde2013.cr_explanation :

Equations

KehlerRohde2013.cr_explanation = { cr := Core.Discourse.Coherence.CoherenceRelation.explanation, freqPct := 18, sourceGivenCR := 80 }

Instances For

def KehlerRohde2013.cr_violatedExp :

Equations

KehlerRohde2013.cr_violatedExp = { cr := Core.Discourse.Coherence.CoherenceRelation.contrast, freqPct := 8, sourceGivenCR := 76 }

Instances For

def KehlerRohde2013.cr_result :

Equations

KehlerRohde2013.cr_result = { cr := Core.Discourse.Coherence.CoherenceRelation.result, freqPct := 6, sourceGivenCR := 8 }

Instances For

theorem KehlerRohde2013.goal_biased_crs :

cr_occasion.sourceGivenCR < 50 ∧ cr_result.sourceGivenCR < 50

Occasion and Result are Goal-biased (Source < 50%).

theorem KehlerRohde2013.source_biased_crs :

cr_elaboration.sourceGivenCR > 50 ∧ cr_explanation.sourceGivenCR > 50 ∧ cr_violatedExp.sourceGivenCR > 50

Elaboration, Explanation, and Violated Expectation are Source-biased.

theorem KehlerRohde2013.biases_masked_by_mixture :

cr_occasion.sourceGivenCR < 50 ∧ cr_elaboration.sourceGivenCR > 50 ∧ cr_occasion.freqPct > cr_elaboration.freqPct

The overall ~57/43 Source/Goal split masks strong CR-conditioned biases. Occasion is most common (.38) and Goal-biased (.18 Source); Elaboration is second (.28) and strongly Source-biased (.98).

def KehlerRohde2013.perfective_model :

NextMentionModel

Instantiate the perfective-condition next-mention model with Table 2 data. Downstream study files can reference these CR biases.

Equations

One or more equations did not get rendered due to their size.

Instances For

def KehlerRohde2013.whatNext_occasion_pct :

ℕ

Table 3: "What happened next?" → Occasion-dominated; "Why?" → Explanation-dominated. Instructions shift P(CR) without changing the stimuli.

Equations

KehlerRohde2013.whatNext_occasion_pct = 71

Instances For

def KehlerRohde2013.whatNext_explanation_pct :

ℕ

Equations

KehlerRohde2013.whatNext_explanation_pct = 1

Instances For

def KehlerRohde2013.why_occasion_pct :

ℕ

Equations

KehlerRohde2013.why_occasion_pct = 1

Instances For

def KehlerRohde2013.why_explanation_pct :

ℕ

Equations

KehlerRohde2013.why_explanation_pct = 91

Instances For

theorem KehlerRohde2013.instructions_shift_pCR :

whatNext_occasion_pct > why_occasion_pct ∧ why_explanation_pct > whatNext_explanation_pct

def KehlerRohde2013.whatNext_sourcePct :

ℕ

Table 5: Source interpretation by instruction condition (perfective). Shifting P(CR) shifts P(referent), as predicted by eq. (9).

Equations

KehlerRohde2013.whatNext_sourcePct = 34

Instances For

def KehlerRohde2013.why_sourcePct :

ℕ

Equations

KehlerRohde2013.why_sourcePct = 82

Instances For

theorem KehlerRohde2013.instructions_shift_interpretation :

why_sourcePct > whatNext_sourcePct

theorem KehlerRohde2013.instruction_effect_magnitude :

why_sourcePct - whatNext_sourcePct > 40

The instruction effect is 48 pp on identical stimuli. No morphosyntactic heuristic can account for this.

structure KehlerRohde2013.StabilityDatum :

Table 4: P(Source | CR) is stable across the original experiment and the instruction manipulation, supporting the structural claim that CR-conditioned biases are properties of the coherence relation itself, not the experimental context.

cr : Core.Discourse.Coherence.CoherenceRelation
originalPct : ℕ
instructionPct : ℕ

Instances For

@[implicit_reducible]

instance KehlerRohde2013.instReprStabilityDatum :

Repr StabilityDatum

Equations

KehlerRohde2013.instReprStabilityDatum = { reprPrec := KehlerRohde2013.instReprStabilityDatum.repr }

def KehlerRohde2013.instReprStabilityDatum.repr :

StabilityDatum → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

def KehlerRohde2013.stab_elaboration :

Equations

KehlerRohde2013.stab_elaboration = { cr := Core.Discourse.Coherence.CoherenceRelation.elaboration, originalPct := 98, instructionPct := 100 }

Instances For

def KehlerRohde2013.stab_explanation :

Equations

KehlerRohde2013.stab_explanation = { cr := Core.Discourse.Coherence.CoherenceRelation.explanation, originalPct := 80, instructionPct := 82 }

Instances For

def KehlerRohde2013.stab_violatedExp :

Equations

KehlerRohde2013.stab_violatedExp = { cr := Core.Discourse.Coherence.CoherenceRelation.contrast, originalPct := 76, instructionPct := 74 }

Instances For

def KehlerRohde2013.stab_occasion :

Equations

KehlerRohde2013.stab_occasion = { cr := Core.Discourse.Coherence.CoherenceRelation.occasion, originalPct := 18, instructionPct := 27 }

Instances For

def KehlerRohde2013.stab_result :

Equations

KehlerRohde2013.stab_result = { cr := Core.Discourse.Coherence.CoherenceRelation.result, originalPct := 8, instructionPct := 9 }

Instances For

theorem KehlerRohde2013.bias_direction_stable :

(stab_elaboration.originalPct > 50 ∧ stab_elaboration.instructionPct > 50) ∧ (stab_explanation.originalPct > 50 ∧ stab_explanation.instructionPct > 50) ∧ (stab_occasion.originalPct < 50 ∧ stab_occasion.instructionPct < 50) ∧ stab_result.originalPct < 50 ∧ stab_result.instructionPct < 50

Bias direction (above/below 50%) is preserved for all five CRs across conditions. P(CR) can shift independently of P(ref|CR).

structure KehlerRohde2013.PromptCRDatum :

Table 6: CR distribution by prompt type. The mere presence of an ambiguous pronoun shifts coherence expectations toward Source-biased relations. This bidirectionality — coreference affects coherence, not just vice versa — is predicted by Bayes (eq. 12) but not by Hobbs (pronouns are inert free variables) or Centering (does not model coherence).

prompt : PromptType
cr : Core.Discourse.Coherence.CoherenceRelation
freqPct : ℕ

Instances For

@[implicit_reducible]

instance KehlerRohde2013.instReprPromptCRDatum :

Repr PromptCRDatum

Equations

KehlerRohde2013.instReprPromptCRDatum = { reprPrec := KehlerRohde2013.instReprPromptCRDatum.repr }

def KehlerRohde2013.instReprPromptCRDatum.repr :

PromptCRDatum → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

def KehlerRohde2013.np_elaboration :

Equations

KehlerRohde2013.np_elaboration = { prompt := KehlerRohde2013.PromptType.noPronoun, cr := Core.Discourse.Coherence.CoherenceRelation.elaboration, freqPct := 6 }

Instances For

def KehlerRohde2013.np_explanation :

Equations

KehlerRohde2013.np_explanation = { prompt := KehlerRohde2013.PromptType.noPronoun, cr := Core.Discourse.Coherence.CoherenceRelation.explanation, freqPct := 20 }

Instances For

def KehlerRohde2013.np_occasion :

Equations

KehlerRohde2013.np_occasion = { prompt := KehlerRohde2013.PromptType.noPronoun, cr := Core.Discourse.Coherence.CoherenceRelation.occasion, freqPct := 36 }

Instances For

def KehlerRohde2013.np_result :

Equations

KehlerRohde2013.np_result = { prompt := KehlerRohde2013.PromptType.noPronoun, cr := Core.Discourse.Coherence.CoherenceRelation.result, freqPct := 13 }

Instances For

def KehlerRohde2013.np_violatedExp :

Equations

KehlerRohde2013.np_violatedExp = { prompt := KehlerRohde2013.PromptType.noPronoun, cr := Core.Discourse.Coherence.CoherenceRelation.contrast, freqPct := 18 }

Instances For

def KehlerRohde2013.pp_elaboration :

Equations

KehlerRohde2013.pp_elaboration = { prompt := KehlerRohde2013.PromptType.pronoun, cr := Core.Discourse.Coherence.CoherenceRelation.elaboration, freqPct := 20 }

Instances For

def KehlerRohde2013.pp_explanation :

Equations

KehlerRohde2013.pp_explanation = { prompt := KehlerRohde2013.PromptType.pronoun, cr := Core.Discourse.Coherence.CoherenceRelation.explanation, freqPct := 28 }

Instances For

def KehlerRohde2013.pp_occasion :

Equations

KehlerRohde2013.pp_occasion = { prompt := KehlerRohde2013.PromptType.pronoun, cr := Core.Discourse.Coherence.CoherenceRelation.occasion, freqPct := 28 }

Instances For

def KehlerRohde2013.pp_result :

Equations

KehlerRohde2013.pp_result = { prompt := KehlerRohde2013.PromptType.pronoun, cr := Core.Discourse.Coherence.CoherenceRelation.result, freqPct := 5 }

Instances For

def KehlerRohde2013.pp_violatedExp :

Equations

KehlerRohde2013.pp_violatedExp = { prompt := KehlerRohde2013.PromptType.pronoun, cr := Core.Discourse.Coherence.CoherenceRelation.contrast, freqPct := 14 }

Instances For

theorem KehlerRohde2013.pronoun_boosts_source_CRs :

pp_elaboration.freqPct > np_elaboration.freqPct ∧ pp_explanation.freqPct > np_explanation.freqPct

Pronoun prompt increases Source-biased CRs.

theorem KehlerRohde2013.pronoun_reduces_goal_CRs :

pp_occasion.freqPct < np_occasion.freqPct ∧ pp_result.freqPct < np_result.freqPct

Pronoun prompt decreases Goal-biased CRs.

def KehlerRohde2013.nm_active_pron :

ℕ

Equations

KehlerRohde2013.nm_active_pron = 77

Instances For

def KehlerRohde2013.nm_active_noPron :

ℕ

Equations

KehlerRohde2013.nm_active_noPron = 59

Instances For

def KehlerRohde2013.nm_passive_pron :

ℕ

Equations

KehlerRohde2013.nm_passive_pron = 42

Instances For

def KehlerRohde2013.nm_passive_noPron :

ℕ

Equations

KehlerRohde2013.nm_passive_noPron = 76

Instances For

theorem KehlerRohde2013.voice_affects_nextMention :

nm_active_pron > nm_passive_pron

Voice affects next-mention in pronoun condition: active (.77) vs passive (.42). Passivization moves the causally-implicated referent out of subject position — same proposition, different bias.

theorem KehlerRohde2013.noPronoun_pattern_reverses :

nm_passive_noPron > nm_active_noPron

In the no-pronoun condition the pattern reverses: passive (.76) > active (.59). By-phrases are optional in English, so their inclusion signals the referent will be re-mentioned.

def KehlerRohde2013.expl_active_pron :

ℕ

Equations

KehlerRohde2013.expl_active_pron = 75

Instances For

def KehlerRohde2013.expl_active_noPron :

ℕ

Equations

KehlerRohde2013.expl_active_noPron = 60

Instances For

def KehlerRohde2013.expl_passive_pron :

ℕ

Equations

KehlerRohde2013.expl_passive_pron = 52

Instances For

def KehlerRohde2013.expl_passive_noPron :

ℕ

Equations

KehlerRohde2013.expl_passive_noPron = 72

Instances For

theorem KehlerRohde2013.voice_affects_coherence :

expl_active_pron > expl_passive_pron

Voice affects coherence in pronoun condition: active produces more Explanations than passive. Since propositions are identical, this is mediated by the shift in pronominal reference — demonstrating bidirectional coherence–coreference dependency.

def KehlerRohde2013.pron_active_subj :

ℕ

Equations

KehlerRohde2013.pron_active_subj = 62

Instances For

def KehlerRohde2013.pron_active_nonSubj :

ℕ

Equations

KehlerRohde2013.pron_active_nonSubj = 24

Instances For

def KehlerRohde2013.pron_passive_subj :

ℕ

Equations

KehlerRohde2013.pron_passive_subj = 87

Instances For

def KehlerRohde2013.pron_passive_nonSubj :

ℕ

Equations

KehlerRohde2013.pron_passive_nonSubj = 23

Instances For

theorem KehlerRohde2013.passive_subj_more_pronominalized :

pron_passive_subj > pron_active_subj

Central topichood prediction: passive subjects are pronominalized more than active subjects (87% vs 62%).

This is NOT explicable by grammatical role alone — both are subjects. It reflects the stronger topichood signal of the passive: using a marked syntactic form to place an entity in subject position is a stronger indicator of topic status. This is the key evidence that P(pronoun | referent) tracks TOPICHOOD, not subjecthood.

theorem KehlerRohde2013.nonSubj_pron_invariant :

pron_active_nonSubj - pron_passive_nonSubj ≤ 1

Non-subject pronominalization is invariant across voice (24% vs 23%). At the same topichood level (low), the voice manipulation — which changes coherence expectations dramatically — has no effect on pronominalization rate. This is the Independence Hypothesis in action: P(pronoun | referent) does not depend on coherence-driven factors.

theorem KehlerRohde2013.subject_advantage_both_voices :

pron_active_subj > pron_active_nonSubj ∧ pron_passive_subj > pron_passive_nonSubj

Subjects are pronominalized more than non-subjects in both voices. This subject advantage is the centering-derived component.

theorem KehlerRohde2013.topichood_monotone :

pron_passive_subj > pron_active_subj ∧ pron_active_subj > pron_active_nonSubj

Topichood monotonically predicts pronominalization: strong (passive subject, 87%) > default (active subject, 62%)

low (non-subject, ~24%).

def KehlerRohde2013.predicted_active_subj :

ℕ

Equations

KehlerRohde2013.predicted_active_subj = 81

Instances For

def KehlerRohde2013.actual_active_subj :

ℕ

Equations

KehlerRohde2013.actual_active_subj = 74

Instances For

def KehlerRohde2013.predicted_passive_subj :

ℕ

Equations

KehlerRohde2013.predicted_passive_subj = 59

Instances For

def KehlerRohde2013.actual_passive_subj :

ℕ

Equations

KehlerRohde2013.actual_passive_subj = 60

Instances For

theorem KehlerRohde2013.bayesian_directionally_correct :

predicted_active_subj > predicted_passive_subj ∧ actual_active_subj > actual_passive_subj

Bayesian predictions are directionally correct: active > passive in both predicted and actual biases.

theorem KehlerRohde2013.passive_prediction_accurate :

actual_passive_subj - predicted_passive_subj ≤ 1

The passive prediction is highly accurate (59% vs 60%).

def KehlerRohde2013.NextMentionModel.sourceBasisPts (m : NextMentionModel) :

ℕ

Compute the coherence-marginalized Source bias from a NextMentionModel. This IS equation (9): P(Source) = Σ_CR P(CR) × P(Source | CR). Result is in basis points (×10000); divide by 100 for percentage.

Equations

One or more equations did not get rendered due to their size.

Instances For

def KehlerRohde2013.whatNext_model :

NextMentionModel

Equations

One or more equations did not get rendered due to their size.

Instances For

def KehlerRohde2013.why_model :

NextMentionModel

Equations

One or more equations did not get rendered due to their size.

Instances For

theorem KehlerRohde2013.instruction_models_share_bias :

whatNext_model.pSourceGivenCR = why_model.pSourceGivenCR

Structural invariant: the two instruction models share the same CR-conditioned biases. The instruction manipulation changes P(CR) while holding P(ref|CR) constant. This is the structural content of Table 4.

theorem KehlerRohde2013.eq9_why_exceeds_whatNext :

why_model.sourceBasisPts > whatNext_model.sourceBasisPts

Eq. (9) derivation: the "Why?" mixture exceeds the "What next?" mixture. This is DERIVED from the model, not read off Table 5. The proof computes: Why: 1×27 + 91×82 + 8×100 + 1×74 + 0×9 + 0×50 = 8363 What next: 71×27 + 1×82 + 5×100 + 8×74 + 5×9 + 10×50 = 3636 and verifies 8363 > 3636. The direction follows from Explanation (Source-biased at 82%) dominating the Why mixture at 91%.

theorem KehlerRohde2013.eq9_mixtures_approximate_table5 :

why_model.sourceBasisPts / 100 > 80 ∧ whatNext_model.sourceBasisPts / 100 < 40

The computed mixtures are consistent with Table 5: Why → ~84% Source, What-next → ~36% Source (vs observed 82% and 34%). The small discrepancy is from integer rounding and the "Other" CR category.

def KehlerRohde2013.bayesianPrediction (pSubj pPronSubj pPronNonSubj : ℕ) :

ℕ

Compute P(Subject | pronoun) via Bayes' rule (eq. 13). Takes P(Subject next-mentioned) from no-pronoun data and P(pronoun | position) from pronominalization rates. Result is a percentage (0–100).

Equations

One or more equations did not get rendered due to their size.

Instances For

theorem KehlerRohde2013.eq13_active_prediction :

bayesianPrediction nm_active_noPron pron_active_subj pron_active_nonSubj > 50

Eq. (13) derivation: active voice. From:

P(Subject) = 59% (Table 7, no-pronoun, causal ref = subject)
P(pronoun | Subject) = 62% (Table 9)
P(pronoun | NonSubject) = 24% (Table 9) Bayes' rule yields: 62×59 / (62×59 + 24×41) = 3658/4642 ≈ 78%. The paper reports 81% (from unrounded data); the direction matches.

theorem KehlerRohde2013.eq13_passive_prediction :

bayesianPrediction (100 - nm_passive_noPron) pron_passive_subj pron_passive_nonSubj > 50

Eq. (13) derivation: passive voice. From:

P(Subject) = 100 - 76 = 24% (Table 7: 76% mention causal ref, who is the NON-subject in passive)
P(pronoun | Subject) = 87% (Table 9)
P(pronoun | NonSubject) = 23% (Table 9) Bayes' rule yields: 87×24 / (87×24 + 23×76) = 2088/3836 ≈ 54%.

theorem KehlerRohde2013.eq13_active_exceeds_passive :

bayesianPrediction nm_active_noPron pron_active_subj pron_active_nonSubj > bayesianPrediction (100 - nm_passive_noPron) pron_passive_subj pron_passive_nonSubj

Central Bayesian prediction: Bayes' rule correctly derives that active > passive for P(Subject | pronoun), even though passive subjects are more likely to be pronominalized (87% vs 62%). The prior P(Subject) is much lower in passive (24% vs 59%), and this dominates. Production bias alone would predict passive > active; the Bayesian model correctly reverses this.

theorem KehlerRohde2013.goal_biased_crs_are_endpoint_focused :

cr_occasion.cr.toClass = Core.Discourse.Coherence.CoherenceClass.contiguity ∧ cr_result.cr.toClass = Core.Discourse.Coherence.CoherenceClass.causeEffect ∧ cr_occasion.sourceGivenCR < 50 ∧ cr_result.sourceGivenCR < 50

The two Goal-biased CRs (Occasion, Result) both focus on what happens AFTER the prior event. For transfer verbs, the endpoint is the Goal.

theorem KehlerRohde2013.explanation_source_and_backward :

cr_explanation.cr.selectsCause = true ∧ cr_explanation.sourceGivenCR > 50

Explanation is Source-biased and selects for causes (backward causal). For transfer verbs, the Source/initiator is the cause. For IC verbs, the stimulus is the cause — this is the bridge to IC bias studies.

theorem KehlerRohde2013.contiguity_class_splits :

cr_occasion.cr.toClass = cr_elaboration.cr.toClass ∧ cr_occasion.sourceGivenCR < 50 ∧ cr_elaboration.sourceGivenCR > 50

Key insight: the contiguity class does NOT uniformly predict bias. Occasion (18% Source) and Elaboration (98% Source) are both contiguity relations but have opposite biases. Occasion focuses on the END STATE (Goal); Elaboration redescribes the SAME EVENT (Source/initiator). The bias is determined by the specific relation, not the class.

Centering's CB and KR2013's topichood are not the same signal. #

K&R 2013 IS the Bayesian-Centering reconciliation paper. To make
that explicit at the type level, this section grounds the file's
`topichood`/`bayesianPrediction` apparatus in the
`Theories/Discourse/Centering/` substrate (`cb`, `cp`, `Rule1Gordon`),
showing precisely *what Centering does and does not capture* from
KR2013's empirical landscape.

The key dissociation (KR2013 §8, Table 9): under the standard
grammatical-role Cf ranking (`SUBJECT > OBJECT > OTHER`,
@cite{kameyama-1986}), the CB is **invariant under voice
manipulation** — both `(Amanda, SUBJ) (Brittany, OBJ)` and
`(Amanda, SUBJ) (Brittany, OTHER-by-phrase)` make Amanda the
most-preferred Cf. But KR2013's `topichood` distinguishes the
two cases: passive subject is `.strong`, active subject is
`.default_`. This is the formal content of "P(pronoun | referent)
tracks topichood, not subjecthood" (§8 p. 25).

The cross-paper claim landed here: **Centering's CB selection is a
necessary input to KR2013's production model but is not sufficient
to predict pronominalization rate**. The voice-induced gradient
that KR2013 measure (87% vs 62%) lives in the topichood signal,
not in the CB signal.

**Empirical complement** (post-2013 follow-up): the substrate-level
dissociation theorem `cb_topichood_dissociation_under_voice` below
is the structural reason why
`RosaArnold2017.independence_violated_bridges_to_KR`
(`Phenomena/Reference/Studies/RosaArnold2017.lean`) finds K&R's
Independence Hypothesis empirically *violatable*: because
`cb` cannot detect the voice manipulation, any pronominalization
asymmetry between active and passive subjects must be carried by a
signal external to Centering's `cb`/`cp` — Rosa & Arnold's
experiment provides one such asymmetry as a corpus measurement,
while §12 here exhibits the substrate-level mechanism.

def KehlerRohde2013.amanda :

ℕ

Two referents in our toy KR2013 example: Amanda (subject across voice manipulations) and Brittany (object/by-phrase).

Equations

KehlerRohde2013.amanda = 1

Instances For

def KehlerRohde2013.brittany :

ℕ

Equations

KehlerRohde2013.brittany = 2

Instances For

def KehlerRohde2013.prev_AmandaActive :

Discourse.Centering.Utterance ℕ Discourse.Centering.GrammaticalRole

Prior utterance "Amanda V'd Brittany": Amanda is SUBJ, Brittany is OBJ. Forward-looking centers under Kameyama's role ranking are [Amanda, Brittany] with Amanda as Cp.

Equations

One or more equations did not get rendered due to their size.

Instances For

def KehlerRohde2013.cur_active :

Discourse.Centering.Utterance ℕ Discourse.Centering.GrammaticalRole

Active continuation "She V'd her" — Amanda still SUBJ, both pronouns.

Equations

One or more equations did not get rendered due to their size.

Instances For

def KehlerRohde2013.cur_passive :

Discourse.Centering.Utterance ℕ Discourse.Centering.GrammaticalRole

Passive continuation "Amanda was V'd by Brittany" — Amanda promoted to SUBJ via the marked passive construction; Brittany now in by-phrase (OTHER). The proposition is identical; only the construction differs.

Equations

One or more equations did not get rendered due to their size.

Instances For

theorem KehlerRohde2013.cp_prev_is_amanda :

prev_AmandaActive.cp = some amanda

Cp of the prior utterance is Amanda (SUBJ outranks OBJ).

theorem KehlerRohde2013.cb_invariant_under_voice :

Discourse.Centering.cb prev_AmandaActive cur_active = Discourse.Centering.cb prev_AmandaActive cur_passive

CB is invariant under voice manipulation. Both the active and passive continuations have CB = Amanda, because Amanda is in prev.cf and is realized in both. The grammatical-role ranker cannot see voice — both subjects rank equally as .subject.

theorem KehlerRohde2013.cb_is_amanda_in_both_voices :

Discourse.Centering.cb prev_AmandaActive cur_active = some amanda ∧ Discourse.Centering.cb prev_AmandaActive cur_passive = some amanda

Both voice variants have CB = Amanda specifically.

theorem KehlerRohde2013.topichood_distinguishes_voice :

topichood UD.Voice.Pass true ≠ topichood UD.Voice.Act true

KR2013's topichood IS voice-sensitive. The same subject-position Amanda gets .strong topichood under passive marking but only .default_ under active. This is the gradient that drives the 87% vs 62% pronominalization rate difference (Table 9).

theorem KehlerRohde2013.cb_topichood_dissociation_under_voice :

Discourse.Centering.cb prev_AmandaActive cur_active = Discourse.Centering.cb prev_AmandaActive cur_passive ∧ topichood UD.Voice.Pass true ≠ topichood UD.Voice.Act true

The dissociation theorem: Centering's CB and KR2013's topichood diverge on the voice manipulation. CB is the same in both cases (Amanda); topichood differs (.strong vs .default_). The 25-pp pronominalization gap KR2013 measure (Table 9) lives in the topichood signal, not the CB signal — exactly KR2013 §8's "P(pronoun | referent) tracks topichood, not subjecthood."

theorem KehlerRohde2013.rule1_gordon_satisfied_both_voices :

Discourse.Centering.Rule1Gordon prev_AmandaActive cur_active ∧ Discourse.Centering.Rule1Gordon prev_AmandaActive cur_passive

Rule 1 (Gordon) is satisfied in both voice variants because both Amanda-realizations are pronominal. The substrate-level Rule 1 constraint is voice-insensitive too — it only fires on whether the CB is a pronoun, not on what construction realized it.

KR2013's contribution is precisely to expose the gradient that Rule 1's Bool-valued check averages over: among the utterances that satisfy Rule 1 (Gordon), passive-subject ones use a pronoun 87% of the time while active-subject ones do so only 62% of the time (Table 9). Rule 1 captures the qualitative pattern; the Bayesian likelihood P(pronoun | referent) captures the voice-conditioned production rate.

theorem KehlerRohde2013.topichood_rates_monotone_in_table9 :

pron_passive_subj > pron_active_subj ∧ pron_active_subj > pron_active_nonSubj

Centering as the qualitative skeleton of KR2013's likelihood. Where Centering's Rule1Gordon says "the CB should be pronominalized" (Bool), KR2013's likelihood P(pronoun | referent) says "the CB is pronominalized at a rate proportional to its topichood" (gradient). The Centering substrate provides the which referent is the topic part; KR2013 provide the how strongly part.

Numerically: the 87% / 62% / ~24% pronominalization rates from Table 9 monotonically track the .strong / .default_ / .low levels assigned by topichood.