[HF19]: complex probability expressions via RSA on `PMF` #

Compositional threshold semantics plus an RSA model over an urn scenario (N = 10 balls), formalised on mathlib's PMF in the same key as Studies/LassiterGoodman2017PMF.lean. Cognition 186 (2019) 50–71.

The architectural novelty is a Hellinger-distance speaker utility (Eq. 16) in place of KL divergence. The literal listener P_LL(s|m) ∝ ⟦m⟧(s) · P_prior(s) puts zero mass outside a meaning's extension, so KL(P_rat.bel ‖ P_LL) = ∞ whenever the speaker's belief has support outside it — a strictly probabilistic speaker could never say "certainly RED" at 9/10. Hellinger distance is bounded by 1, so the speaker can assert it with bounded disutility.

Main definitions #

obsKernel, speakerBelief — belief formation (Eq. 12) as PMF.posterior of the hypergeometric observation kernel, with ℚ-valued mirrors hypergeoQ / speakerBeliefQ for finite-arithmetic checks.
SimpleExpr.meaning, complexMeaning — threshold semantics for simple (Eq. 13-14) and nested (Eq. 22-23) probability expressions.
literalListener, pragmaticListener — the RSA literal listener (uniform on a meaning extension, Eq. 15) and the pragmatic listener as a joint PMF.posterior (Eq. 18).

Main results #

hellinger_admits_what_KL_excludes — on the witness pair (pure 9, pure 10), KL = ∞ but HD ≤ 1: the formal content of the Hellinger-vs-KL divergence, via klDiv_eq_top_when_zero_support and hellingerDistSq_le_one.
statePosterior_apply, accessPosterior_apply, statePosterior_lt_iff — marginalisations (Eq. 19-20) of the joint posterior, direct corollaries of Core/Probability/JointPosterior.
certainly_subset_probably, probably_subset_possibly — nesting of the inferred meaning extensions.

The cross-paper contrast with [GS13] makes the speaker-utility divergence (KL vs Hellinger) visible at theorem level. The paper's posterior-predictive evaluation (JAGS fits, the model–data correlations of Tables 6-10) is statistical, not structural, and is not formalised here.

Domain types #

source

@[reducible, inline]

abbrev HerbstrittFranke2019.UrnState :

Type

Urn state: number of red balls in the urn (0..10).

Equations

HerbstrittFranke2019.UrnState = Fin 11

Instances For

source

def HerbstrittFranke2019.proportion (s : UrnState) :

ℚ

The proportion of red balls: s/10. First-order probability of drawing a red ball, and the measure function for simple expressions.

Equations

HerbstrittFranke2019.proportion s = ↑↑s / 10

Instances For

source

@[reducible, inline]

abbrev HerbstrittFranke2019.Obs :

Type

Observation count (zero-padded for access < 10): obs > access have probability 0 in the kernel.

Equations

HerbstrittFranke2019.Obs = Fin 11

Instances For

Belief formation via the hypergeometric kernel #

The observation kernel is the hypergeometric distribution from Core/Probability/Hypergeometric.lean, embedded into the padded Obs := Fin 11 via PMF.map. The speaker's posterior belief is then just PMF.posterior of this kernel against the uniform world prior.

source

noncomputable def HerbstrittFranke2019.obsKernel (access : ℕ) (h_a : access ≤ 10) (s : UrnState) :

PMF Obs

Observation kernel: P(obs | access, state). The hypergeometric PMF on Fin (access + 1) embedded into Fin 11 (zero on obs > access). Uses PMF.hypergeometric from Core/Probability.

Equations

HerbstrittFranke2019.obsKernel access h_a s = PMF.map (fun (o : Fin (access + 1)) => ⟨↑o, ⋯⟩) (PMF.hypergeometric 10 (↑s) access h_a ⋯)

Instances For

source

noncomputable def HerbstrittFranke2019.speakerBelief (access : ℕ) (h_a : access ≤ 10) (obs : Obs) (h_marg : PMF.marginal (obsKernel access h_a) (PMF.uniformOfFintype UrnState) obs ≠ 0) :

PMF UrnState

Speaker's posterior belief P_rat.bel(·|o, a) (Eq. 12). With uniform world prior, this is PMF.posterior of the hypergeometric kernel.

Equations

HerbstrittFranke2019.speakerBelief access h_a obs h_marg = PMF.posterior (HerbstrittFranke2019.obsKernel access h_a) (PMF.uniformOfFintype HerbstrittFranke2019.UrnState) obs h_marg

Instances For

`ℚ`-valued speaker belief for finite-arithmetic verification #

Mirrors the existing file's speakerBeliefQ for norm_num checks. Bridge to the PMF version: both reduce to the same hypergeometric ratio.

source

def HerbstrittFranke2019.hypergeoQ (access obs : ℕ) (s : UrnState) :

ℚ

Hypergeometric weight at (N=10, K=s, n=access, k=obs) as ℚ.

Equations

HerbstrittFranke2019.hypergeoQ access obs s = if obs ≤ access then ↑((↑s).choose obs * (10 - ↑s).choose (access - obs)) / ↑(Nat.choose 10 access) else 0

Instances For

source

def HerbstrittFranke2019.speakerBeliefQ (access obs : ℕ) (s : UrnState) :

ℚ

ℚ-valued speaker belief — Bayes' rule over the hypergeometric kernel with uniform world prior.

Equations

One or more equations did not get rendered due to their size.

Instances For

Simple expression semantics #

source

def HerbstrittFranke2019.θ_possibly :

ℚ

Semantic threshold for "possibly" (posterior mean from Table 6).

Equations

HerbstrittFranke2019.θ_possibly = 247 / 1000

Instances For

source

def HerbstrittFranke2019.θ_probably :

ℚ

Semantic threshold for "probably" (posterior mean from Table 6).

Equations

HerbstrittFranke2019.θ_probably = 549 / 1000

Instances For

source

def HerbstrittFranke2019.θ_certainly :

ℚ

Semantic threshold for "certainly" (posterior mean from Table 6).

Equations

HerbstrittFranke2019.θ_certainly = 949 / 1000

Instances For

source

theorem HerbstrittFranke2019.inferred_threshold_ordering :

θ_possibly < θ_probably ∧ θ_probably < θ_certainly

The inferred threshold ordering matches the theoretical prediction: certainly > probably > possibly.

source

inductive HerbstrittFranke2019.SimpleExpr :

Type

The five simple expressions from Experiments 2 and 3.

certainlyNot : SimpleExpr
probablyNot : SimpleExpr
possibly : SimpleExpr
probably : SimpleExpr
certainly : SimpleExpr

Instances For

source

@[implicit_reducible]

instance HerbstrittFranke2019.instDecidableEqSimpleExpr :

DecidableEq SimpleExpr

Equations

HerbstrittFranke2019.instDecidableEqSimpleExpr x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

source

@[implicit_reducible]

instance HerbstrittFranke2019.instReprSimpleExpr :

Repr SimpleExpr

Equations

HerbstrittFranke2019.instReprSimpleExpr = { reprPrec := HerbstrittFranke2019.instReprSimpleExpr.repr }

source

def HerbstrittFranke2019.instReprSimpleExpr.repr :

SimpleExpr → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

source

@[implicit_reducible]

instance HerbstrittFranke2019.instInhabitedSimpleExpr :

Inhabited SimpleExpr

Equations

HerbstrittFranke2019.instInhabitedSimpleExpr = { default := HerbstrittFranke2019.instInhabitedSimpleExpr.default }

source

@[implicit_reducible]

instance HerbstrittFranke2019.instFintypeSimpleExpr :

Fintype SimpleExpr

Equations

One or more equations did not get rendered due to their size.

source

def HerbstrittFranke2019.SimpleExpr.meaning :

SimpleExpr → UrnState → Bool

Simple expression meaning using inferred thresholds. Eq. 13: ⟦X(RED)⟧ = {s | s/10 > θ_X} (positive) Eq. 14: ⟦X not(RED)⟧ = {s | s/10 < 1 − θ_X} (negated)

Equations

HerbstrittFranke2019.SimpleExpr.certainly.meaning = fun (s : HerbstrittFranke2019.UrnState) => decide (HerbstrittFranke2019.proportion s > HerbstrittFranke2019.θ_certainly)
HerbstrittFranke2019.SimpleExpr.probably.meaning = fun (s : HerbstrittFranke2019.UrnState) => decide (HerbstrittFranke2019.proportion s > HerbstrittFranke2019.θ_probably)
HerbstrittFranke2019.SimpleExpr.possibly.meaning = fun (s : HerbstrittFranke2019.UrnState) => decide (HerbstrittFranke2019.proportion s > HerbstrittFranke2019.θ_possibly)
HerbstrittFranke2019.SimpleExpr.probablyNot.meaning = fun (s : HerbstrittFranke2019.UrnState) => decide (HerbstrittFranke2019.proportion s < 1 - HerbstrittFranke2019.θ_probably)
HerbstrittFranke2019.SimpleExpr.certainlyNot.meaning = fun (s : HerbstrittFranke2019.UrnState) => decide (HerbstrittFranke2019.proportion s < 1 - HerbstrittFranke2019.θ_certainly)

Instances For

source

theorem HerbstrittFranke2019.certainly_subset_probably (s : UrnState) :

SimpleExpr.certainly.meaning s = true → SimpleExpr.probably.meaning s = true

Simple expression extensions are nested: certainly ⊂ probably ⊂ possibly.

source

theorem HerbstrittFranke2019.probably_subset_possibly (s : UrnState) :

SimpleExpr.probably.meaning s = true → SimpleExpr.possibly.meaning s = true

Compositional complex-expression semantics #

source

def HerbstrittFranke2019.posteriorProb (access obs : ℕ) (φ : UrnState → Bool) :

ℚ

Posterior probability that an urn-state proposition φ holds, given the speaker's observation. ℚ-valued for decidable verification.

posteriorProb(access, obs, φ) = Σ_{s ∈ ⟦φ⟧} P_rat.bel(s | obs, access)

Equations

HerbstrittFranke2019.posteriorProb access obs φ = {x : HerbstrittFranke2019.UrnState | φ x = true}.sum (HerbstrittFranke2019.speakerBeliefQ access obs)

Instances For

source

inductive HerbstrittFranke2019.InnerExpr :

Type

The three inner expressions from Experiment 3 (Eq. 22). Footnote 18: θ_probably from the simple model maps to θ_likely of the inner expressions in the complex model.

likely : InnerExpr
possible : InnerExpr
unlikely : InnerExpr

Instances For

source

@[implicit_reducible]

instance HerbstrittFranke2019.instDecidableEqInnerExpr :

DecidableEq InnerExpr

Equations

HerbstrittFranke2019.instDecidableEqInnerExpr x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

source

def HerbstrittFranke2019.instReprInnerExpr.repr :

InnerExpr → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

source

@[implicit_reducible]

instance HerbstrittFranke2019.instReprInnerExpr :

Repr InnerExpr

Equations

HerbstrittFranke2019.instReprInnerExpr = { reprPrec := HerbstrittFranke2019.instReprInnerExpr.repr }

source

def HerbstrittFranke2019.InnerExpr.meaning :

InnerExpr → UrnState → Bool

Inner expression meaning (Eq. 22).

Equations

HerbstrittFranke2019.InnerExpr.likely.meaning = fun (s : HerbstrittFranke2019.UrnState) => decide (HerbstrittFranke2019.proportion s > HerbstrittFranke2019.θ_probably)
HerbstrittFranke2019.InnerExpr.possible.meaning = fun (s : HerbstrittFranke2019.UrnState) => decide (HerbstrittFranke2019.proportion s > HerbstrittFranke2019.θ_possibly)
HerbstrittFranke2019.InnerExpr.unlikely.meaning = fun (s : HerbstrittFranke2019.UrnState) => decide (HerbstrittFranke2019.proportion s < 1 - HerbstrittFranke2019.θ_probably)

Instances For

source

inductive HerbstrittFranke2019.OuterMod :

Type

The four outer modifiers from Experiment 3.

is_ : OuterMod
isCertainly : OuterMod
isProbably : OuterMod
mightBe : OuterMod

Instances For

source

@[implicit_reducible]

instance HerbstrittFranke2019.instDecidableEqOuterMod :

DecidableEq OuterMod

Equations

HerbstrittFranke2019.instDecidableEqOuterMod x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

source

@[implicit_reducible]

instance HerbstrittFranke2019.instReprOuterMod :

Repr OuterMod

Equations

HerbstrittFranke2019.instReprOuterMod = { reprPrec := HerbstrittFranke2019.instReprOuterMod.repr }

source

def HerbstrittFranke2019.instReprOuterMod.repr :

OuterMod → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

source

def HerbstrittFranke2019.OuterMod.threshold :

OuterMod → ℚ

Outer modifier threshold (Table 9, complex model).

Equations

HerbstrittFranke2019.OuterMod.mightBe.threshold = 332 / 1000
HerbstrittFranke2019.OuterMod.is_.threshold = HerbstrittFranke2019.θ_probably
HerbstrittFranke2019.OuterMod.isProbably.threshold = 690 / 1000
HerbstrittFranke2019.OuterMod.isCertainly.threshold = 982 / 1000

Instances For

source

def HerbstrittFranke2019.complexMeaning (outer : OuterMod) (inner : InnerExpr) (access obs : ℕ) :

Bool

Complex expression Y(X(RED)) (Eq. 23): ⟦Y(X(RED))⟧(⟨o, a⟩) ⟺ Σ_{s∈⟦X(RED)⟧} P_rat.bel(s|o, a) > θ_Y

Equations

HerbstrittFranke2019.complexMeaning outer inner access obs = decide (HerbstrittFranke2019.posteriorProb access obs inner.meaning > outer.threshold)

Instances For

source

theorem HerbstrittFranke2019.strict_threshold_equiv_ge :

(∀ (s : UrnState), proportion s > θ_certainly ↔ proportion s ≥ θ_certainly) ∧ (∀ (s : UrnState), proportion s > θ_probably ↔ proportion s ≥ θ_probably) ∧ ∀ (s : UrnState), proportion s > θ_possibly ↔ proportion s ≥ θ_possibly

For HF's specific thresholds, strict > and non-strict ≥ give the same extension on UrnState — no proportion s/10 (s ∈ {0,...,10}) exactly equals any threshold. Justifies using > (paper Eq. 13) even when the theory-layer nestedThreshold uses ≥ (Fagin & Halpern).

RSA model on PMF #

The literal listener P_LL(s|m) ∝ ⟦m⟧(s) · P_prior(s) is the uniform distribution on the meaning extension (Eq. 15, with uniform prior). The speaker normalises softmax-Hellinger scores (Eq. 16-17). The pragmatic listener is PMF.posterior over the joint (state × obs × access) (Eq. 18).

source

noncomputable def HerbstrittFranke2019.literalListener (φ : UrnState → Bool) (h_nonempty : {x : UrnState | φ x = true}.Nonempty) :

PMF UrnState

Literal listener (Eq. 15) for a meaning with non-empty extension: uniform on the extension.

Equations

HerbstrittFranke2019.literalListener φ h_nonempty = PMF.uniformOfFinset {x : HerbstrittFranke2019.UrnState | φ x = true} h_nonempty

Instances For

source

theorem HerbstrittFranke2019.simple_meaning_nonempty (u : SimpleExpr) :

{x : UrnState | u.meaning x = true}.Nonempty

All five simple expressions have non-empty extensions under HF's inferred thresholds — every meaning admits a literal listener.

Joint posterior marginalisations #

The pragmatic listener is P_PL(s, o, a | m); its marginals P(s, o | m) (over access) and P(s, a | m) (over obs) are direct corollaries of Core/Probability/JointPosterior.lean's marginalisation theorems applied to the joint posterior.

The structural form here works for any speaker kernel S and joint prior over (state × access) — paper-specific instantiation is the softmax-Hellinger speaker, which lives in the Hellinger-vs-KL section below.

source

noncomputable def HerbstrittFranke2019.pragmaticListener {Access : Type u_1} (S : UrnState × Access → PMF SimpleExpr) (joint : PMF (UrnState × Access)) (m : SimpleExpr) (h_marg : PMF.marginal S joint m ≠ 0) :

PMF (UrnState × Access)

HF Eq. 18: pragmatic listener as joint posterior. P_PL(s, a | m) ∝ S(m | s, a) · P(s, a).

Just PMF.posterior at α := UrnState × Access. The marginalisation theorems below are direct instantiations of posterior_fst_apply / posterior_snd_apply.

Equations

HerbstrittFranke2019.pragmaticListener S joint m h_marg = PMF.posterior S joint m h_marg

Instances For

source

theorem HerbstrittFranke2019.statePosterior_apply {Access : Type u_1} [Fintype Access] (S : UrnState × Access → PMF SimpleExpr) (joint : PMF (UrnState × Access)) (m : SimpleExpr) (h_marg : PMF.marginal S joint m ≠ 0) (s : UrnState) :

(pragmaticListener S joint m h_marg).fst s = (∑ a : Access, joint (s, a) * (S (s, a)) m) / PMF.marginal S joint m

HF Eq. 19/20 (marginal over Access): per-state marginal posterior. P(s | m) = ∑_a P_PL(s, a | m) = (∑_a P(s, a) · S(m | s, a)) / Z

source

theorem HerbstrittFranke2019.accessPosterior_apply {Access : Type u_1} [Fintype Access] [DecidableEq Access] (S : UrnState × Access → PMF SimpleExpr) (joint : PMF (UrnState × Access)) (m : SimpleExpr) (h_marg : PMF.marginal S joint m ≠ 0) (a : Access) :

(pragmaticListener S joint m h_marg).snd a = (∑ s : UrnState, joint (s, a) * (S (s, a)) m) / PMF.marginal S joint m

HF marginal over UrnState: per-access marginal posterior. Companion of statePosterior_apply for the Access component.

source

theorem HerbstrittFranke2019.statePosterior_lt_iff {Access : Type u_1} [Fintype Access] (S : UrnState × Access → PMF SimpleExpr) (joint : PMF (UrnState × Access)) (m : SimpleExpr) (h_marg : PMF.marginal S joint m ≠ 0) (s₁ s₂ : UrnState) :

(pragmaticListener S joint m h_marg).fst s₁ < (pragmaticListener S joint m h_marg).fst s₂ ↔ ∑ a : Access, joint (s₁, a) * (S (s₁, a)) m < ∑ a : Access, joint (s₂, a) * (S (s₂, a)) m

Comparison decomposition for the per-state marginal posterior. L1 favours state s₂ over s₁ after m iff the conditional joint sums favour it. Direct corollary of posterior_fst_lt_iff.

Hellinger vs KL: the architectural divergence point #

HF's central methodological choice: KL divergence makes any message whose extension fails to cover the speaker's belief support have infinite disutility, so the speaker can never consider it. Hellinger distance is uniformly bounded by 1, so all messages have bounded disutility.

The two theorems below witness this directly:

klDiv_eq_top_when_zero_support: KL = ∞ whenever some support point of P has zero mass under Q.
hellingerDistSq_le_one: H² ≤ 1 always, so HD ≤ 1.

Together they make the architectural divergence visible at theorem level: there exist (P, Q) pairs where KL-utility excludes Q as a message choice but Hellinger-utility admits it.

source

theorem HerbstrittFranke2019.klDiv_eq_top_when_zero_support {α : Type u_1} [Fintype α] [MeasurableSpace α] [MeasurableSingletonClass α] (P Q : PMF α) (s₀ : α) (hP : P s₀ ≠ 0) (hQ : Q s₀ = 0) :

P.klDiv Q = ⊤

KL divergence is ∞ when Q has zero mass on P's support. Direct from mathlib klDiv_of_not_ac: KL = ∞ when P.toMeasure is not absolutely continuous w.r.t. Q.toMeasure. Absolute continuity fails at s₀ because Q s₀ = 0 but P s₀ ≠ 0.

source

theorem HerbstrittFranke2019.hellingerDistSq_le_one {α : Type u_1} [Fintype α] (P Q : PMF α) :

P.hellingerDistSq Q ≤ 1

Hellinger squared distance is ≤ 1 on PMFs: H²(P, Q) = 1 - BC(P, Q) where BC ∈ [0, 1] (Cauchy-Schwarz on √P, √Q, both with sum-to-1). Bounded above by 1 because BC ≥ 0.

The lower bound 0 ≤ BC is immediate (each term √(P·Q) ≥ 0).

source

theorem HerbstrittFranke2019.hellingerDist_le_one {α : Type u_1} [Fintype α] (P Q : PMF α) :

P.hellingerDist Q ≤ 1

Hellinger distance is ≤ 1 on PMFs: by H = √H² and H² ≤ 1.

source

theorem HerbstrittFranke2019.hellinger_admits_what_KL_excludes :

(PMF.pure ⟨9, ⋯⟩).klDiv (PMF.pure ⟨10, ⋯⟩) = ⊤ ∧ (PMF.pure ⟨9, ⋯⟩).hellingerDist (PMF.pure ⟨10, ⋯⟩) ≤ 1

The architectural divergence (concrete): there exist PMFs P, Q on UrnState such that KL(P ‖ Q) = ∞ but HD(P, Q) ≤ 1.

Witness: P = pure 9 (speaker's degenerate belief at 9/10 red), Q = pure 10 (literal listener for certainly — δ on s=10). Under KL: speaker can never say "certainly". Under Hellinger: speaker considers "certainly" with bounded disutility. This is HF's key architectural claim.

Architectural contrast with [GS13] #

This model and G&S 2013 share the hypergeometric observation kernel (both use PMF.hypergeometric with N=10 and N=3 respectively) and the RSA architecture (literal listener, softmax speaker, posterior pragmatic listener). The only architectural difference is the speaker utility:

Component	G&S 2013	HF 2019
State space	`Fin 4` (objects)	`Fin 11` (red balls)
Observation	hypergeometric	hypergeometric
Utility	`-KL(bel ‖ L0)`	`-HD(bel, L0)`
Admits "true enough"?	NO (KL = ∞ on zero-support)	YES (HD ≤ 1)

hellinger_admits_what_KL_excludes makes this difference visible: the same speaker belief pure 9 paired with the literal listener for certainly (pure 10) yields KL = ∞ but HD ≤ 1. Under G&S 2013's KL utility this speaker would be excluded from saying "certainly RED"; under HF's Hellinger utility she can. The empirical evidence (HF's production data, Section 5) shows speakers DO say "certainly" at 9/10.

The Bretagnolle–Huber inequality (PMF.two_hellingerDistSq_le_klDiv in Core/Probability/Entropy.lean) gives the formal direction 2 · H² ≤ KL on the finite side: where KL is finite, H² is too, and bounded. Hellinger is the strictly weaker (more permissive) divergence — exactly what HF wants for the speaker utility.

Modal concord and `might be possible` #

The paper finds that the compositional model underpredicts how often speakers choose "might be possible": although its truth condition is weak (posterior probability of "possible" exceeds θ_might = 0.332), the pragmatic speaker prefers logically stronger messages and so assigns the doubly-hedged form a low choice rate — yet participants select it far more often than predicted. A modal concord reading, collapsing the two modals to a single "possible" [Zei07], would close the gap. Qualitative; not formalised here.

Documentation

Linglib.Studies.HerbstrittFranke2019

[HF19]: complex probability expressions via RSA on `PMF` #

Main definitions #

Main results #

Domain types #

Belief formation via the hypergeometric kernel #

`ℚ`-valued speaker belief for finite-arithmetic verification #

Simple expression semantics #

Compositional complex-expression semantics #

RSA model on PMF #

Joint posterior marginalisations #

Hellinger vs KL: the architectural divergence point #

Architectural contrast with [GS13] #

[HF19]: complex probability expressions via RSA on PMF #

Main definitions #

Main results #

Domain types #

Belief formation via the hypergeometric kernel #

ℚ-valued speaker belief for finite-arithmetic verification #

Simple expression semantics #

Compositional complex-expression semantics #

RSA model on PMF #

Joint posterior marginalisations #

Hellinger vs KL: the architectural divergence point #

Architectural contrast with [GS13] #

Modal concord and might be possible #

[HF19]: complex probability expressions via RSA on `PMF` #

`ℚ`-valued speaker belief for finite-arithmetic verification #

Modal concord and `might be possible` #