Documentation

Linglib.Phenomena.Clarification.Studies.DongEtAl2026PMF

@cite{dong-etal-2026} on mathlib `PMF` (binary identification) #

@cite{dong-etal-2026}

PMF-shaped formalisation of the paper's 4 findings on binary identification. The task is intentionally degenerate: at any target, exactly one utterance applies (matching pair).

All 4 findings are vacuous-zero: the wrong guess has L0 = 0, so its S1/L1 mass is 0, hence "correct > wrong".

Stakes parameter k (animal: 1, medical: 10) is irrelevant to the qualitative predictions because the L0 gate zeros incorrect responses.

For the simplest PMF formulation, we set α = 1 and use a unit cost factor (no stakes-dependence). Both bundled-API configs (animalCfg, medicalCfg) in the paper collapse to the same PMF model after qualitative-only migration.

inductive DongEtAl2026.PMF.Target :

Binary identification task (simplified from the paper's Mixed-Stakes 20 Questions, which uses 100 animals / 15 diseases).

t₁ : Target
t₂ : Target

Instances For

@[implicit_reducible]

instance DongEtAl2026.PMF.instDecidableEqTarget :

DecidableEq Target

Equations

DongEtAl2026.PMF.instDecidableEqTarget x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

@[implicit_reducible]

instance DongEtAl2026.PMF.instReprTarget :

Equations

DongEtAl2026.PMF.instReprTarget = { reprPrec := DongEtAl2026.PMF.instReprTarget.repr }

def DongEtAl2026.PMF.instReprTarget.repr :

Target → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

@[implicit_reducible]

instance DongEtAl2026.PMF.instFintypeTarget :

Fintype Target

Equations

One or more equations did not get rendered due to their size.

instance DongEtAl2026.PMF.instNonemptyTarget :

Nonempty Target

def DongEtAl2026.PMF.targetMatches :

Target → Target → Bool

Boolean match: does guess match target?

Equations

DongEtAl2026.PMF.targetMatches DongEtAl2026.PMF.Target.t₁ DongEtAl2026.PMF.Target.t₁ = true
DongEtAl2026.PMF.targetMatches DongEtAl2026.PMF.Target.t₂ DongEtAl2026.PMF.Target.t₂ = true
DongEtAl2026.PMF.targetMatches x✝¹ x✝ = false

Instances For

§1. L0 — uniform on extension via `RSA.L0OfBoolMeaning` #

@[reducible, inline]

abbrev DongEtAl2026.PMF.extension (u : Target) :

Finset Target

Equations

DongEtAl2026.PMF.extension u = RSA.extensionOf DongEtAl2026.PMF.targetMatches u

Instances For

theorem DongEtAl2026.PMF.extension_nonempty (u : Target) :

(extension u).Nonempty

noncomputable def DongEtAl2026.PMF.L0 (u : Target) :

Equations

DongEtAl2026.PMF.L0 u = RSA.L0OfBoolMeaning DongEtAl2026.PMF.targetMatches u ⋯

Instances For

@[simp]

theorem DongEtAl2026.PMF.mem_support_L0_iff (u w : Target) :

w ∈ (L0 u).support ↔ targetMatches u w = true

theorem DongEtAl2026.PMF.L0_apply_of_true {u w : Target} (h : targetMatches u w = true) :

(L0 u) w = (↑(extension u).card)⁻¹

theorem DongEtAl2026.PMF.L0_apply_of_false {u w : Target} (h : targetMatches u w ≠ true) :

(L0 u) w = 0

§2. S1 — speaker as `PMF.normalize` of L0 #

theorem DongEtAl2026.PMF.L0_tsum_utterance_ne_top (w : Target) :

∑' (u : Target), (L0 u) w ≠ ⊤

noncomputable def DongEtAl2026.PMF.S1 (w : Target) :

Pragmatic speaker: each world has at least one matching utterance, so no fallback is needed (every world is non-degenerate in the binary task).

Equations

DongEtAl2026.PMF.S1 w = PMF.normalize (fun (u : DongEtAl2026.PMF.Target) => (DongEtAl2026.PMF.L0 u) w) ⋯ ⋯

Instances For

theorem DongEtAl2026.PMF.S1_apply (w u : Target) :

(S1 w) u = (L0 u) w * (∑' (u' : Target), (L0 u') w)⁻¹

§3. L1 — Bayesian inversion against the uniform world prior #

noncomputable def DongEtAl2026.PMF.worldPrior :

Equations

DongEtAl2026.PMF.worldPrior = PMF.uniformOfFintype DongEtAl2026.PMF.Target

Instances For

theorem DongEtAl2026.PMF.worldPrior_ne_zero (w : Target) :

worldPrior w ≠ 0

theorem DongEtAl2026.PMF.L1_marginal_ne_zero (u : Target) :

PMF.marginal S1 worldPrior u ≠ 0

noncomputable def DongEtAl2026.PMF.L1 (u : Target) :

Equations

DongEtAl2026.PMF.L1 u = PMF.posterior DongEtAl2026.PMF.S1 DongEtAl2026.PMF.worldPrior u ⋯

Instances For

§4. Findings — all vacuous-zero #

theorem DongEtAl2026.PMF.S1_prefers_correct :

(S1 Target.t₁) Target.t₂ < (S1 Target.t₁) Target.t₁

S1 prefers correct guess: at world .t₁, utterance .t₂ has L0 = 0 (targetMatches .t₂ .t₁ = false), so S1 .t₁ .t₂ = 0.

Same-base comparison (same world .t₁), different utterances → use normalize_lt_iff_lt to reduce to L0 score comparison.

theorem DongEtAl2026.PMF.S1_animal_prefers_correct :

(S1 Target.t₁) Target.t₁ > (S1 Target.t₁) Target.t₂

S1 facing target .t₁ prefers utterance .t₁ over .t₂.

theorem DongEtAl2026.PMF.S1_medical_prefers_correct :

(S1 Target.t₁) Target.t₁ > (S1 Target.t₁) Target.t₂

Same prediction at "medical" stakes (the binary task is degenerate; qualitative prediction unchanged by the stakes parameter).

theorem DongEtAl2026.PMF.S1_t2_lt_S1_t1_for_t1 :

(S1 Target.t₂) Target.t₁ < (S1 Target.t₁) Target.t₁

Per-world S1 leaf for L1: at world .t₂, utterance .t₁ doesn't apply, so S1 .t₂ .t₁ = 0. At world .t₁, S1 .t₁ .t₁ > 0.

theorem DongEtAl2026.PMF.L1_animal_identifies :

(L1 Target.t₁) Target.t₁ > (L1 Target.t₁) Target.t₂

L1 correctly identifies target from utterance.

theorem DongEtAl2026.PMF.L1_medical_identifies :

(L1 Target.t₁) Target.t₁ > (L1 Target.t₁) Target.t₂

Same prediction at "medical" stakes.