[GS13] on mathlib PMF #
PMF formalization of GS2013 using PMF.hypergeometric as the observation
kernel primitive. The model is parameterized over a : Access throughout;
each downstream operator (speakerBelief, qualityOk, s1Score, S1g,
marginalSpeaker, L1) takes (a, k) where k : Fin (a.val + 1) is the
observed count.
PMF stack (one definition each, parameterized) #
worldPrior : PMF WorldState— uniform on 4 statesobsKernel a w : PMF (Fin (a.val + 1))—PMF.hypergeometric 3 w.toNat a.valspeakerBelief a k : PMF WorldState—PMF.posterior (obsKernel a) worldPrior kS1g m α a k h : PMF U— softmax-of-expected-log over the speaker's beliefmarginalSpeaker m α a w hCov : PMF U—(obsKernel a w).bind (S1g a)L1 m α a u hMarg : PMF WorldState—PMF.posterior (marginalSpeaker a) worldPrior u
Silence-extended findings #
All 11 paper findings are stated against the silence-extended utterance
space RSA.WithSilence U — the cover hypothesis cover_silent is universal
because silence is qOk at every observation. This dissolves the
(access, word) cells that GS2013's "sensible situations" restriction
excludes, which would otherwise block formalization without silence.
Model substrate: world states, utterance enums, meanings, access #
World states: how many of 3 objects have the property.
- s0 : WorldState
- s1 : WorldState
- s2 : WorldState
- s3 : WorldState
Instances For
Equations
- GoodmanStuhlmuller2013.instDecidableEqWorldState x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Equations
Equations
- One or more equations did not get rendered due to their size.
Equations
- GoodmanStuhlmuller2013.instDecidableEqQUtt x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- GoodmanStuhlmuller2013.instReprQUtt = { reprPrec := GoodmanStuhlmuller2013.instReprQUtt.repr }
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Equations
- GoodmanStuhlmuller2013.qMeaning GoodmanStuhlmuller2013.QUtt.none_ x✝ = (x✝.toNat == 0)
- GoodmanStuhlmuller2013.qMeaning GoodmanStuhlmuller2013.QUtt.some_ x✝ = decide (x✝.toNat ≥ 1)
- GoodmanStuhlmuller2013.qMeaning GoodmanStuhlmuller2013.QUtt.all x✝ = (x✝.toNat == 3)
Instances For
Equations
- GoodmanStuhlmuller2013.instDecidableEqNumUtt x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Equations
- GoodmanStuhlmuller2013.lbMeaning GoodmanStuhlmuller2013.NumUtt.one x✝ = decide (x✝.toNat ≥ 1)
- GoodmanStuhlmuller2013.lbMeaning GoodmanStuhlmuller2013.NumUtt.two x✝ = decide (x✝.toNat ≥ 2)
- GoodmanStuhlmuller2013.lbMeaning GoodmanStuhlmuller2013.NumUtt.three x✝ = decide (x✝.toNat ≥ 3)
Instances For
Speaker access = number of objects (out of 3) the speaker observes:
a Fin 4 indexed at 0..3, so a.val is the access value and
Fin (a.val + 1) reduces in type position when a is concrete. The paper
restricts to {1, 2, 3}; access 0 (speaker observes nothing) is well-defined
but unused.
Equations
- GoodmanStuhlmuller2013.Access = Fin 4
Instances For
Access value 1 (speaker observes 1 of 3 objects).
Equations
Instances For
Access value 2 (speaker observes 2 of 3 objects).
Equations
Instances For
Access value 3 — full access (speaker observes all 3 objects).
Equations
Instances For
World prior — uniform on WorldState #
Equations
- GoodmanStuhlmuller2013.worldPrior = PMF.uniformOfFintype GoodmanStuhlmuller2013.WorldState
Instances For
The uniform world prior assigns ENNReal.ofReal (1/4) to every world.
Hypergeometric observation kernel #
obsKernel a w : PMF (Fin (a.val + 1)) is the hypergeometric distribution
over count outcomes when the speaker observes a.val of the 3 objects, of
which w.toNat have the property.
Equations
- GoodmanStuhlmuller2013.obsKernel a w = PMF.hypergeometric 3 w.toNat ↑a ⋯ ⋯
Instances For
Closed-form observation kernel value: C(K, k) · C(N-K, n-k) / C(N, n).
The kernel is non-zero iff the count is hypergeometric-feasible.
obsKernel value in ENNReal.ofReal form — the shape the value table
below reduces to by norm_num.
Observation compatibility and witness worlds #
A world s is compatible with observing k.val successes out of a.val
draws iff the hypergeometric numerator at (K=s.toNat, k=k.val) is non-zero.
Equations
- GoodmanStuhlmuller2013.obsCompatible a k s = (decide (↑k ≤ s.toNat) && decide (↑a - ↑k ≤ 3 - s.toNat))
Instances For
Speaker belief — PMF.posterior of obsKernel #
Speaker's posterior over worlds given a count observation.
Equations
Instances For
Quality filter, literal probabilities, softmax speaker #
An utterance u is quality-OK at observation (a, k) iff u is true at
every world compatible with (a, k).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Uniform-on-extension literal probability.
Equations
- GoodmanStuhlmuller2013.lexReal m u s = if m u s = true then (↑(RSA.extensionOf m u).card)⁻¹ else 0
Instances For
toReal projection of speakerBelief.
Equations
- GoodmanStuhlmuller2013.beliefReal a k s = ((GoodmanStuhlmuller2013.speakerBelief a k) s).toReal
Instances For
Speaker score: softmaxBelief over the literal probabilities, filtered
by qualityOk.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Speaker conditional on observation.
Equations
- GoodmanStuhlmuller2013.S1g m α a k h0 = PMF.normalize (fun (x : U) => GoodmanStuhlmuller2013.s1Score m α a k x) h0 ⋯
Instances For
Marginal speaker and pragmatic listener #
Since marginalSpeaker uses PMF.bind (not bindOnSupport), S1g must be
defined at every k, not just kernel-supported ones. The cover hypothesis
hCov therefore quantifies over all k : Fin (a.val + 1). With
WithSilence, this is automatic via cover_silent.
Equations
- GoodmanStuhlmuller2013.marginalSpeaker m α a w hCov = (GoodmanStuhlmuller2013.obsKernel a w).bind fun (k : Fin (↑a + 1)) => GoodmanStuhlmuller2013.S1g m α a k ⋯
Instances For
Pragmatic listener: Bayesian inversion of the marginal speaker.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Silence is universally qOk: liftMeaning m none = true at every world,
so the cover hypothesis is universally satisfiable.
Observation-kernel value table #
Closed-form values for obsKernel a w k, derived from obsKernel_eq_ofReal
by evaluating the Nat.choose arithmetic. Only the cells the findings below
rewrite with are stated.
marginalSpeaker collapse #
At full access the kernel concentrates on a single k, so the bind collapses
to one S1g evaluation; at partial access it expands to the 2- or 3-term
kernel-weighted sum.
Extension cardinalities for the silence-extended models #
Generic s1Score evaluation #
When qOk u passes, liftMeaning-lifted utterances have a uniform lex
value on the belief support (because: qOk ⇒ m u s = true at all
compatible s ⊇ belief support ⇒ lex u s = 1/(card extension)).
softmaxBelief_uniform_on_support then collapses
s1Score = ENNReal.ofReal (1/c).
s1Score value table #
Per-(meaning, access, count, utterance) closed forms. Positive cells go
through s1Score_uniform_apply; qOk failures are zero by
softmaxBelief_eq_zero_of_not_qOk.
Partition functions #
Findings #
The 11 paper findings, stated against WithSilence + liftMeaning so the
cover hypothesis is automatically satisfied via cover_silent.
These cells are computational evaluations of the model at specific (meaning, access, world-pair, utterance) tuples, in the paper's expository regime (α = 1, uniform prior), not the fitted regime (higher α, non-uniform prior). Where the fitted model predicts only a marginal implicature (access 2), directions can differ from the plotted predictions.
They are NOT corollaries of a single information-theoretic cancellation
theorem. The structural theorem that IS provable
(RSA.mutualInfoStateObs_bind_noise_le in
Linglib/Pragmatics/RSA/Cancellation.lean) only gives obs-level
MI cancellation: MI(state; obs_n) ≤ MI(state; obs_i). The per-(world-pair)
orderings these cells encode are utterance-level claims that depend on the
specific lex shape, not just on kernel informativity — see Cancellation.lean
§3 for why utterance-level MI cancellation isn't a clean DPI corollary.
Anyone wanting to prove a structural property about the model should use the
obs-level cancellation theorem there.
Finding 1: at full access, some favors s2 > s3 (scalar implicature).
Finding 4: at full access, two favors s2 > s3 (upper-bounded reading).
.a1 minimal-access findings #
Each shows marginalSpeaker (smaller-state) ≤ marginalSpeaker (larger-state),
so ¬ L1 (smaller) > L1 (larger). At (.a1, k=0) the target utterance has
S1g = 0 (qOk fails); at (.a1, k=1) it has S1g = 4/7. So the comparison
reduces to obsKernel(smaller)(k=1) ≤ obsKernel(larger)(k=1).
Finding 3: at partial access, some does NOT favor s2 > s3 (equality).
Finding 5: at partial access, two does NOT favor s2 > s3 (weakened).
Finding 10 (HEADLINE): at partial access, one favors s1 > s3.