Documentation

Linglib.Phenomena.ScalarImplicatures.Studies.GoodmanStuhlmuller2013PMF

@cite{goodman-stuhlmuller-2013} on mathlib PMF #

@cite{goodman-stuhlmuller-2013}

PMF reformulation of GS2013 using PMF.hypergeometric as the observation kernel primitive. The model is parameterized over a : Access throughout — there is no per-.a1/.a2/.a3 substrate; each downstream operator (speakerBelief, qualityOk, s1Score, S1g, marginalSpeaker, L1) takes (a, k) where k : Fin (a.toNat + 1) is the observed count.

PMF stack (one definition each, parameterized) #

Silence-extended findings #

All 11 paper findings are stated against the silence-extended utterance space WithSilence U — the cover hypothesis cover_silent is universal because silence is qOk at every observation. This dissolves the (access, word) ∉ {sensible} defects that block formalization without silence (paper p. 180 "sensible situations").

§1. Substrate: world states, utterance enums, meaning functions, access #

World states: how many of 3 objects have the property.

Instances For
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For
      @[implicit_reducible]
      Equations
      • One or more equations did not get rendered due to their size.

      Quantifier alternative set.

      Instances For
        @[implicit_reducible]
        Equations
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For
          @[implicit_reducible]
          Equations
          • One or more equations did not get rendered due to their size.

          Numeral alternative set (lower-bound semantics).

          Instances For
            @[implicit_reducible]
            Equations
            Equations
            • One or more equations did not get rendered due to their size.
            Instances For
              @[implicit_reducible]
              Equations
              • One or more equations did not get rendered due to their size.
              @[reducible, inline]

              Speaker access = number of objects (out of 3) the speaker observes. A Fin 4 indexed at 0..3. The paper restricts to {1, 2, 3}; access=0 (speaker observes nothing) is well-defined but unused.

              Carrying this as Fin 4 instead of a custom Access enum eliminates the Access.toNat indirection — a.val is just the access value and Fin (a.val + 1) reduces in type position when a is concrete.

              Equations
              Instances For
                @[reducible, inline]

                Access value 1 (speaker observes 1 of 3 objects).

                Equations
                Instances For
                  @[reducible, inline]

                  Access value 2 (speaker observes 2 of 3 objects).

                  Equations
                  Instances For
                    @[reducible, inline]

                    Access value 3 — full access (speaker observes all 3 objects).

                    Equations
                    Instances For
                      @[reducible, inline]

                      Compatibility shim: a.toNat = a.val since Access = Fin 4.

                      Equations
                      Instances For

                        §2. World prior — uniform on WorldState #

                        The uniform world prior assigns ENNReal.ofReal (1/4) to every world.

                        §3. Hypergeometric observation kernel #

                        obsKernel a w : PMF (Fin (a.toNat + 1)) is the hypergeometric distribution over count outcomes when the speaker observes a.toNat of the 3 objects, of which w.toNat have the property. Built directly from PMF.hypergeometric.

                        theorem Phenomena.ScalarImplicatures.GoodmanStuhlmuller2013.PMF.obsKernel_apply (a : Access) (w : WorldState) (k : Fin (a.toNat + 1)) :
                        (obsKernel a w) k = (w.toNat.choose k * (3 - w.toNat).choose (a.toNat - k)) / (Nat.choose 3 a.toNat)

                        Closed-form observation kernel value: C(K, k) · C(N-K, n-k) / C(N, n).

                        theorem Phenomena.ScalarImplicatures.GoodmanStuhlmuller2013.PMF.obsKernel_apply_ne_zero_iff (a : Access) (w : WorldState) (k : Fin (a.toNat + 1)) :
                        (obsKernel a w) k 0 k w.toNat a.toNat - k 3 - w.toNat

                        The kernel is non-zero iff the count is hypergeometric-feasible.

                        §4. Speaker belief — PMF.posterior of obsKernel #

                        Speaker's posterior over worlds given a count observation.

                        Equations
                        • One or more equations did not get rendered due to their size.
                        Instances For

                          §5. obsCompatible — combinatorial feasibility #

                          A world s is compatible with observing k.val successes out of a.toNat draws iff the hypergeometric numerator at (K=s.toNat, k=k.val) is non-zero.

                          Equations
                          Instances For

                            §6. qualityOk — utterance quality filter #

                            def Phenomena.ScalarImplicatures.GoodmanStuhlmuller2013.PMF.qualityOk {U : Type u_1} (m : UWorldStateBool) (a : Access) (k : Fin (a.toNat + 1)) (u : U) :
                            Bool

                            An utterance u is quality-OK at observation (a, k) iff u is true at every world compatible with (a, k).

                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For

                              §7. lexReal — uniform-on-extension literal probability #

                              noncomputable def Phenomena.ScalarImplicatures.GoodmanStuhlmuller2013.PMF.lexReal {U : Type u_1} [Fintype U] (meaning : UWorldStateBool) (u : U) (s : WorldState) :
                              Equations
                              Instances For

                                §8. beliefReal — toReal projection of speakerBelief #

                                §9. s1Score — softmaxBelief instance #

                                @[reducible, inline]
                                noncomputable abbrev Phenomena.ScalarImplicatures.GoodmanStuhlmuller2013.PMF.s1Score {U : Type u_1} [Fintype U] (meaning : UWorldStateBool) (α : ) (a : Access) (k : Fin (a.toNat + 1)) (u : U) :
                                ENNReal
                                Equations
                                • One or more equations did not get rendered due to their size.
                                Instances For

                                  §10. S1g — speaker conditional on observation #

                                  noncomputable def Phenomena.ScalarImplicatures.GoodmanStuhlmuller2013.PMF.S1g {U : Type u_1} [Fintype U] (meaning : UWorldStateBool) (α : ) (a : Access) (k : Fin (a.toNat + 1)) (h0 : ∑' (u : U), s1Score meaning α a k u 0) :
                                  PMF U
                                  Equations
                                  Instances For

                                    §11. marginalSpeaker — PMF.bind over the obs kernel #

                                    Since we use PMF.bind (not bindOnSupport), we need S1g defined at every k, not just kernel-supported ones. The cover hypothesis hCov therefore quantifies over all k : Fin (a.toNat + 1). With WithSilence, this is automatic via cover_silent (silence is qOk everywhere).

                                    noncomputable def Phenomena.ScalarImplicatures.GoodmanStuhlmuller2013.PMF.marginalSpeaker {U : Type u_1} [Fintype U] (meaning : UWorldStateBool) (α : ) (a : Access) (w : WorldState) (hCov : ∀ (k : Fin (a.toNat + 1)), ∃ (u : U), qualityOk meaning a k u = true) :
                                    PMF U
                                    Equations
                                    • One or more equations did not get rendered due to their size.
                                    Instances For

                                      §12. L1 — Bayesian inversion of the marginal speaker #

                                      noncomputable def Phenomena.ScalarImplicatures.GoodmanStuhlmuller2013.PMF.L1 {U : Type u_1} [Fintype U] (meaning : UWorldStateBool) (α : ) (a : Access) (hCov : ∀ (k : Fin (a.toNat + 1)), ∃ (u : U), qualityOk meaning a k u = true) (u : U) (hMarg : PMF.marginal (fun (w : WorldState) => marginalSpeaker meaning α a w hCov) worldPrior u 0) :
                                      Equations
                                      • One or more equations did not get rendered due to their size.
                                      Instances For

                                        §13. cover_silent — silence is universally qOk #

                                        liftMeaning m none = true at every world, so silence passes qualityOk at every observation. The cover hypothesis is universally satisfiable.

                                        theorem Phenomena.ScalarImplicatures.GoodmanStuhlmuller2013.PMF.cover_silent {U : Type u_1} (m : UWorldStateBool) (a : Access) (k : Fin (a.toNat + 1)) :
                                        ∃ (u : RSA.WithSilence U), qualityOk (RSA.liftMeaning m) a k u = true

                                        §14. obsKernel value lemmas (per (a, w, k)) #

                                        Closed-form values for obsKernel a w k at each combination — derived from obsKernel_apply by unfolding the Nat.choose arithmetic. These are the numerical foundations for the marginalSpeaker / findings layer below.

                                        §15. marginalSpeaker collapse — .a3 (concentrated kernel) #

                                        At full access, obsKernel .a3 w puts all mass on a single k = w.toNat, so marginalSpeaker = (obsKernel .a3 w).bind (S1g .a3 _) collapses to a single S1g evaluation at the diagonal k. Polymorphic over w — replaces what was previously 4 per-world specializations.

                                        §17. Extension cardinalities for silence-extended models #

                                        Per-(meaning, utterance) extension cardinalities, locally tagged for pmf_eval_simps so the universal s1Score_liftMeaning_apply_eq_ite (§17b) reduces to concrete ofReal((card)⁻¹) values.

                                        §17a. Generic s1Score lemmas (work for any (a, k)) #

                                        When qOk u passes, liftMeaning-lifted utterances have a uniform lex value on the belief support (because: qOkm u s = true at all compatible s ⊇ belief support ⇒ lex u s = 1/(card extension)).

                                        softmaxBelief_uniform_on_support then collapses s1Score = ENNReal.ofReal (1/c).

                                        §17b. Universal pmf_eval-friendly if-form for s1Score (liftMeaning _) #

                                        The universal s1Score_uniform_apply is hypothesis-laden (h_qok, h_card, h_pos) so simp can't use it. s1Score_liftMeaning_apply_eq_ite below is parameterized over the meaning function m and has no free preconditions — the qOk branch is encoded as a decidable if, and extension nonemptiness is proved universally via the witnessWorld helper.

                                        This collapses what previously required two paper-tagged lemmas (one per meaning) into one polymorphic lemma. The local attribute [pmf_eval_simps] keeps the tag file-scoped so it does not pollute the substrate set with a private paper lemma (audit hygiene rule).

                                        pmf_eval smoke tests for the if-form variants #

                                        §19a. s1Score evaluations for .a1 #

                                        At (.a1, k=0): compatible worlds = {s0, s1, s2}. Only silence is qOk; all other quantifiers / numerals are false at s0 ⇒ qOk fails ⇒ score 0.

                                        At (.a1, k=1): compatible worlds = {s1, s2, s3}. Silence and some_/one are qOk; none_, all, two, three fail.

                                        §19b. Sum-of-scores at .a1 #

                                        §19c. S1g per-(a1, k, target-utt) #

                                        Computed via S1g = normalize (s1Score), with the score and the partition function from §19a/b.

                                        §19d. marginalSpeaker collapse — .a1 #

                                        For each (.a1, w), expand bind = ∑ k : Fin 2, obsKernel * S1g.

                                        §19e. s1Score evaluations for .a2 #

                                        §19f. Sum-of-scores at .a2 #

                                        §19g. S1g per-(a2, k, target-utt) #

                                        §19h. marginalSpeaker collapse — .a2 (3-element bind) #

                                        §20. Findings (silence-extended) #

                                        The 11 paper findings restated against WithSilence + liftMeaning so the cover hypothesis is automatically satisfied via cover_silent.

                                        Status: illustrations, not corollaries of a single structural theorem #

                                        The 11 cells below are computational evaluations of the model at specific (meaning, access, world-pair, utterance) tuples. They demonstrate that the model produces the predictions plotted in Fig. 2 (A, B).

                                        They are NOT corollaries of a single information-theoretic cancellation theorem. The structural theorem that IS provable (RSA.mutualInfoStateObs_bind_noise_le in Linglib/Theories/Pragmatics/RSA/Cancellation.lean) only gives obs-level MI cancellation: MI(state; obs_n) ≤ MI(state; obs_i). The per-(world-pair) orderings these cells encode are utterance-level claims that depend on the specific lex shape, not just on kernel informativity — see Cancellation.lean §3 for why utterance-level MI cancellation isn't a clean DPI corollary.

                                        Demoted to examples rather than named theorems: they're computational illustrations of model behavior, not load-bearing claims. Anyone wanting to prove a structural property about the model should use the obs-level cancellation theorem in Cancellation.lean.

                                        .a1 minimal-access findings #

                                        Each shows marginalSpeaker (smaller-state) ≤ marginalSpeaker (larger-state), so ¬ L1 (smaller) > L1 (larger). Proof: at (.a1, k=0) the target utterance has S1g = 0 (qOk fails); at (.a1, k=1) it has S1g = 4/7. So the comparison reduces to obsKernel(smaller)(k=1) ≤ obsKernel(larger)(k=1).

                                        .a2 partial-access findings #