Documentation

Linglib.Studies.GoodmanStuhlmuller2013

[GS13] on mathlib PMF #

[GS13]

PMF formalization of GS2013 using PMF.hypergeometric as the observation kernel primitive. The model is parameterized over a : Access throughout; each downstream operator (speakerBelief, qualityOk, s1Score, S1g, marginalSpeaker, L1) takes (a, k) where k : Fin (a.val + 1) is the observed count.

PMF stack (one definition each, parameterized) #

Silence-extended findings #

All 11 paper findings are stated against the silence-extended utterance space RSA.WithSilence U — the cover hypothesis cover_silent is universal because silence is qOk at every observation. This dissolves the (access, word) cells that GS2013's "sensible situations" restriction excludes, which would otherwise block formalization without silence.

Model substrate: world states, utterance enums, meanings, access #

World states: how many of 3 objects have the property.

Instances For
    @[implicit_reducible]
    Equations
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For
      @[implicit_reducible]
      Equations
      • One or more equations did not get rendered due to their size.

      Quantifier alternative set.

      Instances For
        @[implicit_reducible]
        Equations
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For
          @[implicit_reducible]
          Equations
          • One or more equations did not get rendered due to their size.

          Numeral alternative set (lower-bound semantics).

          Instances For
            @[implicit_reducible]
            Equations
            Equations
            • One or more equations did not get rendered due to their size.
            Instances For
              @[implicit_reducible]
              Equations
              • One or more equations did not get rendered due to their size.
              @[reducible, inline]

              Speaker access = number of objects (out of 3) the speaker observes: a Fin 4 indexed at 0..3, so a.val is the access value and Fin (a.val + 1) reduces in type position when a is concrete. The paper restricts to {1, 2, 3}; access 0 (speaker observes nothing) is well-defined but unused.

              Equations
              Instances For
                @[reducible, inline]

                Access value 1 (speaker observes 1 of 3 objects).

                Equations
                Instances For
                  @[reducible, inline]

                  Access value 2 (speaker observes 2 of 3 objects).

                  Equations
                  Instances For
                    @[reducible, inline]

                    Access value 3 — full access (speaker observes all 3 objects).

                    Equations
                    Instances For

                      World prior — uniform on WorldState #

                      theorem GoodmanStuhlmuller2013.worldPrior_apply (s : WorldState) :
                      worldPrior s = ENNReal.ofReal (1 / 4)

                      The uniform world prior assigns ENNReal.ofReal (1/4) to every world.

                      Hypergeometric observation kernel #

                      obsKernel a w : PMF (Fin (a.val + 1)) is the hypergeometric distribution over count outcomes when the speaker observes a.val of the 3 objects, of which w.toNat have the property.

                      noncomputable def GoodmanStuhlmuller2013.obsKernel (a : Access) (w : WorldState) :
                      PMF (Fin (a + 1))
                      Equations
                      Instances For
                        theorem GoodmanStuhlmuller2013.obsKernel_apply (a : Access) (w : WorldState) (k : Fin (a + 1)) :
                        (obsKernel a w) k = (w.toNat.choose k * (3 - w.toNat).choose (a - k)) / (Nat.choose 3 a)

                        Closed-form observation kernel value: C(K, k) · C(N-K, n-k) / C(N, n).

                        theorem GoodmanStuhlmuller2013.obsKernel_apply_ne_zero_iff (a : Access) (w : WorldState) (k : Fin (a + 1)) :
                        (obsKernel a w) k 0 k w.toNat a - k 3 - w.toNat

                        The kernel is non-zero iff the count is hypergeometric-feasible.

                        theorem GoodmanStuhlmuller2013.obsKernel_eq_ofReal (a : Access) (w : WorldState) (k : Fin (a + 1)) :
                        (obsKernel a w) k = ENNReal.ofReal ((w.toNat.choose k) * ((3 - w.toNat).choose (a - k)) / (Nat.choose 3 a))

                        obsKernel value in ENNReal.ofReal form — the shape the value table below reduces to by norm_num.

                        Observation compatibility and witness worlds #

                        def GoodmanStuhlmuller2013.obsCompatible (a : Access) (k : Fin (a + 1)) (s : WorldState) :
                        Bool

                        A world s is compatible with observing k.val successes out of a.val draws iff the hypergeometric numerator at (K=s.toNat, k=k.val) is non-zero.

                        Equations
                        Instances For
                          theorem GoodmanStuhlmuller2013.obsCompatible_iff {a : Access} {k : Fin (a + 1)} {s : WorldState} :
                          obsCompatible a k s = true k s.toNat a - k 3 - s.toNat

                          Speaker belief — PMF.posterior of obsKernel #

                          noncomputable def GoodmanStuhlmuller2013.speakerBelief (a : Access) (k : Fin (a + 1)) :

                          Speaker's posterior over worlds given a count observation.

                          Equations
                          Instances For

                            Quality filter, literal probabilities, softmax speaker #

                            def GoodmanStuhlmuller2013.qualityOk {U : Type u_1} (m : UWorldStateBool) (a : Access) (k : Fin (a + 1)) (u : U) :
                            Bool

                            An utterance u is quality-OK at observation (a, k) iff u is true at every world compatible with (a, k).

                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For
                              noncomputable def GoodmanStuhlmuller2013.lexReal {U : Type u_1} [Fintype U] (m : UWorldStateBool) (u : U) (s : WorldState) :

                              Uniform-on-extension literal probability.

                              Equations
                              Instances For
                                noncomputable def GoodmanStuhlmuller2013.beliefReal (a : Access) (k : Fin (a + 1)) (s : WorldState) :

                                toReal projection of speakerBelief.

                                Equations
                                Instances For
                                  @[reducible, inline]
                                  noncomputable abbrev GoodmanStuhlmuller2013.s1Score {U : Type u_1} [Fintype U] (m : UWorldStateBool) (α : ) (a : Access) (k : Fin (a + 1)) (u : U) :
                                  ENNReal

                                  Speaker score: softmaxBelief over the literal probabilities, filtered by qualityOk.

                                  Equations
                                  • One or more equations did not get rendered due to their size.
                                  Instances For
                                    noncomputable def GoodmanStuhlmuller2013.S1g {U : Type u_1} [Fintype U] (m : UWorldStateBool) (α : ) (a : Access) (k : Fin (a + 1)) (h0 : ∑' (u : U), s1Score m α a k u 0) :
                                    PMF U

                                    Speaker conditional on observation.

                                    Equations
                                    Instances For

                                      Marginal speaker and pragmatic listener #

                                      Since marginalSpeaker uses PMF.bind (not bindOnSupport), S1g must be defined at every k, not just kernel-supported ones. The cover hypothesis hCov therefore quantifies over all k : Fin (a.val + 1). With WithSilence, this is automatic via cover_silent.

                                      noncomputable def GoodmanStuhlmuller2013.marginalSpeaker {U : Type u_1} [Fintype U] (m : UWorldStateBool) (α : ) (a : Access) (w : WorldState) (hCov : ∀ (k : Fin (a + 1)), ∃ (u : U), qualityOk m a k u = true) :
                                      PMF U
                                      Equations
                                      Instances For
                                        noncomputable def GoodmanStuhlmuller2013.L1 {U : Type u_1} [Fintype U] (m : UWorldStateBool) (α : ) (a : Access) (hCov : ∀ (k : Fin (a + 1)), ∃ (u : U), qualityOk m a k u = true) (u : U) (hMarg : PMF.marginal (fun (w : WorldState) => marginalSpeaker m α a w hCov) worldPrior u 0) :

                                        Pragmatic listener: Bayesian inversion of the marginal speaker.

                                        Equations
                                        • One or more equations did not get rendered due to their size.
                                        Instances For
                                          theorem GoodmanStuhlmuller2013.cover_silent {U : Type u_1} (m : UWorldStateBool) (a : Access) (k : Fin (a + 1)) :
                                          ∃ (u : RSA.WithSilence U), qualityOk (RSA.liftMeaning m) a k u = true

                                          Silence is universally qOk: liftMeaning m none = true at every world, so the cover hypothesis is universally satisfiable.

                                          Observation-kernel value table #

                                          Closed-form values for obsKernel a w k, derived from obsKernel_eq_ofReal by evaluating the Nat.choose arithmetic. Only the cells the findings below rewrite with are stated.

                                          marginalSpeaker collapse #

                                          At full access the kernel concentrates on a single k, so the bind collapses to one S1g evaluation; at partial access it expands to the 2- or 3-term kernel-weighted sum.

                                          Extension cardinalities for the silence-extended models #

                                          Generic s1Score evaluation #

                                          When qOk u passes, liftMeaning-lifted utterances have a uniform lex value on the belief support (because: qOkm u s = true at all compatible s ⊇ belief support ⇒ lex u s = 1/(card extension)). softmaxBelief_uniform_on_support then collapses s1Score = ENNReal.ofReal (1/c).

                                          s1Score value table #

                                          Per-(meaning, access, count, utterance) closed forms. Positive cells go through s1Score_uniform_apply; qOk failures are zero by softmaxBelief_eq_zero_of_not_qOk.

                                          Partition functions #

                                          S1g value table #

                                          Findings #

                                          The 11 paper findings, stated against WithSilence + liftMeaning so the cover hypothesis is automatically satisfied via cover_silent.

                                          These cells are computational evaluations of the model at specific (meaning, access, world-pair, utterance) tuples, in the paper's expository regime (α = 1, uniform prior), not the fitted regime (higher α, non-uniform prior). Where the fitted model predicts only a marginal implicature (access 2), directions can differ from the plotted predictions.

                                          They are NOT corollaries of a single information-theoretic cancellation theorem. The structural theorem that IS provable (RSA.mutualInfoStateObs_bind_noise_le in Linglib/Pragmatics/RSA/Cancellation.lean) only gives obs-level MI cancellation: MI(state; obs_n) ≤ MI(state; obs_i). The per-(world-pair) orderings these cells encode are utterance-level claims that depend on the specific lex shape, not just on kernel informativity — see Cancellation.lean §3 for why utterance-level MI cancellation isn't a clean DPI corollary. Anyone wanting to prove a structural property about the model should use the obs-level cancellation theorem there.

                                          Finding 1: at full access, some favors s2 > s3 (scalar implicature).

                                          Finding 4: at full access, two favors s2 > s3 (upper-bounded reading).

                                          .a1 minimal-access findings #

                                          Each shows marginalSpeaker (smaller-state) ≤ marginalSpeaker (larger-state), so ¬ L1 (smaller) > L1 (larger). At (.a1, k=0) the target utterance has S1g = 0 (qOk fails); at (.a1, k=1) it has S1g = 4/7. So the comparison reduces to obsKernel(smaller)(k=1) ≤ obsKernel(larger)(k=1).

                                          Finding 2: at minimal access, some does NOT favor s2 > s3.

                                          .a2 partial-access findings #

                                          Finding 3: at partial access, some does NOT favor s2 > s3 (equality).

                                          Finding 5: at partial access, two does NOT favor s2 > s3 (weakened).

                                          Finding 10 (HEADLINE): at partial access, one favors s1 > s3.

                                          Finding 11: at partial access, one does NOT favor s1 > s2.