Documentation

Linglib.Semantics.Dynamic.Effects.Probability

Probability Effect: Probabilistic Dynamic Semantics #

[LG17] [GW25a] [GW25b]

The probabilistic effect models RSA-style soft assertion, threshold uncertainty, and Bayesian update in dynamic semantics.

This file consolidates two former modules:

Effect family #

FamilyMFalsifierSource
ICDRTEntity.starDiscourse/Intensional.lean
CharlowSetDiscourse/Effects/Nondeterminism.lean
BayesianPMFzero-massthis file
FCSpartial-DomDom partialityDynamic/FileChange/Basic.lean

[LG17]

We define P α abstractly as a structure with pure and bind operations satisfying the monad laws. This allows us to reason about probabilistic programs without committing to a specific representation (PMF, measure, etc.).

In Grove & White notation:

The probability monad is just a monad #

A "probability monad" provides pure, >>=, and the three monad laws — i.e. it is exactly mathlib's Monad / LawfulMonad, so we use those directly rather than re-declaring the typeclass. map/seq and their functor laws are mathlib's Functor.map / id_map / comp_map. The probability-specific structure (conditioning, choice) is added on top by CondProbMonad / ChoiceProbMonad below, which extend [Monad P].

PMF is the canonical instance (Monad PMF / LawfulMonad PMF in mathlib).

In Grove & White's parameterized state monad, the state type can change during computation. This models how discourse updates can modify the structure of the context (e.g., pushing questions onto the QUD stack).

P^σ_σ' α = σ → P(α × σ')

The parameters σ and σ' are the input and output state types.

Parameterized probabilistic state monad.

Maps input state σ to a distribution over (value, output state) pairs. The output state type σ' can differ from σ, allowing type-changing updates.

This is Grove & White's P^σ_σ' α.

Equations
Instances For
    def Semantics.Dynamic.Probabilistic.PState.pure {P : TypeType} [Monad P] {σ α : Type} (v : α) :
    PState P σ σ α

    Return for the parameterized state monad.

    Returns the value paired with the unchanged state. Grove & White: ⌜v⌝^σ = λs. ⌜(v, s)⌝

    Equations
    Instances For
      def Semantics.Dynamic.Probabilistic.PState.bind {P : TypeType} [Monad P] {σ σ' σ'' α β : Type} (m : PState P σ σ' α) (k : αPState P σ' σ'' β) :
      PState P σ σ'' β

      Bind for the parameterized state monad.

      Sequences stateful-probabilistic computations, threading state through. Grove & White: do { x ← m; k x } = λs. (x, s') ← m(s); k(x)(s')

      Equations
      • m.bind k s = do let xm s match x with | (x, s') => k x s'
      Instances For
        def Semantics.Dynamic.Probabilistic.PState.view {P : TypeType} [Monad P] {σ α : Type} (proj : σα) :
        PState P σ σ α

        View (get) a component of the state.

        Returns the value of applying proj to the current state, without modification.

        Equations
        Instances For
          def Semantics.Dynamic.Probabilistic.PState.set {P : TypeType} [Monad P] {σ σ' : Type} (upd : σσ') :
          PState P σ σ' Unit

          Set (put) a component of the state.

          Returns the new state created by upd, with a trivial value.

          Equations
          Instances For
            def Semantics.Dynamic.Probabilistic.PState.modify {P : TypeType} [Monad P] {σ : Type} (f : σσ) :
            PState P σ σ Unit

            Modify the state in place.

            Equations
            Instances For
              theorem Semantics.Dynamic.Probabilistic.PState.pure_bind {P : TypeType} [Monad P] {σ σ' α β : Type} [LawfulMonad P] (v : α) (k : αPState P σ σ' β) :
              (pure v).bind k = k v

              Left identity for PState: pure v >>= k = k v

              theorem Semantics.Dynamic.Probabilistic.PState.bind_pure {P : TypeType} [Monad P] {σ σ' α : Type} [LawfulMonad P] (m : PState P σ σ' α) :
              m.bind pure = m

              Right identity for PState: m >>= pure = m

              theorem Semantics.Dynamic.Probabilistic.PState.bind_assoc {P : TypeType} [Monad P] {σ σ' σ'' σ''' α β γ : Type} [LawfulMonad P] (m : PState P σ σ' α) (n : αPState P σ' σ'' β) (o : βPState P σ'' σ''' γ) :
              (m.bind n).bind o = m.bind fun (x : α) => (n x).bind o

              Associativity for PState.

              Grove & White's observe operation conditions a distribution on a boolean.

              observe : Bool → P Unit
              observe true continues; observe false blocks
              

              This is the mechanism for assertion: update the CommonGround by observing the asserted proposition is true.

              Extended probability monad with conditioning.

              Adds observe for conditioning on boolean observations.

              • fail {α : Type} : P α

                The zero distribution (blocks all continuations)

              • observe : BoolP Unit

                Condition on a boolean: continue if true, block if false

              • observe_true : observe true = pure ()

                Observing true is a no-op

              • observe_false_bind {α : Type} (k : UnitP α) : observe false >>= k = fail

                Observing false blocks all continuations

              • fail_bind {α β : Type} (k : αP β) : fail >>= k = fail

                fail is a left zero for bind

              Instances

                Observe false blocks: any continuation after observe false gives fail

                theorem Semantics.Dynamic.Probabilistic.CondProbMonad.observe_true_pure {P : TypeType} [Monad P] [CondProbMonad P] [LawfulMonad P] :
                (do observe true pure ()) = observe true

                Observe filters: observe true then return is identity

                RSA's S1 isn't just conditioning - it's choosing an utterance weighted by utility. This requires a choose operation in addition to observe.

                choose : (α → ℚ) → P α -- sample from weighted distribution
                

                The relationship:

                Probability monad with choice (for speaker models).

                Adds choose for sampling from weighted distributions. This is what S1's softmax requires.

                • fail {α : Type} : P α
                • observe : BoolP Unit
                • observe_true : observe true = pure ()
                • observe_false_bind {α : Type} (k : UnitP α) : observe false >>= k = fail
                • fail_bind {α β : Type} (k : αP β) : fail >>= k = fail
                • choose {α : Type} [Fintype α] : (α)P α

                  Sample from a weighted distribution over a finite type

                • choose_uniform {α : Type} [Fintype α] [Nonempty α] : (choose fun (x : α) => 1) = choose fun (x : α) => 1

                  choose with uniform weights is like a uniform prior

                • choose_observe {α : Type} [Fintype α] (w : α) (p : αBool) : (do let achoose w CondProbMonad.observe (p a) pure a) = choose fun (a : α) => w a * if p a = true then 1 else 0

                  choose then observe is like weighted observe

                Instances
                  def Semantics.Dynamic.Probabilistic.ChoiceProbMonad.softmaxChoice {P : TypeType} [Monad P] [ChoiceProbMonad P] {α : Type} [Fintype α] (utility : α) (temperature : ) :
                  P α

                  Softmax choice: choose with exp-scaled weights.

                  This is S1's decision rule: P(u) ∝ exp(α · utility(u))

                  Note: We use ℚ so can't compute exp directly. In practice, implementations use Float or work with log-probabilities. This definition is for the interface.

                  Equations
                  Instances For

                    Threshold semantics + threshold uncertainty produces graded truth values. This is a special case of the PDS framework.

                    For a gradable adjective like "tall":

                    With uncertainty over θ:

                    This probability is the graded truth value.

                    def Semantics.Dynamic.Probabilistic.thresholdSem {ε : Type} (measure : ε) (threshold : ) (x : ε) :
                    Bool

                    Threshold semantics: entity satisfies predicate if measure exceeds threshold.

                    Equations
                    Instances For
                      def Semantics.Dynamic.Probabilistic.gradedFromThreshold {ε Θ : Type} [Fintype Θ] [DecidableEq Θ] (measure : ε) (thresholds prior : Θ) (x : ε) :

                      Graded truth from threshold uncertainty.

                      Given a prior over thresholds, the graded truth is the probability that the entity's measure exceeds the threshold.

                      Equations
                      Instances For
                        theorem Semantics.Dynamic.Probabilistic.graded_eq_bool_of_point_mass {ε Θ : Type} [Fintype Θ] [DecidableEq Θ] (measure : ε) (thresholds : Θ) (θ₀ : Θ) (x : ε) :
                        have pointMass := fun (θ : Θ) => if θ = θ₀ then 1 else 0; gradedFromThreshold measure thresholds pointMass x = if measure x > thresholds θ₀ then 1 else 0

                        For a point-mass prior (no uncertainty), graded truth reduces to Boolean.

                        This shows that graded semantics reduces to Boolean semantics when there's no parameter uncertainty.

                        Connection to RSA #

                        Grove & White's framework connects to RSA as follows:

                        1. Literal meaning φ: a function ι → Bool (proposition)
                        2. Common ground: a distribution P ι over indices
                        3. Assertion: observe(φ(i)) for i ← cg
                        4. Probability computation: Pr[φ] = E_i[1_{φ(i)}]

                        RSA's graded φ emerges from:

                        This is exactly Lassiter & Goodman's "threshold + uncertainty = graded".

                        Bayesian state — fibered shape #

                        BayesianState W E := W → PMF (Assignment E) — for each world, a distribution over assignments. The lookup iLookup s v w marginalizes out variables other than v by mapping (· v) over s w.

                        SEAM (Seam 2): Joint-state Bayesian deferred #

                        [GW25b]'s parameterized dynamic semantics (PDS) uses a joint shape: Pσ σ σ' α := σ → P(α × σ'), formalized as PState (§2 above). The natural Bayesian-as-HasJointState instance would refactor BayesianState to PMF (W × Assignment E) and declare HasJointState PMF. The joint field is then the identity (no marginalization at all); deriving the fibered iLookup requires PMF.cond machinery with a mass-zero-handling story (return Option or WithBot, since conditioning on a measure-zero world is undefined).

                        The fibered shape preserved here is the honest projection; the joint upgrade is tracked separately to avoid churning every downstream file that consumes per-world PMF (Assignment E) priors.

                        Bridges to Set (Charlow) #

                        The bridge maps to the Charlow family use mathlib's PMF.support (non-zero-mass elements) and PMF.uniformOfFinset (uniform distribution on a nonempty finite set). These ship as natural transformations between the per-world Set and PMF fibers.

                        Bayesian dynamic state: per-world probability distribution over assignments. The natural M = PMF analog of Charlow's per-world set of alternatives.

                        Equations
                        Instances For
                          @[implicit_reducible]

                          Bayesian state as the probabilistic (M = PMF) instance of the unified lookup interface. The lookup at (v, w) is the marginal distribution of variable v's value: take s w (the joint distribution over assignments at w) and map (· v).

                          SEAM (Falsifier, Seam 1): Bayesian commits to zero-mass as the #

                          no-referent case. There is no value-level falsifier — "no referent" shows up as (s w).map (· v) v = 0, not as a distinguished Entity.star or empty alternative-set. Bridge to Charlow via supportFiber (probability → possibility) is lossy whenever the PMF has nontrivial spread.

                          Equations
                          • One or more equations did not get rendered due to their size.
                          noncomputable def Semantics.Dynamic.Probabilistic.supportFiber {W E : Type} (s : BayesianState W E) (w : W) :

                          Bayesian ↠ Charlow (per world): the support of the per-world PMF gives the set of possible-value alternatives. Mathlib supplies PMF.support : PMF α → Set α directly; this is the natural transformation PMF ⟶ Set applied at each world.

                          Equations
                          Instances For
                            noncomputable def Semantics.Dynamic.Probabilistic.uniformFiber {W E : Type} [DecidableEq (Core.Assignment E)] (s : WFinset (Core.Assignment E)) (h : ∀ (w : W), (s w).Nonempty) :

                            Charlow ↪ Bayesian (per world, when supported): the uniform distribution on a nonempty finite set of alternatives. The lift requires a nonemptiness witness; outside it, no probability assignment is well-defined (Charlow's "nothing rules anything out" is incompatible with the PMF normalization axiom).

                            This is the natural transformation Set ⟶ PMF whose existence is gated by finite nonempty support — matching the Set/PMF asymmetry visible in mathlib (PMF.uniformOfFinset requires Finset.Nonempty).

                            Equations
                            Instances For