Probability Effect: Probabilistic Dynamic Semantics #
The probabilistic effect models RSA-style soft assertion, threshold uncertainty, and Bayesian update in dynamic semantics.
This file consolidates two former modules:
- Probability-monad infrastructure (
PState,CondProbMonad,ChoiceProbMonad, threshold semantics) built on mathlib'sMonad/LawfulMonad— wasDynamic/Probability/Basic.lean - The
HasFiberedLookup PMFinstance and Bayesian-Charlow bridge — wasDynamic/Probability/Lookup.lean
Effect family #
| Family | M | Falsifier | Source |
|---|---|---|---|
| ICDRT | Entity | .star | Discourse/Intensional.lean |
| Charlow | Set | ∅ | Discourse/Effects/Nondeterminism.lean |
| Bayesian | PMF | zero-mass | this file |
| FCS | partial-Dom | Dom partiality | Dynamic/FileChange/Basic.lean |
We define P α abstractly as a structure with pure and bind operations
satisfying the monad laws. This allows us to reason about probabilistic
programs without committing to a specific representation (PMF, measure, etc.).
In Grove & White notation:
The probability monad is just a monad #
A "probability monad" provides pure, >>=, and the three monad laws — i.e.
it is exactly mathlib's Monad / LawfulMonad, so we use those directly
rather than re-declaring the typeclass. map/seq and their functor laws are
mathlib's Functor.map / id_map / comp_map. The probability-specific
structure (conditioning, choice) is added on top by CondProbMonad /
ChoiceProbMonad below, which extend [Monad P].
PMF is the canonical instance (Monad PMF / LawfulMonad PMF in mathlib).
In Grove & White's parameterized state monad, the state type can change during computation. This models how discourse updates can modify the structure of the context (e.g., pushing questions onto the QUD stack).
P^σ_σ' α = σ → P(α × σ')
The parameters σ and σ' are the input and output state types.
Parameterized probabilistic state monad.
Maps input state σ to a distribution over (value, output state) pairs. The output state type σ' can differ from σ, allowing type-changing updates.
This is Grove & White's P^σ_σ' α.
Equations
- Semantics.Dynamic.Probabilistic.PState P σ σ' α = (σ → P (α × σ'))
Instances For
Return for the parameterized state monad.
Returns the value paired with the unchanged state.
Grove & White: ⌜v⌝^σ = λs. ⌜(v, s)⌝
Equations
- Semantics.Dynamic.Probabilistic.PState.pure v s = pure (v, s)
Instances For
Bind for the parameterized state monad.
Sequences stateful-probabilistic computations, threading state through.
Grove & White: do { x ← m; k x } = λs. (x, s') ← m(s); k(x)(s')
Equations
- m.bind k s = do let x ← m s match x with | (x, s') => k x s'
Instances For
View (get) a component of the state.
Returns the value of applying proj to the current state, without modification.
Equations
- Semantics.Dynamic.Probabilistic.PState.view proj s = pure (proj s, s)
Instances For
Set (put) a component of the state.
Returns the new state created by upd, with a trivial value.
Equations
- Semantics.Dynamic.Probabilistic.PState.set upd s = pure ((), upd s)
Instances For
Modify the state in place.
Equations
- Semantics.Dynamic.Probabilistic.PState.modify f s = pure ((), f s)
Instances For
Grove & White's observe operation conditions a distribution on a boolean.
observe : Bool → P Unit
observe true continues; observe false blocks
This is the mechanism for assertion: update the CommonGround by observing the asserted proposition is true.
Extended probability monad with conditioning.
Adds observe for conditioning on boolean observations.
- fail {α : Type} : P α
The zero distribution (blocks all continuations)
- observe : Bool → P Unit
Condition on a boolean: continue if true, block if false
- observe_true : observe true = pure ()
Observing true is a no-op
Observing false blocks all continuations
fail is a left zero for bind
Instances
Observe false blocks: any continuation after observe false gives fail
Observe filters: observe true then return is identity
RSA's S1 isn't just conditioning - it's choosing an utterance weighted by
utility. This requires a choose operation in addition to observe.
choose : (α → ℚ) → P α -- sample from weighted distribution
The relationship:
Probability monad with choice (for speaker models).
Adds choose for sampling from weighted distributions.
This is what S1's softmax requires.
- observe : Bool → P Unit
- observe_true : observe true = pure ()
Sample from a weighted distribution over a finite type
- choose_uniform {α : Type} [Fintype α] [Nonempty α] : (choose fun (x : α) => 1) = choose fun (x : α) => 1
choose with uniform weights is like a uniform prior
- choose_observe {α : Type} [Fintype α] (w : α → ℚ) (p : α → Bool) : (do let a ← choose w CondProbMonad.observe (p a) pure a) = choose fun (a : α) => w a * if p a = true then 1 else 0
choose then observe is like weighted observe
Instances
Softmax choice: choose with exp-scaled weights.
This is S1's decision rule: P(u) ∝ exp(α · utility(u))
Note: We use ℚ so can't compute exp directly. In practice, implementations use Float or work with log-probabilities. This definition is for the interface.
Equations
- Semantics.Dynamic.Probabilistic.ChoiceProbMonad.softmaxChoice utility temperature = Semantics.Dynamic.Probabilistic.ChoiceProbMonad.choose utility
Instances For
Threshold semantics + threshold uncertainty produces graded truth values. This is a special case of the PDS framework.
For a gradable adjective like "tall":
measure : Entity → ℝgives heightsthreshold : θis the standard of comparison⟦tall⟧_θ(x) = measure(x) > θis Boolean
With uncertainty over θ:
⟦tall⟧(x) = E_θ[⟦tall⟧_θ(x)] = P(measure(x) > θ)
This probability is the graded truth value.
Threshold semantics: entity satisfies predicate if measure exceeds threshold.
Equations
- Semantics.Dynamic.Probabilistic.thresholdSem measure threshold x = decide (measure x > threshold)
Instances For
Graded truth from threshold uncertainty.
Given a prior over thresholds, the graded truth is the probability that the entity's measure exceeds the threshold.
Equations
- Semantics.Dynamic.Probabilistic.gradedFromThreshold measure thresholds prior x = ∑ θ : Θ, prior θ * if measure x > thresholds θ then 1 else 0
Instances For
For a point-mass prior (no uncertainty), graded truth reduces to Boolean.
This shows that graded semantics reduces to Boolean semantics when there's no parameter uncertainty.
Connection to RSA #
Grove & White's framework connects to RSA as follows:
- Literal meaning φ: a function
ι → Bool(proposition) - Common ground: a distribution
P ιover indices - Assertion:
observe(φ(i))fori ← cg - Probability computation:
Pr[φ] = E_i[1_{φ(i)}]
RSA's graded φ emerges from:
- Boolean φ_θ indexed by parameters θ
- A prior distribution over θ
- Marginalization: φ(x) = E_θ[φ_θ(x)]
This is exactly Lassiter & Goodman's "threshold + uncertainty = graded".
Bayesian state — fibered shape #
BayesianState W E := W → PMF (Assignment E) — for each world, a
distribution over assignments. The lookup iLookup s v w marginalizes
out variables other than v by mapping (· v) over s w.
SEAM (Seam 2): Joint-state Bayesian deferred #
[GW25b]'s parameterized dynamic semantics (PDS) uses
a joint shape: Pσ σ σ' α := σ → P(α × σ'), formalized as PState (§2 above).
The natural Bayesian-as-HasJointState instance would refactor BayesianState
to PMF (W × Assignment E) and declare HasJointState PMF. The joint field
is then the identity (no marginalization at all); deriving the fibered iLookup
requires PMF.cond machinery with a mass-zero-handling story (return Option
or WithBot, since conditioning on a measure-zero world is undefined).
The fibered shape preserved here is the honest projection; the joint upgrade
is tracked separately to avoid churning every downstream file that consumes
per-world PMF (Assignment E) priors.
Bridges to Set (Charlow) #
The bridge maps to the Charlow family use mathlib's PMF.support (non-zero-mass
elements) and PMF.uniformOfFinset (uniform distribution on a nonempty finite
set). These ship as natural transformations between the per-world Set and PMF
fibers.
Bayesian dynamic state: per-world probability distribution over
assignments. The natural M = PMF analog of Charlow's per-world set
of alternatives.
Equations
- Semantics.Dynamic.Probabilistic.BayesianState W E = (W → PMF (Core.Assignment E))
Instances For
Bayesian state as the probabilistic (M = PMF) instance of the
unified lookup interface. The lookup at (v, w) is the marginal
distribution of variable v's value: take s w (the joint distribution
over assignments at w) and map (· v).
SEAM (Falsifier, Seam 1): Bayesian commits to zero-mass as the #
no-referent case. There is no value-level falsifier — "no referent"
shows up as (s w).map (· v) v = 0, not as a distinguished Entity.star
or empty alternative-set. Bridge to Charlow via supportFiber
(probability → possibility) is lossy whenever the PMF has nontrivial
spread.
Equations
- One or more equations did not get rendered due to their size.
Bayesian ↠ Charlow (per world): the support of the per-world
PMF gives the set of possible-value alternatives. Mathlib supplies
PMF.support : PMF α → Set α directly; this is the natural
transformation PMF ⟶ Set applied at each world.
Equations
- Semantics.Dynamic.Probabilistic.supportFiber s w = (s w).support
Instances For
Charlow ↪ Bayesian (per world, when supported): the uniform distribution on a nonempty finite set of alternatives. The lift requires a nonemptiness witness; outside it, no probability assignment is well-defined (Charlow's "nothing rules anything out" is incompatible with the PMF normalization axiom).
This is the natural transformation Set ⟶ PMF whose existence is
gated by finite nonempty support — matching the Set/PMF asymmetry
visible in mathlib (PMF.uniformOfFinset requires Finset.Nonempty).
Equations
- Semantics.Dynamic.Probabilistic.uniformFiber s h w = PMF.uniformOfFinset (s w) ⋯