Probability Effect: Probabilistic Dynamic Semantics #
@cite{lassiter-goodman-2017} @cite{grove-white-2025} @cite{grove-white-2025b}
The probabilistic effect models RSA-style soft assertion, threshold uncertainty, and Bayesian update in dynamic semantics.
This file consolidates two former modules:
- Abstract probability-monad infrastructure (
ProbMonad,PState,CondProbMonad,ChoiceProbMonad, threshold semantics) — wasDynamic/Probability/Basic.lean - The
HasFiberedLookup PMFinstance and Bayesian-Charlow bridge — wasDynamic/Probability/Lookup.lean
Effect family #
| Family | M | Falsifier | Source |
|---|---|---|---|
| ICDRT | Entity | .star | Discourse/Intensional.lean |
| Charlow | Set | ∅ | Discourse/Effects/Nondeterminism.lean |
| Bayesian | PMF | zero-mass | this file |
| FCS | partial-Dom | Dom partiality | Dynamic/FileChange/Basic.lean |
@cite{lassiter-goodman-2017}
We define P α abstractly as a structure with pure and bind operations
satisfying the monad laws. This allows us to reason about probabilistic
programs without committing to a specific representation (PMF, measure, etc.).
In Grove & White notation:
Abstract probability monad interface.
A probability monad provides:
pure: Lift a value to a trivial distributionbind: Sequence probabilistic computations- Monad laws as equalities
This is the semantic interface; implementations may use PMFs, measures, etc.
Note: We use Type instead of Type* to avoid universe issues. For semantic work, this is typically sufficient.
- pure {α : Type} : α → P α
Trivial distribution concentrated at a value
- bind {α β : Type} : P α → (α → P β) → P β
Sequence: sample from m, then continue with k
Left identity:
pure v >>= k = k vRight identity:
m >>= pure = m- bind_assoc {α β γ : Type} (m : P α) (n : α → P β) (o : β → P γ) : bind (bind m n) o = bind m fun (x : α) => bind (n x) o
Associativity:
(m >>= n) >>= o = m >>= (λx. n x >>= o)
Instances
Map a function over a distribution
Equations
Instances For
Sequence two distributions, ignoring the first result
Equations
- Semantics.Dynamic.Probabilistic.ProbMonad.seq m n = Semantics.Dynamic.Probabilistic.ProbMonad.bind m fun (x : α) => n
Instances For
In Grove & White's parameterized state monad, the state type can change during computation. This models how discourse updates can modify the structure of the context (e.g., pushing questions onto the QUD stack).
P^σ_σ' α = σ → P(α × σ')
The parameters σ and σ' are the input and output state types.
Parameterized probabilistic state monad.
Maps input state σ to a distribution over (value, output state) pairs. The output state type σ' can differ from σ, allowing type-changing updates.
This is Grove & White's P^σ_σ' α.
Equations
- Semantics.Dynamic.Probabilistic.PState P σ σ' α = (σ → P (α × σ'))
Instances For
Bind for the parameterized state monad.
Sequences stateful-probabilistic computations, threading state through.
Grove & White: do { x ← m; k x } = λs. (x, s') ← m(s); k(x)(s')
Equations
- m.bind k s = Semantics.Dynamic.Probabilistic.ProbMonad.bind (m s) fun (x : α × σ') => match x with | (x, s') => k x s'
Instances For
Grove & White's observe operation conditions a distribution on a boolean.
observe : Bool → P Unit
observe true continues; observe false blocks
This is the mechanism for assertion: update the CG by observing the asserted proposition is true.
Extended probability monad with conditioning.
Adds observe for conditioning on boolean observations.
- fail {α : Type} : P α
The zero distribution (blocks all continuations)
- observe : Bool → P Unit
Condition on a boolean: continue if true, block if false
- observe_true : observe true = ProbMonad.pure ()
Observing true is a no-op
Observing false blocks all continuations
fail is a left zero for bind
Instances
Observe filters: observe true then return is identity
Observe false blocks: any continuation after observe false gives fail
RSA's S1 isn't just conditioning - it's choosing an utterance weighted by
utility. This requires a choose operation in addition to observe.
choose : (α → ℚ) → P α -- sample from weighted distribution
The relationship:
Probability monad with choice (for speaker models).
Adds choose for sampling from weighted distributions.
This is what S1's softmax requires.
- observe : Bool → P Unit
- observe_true : observe true = ProbMonad.pure ()
Sample from a weighted distribution over a finite type
- choose_uniform {α : Type} [Fintype α] [Nonempty α] : (choose fun (x : α) => 1) = choose fun (x : α) => 1
choose with uniform weights is like a uniform prior
- choose_observe {α : Type} [Fintype α] (w : α → ℚ) (p : α → Bool) : (ProbMonad.bind (choose w) fun (a : α) => ProbMonad.bind (CondProbMonad.observe (p a)) fun (x : Unit) => ProbMonad.pure a) = choose fun (a : α) => w a * if p a = true then 1 else 0
choose then observe is like weighted observe
Instances
Softmax choice: choose with exp-scaled weights.
This is S1's decision rule: P(u) ∝ exp(α · utility(u))
Note: We use ℚ so can't compute exp directly. In practice, implementations use Float or work with log-probabilities. This definition is for the interface.
Equations
- Semantics.Dynamic.Probabilistic.ChoiceProbMonad.softmaxChoice utility temperature = Semantics.Dynamic.Probabilistic.ChoiceProbMonad.choose utility
Instances For
Threshold semantics + threshold uncertainty produces graded truth values. This is a special case of the PDS framework.
For a gradable adjective like "tall":
measure : Entity → ℝgives heightsthreshold : θis the standard of comparison⟦tall⟧_θ(x) = measure(x) > θis Boolean
With uncertainty over θ:
⟦tall⟧(x) = E_θ[⟦tall⟧_θ(x)] = P(measure(x) > θ)
This probability is the graded truth value.
Threshold semantics: entity satisfies predicate if measure exceeds threshold.
Equations
- Semantics.Dynamic.Probabilistic.thresholdSem measure threshold x = decide (measure x > threshold)
Instances For
Graded truth from threshold uncertainty.
Given a prior over thresholds, the graded truth is the probability that the entity's measure exceeds the threshold.
Equations
- Semantics.Dynamic.Probabilistic.gradedFromThreshold measure thresholds prior x = ∑ θ : Θ, prior θ * if measure x > thresholds θ then 1 else 0
Instances For
For a point-mass prior (no uncertainty), graded truth reduces to Boolean.
This shows that graded semantics reduces to Boolean semantics when there's no parameter uncertainty.
Connection to RSA #
Grove & White's framework connects to RSA as follows:
- Literal meaning φ: a function
ι → Bool(proposition) - Common ground: a distribution
P ιover indices - Assertion:
observe(φ(i))fori ← cg - Probability computation:
Pr[φ] = E_i[1_{φ(i)}]
RSA's graded φ emerges from:
- Boolean φ_θ indexed by parameters θ
- A prior distribution over θ
- Marginalization: φ(x) = E_θ[φ_θ(x)]
This is exactly Lassiter & Goodman's "threshold + uncertainty = graded".
Bayesian state — fibered shape #
BayesianState W E := W → PMF (Assignment E) — for each world, a
distribution over assignments. The lookup iLookup s v w marginalizes
out variables other than v by mapping (· v) over s w.
SEAM (Seam 2): Joint-state Bayesian deferred #
@cite{grove-white-2025b}'s parameterized dynamic semantics (PDS) uses
a joint shape: Pσ σ σ' α := σ → P(α × σ'), formalized as PState (§2 above).
The natural Bayesian-as-HasJointState instance would refactor BayesianState
to PMF (W × Assignment E) and declare HasJointState PMF. The joint field
is then the identity (no marginalization at all); deriving the fibered iLookup
requires PMF.cond machinery with a mass-zero-handling story (return Option
or WithBot, since conditioning on a measure-zero world is undefined).
The fibered shape preserved here is the honest projection; the joint upgrade
is tracked separately to avoid churning every downstream file that consumes
per-world PMF (Assignment E) priors.
Bridges to Set (Charlow) #
The bridge maps to the Charlow family use mathlib's PMF.support (non-zero-mass
elements) and PMF.uniformOfFinset (uniform distribution on a nonempty finite
set). These ship as natural transformations between the per-world Set and PMF
fibers.
Bayesian dynamic state: per-world probability distribution over
assignments. The natural M = PMF analog of Charlow's per-world set
of alternatives.
Equations
- Semantics.Dynamic.Probabilistic.BayesianState W E = (W → PMF (Core.Assignment E))
Instances For
Bayesian state as the probabilistic (M = PMF) instance of the
unified lookup interface. The lookup at (v, w) is the marginal
distribution of variable v's value: take s w (the joint distribution
over assignments at w) and map (· v).
SEAM (Falsifier, Seam 1): Bayesian commits to zero-mass as the #
no-referent case. There is no value-level falsifier — "no referent"
shows up as (s w).map (· v) v = 0, not as a distinguished Entity.star
or empty alternative-set. Bridge to Charlow via supportFiber
(probability → possibility) is lossy whenever the PMF has nontrivial
spread.
Equations
- One or more equations did not get rendered due to their size.
Bayesian ↠ Charlow (per world): the support of the per-world
PMF gives the set of possible-value alternatives. Mathlib supplies
PMF.support : PMF α → Set α directly; this is the natural
transformation PMF ⟶ Set applied at each world.
Equations
- Semantics.Dynamic.Probabilistic.supportFiber s w = (s w).support
Instances For
Charlow ↪ Bayesian (per world, when supported): the uniform distribution on a nonempty finite set of alternatives. The lift requires a nonemptiness witness; outside it, no probability assignment is well-defined (Charlow's "nothing rules anything out" is incompatible with the PMF normalization axiom).
This is the natural transformation Set ⟶ PMF whose existence is
gated by finite nonempty support — matching the Set/PMF asymmetry
visible in mathlib (PMF.uniformOfFinset requires Finset.Nonempty).
Equations
- Semantics.Dynamic.Probabilistic.uniformFiber s h w = PMF.uniformOfFinset (s w) ⋯