Documentation

Linglib.Semantics.Dynamic.Effects.Probability

Probability Effect: Probabilistic Dynamic Semantics #

[LG17] [GW25a] [GW25b]

The probabilistic effect models RSA-style soft assertion, threshold uncertainty, and Bayesian update in dynamic semantics.

This file consolidates two former modules:

Probability-monad infrastructure (PState, CondProbMonad, ChoiceProbMonad, threshold semantics) built on mathlib's Monad/LawfulMonad — was Dynamic/Probability/Basic.lean
The HasFiberedLookup PMF instance and Bayesian-Charlow bridge — was Dynamic/Probability/Lookup.lean

Effect family #

Family	`M`	Falsifier	Source
ICDRT	`Entity`	`.star`	`Discourse/Intensional.lean`
Charlow	`Set`	`∅`	`Discourse/Effects/Nondeterminism.lean`
Bayesian	`PMF`	zero-mass	this file
FCS	partial-`Dom`	`Dom` partiality	`Dynamic/FileChange/Basic.lean`

We define P α abstractly as a structure with pure and bind operations satisfying the monad laws. This allows us to reason about probabilistic programs without committing to a specific representation (PMF, measure, etc.).

In Grove & White notation:

⌜v⌝ is pure v (trivial distribution at v)
x ← m; k is bind m (λ x => k)

The probability monad is just a monad #

A "probability monad" provides pure, >>=, and the three monad laws — i.e. it is exactly mathlib's Monad / LawfulMonad, so we use those directly rather than re-declaring the typeclass. map/seq and their functor laws are mathlib's Functor.map / id_map / comp_map. The probability-specific structure (conditioning, choice) is added on top by CondProbMonad / ChoiceProbMonad below, which extend [Monad P].

PMF is the canonical instance (Monad PMF / LawfulMonad PMF in mathlib).

In Grove & White's parameterized state monad, the state type can change during computation. This models how discourse updates can modify the structure of the context (e.g., pushing questions onto the QUD stack).

P^σ_σ' α = σ → P(α × σ')

The parameters σ and σ' are the input and output state types.

def Semantics.Dynamic.Probabilistic.PState (P : Type → Type) (σ σ' α : Type) :

Parameterized probabilistic state monad.

Maps input state σ to a distribution over (value, output state) pairs. The output state type σ' can differ from σ, allowing type-changing updates.

This is Grove & White's P^σ_σ' α.

Equations

Semantics.Dynamic.Probabilistic.PState P σ σ' α = (σ → P (α × σ'))

Instances For

def Semantics.Dynamic.Probabilistic.PState.pure {P : Type → Type} [Monad P] {σ α : Type} (v : α) :

PState P σ σ α

Return for the parameterized state monad.

Returns the value paired with the unchanged state. Grove & White: ⌜v⌝^σ = λs. ⌜(v, s)⌝

Equations

Semantics.Dynamic.Probabilistic.PState.pure v s = pure (v, s)

Instances For

def Semantics.Dynamic.Probabilistic.PState.bind {P : Type → Type} [Monad P] {σ σ' σ'' α β : Type} (m : PState P σ σ' α) (k : α → PState P σ' σ'' β) :

PState P σ σ'' β

Bind for the parameterized state monad.

Sequences stateful-probabilistic computations, threading state through. Grove & White: do { x ← m; k x } = λs. (x, s') ← m(s); k(x)(s')

Equations

m.bind k s = do let x ← m s match x with | (x, s') => k x s'

Instances For

def Semantics.Dynamic.Probabilistic.PState.view {P : Type → Type} [Monad P] {σ α : Type} (proj : σ → α) :

PState P σ σ α

View (get) a component of the state.

Returns the value of applying proj to the current state, without modification.

Equations

Semantics.Dynamic.Probabilistic.PState.view proj s = pure (proj s, s)

Instances For

def Semantics.Dynamic.Probabilistic.PState.set {P : Type → Type} [Monad P] {σ σ' : Type} (upd : σ → σ') :

PState P σ σ' Unit

Set (put) a component of the state.

Returns the new state created by upd, with a trivial value.

Equations

Semantics.Dynamic.Probabilistic.PState.set upd s = pure ((), upd s)

Instances For

def Semantics.Dynamic.Probabilistic.PState.modify {P : Type → Type} [Monad P] {σ : Type} (f : σ → σ) :

PState P σ σ Unit

Modify the state in place.

Equations

Semantics.Dynamic.Probabilistic.PState.modify f s = pure ((), f s)

Instances For

theorem Semantics.Dynamic.Probabilistic.PState.pure_bind {P : Type → Type} [Monad P] {σ σ' α β : Type} [LawfulMonad P] (v : α) (k : α → PState P σ σ' β) :

(pure v).bind k = k v

Left identity for PState: pure v >>= k = k v

theorem Semantics.Dynamic.Probabilistic.PState.bind_pure {P : Type → Type} [Monad P] {σ σ' α : Type} [LawfulMonad P] (m : PState P σ σ' α) :

m.bind pure = m

Right identity for PState: m >>= pure = m

theorem Semantics.Dynamic.Probabilistic.PState.bind_assoc {P : Type → Type} [Monad P] {σ σ' σ'' σ''' α β γ : Type} [LawfulMonad P] (m : PState P σ σ' α) (n : α → PState P σ' σ'' β) (o : β → PState P σ'' σ''' γ) :

(m.bind n).bind o = m.bind fun (x : α) => (n x).bind o

Associativity for PState.

Grove & White's observe operation conditions a distribution on a boolean.

observe : Bool → P Unit
observe true continues; observe false blocks

This is the mechanism for assertion: update the CommonGround by observing the asserted proposition is true.

class Semantics.Dynamic.Probabilistic.CondProbMonad (P : Type → Type) [Monad P] :

Extended probability monad with conditioning.

Adds observe for conditioning on boolean observations.

fail {α : Type} : P α
The zero distribution (blocks all continuations)
observe : Bool → P Unit
Condition on a boolean: continue if true, block if false
observe_true : observe true = pure ()
Observing true is a no-op
observe_false_bind {α : Type} (k : Unit → P α) : observe false >>= k = fail
Observing false blocks all continuations
fail_bind {α β : Type} (k : α → P β) : fail >>= k = fail
fail is a left zero for bind

Instances

theorem Semantics.Dynamic.Probabilistic.CondProbMonad.observe_false_pure {P : Type → Type} [Monad P] [CondProbMonad P] :

(do observe false pure ()) = fail

Observe false blocks: any continuation after observe false gives fail

theorem Semantics.Dynamic.Probabilistic.CondProbMonad.observe_true_pure {P : Type → Type} [Monad P] [CondProbMonad P] [LawfulMonad P] :

(do observe true pure ()) = observe true

Observe filters: observe true then return is identity

RSA's S1 isn't just conditioning - it's choosing an utterance weighted by utility. This requires a choose operation in addition to observe.

choose : (α → ℚ) → P α -- sample from weighted distribution

The relationship:

observe b filters: keeps current path if b, blocks otherwise
choose w samples: draws from distribution proportional to weights w

class Semantics.Dynamic.Probabilistic.ChoiceProbMonad (P : Type → Type) [Monad P] extends Semantics.Dynamic.Probabilistic.CondProbMonad P :

Probability monad with choice (for speaker models).

Adds choose for sampling from weighted distributions. This is what S1's softmax requires.

fail {α : Type} : P α
observe : Bool → P Unit
observe_true : observe true = pure ()
observe_false_bind {α : Type} (k : Unit → P α) : observe false >>= k = fail
fail_bind {α β : Type} (k : α → P β) : fail >>= k = fail
choose {α : Type} [Fintype α] : (α → ℚ) → P α
Sample from a weighted distribution over a finite type
choose_uniform {α : Type} [Fintype α] [Nonempty α] : (choose fun (x : α) => 1) = choose fun (x : α) => 1
choose with uniform weights is like a uniform prior
choose_observe {α : Type} [Fintype α] (w : α → ℚ) (p : α → Bool) : (do let a ← choose w CondProbMonad.observe (p a) pure a) = choose fun (a : α) => w a * if p a = true then 1 else 0
choose then observe is like weighted observe

Instances

def Semantics.Dynamic.Probabilistic.ChoiceProbMonad.softmaxChoice {P : Type → Type} [Monad P] [ChoiceProbMonad P] {α : Type} [Fintype α] (utility : α → ℚ) (temperature : ℚ) :

P α

Softmax choice: choose with exp-scaled weights.

This is S1's decision rule: P(u) ∝ exp(α · utility(u))

Note: We use ℚ so can't compute exp directly. In practice, implementations use Float or work with log-probabilities. This definition is for the interface.

Equations

Semantics.Dynamic.Probabilistic.ChoiceProbMonad.softmaxChoice utility temperature = Semantics.Dynamic.Probabilistic.ChoiceProbMonad.choose utility

Instances For

Threshold semantics + threshold uncertainty produces graded truth values. This is a special case of the PDS framework.

For a gradable adjective like "tall":

measure : Entity → ℝ gives heights
threshold : θ is the standard of comparison
⟦tall⟧_θ(x) = measure(x) > θ is Boolean

With uncertainty over θ:

⟦tall⟧(x) = E_θ[⟦tall⟧_θ(x)] = P(measure(x) > θ)

This probability is the graded truth value.

def Semantics.Dynamic.Probabilistic.thresholdSem {ε : Type} (measure : ε → ℚ) (threshold : ℚ) (x : ε) :

Bool

Threshold semantics: entity satisfies predicate if measure exceeds threshold.

Equations

Semantics.Dynamic.Probabilistic.thresholdSem measure threshold x = decide (measure x > threshold)

Instances For

def Semantics.Dynamic.Probabilistic.gradedFromThreshold {ε Θ : Type} [Fintype Θ] [DecidableEq Θ] (measure : ε → ℚ) (thresholds prior : Θ → ℚ) (x : ε) :

ℚ

Graded truth from threshold uncertainty.

Given a prior over thresholds, the graded truth is the probability that the entity's measure exceeds the threshold.

Equations

Semantics.Dynamic.Probabilistic.gradedFromThreshold measure thresholds prior x = ∑ θ : Θ, prior θ * if measure x > thresholds θ then 1 else 0

Instances For

theorem Semantics.Dynamic.Probabilistic.graded_eq_bool_of_point_mass {ε Θ : Type} [Fintype Θ] [DecidableEq Θ] (measure : ε → ℚ) (thresholds : Θ → ℚ) (θ₀ : Θ) (x : ε) :

have pointMass := fun (θ : Θ) => if θ = θ₀ then 1 else 0; gradedFromThreshold measure thresholds pointMass x = if measure x > thresholds θ₀ then 1 else 0

For a point-mass prior (no uncertainty), graded truth reduces to Boolean.

This shows that graded semantics reduces to Boolean semantics when there's no parameter uncertainty.

Connection to RSA #

Grove & White's framework connects to RSA as follows:

Literal meaning φ: a function ι → Bool (proposition)
Common ground: a distribution P ι over indices
Assertion: observe(φ(i)) for i ← cg
Probability computation: Pr[φ] = E_i[1_{φ(i)}]

RSA's graded φ emerges from:

Boolean φ_θ indexed by parameters θ
A prior distribution over θ
Marginalization: φ(x) = E_θ[φ_θ(x)]

This is exactly Lassiter & Goodman's "threshold + uncertainty = graded".

Bayesian state — fibered shape #

BayesianState W E := W → PMF (Assignment E) — for each world, a distribution over assignments. The lookup iLookup s v w marginalizes out variables other than v by mapping (· v) over s w.

SEAM (Seam 2): Joint-state Bayesian deferred #

[GW25b]'s parameterized dynamic semantics (PDS) uses a joint shape: Pσ σ σ' α := σ → P(α × σ'), formalized as PState (§2 above). The natural Bayesian-as-HasJointState instance would refactor BayesianState to PMF (W × Assignment E) and declare HasJointState PMF. The joint field is then the identity (no marginalization at all); deriving the fibered iLookup requires PMF.cond machinery with a mass-zero-handling story (return Option or WithBot, since conditioning on a measure-zero world is undefined).

The fibered shape preserved here is the honest projection; the joint upgrade is tracked separately to avoid churning every downstream file that consumes per-world PMF (Assignment E) priors.

Bridges to `Set` (Charlow) #

The bridge maps to the Charlow family use mathlib's PMF.support (non-zero-mass elements) and PMF.uniformOfFinset (uniform distribution on a nonempty finite set). These ship as natural transformations between the per-world Set and PMF fibers.

def Semantics.Dynamic.Probabilistic.BayesianState (W E : Type) :

Bayesian dynamic state: per-world probability distribution over assignments. The natural M = PMF analog of Charlow's per-world set of alternatives.

Equations

Semantics.Dynamic.Probabilistic.BayesianState W E = (W → PMF (Core.Assignment E))

Instances For

@[implicit_reducible]

noncomputable instance Semantics.Dynamic.Probabilistic.instBayesianHasFiberedLookup (W E : Type) :

Context.HasFiberedLookup PMF (BayesianState W E) ℕ W E

Bayesian state as the probabilistic (M = PMF) instance of the unified lookup interface. The lookup at (v, w) is the marginal distribution of variable v's value: take s w (the joint distribution over assignments at w) and map (· v).

SEAM (Falsifier, Seam 1): Bayesian commits to zero-mass as the #

no-referent case. There is no value-level falsifier — "no referent" shows up as (s w).map (· v) v = 0, not as a distinguished Entity.star or empty alternative-set. Bridge to Charlow via supportFiber (probability → possibility) is lossy whenever the PMF has nontrivial spread.

Equations

One or more equations did not get rendered due to their size.

noncomputable def Semantics.Dynamic.Probabilistic.supportFiber {W E : Type} (s : BayesianState W E) (w : W) :

Set (Core.Assignment E)

Bayesian ↠ Charlow (per world): the support of the per-world PMF gives the set of possible-value alternatives. Mathlib supplies PMF.support : PMF α → Set α directly; this is the natural transformation PMF ⟶ Set applied at each world.

Equations

Semantics.Dynamic.Probabilistic.supportFiber s w = (s w).support

Instances For

noncomputable def Semantics.Dynamic.Probabilistic.uniformFiber {W E : Type} [DecidableEq (Core.Assignment E)] (s : W → Finset (Core.Assignment E)) (h : ∀ (w : W), (s w).Nonempty) :

BayesianState W E

Charlow ↪ Bayesian (per world, when supported): the uniform distribution on a nonempty finite set of alternatives. The lift requires a nonemptiness witness; outside it, no probability assignment is well-defined (Charlow's "nothing rules anything out" is incompatible with the PMF normalization axiom).

This is the natural transformation Set ⟶ PMF whose existence is gated by finite nonempty support — matching the Set/PMF asymmetry visible in mathlib (PMF.uniformOfFinset requires Finset.Nonempty).

Equations

Semantics.Dynamic.Probabilistic.uniformFiber s h w = PMF.uniformOfFinset (s w) ⋯

Instances For