RSA — Unbundled Operators #
@cite{frank-goodman-2012} @cite{degen-2023}
Mathlib-shaped, struct-free formulation of RSA's three core operators, sitting
alongside the bundled RSAConfig API (Defs.lean / Basic.lean). Each
operator takes its ingredients (meaning function, cost factor, rationality, prior)
as explicit arguments — no RSAConfig projection chains, no nonneg-axiom fields.
The mathlib precedent is bayesRisk (Mathlib/Probability/Decision/Risk/Defs.lean):
def bayesRisk (ℓ : Θ → 𝓨 → ℝ≥0∞) (P : Kernel Θ 𝓧) (π : Measure Θ) : ℝ≥0∞
There is no BayesianDecisionConfig struct. Ingredients are function arguments;
hypotheses are stated where needed; lemmas universally quantify over the parts
that vary.
Operators #
L0OfMeaning— literal listener, normalising a non-negative meaning functionmeaning : U → W → ℝ≥0∞over worlds for each utterance.S1Belief— pragmatic speaker, belief-based formS1(u | w) ∝ L0(w | u)^α · costFactor(u).L1— pragmatic listener, defined asPMF.posteriorof the speaker kernel against the world prior. The grounding theoremL1_eq_posteriorisrfl.
Grounding #
L1 does not redefine Bayes' rule — it forwards to PMF.posterior. The
"L1 IS Bayesian inversion of S1 against the world prior" claim is therefore
true by construction (CLAUDE.md "import-don't-restipulate" discipline), not
by a bridge theorem proved after the fact. Theorems about PMF.posterior
(support characterisation, marginal-times-posterior identity) lift to L1
as one-liners.
Inequality decomposition (for consumer migrations) #
Each operator has parallel _lt_iff_score_lt and _le_iff_score_le lemmas
that strip off the normalisation factor — the workhorses for migrating
"L1/S1 prefers a₂ over a₁" claims through the structural shell:
S1Belief_apply_lt_iff_score_lt/S1Belief_apply_le_iff_score_le— same-world utterance comparison reduces to(L0 u w)^α · cost uL1_lt_iff_score_lt/L1_le_iff_score_le— same-observation world comparison reduces toprior(w) · κ(w, u)
For the common case where the world prior is uniform, see
PMF.posterior_lt_iff_kernel_lt_of_uniform / _le_iff_kernel_le in
Core/Probability/Posterior.lean — those cancel both the marginal
denominator AND the uniform prior in one move.
Relationship to RSAConfig #
Phase 1 of the RSA → mathlib-PMF migration: this file is a pure addition.
RSAConfig and RSAConfig.L1 (in Basic.lean) remain in place; consumer
code is unchanged. A subsequent phase migrates one Phenomena study end-to-end
to demonstrate that rsa_predict reflection still applies to operator
applications.
L0: Literal Listener #
Literal listener built by normalising a meaning function over worlds.
For utterance u, L0OfMeaning meaning u h0 hTop is the PMF over worlds with
mass meaning u w / Σ_{w'} meaning u w'.
The two hypotheses are exactly PMF.normalize's API: the marginal must be
non-zero (so the utterance is true somewhere) and finite (automatic on
Fintype W if every meaning value is < ∞).
Equations
- RSA.L0OfMeaning meaning u h0 hTop = PMF.normalize (meaning u) h0 hTop
Instances For
L0 from a Boolean meaning (uniform on extension) #
Extension of a Boolean meaning at utterance u: the Finset of worlds
the meaning is true at. The [Fintype W]/[DecidableEq W] machinery is used
implicitly via Finset.univ.filter.
Equations
- RSA.extensionOf m u = {w : W | m u w = true}
Instances For
Literal listener for a Boolean meaning: uniform distribution on the
extension. Specialisation of L0OfMeaning to the case where each meaning
value is 0 or 1 and the extension is non-empty. The (extensionOf m u).Nonempty
hypothesis is PMF.uniformOfFinset's API.
Equations
- RSA.L0OfBoolMeaning m u h = PMF.uniformOfFinset (RSA.extensionOf m u) h
Instances For
S1: Pragmatic Speaker (belief-based) #
Belief-based pragmatic speaker (@cite{frank-goodman-2012}):
S1(u | w) ∝ L0(w | u)^α · costFactor(u), normalised over utterances.
L0 : U → PMF W— the literal listener kernel (often built viaL0OfMeaning, but any kernel will do).costFactor : U → ℝ≥0∞— multiplicative cost weight. To recover the classicalexp(-cost)form passfun u => ENNReal.ofReal (Real.exp (-cost u)).α : ℝ— rationality / soft-max temperature.
Returns the speaker's distribution at world w.
Equations
- RSA.S1Belief L0 costFactor α w h0 hTop = PMF.normalize (fun (u : U) => (L0 u) w ^ α * costFactor u) h0 hTop
Instances For
The S1Belief PMF assigns positive probability to u iff the L0 score at
u and the cost factor at u are both non-zero (rpow of a positive finite
base is positive). Convenience for discharging "speaker assigns probability
to this utterance" obligations downstream.
Inequality decomposition for S1Belief: at a fixed world, comparing
two utterances' speaker probabilities reduces to comparing their unnormalised
scores (L0 u w)^α · cost u. The partition function depends on w but not
on u, so it cancels.
Direct lift from PMF.normalize_lt_iff_lt. The workhorse decomposition
lemma for "speaker prefers u₂ over u₁ at world w" proofs.
The ≤ companion of S1Belief_apply_lt_iff_score_lt.
S1: Pragmatic Speaker (softmax-of-expected-log form) #
@cite{goodman-stuhlmuller-2013} / @cite{kao-etal-2014-metaphor}-style speaker:
S1(u | obs) ∝ exp(α · Σ_s belief(s) · log lex(u, s)) when qOk u, else 0.
Real-valued internally; lifted to ℝ≥0∞ at the PMF.normalize boundary. The
quality predicate qOk is not derived from lex and belief because Lean's
Real.log 0 = 0 does not match WebPPL's log 0 = -∞: in WebPPL the score is
automatically zero on quality-violating utterances (via exp (-∞) = 0), but
in Lean an explicit filter is required. Consumers typically pass
qOk u := ∀ s ∈ supp belief, lex u s > 0 or a problem-specific predicate.
The score is positive iff qOk u (regardless of lex/belief
internals — Real.exp is always positive). The tsum-positivity cover
collapses to ∃ u, qOk u (see softmaxBelief_tsum_ne_zero_of_witness),
which is the natural shape for a PMF.normalize discharge.
Equations
- RSA.softmaxBelief lex belief α qOk u = if qOk u then ENNReal.ofReal (Real.exp (α * ∑ s : W, belief s * Real.log (lex u s))) else 0
Instances For
The score is 0 exactly when the quality predicate fails. The
companion to softmaxBelief_ne_zero_of_qOk (positive direction); the two
characterise the support of softmaxBelief lex belief α qOk as
{u | qOk u}.
Cover discharge: a single quality-OK witness is enough to make the
fan-out sum non-zero — the standard PMF.normalize precondition shape.
Softmax collapse at concentrated belief (α = 1): when the belief is a
Kronecker delta on a single world s*, the expected log collapses to a
single log term, and exp ∘ log cancels at any positive value. The score
becomes simply lex u s* (lifted via ENNReal.ofReal).
The hypothesis qOk u → 0 < lex u s* is the canonical "quality-OK
utterances are true at the speaker-believed world" condition — it ensures
Real.log (lex u s*) is well-defined (non-junk-zero) so that exp ∘ log
gives back lex u s* rather than 1.
This is the bridge that lets transcendental softmax expressions reduce to
rational arithmetic in models with deterministic full-access observations
(e.g., @cite{goodman-stuhlmuller-2013} at access .a3).
Softmax collapse at lex uniform on belief support (α = 1): when lex u
takes the same positive value v on every world in the belief support, the
expected log collapses to log v, and exp ∘ log cancels to give v. The
score becomes ENNReal.ofReal v (lifted via ENNReal.ofReal).
Generalises softmaxBelief_concentrated_apply from a Kronecker-delta belief
to any belief whose support sits inside lex u's level set at v. Captures
the natural property of uniform-on-extension lex functions: when qOk passes,
the speaker's belief support is contained in the utterance's extension, where
lex is uniformly 1/|extension|.
The hypothesis ∑ s, belief s = 1 is the canonical "belief is a probability
distribution" assumption; combined with bilinearity of multiplication it lets
the constant log v factor out of the sum.
L1: Pragmatic Listener #
Pragmatic listener: Bayesian inversion of the speaker kernel against the
world prior. Defined as PMF.posterior, so the "L1 = posterior" claim is
true by construction.
Mathlib calls this operator posterior (Mathlib/Probability/Kernel/Posterior.lean,
notation κ†μ). At the PMF level — without measure-theoretic typeclasses —
it is PMF.posterior from Core/Probability/Posterior.lean. This file
gives it the linguistically familiar name L1.
Equations
- RSA.L1 speaker worldPrior u h = PMF.posterior speaker worldPrior u h
Instances For
Grounding theorem: L1 IS PMF.posterior. True by construction (rfl),
not by a bridge proof. This is the point of the unbundled formulation — the
mathlib operator is the definition, not something we redefine and reconcile.
Support of L1: a world has positive posterior mass iff it had positive
prior mass and the speaker assigns it positive probability of the
observed utterance. Lifts directly from PMF.mem_support_posterior_iff.
Bayes identity in product form: marginal · L1 = prior · speaker. Lifts
from PMF.marginal_mul_posterior_apply.
Inequality decomposition for L1: posterior comparison at the pragmatic-
listener layer reduces to score comparison. Direct lift from
PMF.posterior_lt_iff_score_lt — the shared marginal denominator cancels.
This is the workhorse decomposition lemma for "world w₁ has higher
posterior probability than world w₂ after observing u" claims.
The ≤ companion of L1_lt_iff_score_lt.