The RSA speaker as a Gibbs (exponentially-tilted) measure #
Layer 1 of the analytic theory of RSA (Measure/Kernel-native foundation). The
pragmatic speaker S1(· | w) is the exponential tilting of a base
utterance measure by the rationality-scaled utility
score w u = α · (log L0(w | u) − cost u):
S1(· | w) = base.tilted (score w)
i.e. the speaker is a Gibbs measure in mathlib's sense
(MeasureTheory.Measure.tilted). This grounds the RSA speaker in mathlib's
exponential-family machinery rather than a bespoke PMF.normalize ∘ exp
reimplementation, and is the object on which the variational (free-energy /
klDiv) characterization is the Gibbs / Donsker–Varadhan variational principle in
Core/Probability/GibbsVariational.lean, applied here via speaker_isGreatest.
Main statements #
speaker_count_singleton/speaker_countRestrict_singleton— closed form of the speaker mass at an utterance (softmax over the base / over the applicable setA);speaker_countRestrict_singleton_of_not_memgives0offA.speaker_count_lt_iff_score_lt/speaker_lt_iff_score_lt/speaker_countRestrict_lt_iff_score_lt— monotonicity in informativity: the speaker prefersboveraiff its utility is higher. The partition function cancels; reduces to strict monotonicity ofReal.exp.speakerAlpha— the rationality parameter α (S₁ ∝ L0^α, Boltzmann/softmax at inverse temperature α):speakerAlpha_count_lt_iff_score_lt(α > 0 keeps the order),speakerAlpha_count_zero_singleton(α = 0 is uniform), and the zero-temperature limitsspeakerAlpha_countRestrict_tendsto_one_of_isMax/_tendsto_zero_of_exists_gt(α → ∞ is the argmax / fully-rational speaker, over the applicable set).listener/listener_comp_speaker_marginal/listener_eq_withDensity/listener_region— the pragmatic listener as the Bayesian posteriorS1 † prior, stated measure-natively (Bayes consistency, density reweighting, region inference).speaker_isGreatest— the RSA speaker is the rational optimizer (Gibbs / Donsker–Varadhan variational principle).
The RSA pragmatic speaker as an exponentially-tilted (Gibbs) measure: the
base utterance measure tilted by the utility score. With score u = α · (log L0(w|u) − cost u) this is S1(· | w) ∝ L0(w|u)^α · exp(−α·cost).
Equations
- RSA.Gibbs.speaker base score = base.tilted score
Instances For
Closed form of the speaker mass at a single utterance, for the counting
base: ofReal (exp (score u) / Z) with partition Z = ∑ u', exp (score u').
Speaker monotonicity in informativity: at a fixed world, the speaker
assigns more mass to b than to a iff b has the higher utility. The
partition function cancels exactly; the comparison reduces to the strict
monotonicity of Real.exp. The RSA "speaker prefers the more informative
utterance" fact, as a theorem about the Gibbs measure.
Speaker monotonicity, general probability/finite base: for any finite
base measure that assigns equal nonzero mass to the two utterances, the speaker
prefers b over a iff b has the higher utility. Generalizes
speaker_count_lt_iff_score_lt off the counting base — the form the variational
principle needs, where the base is a probability measure (e.g.
ProbabilityTheory.uniformOn). The shared singleton mass and the partition
function Z = ∫ y, exp (score y) ∂base cancel, reducing to Real.exp
monotonicity.
An integrability hypothesis on exp ∘ score is required (and is automatic on a
Fintype): without it a non-integrable tilt collapses to the zero measure
(MeasureTheory.tilted_of_not_integrable), making both sides vanish and the
equivalence false. It mirrors the h_exp hypothesis of speaker_isGreatest.
Finite-support speaker (counting base restricted to the applicable set) #
The faithful RSA speaker normalizes over only the utterances that apply to the
referent ([FG12]'s set W in eq. 2), not the whole utterance type.
Tilting Measure.count restricted to a finite applicable set A realizes this:
the closed form at an applicable utterance is the softmax over A, and a
non-applicable utterance gets mass 0.
Finite-support speaker, closed form: tilting Measure.count restricted to
a finite set A gives, at any a ∈ A, the softmax
ofReal (exp (score a) / ∑ x ∈ A, exp (score x)) over A. With
score u = log (1 / |u|) this is [FG12]'s eq. (2).
A non-applicable utterance (a ∉ A) gets zero speaker mass: the finite-support
speaker is supported exactly on the applicable set A.
Finite-support speaker monotonicity: among applicable utterances, the
speaker prefers b over a iff b has the higher utility. The softmax normalizer
over A cancels. A specialization of speaker_lt_iff_score_lt to the restricted
counting base, with the equal-mass and nonzero side conditions discharged.
Rationality parameter α #
The full RSA speaker scales the utility by a rationality parameter α ≥ 0:
S₁(· | w) ∝ L0(w | ·)^α, i.e. tilt the base by α · score. α = 1 is the
canonical speaker; α → ∞ is the fully-rational argmax speaker (it concentrates
on the most informative utterances); α → 0 is the uniform (random) speaker.
The α-rational speaker: the base tilted by the utility scaled by rationality
α. With score = log L0 this is S₁(· | w) ∝ L0(w | ·)^α.
Equations
- RSA.Gibbs.speakerAlpha base α score = RSA.Gibbs.speaker base fun (u : U) => α * score u
Instances For
Closed form of the α-speaker over the counting base: the Boltzmann/softmax
distribution exp (α · score a) / ∑ u, exp (α · score u) at inverse temperature
α.
α-monotonicity (α > 0): any positively-rational speaker prefers the
higher-utility utterance. The rationality scaling preserves the order; the
partition cancels.
α = 0 is the uniform (indifferent) speaker: with zero rationality the
speaker ignores utility and spreads its mass uniformly, (card U)⁻¹ at every
utterance.
Closed form of the α-speaker over the applicable set A: the softmax at
inverse temperature α restricted to A. Generalizes speakerAlpha_count_singleton
(the A = univ case) to the finite-support speaker the studies consume.
A non-applicable utterance gets zero α-speaker mass.
α-monotonicity, finite support (α > 0): over the applicable set A, the
positively-rational speaker prefers the higher-utility utterance.
α = 0 is the uniform speaker over the applicable set: zero rationality
spreads mass uniformly across A, (A.card)⁻¹ at each applicable utterance.
α → ∞ concentrates on the unique best applicable utterance: if a strictly
maximizes the utility over the applicable set A, the fully-rational speaker
assigns it mass → 1. The softmax becomes the argmax (zero-temperature limit).
α → ∞ abandons a dominated utterance: if some applicable utterance strictly
beats a, the fully-rational speaker assigns a mass → 0. Dual to
speakerAlpha_countRestrict_tendsto_one_of_isMax.
The pragmatic listener as a Bayesian posterior (measure-native) #
The pragmatic listener L1 is the Bayesian posterior of the speaker kernel
S1 : Kernel World Utterance against the world prior — mathlib's
ProbabilityTheory.posterior, S1 † prior : Kernel Utterance World. This is
the listener stated measure-natively: it is a kernel/measure, characterized by
Bayes-consistency rather than by pointwise singleton values, so it is uniform
across discrete and continuous distribution classes (where singletons are null
and a pointwise reading is unavailable).
The RSA pragmatic listener: the Bayesian posterior S1 † prior of the
speaker kernel against the world prior.
Equations
- RSA.Gibbs.listener S1 prior = ProbabilityTheory.posterior S1 prior
Instances For
Listener–speaker consistency (measure-native): averaging the listener
over the speaker's marginal of the prior recovers the prior,
L1 ∘ₘ (S1 ∘ₘ prior) = prior. The Bayesian-coherence identity for RSA, as an
equality of measures — no singletons, any distribution class.
Bayes' theorem and region inference (measure-native) #
Bayes' theorem for the RSA listener: for almost every utterance u, the
listener is the prior reweighted by the per-world speaker likelihood-ratio,
L1 u = prior.withDensity (fun w ↦ (S1 w).rnDeriv (S1 ∘ₘ prior) u). The posterior
is the prior tilted by how diagnostic each world is for u. For countable worlds
the absolute-continuity side condition is automatic.
Region inference: the listener's posterior mass on a set of worlds A
given utterance u is the integral of the speaker likelihood-ratio over A.
This is the measure-native quantity behind RSA "the listener infers φ" claims
(projection, exhaustivity = posterior mass on the relevant region of worlds) —
stated at the level of sets, where it is meaningful for any distribution class,
rather than at points.
The RSA speaker is the rational optimizer #
speaker base score = base.tilted score is a Gibbs measure, so the
Gibbs / Donsker–Varadhan variational principle (InformationTheory.isGreatest_cgf,
proved generically in Core/Probability/GibbsVariational.lean) applies directly:
among all candidate distributions over utterances, the speaker maximizes the free
energy 𝔼_q[score] − KL(q ‖ base), with optimal value the cumulant generating
function cgf score base 1 (= log ∫ exp score ∂base, the free energy /
log-partition). This turns RSA's "rational speaker" from a stipulated softmax into a
theorem.
The RSA speaker is the rational optimizer: cgf score base 1 is the greatest
value of the free-energy functional base.freeEnergy score over candidate utterance
distributions, attained at speaker base score = base.tilted score. A direct
instance of the Gibbs / Donsker–Varadhan variational principle
InformationTheory.isGreatest_cgf.