Documentation

Linglib.Theories.Pragmatics.RSA.Monotonicity

RSA Monotonicity — Structural Order-Preserving Lemmas #

@cite{frank-goodman-2012} @cite{degen-2023}

Structural lemmas that reduce RSA inequality proofs to qualitative claims about the underlying scores, costs, and meanings — peeling off one layer of softmax/Bayes per rewrite. Companion to Basic.lean's L1_latent_lt_of_score_lt and L1_marginal_lt_of_score_sum_lt.

The intent is to shrink the footprint of the rsa_predict numerical-reflection tactic: when an inequality holds because of a structural fact (denominator cancellation, monotonicity of softmax in the score, monotonicity of the score in L0 or cost), prove it by chaining these lemmas instead of verifying the resulting ratio numerically.

Layering #

The full softmax stack is:

L1 u w ∝ worldPrior w · Σ_l latentPrior w l · S1 l w u
S1 l w u = s1Score(L0(l).policy, α, l, w, u) / Σ_{u'} s1Score(...)
L0 l u w = meaning(initial, l, u, w) / Σ_{w'} meaning(initial, l, u, w')

Each layer normalizes by a denominator that cancels when comparing two values at the same conditioning point:

For these comparisons, this file provides *_lt_of_score_lt lemmas that discharge the inequality from a score comparison alone, skipping the heavy softmax denominator computation that rsa_predict would otherwise verify numerically. All such lemmas are tagged @[gcongr], so user code can write by gcongr; rsa_predict instead of invoking the lemma by name.

Score-shape lemmas #

Once a comparison reduces to scores, the score function's shape determines the next reduction. For the two canonical s1Score patterns:

These shape lemmas live below as S1_belief_score_lt_iff etc.; they are parameterized by the scoring rule, so a paper using a custom s1Score adds its own shape lemma instead of invoking rsa_predict.

theorem RSA.RSAConfig.L1_lt_of_score_lt {U : Type u_1} {W : Type u_2} [Fintype U] [Fintype W] (cfg : RSAConfig U W) (u : U) (w₁ w₂ : W) (h : cfg.L1agent.score u w₁ < cfg.L1agent.score u w₂) :
cfg.L1 u w₁ < cfg.L1 u w₂

L1 cross-world comparison via score comparison.

Both sides share the same u, hence the same L1agent.totalScore u, which cancels. Reduces a softmax-ratio comparison to a comparison of the unnormalized L1 scores.

Tagged @[gcongr] so gcongr can peel cfg.L1 u w₁ < cfg.L1 u w₂ to cfg.L1agent.score u w₁ < cfg.L1agent.score u w₂ automatically.

Specialization of RationalAction.policy_lt_of_score_lt to L1agent.

theorem RSA.RSAConfig.L1_eq_of_score_eq {U : Type u_1} {W : Type u_2} [Fintype U] [Fintype W] (cfg : RSAConfig U W) (u : U) (w₁ w₂ : W) (h : cfg.L1agent.score u w₁ = cfg.L1agent.score u w₂) :
cfg.L1 u w₁ = cfg.L1 u w₂

L1 cross-world equality via score equality.

theorem RSA.RSAConfig.L1_lt_iff_score_lt {U : Type u_1} {W : Type u_2} [Fintype U] [Fintype W] (cfg : RSAConfig U W) (u : U) (w₁ w₂ : W) :
cfg.L1 u w₁ < cfg.L1 u w₂ cfg.L1agent.score u w₁ < cfg.L1agent.score u w₂

Iff form of L1_lt_of_score_lt: at the same u, L1 strict ordering coincides with L1agent.score strict ordering. Use this when rewriting a goal in score form (rather than reducing a hypothesis).

theorem RSA.RSAConfig.L1_lt_cross_of_cross_lt {U : Type u_1} {W : Type u_2} [Fintype U] [Fintype W] (cfg : RSAConfig U W) (u₁ u₂ : U) (w : W) (h : cfg.L1agent.score u₁ w * cfg.L1agent.totalScore u₂ < cfg.L1agent.score u₂ w * cfg.L1agent.totalScore u₁) :
cfg.L1 u₁ w < cfg.L1 u₂ w

L1 cross-utterance comparison via cross-product. Different u means different denominators, so we use the cross-product trick (no positivity hypothesis needed — it's derived from the cross-product).

theorem RSA.RSAConfig.S1_lt_of_score_lt {U : Type u_1} {W : Type u_2} [Fintype U] [Fintype W] (cfg : RSAConfig U W) (l : cfg.Latent) (w : W) (u₁ u₂ : U) (h : (cfg.S1agent l).score w u₁ < (cfg.S1agent l).score w u₂) :
cfg.S1 l w u₁ < cfg.S1 l w u₂

S1 same-state comparison via score comparison.

Both sides share (l, w), hence the same S1agent.totalScore w.

theorem RSA.RSAConfig.S1_eq_of_score_eq {U : Type u_1} {W : Type u_2} [Fintype U] [Fintype W] (cfg : RSAConfig U W) (l : cfg.Latent) (w : W) (u₁ u₂ : U) (h : (cfg.S1agent l).score w u₁ = (cfg.S1agent l).score w u₂) :
cfg.S1 l w u₁ = cfg.S1 l w u₂

S1 same-state equality via score equality.

theorem RSA.RSAConfig.S1_lt_iff_score_lt {U : Type u_1} {W : Type u_2} [Fintype U] [Fintype W] (cfg : RSAConfig U W) (l : cfg.Latent) (w : W) (u₁ u₂ : U) :
cfg.S1 l w u₁ < cfg.S1 l w u₂ (cfg.S1agent l).score w u₁ < (cfg.S1agent l).score w u₂

Iff form of S1_lt_of_score_lt: at the same (l, w), S1 strict ordering coincides with S1agent.score strict ordering. Use this when rewriting a goal in score form (rather than reducing a hypothesis).

theorem RSA.RSAConfig.S1_lt_cross_of_cross_lt {U : Type u_1} {W : Type u_2} [Fintype U] [Fintype W] (cfg : RSAConfig U W) (l : cfg.Latent) (w₁ w₂ : W) (u : U) (h : (cfg.S1agent l).score w₁ u * (cfg.S1agent l).totalScore w₂ < (cfg.S1agent l).score w₂ u * (cfg.S1agent l).totalScore w₁) :
cfg.S1 l w₁ u < cfg.S1 l w₂ u

S1 cross-world comparison via cross-product (different w → different denominator).

theorem RSA.RSAConfig.L0_lt_of_meaning_lt {U : Type u_1} {W : Type u_2} [Fintype U] [Fintype W] (cfg : RSAConfig U W) (l : cfg.Latent) (u : U) (w₁ w₂ : W) (h : cfg.meaning cfg.initial l u w₁ < cfg.meaning cfg.initial l u w₂) :
cfg.L0 l u w₁ < cfg.L0 l u w₂

L0 cross-world comparison via meaning comparison.

Both sides share (l, u), hence the same L0agent.totalScore u. The score IS the meaning, so this reduces to a comparison of meaning values.

theorem RSA.RSAConfig.L0_lt_iff_meaning_lt {U : Type u_1} {W : Type u_2} [Fintype U] [Fintype W] (cfg : RSAConfig U W) (l : cfg.Latent) (u : U) (w₁ w₂ : W) :
cfg.L0 l u w₁ < cfg.L0 l u w₂ cfg.meaning cfg.initial l u w₁ < cfg.meaning cfg.initial l u w₂

Iff form of L0_lt_of_meaning_lt: at the same (l, u), L0 strict ordering coincides with meaning strict ordering (the L0 score IS the meaning).

theorem RSA.RSAConfig.L1agent_score_eq {U : Type u_1} {W : Type u_2} [Fintype U] [Fintype W] (cfg : RSAConfig U W) (u : U) (w : W) :
cfg.L1agent.score u w = cfg.worldPrior w * l : cfg.Latent, cfg.latentPrior w l * cfg.S1 l w u

L1 score in closed form: L1agent.score u w = worldPrior w · Σ_l latentPrior w l · S1 l w u.

This unfolds the RationalAction.score projection to the explicit sum, making the worldPrior and latentPrior factors visible for further structural reasoning.

theorem RSA.RSAConfig.L1agent_score_unique_eq {U : Type u_1} {W : Type u_2} [Fintype U] [Fintype W] (cfg : RSAConfig U W) [Unique cfg.Latent] (u : U) (w : W) :
cfg.L1agent.score u w = cfg.worldPrior w * cfg.latentPrior w default * cfg.S1 default w u

L1 score when the latent type has a unique inhabitant: the latent sum collapses to a single term, giving a clean three-factor product.

Common case: most basic RSA models (Frank & Goodman 2012, Lassiter & Goodman 2017, etc.) use no latent variable, i.e. Latent = Unit, which carries a Unique instance automatically.

theorem RSA.RSAConfig.S1_belief_score_lt_iff {U : Type u_1} {W : Type u_2} [Fintype U] [Fintype W] (cfg : RSAConfig U W) (l : cfg.Latent) (w : W) (u₁ u₂ : U) (hShape : ∀ (l0 : UW) (α : ) (l : cfg.Latent) (w : W) (u : U), cfg.s1Score l0 α l w u = (l0 u w).rpow α) :
(cfg.S1agent l).score w u₁ < (cfg.S1agent l).score w u₂ (cfg.L0 l u₁ w).rpow cfg.α < (cfg.L0 l u₂ w).rpow cfg.α

Belief-based S1 score: monotonicity in L0(u, w).

For s1Score = fun l0 α _ w u => Real.rpow (l0 u w) α, the S1 score inherits strict monotonicity from rpow _ α (when α > 0).

Application:

-- Goal: (cfg.S1agent l).score w u₁ < (cfg.S1agent l).score w u₂
apply S1_belief_score_lt_iff.mpr
-- New goal: cfg.L0 l u₁ w < cfg.L0 l u₂ w

The hypothesis hShape pins down the scoring rule; if your model uses a different s1Score, write a parallel lemma for that shape.

theorem RSA.RSAConfig.S1_action_score_lt_iff {U : Type u_1} {W : Type u_2} [Fintype U] [Fintype W] (cfg : RSAConfig U W) (cost : U) (l : cfg.Latent) (w : W) (u₁ u₂ : U) (hShape : ∀ (l0 : UW) (α : ) (_l : cfg.Latent) (w : W) (u : U), cfg.s1Score l0 α _l w u = Real.exp (α * (Real.log (l0 u w) - cost u))) :
(cfg.S1agent l).score w u₁ < (cfg.S1agent l).score w u₂ cfg.α * (Real.log (cfg.L0 l u₁ w) - cost u₁) < cfg.α * (Real.log (cfg.L0 l u₂ w) - cost u₂)

Action-based S1 score: monotonicity in α · (log L0 - cost).

For s1Score = fun l0 α _ w u => Real.exp (α * (Real.log (l0 u w) - cost u)), the S1 score inherits strict monotonicity from exp and the linearity of α · (log _ - cost _).