Documentation

Linglib.Core.Probability.Scores

PMFs from rational score vectors #

RSA-style models compute unnormalized scores in ℚ≥0 — exact, computable, and with order decidable by kernel reduction — and state predictions about the normalized distributions. This file provides the constructor and its comparison calculus.

Fallback σ completes zero-mass score rows: a study declares one fallback per normalization site, and both faces read it — the ℚ≥0 values through scoresWith, the distribution through ofScores — so the two cannot diverge and the bridge ofScores_apply is total (a rfl). Comparison, threshold, product, and event lemmas then reduce every prediction to a ℚ≥0 inequality between scoresWith values, closed by decide +kernel.

normalizeOrUniform is the generic ℝ≥0∞ total normalizer (the total sibling of PMF.normalize); [UPSTREAM] candidate.

Main definitions #

PMF.Fallback — total fallback distribution (value + nonzero-mass witness) with smart constructors pure, uniform, uniformOn.
PMF.normalizeScores f — f x / ∑ f, the ℚ≥0 shadow of PMF.normalize.
PMF.scoresWith fb f — the normalized ℚ≥0 score function both faces read.
PMF.ofScores fb f — the induced PMF; ofScores_apply is rfl.
PMF.normalizeOrUniform — total ℝ≥0∞ normalization.

Main results #

ofScores_lt / ofScores_lt_cross / _le / _eq — predictions from scoresWith comparisons.
ofScores_lt_ratCast and companions — thresholds against ℚ≥0 literals.
prod_ofScores_apply, prod_ofScores_lt — chain-rule trajectory products.
eventMass, ofScores_event_eq, ofScores_event_lt — event marginals.

Total normalization #

noncomputable def PMF.normalizeOrUniform {σ : Type u_1} [Fintype σ] [Nonempty σ] (f : σ → ENNReal) :

PMF σ

Total normalization: normalize f when it has positive finite mass, else fall back to the uniform distribution. Unlike PMF.normalize, no positivity witness is needed at the definition site, so u ↦ normalizeOrUniform (score u) is a well-defined kernel even when some rows are dead. [UPSTREAM] candidate (total sibling of PMF.normalize).

Equations

PMF.normalizeOrUniform f = if h : ∑' (x : σ), f x ≠ 0 ∧ ∑' (x : σ), f x ≠ ⊤ then PMF.normalize f ⋯ ⋯ else PMF.uniformOfFintype σ

Instances For

theorem PMF.normalizeOrUniform_apply {σ : Type u_1} [Fintype σ] [Nonempty σ] {f : σ → ENNReal} (h0 : ∑' (x : σ), f x ≠ 0) (hT : ∑' (x : σ), f x ≠ ⊤) (x : σ) :

(normalizeOrUniform f) x = f x * (∑' (x' : σ), f x')⁻¹

theorem PMF.normalizeOrUniform_ofReal_lt_cross {σ : Type u_1} [Fintype σ] [Nonempty σ] {τ : Type u_3} [Fintype τ] [Nonempty τ] {f : σ → ℝ} {g : τ → ℝ} {a : σ} {b : τ} (hf : ∀ (x : σ), 0 ≤ f x) (hg : ∀ (y : τ), 0 ≤ g y) (hfs : 0 < ∑ x : σ, f x) (hgs : 0 < ∑ y : τ, g y) (h : f a / ∑ x : σ, f x < g b / ∑ y : τ, g y) :

(normalizeOrUniform fun (x : σ) => ENNReal.ofReal (f x)) a < (normalizeOrUniform fun (y : τ) => ENNReal.ofReal (g y)) b

Comparison of normalizeOrUniform values built from nonnegative real scores: reduces to the normalized-score inequality. Real-face analogue of ofScores_lt_cross.

Posterior dominance #

Comparing two posteriors at a cell needs no global algebra: pointwise dominance of the odds f a : f c by g a : g c (strict at one cell) transfers to the normalized values. This is Bayes-factor monotonicity.

theorem Finset.div_sum_lt_div_sum {ι : Type u_3} {α : Type u_4} [Fintype ι] [Semifield α] [LinearOrder α] [IsStrictOrderedRing α] {f g : ι → α} {a : ι} (hfs : 0 < ∑ i : ι, f i) (hgs : 0 < ∑ i : ι, g i) (hle : ∀ (i : ι), f a * g i ≤ g a * f i) (hlt : ∃ (i : ι), f a * g i < g a * f i) :

f a / ∑ i : ι, f i < g a / ∑ i : ι, g i

Pointwise odds dominance implies posterior dominance: if f a · g c ≤ g a · f c at every cell, strictly at one, then f a / ∑ f < g a / ∑ g.

Fallbacks #

structure PMF.Fallback (σ : Type u_3) [Fintype σ] :

Type u_3

A total fallback distribution completing a zero-mass score row: a ℚ≥0 weight vector with nonzero mass. A study declares one fallback per normalization site; scoresWith and ofScores both read it, so the ℚ≥0 and PMF faces agree by construction.

dist : σ → ℚ≥0
The fallback weights (normalized by scoresWith).
ne_zero : ∑ x : σ, self.dist x ≠ 0
The weights carry mass.

Instances For

def PMF.Fallback.pure {σ : Type u_1} [Fintype σ] [DecidableEq σ] (a : σ) :

Point-mass fallback (e.g. a designated null utterance).

Equations

PMF.Fallback.pure a = { dist := fun (x : σ) => if x = a then 1 else 0, ne_zero := ⋯ }

Instances For

def PMF.Fallback.uniform {σ : Type u_1} [Fintype σ] [Nonempty σ] :

Uniform fallback.

Equations

PMF.Fallback.uniform = { dist := fun (x : σ) => 1, ne_zero := ⋯ }

Instances For

def PMF.Fallback.uniformOn {σ : Type u_1} [Fintype σ] [DecidableEq σ] (S : Finset σ) (hS : S.Nonempty) :

Uniform fallback over a nonempty subset (e.g. words with viable continuations, scene members).

Equations

PMF.Fallback.uniformOn S hS = { dist := fun (x : σ) => if x ∈ S then 1 else 0, ne_zero := ⋯ }

Instances For

Scores #

def PMF.normalizeScores {σ : Type u_1} [Fintype σ] (f : σ → ℚ≥0) (x : σ) :

ℚ≥0

Normalize a score vector into a distribution function (÷0 = 0 gives the zero function on a dead vector) — the ℚ≥0 shadow of PMF.normalize. Studies define each agent of a model tower as one normalizeScores application over the agents below it.

Equations

PMF.normalizeScores f x = f x / ∑ y : σ, f y

Instances For

theorem PMF.normalizeScores_mul_left {σ : Type u_1} [Fintype σ] {c : ℚ≥0} (hc : c ≠ 0) (f : σ → ℚ≥0) :

(normalizeScores fun (x : σ) => c * f x) = normalizeScores f

Normalization is scale-invariant: a common nonzero factor cancels. The reason RSA chains may drop constant factors mid-tower.

def PMF.scoresWith {σ : Type u_1} [Fintype σ] (fb : Fallback σ) (f : σ → ℚ≥0) (x : σ) :

ℚ≥0

The normalized score function both faces read: f normalized when it has mass, else the fallback normalized.

Kernel hygiene: comparisons of scoresWith values reduce under decide +kernel provided the score chain's base tables are pattern matches or Bool-valued tables — a propositional if x = y then … else … over a derived DecidableEq anywhere in the chain blocks kernel reduction of order comparisons (equalities still reduce).

Equations

PMF.scoresWith fb f x = if 0 < ∑ y : σ, f y then PMF.normalizeScores f x else PMF.normalizeScores fb.dist x

Instances For

theorem PMF.scoresWith_sum_eq_one {σ : Type u_1} [Fintype σ] (fb : Fallback σ) (f : σ → ℚ≥0) :

∑ x : σ, scoresWith fb f x = 1

theorem PMF.scoresWith_of_pos {σ : Type u_1} [Fintype σ] (fb : Fallback σ) {f : σ → ℚ≥0} (h : 0 < ∑ y : σ, f y) (x : σ) :

scoresWith fb f x = f x / ∑ y : σ, f y

noncomputable def PMF.ofScores {σ : Type u_1} [Fintype σ] (fb : Fallback σ) (f : σ → ℚ≥0) :

PMF σ

The PMF induced by a score vector and a fallback: pointwise the coercion of scoresWith, so the ℚ≥0 face and the PMF face agree definitionally.

Equations

PMF.ofScores fb f = ⟨fun (x : σ) => ↑↑(PMF.scoresWith fb f x), ⋯⟩

Instances For

theorem PMF.ofScores_apply {σ : Type u_1} [Fintype σ] (fb : Fallback σ) (f : σ → ℚ≥0) (x : σ) :

(ofScores fb f) x = ↑↑(scoresWith fb f x)

The total bridge: ofScores values are exactly the coerced scoresWith values — no hypotheses, by definition.

Comparisons #

Each lemma reduces a PMF-level prediction to a ℚ≥0 inequality between scoresWith values, which decide +kernel closes.

theorem PMF.ofScores_lt_cross {σ : Type u_1} [Fintype σ] {τ : Type u_3} [Fintype τ] (fb₁ : Fallback σ) (fb₂ : Fallback τ) {f : σ → ℚ≥0} {g : τ → ℚ≥0} {a : σ} {b : τ} (h : scoresWith fb₁ f a < scoresWith fb₂ g b) :

(ofScores fb₁ f) a < (ofScores fb₂ g) b

theorem PMF.ofScores_lt {σ : Type u_1} [Fintype σ] (fb : Fallback σ) {f : σ → ℚ≥0} {a b : σ} (h : scoresWith fb f a < scoresWith fb f b) :

(ofScores fb f) a < (ofScores fb f) b

theorem PMF.ofScores_le_cross {σ : Type u_1} [Fintype σ] {τ : Type u_3} [Fintype τ] (fb₁ : Fallback σ) (fb₂ : Fallback τ) {f : σ → ℚ≥0} {g : τ → ℚ≥0} {a : σ} {b : τ} (h : scoresWith fb₁ f a ≤ scoresWith fb₂ g b) :

(ofScores fb₁ f) a ≤ (ofScores fb₂ g) b

theorem PMF.ofScores_eq_cross {σ : Type u_1} [Fintype σ] {τ : Type u_3} [Fintype τ] (fb₁ : Fallback σ) (fb₂ : Fallback τ) {f : σ → ℚ≥0} {g : τ → ℚ≥0} {a : σ} {b : τ} (h : scoresWith fb₁ f a = scoresWith fb₂ g b) :

(ofScores fb₁ f) a = (ofScores fb₂ g) b

Thresholds against rational literals #

theorem PMF.ofScores_lt_ratCast {σ : Type u_1} [Fintype σ] (fb : Fallback σ) {f : σ → ℚ≥0} {a : σ} {t : ℚ≥0} (h : scoresWith fb f a < t) :

(ofScores fb f) a < ↑↑t

theorem PMF.ratCast_lt_ofScores {σ : Type u_1} [Fintype σ] (fb : Fallback σ) {f : σ → ℚ≥0} {a : σ} {t : ℚ≥0} (h : t < scoresWith fb f a) :

↑↑t < (ofScores fb f) a

theorem PMF.ofScores_eq_ratCast {σ : Type u_1} [Fintype σ] (fb : Fallback σ) {f : σ → ℚ≥0} {a : σ} {t : ℚ≥0} (h : scoresWith fb f a = t) :

(ofScores fb f) a = ↑↑t

Chain-rule products #

theorem PMF.prod_ofScores_apply {σ : Type u_1} [Fintype σ] {ι : Type u_3} (s : Finset ι) (fb : ι → Fallback σ) (f : ι → σ → ℚ≥0) (a : ι → σ) :

∏ i ∈ s, (ofScores (fb i) (f i)) (a i) = ↑↑(∏ i ∈ s, scoresWith (fb i) (f i) (a i))

theorem PMF.prod_ofScores_lt {σ : Type u_1} [Fintype σ] {ι : Type u_3} (s : Finset ι) (fb₁ fb₂ : ι → Fallback σ) {f g : ι → σ → ℚ≥0} {a b : ι → σ} (h : ∏ i ∈ s, scoresWith (fb₁ i) (f i) (a i) < ∏ i ∈ s, scoresWith (fb₂ i) (g i) (b i)) :

∏ i ∈ s, (ofScores (fb₁ i) (f i)) (a i) < ∏ i ∈ s, (ofScores (fb₂ i) (g i)) (b i)

Event marginals #

def PMF.eventMass {σ : Type u_1} [Fintype σ] (g : σ → ℚ≥0) (P : σ → Bool) :

ℚ≥0

Mass of a decidable event under a ℚ≥0 weight vector.

Equations

PMF.eventMass g P = ∑ x : σ, if P x = true then g x else 0

Instances For

theorem PMF.ofScores_event_eq {σ : Type u_1} [Fintype σ] (fb : Fallback σ) (f : σ → ℚ≥0) (P : σ → Bool) :

(∑' (x : σ), if P x = true then (ofScores fb f) x else 0) = ↑↑(eventMass (scoresWith fb f) P)

theorem PMF.ofScores_event_lt {σ : Type u_1} [Fintype σ] (fb : Fallback σ) (f : σ → ℚ≥0) {P Q : σ → Bool} (h : eventMass (scoresWith fb f) P < eventMass (scoresWith fb f) Q) :

(∑' (x : σ), if P x = true then (ofScores fb f) x else 0) < ∑' (x : σ), if Q x = true then (ofScores fb f) x else 0

theorem PMF.ofScores_event_lt_cross {σ : Type u_1} [Fintype σ] {τ : Type u_3} [Fintype τ] (fb₁ : Fallback σ) (fb₂ : Fallback τ) (f : σ → ℚ≥0) (g : τ → ℚ≥0) {P : σ → Bool} {Q : τ → Bool} (h : eventMass (scoresWith fb₁ f) P < eventMass (scoresWith fb₂ g) Q) :

(∑' (x : σ), if P x = true then (ofScores fb₁ f) x else 0) < ∑' (y : τ), if Q y = true then (ofScores fb₂ g) y else 0

theorem PMF.ofScores_event_eq_ratCast {σ : Type u_1} [Fintype σ] (fb : Fallback σ) (f : σ → ℚ≥0) (P : σ → Bool) {q : ℚ≥0} (h : eventMass (scoresWith fb f) P = q) :

(∑' (x : σ), if P x = true then (ofScores fb f) x else 0) = ↑↑q

Exact-value form: an event mass that is a rational literal on the ℚ≥0 face is that literal on the PMF face.