Log-Sum-Exp: The Warped Semiring Bridging MaxEnt and Max-Plus #

@cite{cohen-smith-zaidan-2008} @cite{eisner-2002}

The log-sum-exp operator at temperature α > 0, lse α a b := (1/α) · log(exp(α·a) + exp(α·b)), is the additive operator of a commutative ("warped") semiring on ℝ whose multiplicative operator is +. Three things to know:

Algebra: it satisfies commutativity, associativity, and the semiring distributivity d + (a ⊕_α b) = (d + a) ⊕_α (d + b) for all α ≠ 0. Identity is -∞ (omitted here; see WithBot ℝ for the typeclass-level packaging).
Limit (in LogSumExp/Limit.lean): as α → ∞, lse α a b → max a b.
Bridge to softmax (in Decoder.lean extensions): the softmax probability admits the partition-function form softmax α s c = exp(α · (s c - lseFinset α s f)), which collapses the softmax → argmax limit to the lse → max limit.

Important caveat #

LSE is not idempotent at finite α — lse α a a = a + log 2 / α. So while it is a commutative semiring, it is not a Kleene algebra. Idempotency is exactly what is recovered in the α → ∞ limit, and is what makes max-plus suitable for shortest-path style computation. The "warp" terminology emphasises this deviation from idempotence: lse α only behaves like max up to a temperature-dependent slack.

source

noncomputable def Core.Constraint.lse (α a b : ℝ) :

ℝ

Log-sum-exp at temperature α: lse α a b := (1/α) · log(exp(α·a) + exp(α·b)).

Equations

Core.Constraint.lse α a b = 1 / α * Real.log (Real.exp (α * a) + Real.exp (α * b))

Instances For

source

theorem Core.Constraint.exp_alpha_lse (α : ℝ) (hα : α ≠ 0) (a b : ℝ) :

Real.exp (α * lse α a b) = Real.exp (α * a) + Real.exp (α * b)

Cancellation identity (α ≠ 0): exp(α · lse α a b) = exp(α·a) + exp(α·b).

This is the key building block — every algebraic identity for lse reduces to an identity inside the log, accessed by exponentiating both sides and using this lemma.

source

theorem Core.Constraint.lse_comm (α a b : ℝ) :

lse α a b = lse α b a

LSE is commutative.

source

theorem Core.Constraint.lse_assoc (α : ℝ) (hα : α ≠ 0) (a b c : ℝ) :

lse α (lse α a b) c = lse α a (lse α b c)

LSE is associative (for α ≠ 0).

Proof: both sides equal (1/α) · log(exp(α·a) + exp(α·b) + exp(α·c)) by the cancellation identity exp_alpha_lse.

source

theorem Core.Constraint.add_lse (α : ℝ) (hα : α ≠ 0) (d a b : ℝ) :

d + lse α a b = lse α (d + a) (d + b)

The semiring distributivity law: + distributes over lse α (for α ≠ 0).

Equivalently, multiplication by exp(α·d) distributes over the sum inside the log. This is the "multiplicative" axiom of the warped semiring (ℝ, lse α, +) — the analogue of d * (a + b) = d*a + d*b.

source

theorem Core.Constraint.lse_self (α : ℝ) (hα : α ≠ 0) (a : ℝ) :

lse α a a = a + Real.log 2 / α

The non-idempotency identity: lse α a a = a + log 2 / α for α ≠ 0.

This is the precise statement that LSE is NOT a tropical/max-plus operation at finite α — there is a log 2 / α "slack" that vanishes only in the α → ∞ limit.

source

noncomputable def Core.Constraint.lseFinset {C : Type u_1} (α : ℝ) (s : Finset C) (f : C → ℝ) :

ℝ

The n-ary log-sum-exp over a Finset: lseFinset α s f := (1/α) · log(∑ c ∈ s, exp(α · f c)).

Equals (1/α) · log Z where Z = ∑ exp(α · f c) is the MaxEnt partition function. The softmax probability of c ∈ s then has the clean form exp(α · (f c - lseFinset α s f)).

Equations

Core.Constraint.lseFinset α s f = 1 / α * Real.log (∑ c ∈ s, Real.exp (α * f c))

Instances For

source

theorem Core.Constraint.exp_alpha_lseFinset {C : Type u_1} (α : ℝ) (hα : α ≠ 0) {s : Finset C} (hne : s.Nonempty) (f : C → ℝ) :

Real.exp (α * lseFinset α s f) = ∑ c ∈ s, Real.exp (α * f c)

Cancellation: exp(α · lseFinset α s f) = ∑ c ∈ s, exp(α · f c) when α ≠ 0 and s is non-empty.

source

theorem Core.Constraint.lseFinset_singleton {C : Type u_1} (α : ℝ) (hα : α ≠ 0) (c : C) (f : C → ℝ) :

lseFinset α {c} f = f c

The singleton case: lseFinset α {c} f = f c.

Witnesses that the n-ary lse degenerates to the underlying score on a single-candidate set — there is nothing to aggregate.

Documentation

Linglib.Core.Constraint.Dequantization.LogSumExp.Basic

Log-Sum-Exp: The Warped Semiring Bridging MaxEnt and Max-Plus #

Important caveat #