Decoders — From Scored Candidates to Probability Distributions #

@cite{mcfadden-1974}

A Decoder is the choice component of a constraint-based grammar. Given a finite set of candidates, each carrying a score, a decoder returns a probability distribution over those candidates:

  scored candidates  --[Decoder]-->  P : Cand → ℝ

Different constraint frameworks correspond to different decoders. This table shows how the major OT/HG-family frameworks all instantiate the same Decoder interface, varying only the score type and the choice rule:

Framework	Score type	Decoder
OT	`LexProfile Nat n`	argmin (uniform over winners)
HG	`ℝ` (harmony)	argmax (uniform over winners)
MaxEnt	`ℝ` (harmony)	softmax (Gumbel-noise argmax)
Normal MaxEnt	`ℝ` (harmony)	probit (Gaussian-noise argmax)
NHG	`ℝ` (harmony)	weight-perturbed argmax
Stochastic OT	`LexProfile Nat n`	ranking-perturbed argmin

The unifying principle is McFadden's Random Utility Model (@cite{mcfadden-1974}): every stochastic decoder is "argmax after a noise kernel applied to the scores". The noise distribution determines the choice rule (Gumbel ⇒ softmax, Gaussian ⇒ probit, point-mass ⇒ deterministic argmax). Noise-kernel-based decoders live in a sibling file; here we just expose the Decoder interface and the three foundational instances.

Zero-temperature bridge #

softmaxDecoder α interpolates between MaxEnt-style soft optimization and OT-style hard optimization: as α → ∞, the softmax distribution concentrates on the unique maximizer (when one exists), recovering argmaxDecoder. This is the formal statement of the OT-as-MaxEnt-limit correspondence and is captured by softmax_argmax_limit in Core.Agent.RationalAction.

Semiring connection (deferred) #

For deterministic (point-mass noise) decoders, the resulting algebraic structure on scores is a semiring:

argmaxDecoder over ℝ ↔ max-plus semiring
argminDecoder over Tropical R ↔ min-plus (tropical) semiring
softmaxDecoder over ℝ ↔ log-sum-exp ("warped") semiring

The zero-temperature limit softmax → argmax is precisely the semiring homomorphism log-sum-exp → max in the limit α → ∞. A separate Semiring.lean file will make these bridges precise.

source

structure Core.Constraint.Decoder (Cand : Type u_1) (Score : Type u_2) :

Type (max u_1 u_2)

A decoder maps scored candidates to a probability distribution.

Decoder Cand Score packages a single function: given a finite candidate set and a Score-valued evaluation, return a real-valued probability for each candidate.

The structure does not enforce that outputs are non-negative or sum to 1 — those are properties of well-behaved decoders, captured by IsProbDecoder below. Keeping the structure plain makes it easy to define non-normalized scoring functions (e.g., raw exp(score) rather than full softmax) and prove their probabilistic properties separately.

decode : Finset Cand → (Cand → Score) → Cand → ℝ
Decode a finite scored candidate set into a probability assignment.

Instances For

source

class Core.Constraint.Decoder.IsProb {Cand : Type u_1} {Score : Type u_2} (d : Decoder Cand Score) :

Prop

A decoder is a probability decoder when its outputs are non-negative and sum to 1 over any non-empty candidate set.

nonneg (cands : Finset Cand) (score : Cand → Score) (c : Cand) : 0 ≤ d.decode cands score c
All decoded probabilities are non-negative.
sum_eq_one {cands : Finset Cand} : cands.Nonempty → ∀ (score : Cand → Score), ∑ c ∈ cands, d.decode cands score c = 1
Decoded probabilities sum to 1 over any non-empty candidate set.

Instances

source

noncomputable def Core.Constraint.softmaxDecoder {Cand : Type u_1} (α : ℝ) :

Decoder Cand ℝ

The MaxEnt decoder: softmax over real-valued scores at temperature α.

decode cands score c = exp(α · score(c)) / Σ_{c' ∈ cands} exp(α · score(c')) when c ∈ cands, and 0 otherwise.

Equivalent to Core.softmax from RationalAction.lean but generalised from Fintype to a finite subset cands of a possibly-infinite type.

Equations

One or more equations did not get rendered due to their size.

Instances For

source

noncomputable def Core.Constraint.argmaxDecoder {Cand : Type u_1} {Score : Type u_2} [LinearOrder Score] :

Decoder Cand Score

The HG-style decoder: uniform distribution over the maximizers of score on cands. Deterministic (Dirac on the unique winner) when the maximum is achieved by exactly one candidate.

Used by deterministic HG (with Score = ℝ and score = harmonyScoreR) and is the α → ∞ limit of softmaxDecoder (see softmax_argmax_limit).

Equations

One or more equations did not get rendered due to their size.

Instances For

source

noncomputable def Core.Constraint.argminDecoder {Cand : Type u_1} {Score : Type u_2} [LinearOrder Score] :

Decoder Cand Score

The OT-style decoder: uniform distribution over the minimizers of score on cands.

For Optimality Theory, instantiate with Score = LexProfile Nat n, so that the lexicographic minimum on the violation-count profile selects the optimal candidate(s). With a strict total ranking and a single optimum, this is a Dirac on the OT winner.

Equations

One or more equations did not get rendered due to their size.

Instances For

source

instance Core.Constraint.softmaxDecoder_isProb {Cand : Type u_1} (α : ℝ) :

(softmaxDecoder α).IsProb

The softmax decoder is a proper probability decoder: outputs are non-negative and sum to 1 over any non-empty candidate set.

source

instance Core.Constraint.argmaxDecoder_isProb {Cand : Type u_1} {Score : Type u_2} [LinearOrder Score] :

argmaxDecoder.IsProb

The argmax decoder is a proper probability decoder. The non-trivial part is that the set of winners is non-empty when cands is, so the uniform distribution 1/winners.card makes sense.

source

instance Core.Constraint.argminDecoder_isProb {Cand : Type u_1} {Score : Type u_2} [LinearOrder Score] :

argminDecoder.IsProb

The argmin decoder is a proper probability decoder. Symmetric to the argmax case.

Documentation

Linglib.Core.Constraint.Decoder

Decoders — From Scored Candidates to Probability Distributions #

Zero-temperature bridge #

Semiring connection (deferred) #