Documentation

Linglib.Core.Constraint.Decoder

Decoders — From Scored Candidates to Probability Distributions #

@cite{mcfadden-1974}

A Decoder is the choice component of a constraint-based grammar. Given a finite set of candidates, each carrying a score, a decoder returns a probability distribution over those candidates:

  scored candidates  --[Decoder]-->  P : Cand → ℝ

Different constraint frameworks correspond to different decoders. This table shows how the major OT/HG-family frameworks all instantiate the same Decoder interface, varying only the score type and the choice rule:

FrameworkScore typeDecoder
OTLexProfile Nat nargmin (uniform over winners)
HG (harmony)argmax (uniform over winners)
MaxEnt (harmony)softmax (Gumbel-noise argmax)
Normal MaxEnt (harmony)probit (Gaussian-noise argmax)
NHG (harmony)weight-perturbed argmax
Stochastic OTLexProfile Nat nranking-perturbed argmin

The unifying principle is McFadden's Random Utility Model (@cite{mcfadden-1974}): every stochastic decoder is "argmax after a noise kernel applied to the scores". The noise distribution determines the choice rule (Gumbel ⇒ softmax, Gaussian ⇒ probit, point-mass ⇒ deterministic argmax). Noise-kernel-based decoders live in a sibling file; here we just expose the Decoder interface and the three foundational instances.

Zero-temperature bridge #

softmaxDecoder α interpolates between MaxEnt-style soft optimization and OT-style hard optimization: as α → ∞, the softmax distribution concentrates on the unique maximizer (when one exists), recovering argmaxDecoder. This is the formal statement of the OT-as-MaxEnt-limit correspondence and is captured by softmax_argmax_limit in Core.Agent.RationalAction.

Semiring connection (deferred) #

For deterministic (point-mass noise) decoders, the resulting algebraic structure on scores is a semiring:

The zero-temperature limit softmax → argmax is precisely the semiring homomorphism log-sum-exp → max in the limit α → ∞. A separate Semiring.lean file will make these bridges precise.

structure Core.Constraint.Decoder (Cand : Type u_1) (Score : Type u_2) :
Type (max u_1 u_2)

A decoder maps scored candidates to a probability distribution.

Decoder Cand Score packages a single function: given a finite candidate set and a Score-valued evaluation, return a real-valued probability for each candidate.

The structure does not enforce that outputs are non-negative or sum to 1 — those are properties of well-behaved decoders, captured by IsProbDecoder below. Keeping the structure plain makes it easy to define non-normalized scoring functions (e.g., raw exp(score) rather than full softmax) and prove their probabilistic properties separately.

  • decode : Finset Cand(CandScore)Cand

    Decode a finite scored candidate set into a probability assignment.

Instances For
    class Core.Constraint.Decoder.IsProb {Cand : Type u_1} {Score : Type u_2} (d : Decoder Cand Score) :

    A decoder is a probability decoder when its outputs are non-negative and sum to 1 over any non-empty candidate set.

    • nonneg (cands : Finset Cand) (score : CandScore) (c : Cand) : 0 d.decode cands score c

      All decoded probabilities are non-negative.

    • sum_eq_one {cands : Finset Cand} : cands.Nonempty∀ (score : CandScore), ccands, d.decode cands score c = 1

      Decoded probabilities sum to 1 over any non-empty candidate set.

    Instances
      noncomputable def Core.Constraint.softmaxDecoder {Cand : Type u_1} (α : ) :
      Decoder Cand

      The MaxEnt decoder: softmax over real-valued scores at temperature α.

      decode cands score c = exp(α · score(c)) / Σ_{c' ∈ cands} exp(α · score(c')) when c ∈ cands, and 0 otherwise.

      Equivalent to Core.softmax from RationalAction.lean but generalised from Fintype to a finite subset cands of a possibly-infinite type.

      Equations
      • One or more equations did not get rendered due to their size.
      Instances For
        noncomputable def Core.Constraint.argmaxDecoder {Cand : Type u_1} {Score : Type u_2} [LinearOrder Score] :
        Decoder Cand Score

        The HG-style decoder: uniform distribution over the maximizers of score on cands. Deterministic (Dirac on the unique winner) when the maximum is achieved by exactly one candidate.

        Used by deterministic HG (with Score = ℝ and score = harmonyScoreR) and is the α → ∞ limit of softmaxDecoder (see softmax_argmax_limit).

        Equations
        • One or more equations did not get rendered due to their size.
        Instances For
          noncomputable def Core.Constraint.argminDecoder {Cand : Type u_1} {Score : Type u_2} [LinearOrder Score] :
          Decoder Cand Score

          The OT-style decoder: uniform distribution over the minimizers of score on cands.

          For Optimality Theory, instantiate with Score = LexProfile Nat n, so that the lexicographic minimum on the violation-count profile selects the optimal candidate(s). With a strict total ranking and a single optimum, this is a Dirac on the OT winner.

          Equations
          • One or more equations did not get rendered due to their size.
          Instances For
            instance Core.Constraint.softmaxDecoder_isProb {Cand : Type u_1} (α : ) :

            The softmax decoder is a proper probability decoder: outputs are non-negative and sum to 1 over any non-empty candidate set.

            instance Core.Constraint.argmaxDecoder_isProb {Cand : Type u_1} {Score : Type u_2} [LinearOrder Score] :

            The argmax decoder is a proper probability decoder. The non-trivial part is that the set of winners is non-empty when cands is, so the uniform distribution 1/winners.card makes sense.

            instance Core.Constraint.argminDecoder_isProb {Cand : Type u_1} {Score : Type u_2} [LinearOrder Score] :

            The argmin decoder is a proper probability decoder. Symmetric to the argmax case.