Documentation

Linglib.Theories.Processing.LanguageModel.Basic

Language Model as a Markov Kernel #

An autoregressive language model over a vocabulary Voc is a Markov kernel List Voc → PMF (Option Voc) from contexts to next-symbol distributions, with none denoting end-of-string.

This is the smallest cross-cutting primitive shared by everything in Theories/Processing/: surprisal-based theories, the IAS family (@cite{giulianelli-etal-2026}), and any downstream measure that wants a unified notion of "the LM's predictive distribution at a context".

PMF is mathlib's probability monad over a (countable) type (PMF α := { f : α → ℝ≥0∞ // HasSum f 1 }); using it here gives the language-model layer the canonical Markov-kernel typing without imposing [Fintype Voc] (vocabularies like String or token streams need not be finite).

Main definitions #

An autoregressive language model over a vocabulary Voc, expressed as a Markov kernel from contexts to next-symbol distributions. A draw of none denotes end-of-string.

  • next : List VocPMF (Option Voc)

    Conditional distribution over Option Voc given a context.

Instances For
    def Theories.Processing.LanguageModel.LangModel.nextProb {Voc : Type u_1} (lm : LangModel Voc) (c : List Voc) (w : Voc) :
    ENNReal

    Conditional probability of the next symbol w given context c.

    Equations
    Instances For
      noncomputable def Theories.Processing.LanguageModel.LangModel.surprisal {Voc : Type u_1} (lm : LangModel Voc) (c : List Voc) (w : Voc) :

      Surprisal of the next symbol w given context c (in nats). This is the classical Shannon information content @cite{levy-2008}.

      Equations
      Instances For