Documentation

Linglib.Processing.Expectation.LanguageModel

Language Model as a Markov Kernel #

An autoregressive language model over a vocabulary Voc is a Markov kernel List Voc → PMF (Option Voc) from contexts to next-symbol distributions, with none denoting end-of-string.

This is the smallest cross-cutting primitive shared by everything in Processing/: surprisal-based theories, the IAS family ([GWCF26]), and any downstream measure that wants a unified notion of "the LM's predictive distribution at a context".

PMF is mathlib's probability monad over a (countable) type (PMF α := { f : α → ℝ≥0∞ // HasSum f 1 }); using it here gives the language-model layer the canonical Markov-kernel typing without imposing [Fintype Voc] (vocabularies like String or token streams need not be finite).

Main definitions #

structure Processing.LanguageModel.LangModel (Voc : Type u_1) :
Type u_1

An autoregressive language model over a vocabulary Voc, expressed as a Markov kernel from contexts to next-symbol distributions. A draw of none denotes end-of-string.

  • next : List VocPMF (Option Voc)

    Conditional distribution over Option Voc given a context.

Instances For
    def Processing.LanguageModel.LangModel.nextProb {Voc : Type u_1} (lm : LangModel Voc) (c : List Voc) (w : Voc) :
    ENNReal

    Conditional probability of the next symbol w given context c.

    Equations
    Instances For
      noncomputable def Processing.LanguageModel.LangModel.surprisal {Voc : Type u_1} (lm : LangModel Voc) (c : List Voc) (w : Voc) :

      Surprisal of the next symbol w given context c (in nats). This is the classical Shannon information content [Lev08].

      Equations
      Instances For