Lossy-Context Surprisal: Bridge to Classical Surprisal #

@cite{futrell-gibson-levy-2020}

MemoryProcess (in Basic.lean) is the abstract type underlying the lossy-context surprisal model of @cite{futrell-gibson-levy-2020} ("Lossy-Context Surprisal: An Information-Theoretic Model of Memory Effects in Sentence Processing", Cog Sci 44, e12814).

This file proves the paper's §3.5.1 reduction: when the memory encoder loses no information (a Dirac at some history-summarising function f), expected surprisal collapses to classical surprisal under the language model induced by predict ∘ f.

In the paper's framing (§3.5.1), surprisal theory becomes a definitional special case of lossy-context surprisal: the encoder M giving a lossless representation of context (a Dirac at the true context, or any deterministic summary) recovers Shannon's −log p(w | c).

Main definitions #

MemoryProcess.IsLosslessFor — the process exactly realises an LM
LangModel.virtualLM — the LM induced by composing predict with a history-summarising function

Main theorem #

expectedSurprisal_eq_surprisal_of_lossless — lossless mp of lm yields mp.expectedSurprisal = lm.surprisal

source

def Theories.Processing.NoisyChannel.MemoryProcess.IsLosslessFor {Voc : Type u_1} {Mem : Type u_2} (mp : MemoryProcess Voc Mem) (lm : LanguageModel.LangModel Voc) :

Prop

A memory process is lossless for a language model lm if some deterministic history-summary f makes the encoder a Dirac at f c and the predictor's distribution at f c equal to lm.next c.

(@cite{futrell-gibson-levy-2020} §3.5.1: this is the "perfect memory" regime in which lossy-context surprisal collapses to classical surprisal.)

Equations

mp.IsLosslessFor lm = ∃ (f : List Voc → Mem), mp.IsDirac f ∧ ∀ (c : List Voc), mp.predict (f c) = lm.next c

Instances For

source

theorem Theories.Processing.NoisyChannel.MemoryProcess.expectedSurprisal_eq_surprisal_of_lossless {Voc : Type u_1} {Mem : Type u_2} {mp : MemoryProcess Voc Mem} {lm : LanguageModel.LangModel Voc} (h : mp.IsLosslessFor lm) (c : List Voc) (w : Voc) :

mp.expectedSurprisal c w = lm.surprisal c w

Lossless reduction (§3.5.1). A memory process that is lossless for lm produces exactly the classical surprisal of lm. Lossy-context surprisal generalises surprisal — it does not replace it.

Reading: when no information is lost in encoding, the integral over memory states in expectedSurprisal (Eq. 3) degenerates to a single deterministic prediction, recovering Shannon's -log p(w | c).

source

def Theories.Processing.NoisyChannel.MemoryProcess.virtualLM {Voc : Type u_1} {Mem : Type u_2} (mp : MemoryProcess Voc Mem) (f : List Voc → Mem) :

LanguageModel.LangModel Voc

The "virtual" language model induced by a deterministic history-summary f and a memory predictor predict.

Equations

mp.virtualLM f = { next := fun (c : List Voc) => mp.predict (f c) }

Instances For

source

theorem Theories.Processing.NoisyChannel.MemoryProcess.isLosslessFor_virtualLM {Voc : Type u_1} {Mem : Type u_2} (mp : MemoryProcess Voc Mem) {f : List Voc → Mem} (h : mp.IsDirac f) :

mp.IsLosslessFor (mp.virtualLM f)

A memory process is lossless for its own virtual LM whenever the encoder is a Dirac at f. This is the "construction-side" complement of expectedSurprisal_eq_surprisal_of_lossless: any Dirac encoder does realise some LM, namely virtualLM.

source

theorem Theories.Processing.NoisyChannel.MemoryProcess.expectedSurprisal_eq_virtualLM_surprisal {Voc : Type u_1} {Mem : Type u_2} {mp : MemoryProcess Voc Mem} {f : List Voc → Mem} (h : mp.IsDirac f) (c : List Voc) (w : Voc) :

mp.expectedSurprisal c w = (mp.virtualLM f).surprisal c w

Lossless ⇒ classical surprisal of the virtual LM. Any Dirac memory process realises classical surprisal under its induced language model. This is the reduction in its purely structural form, without an external lm parameter.

Documentation

Linglib.Theories.Processing.NoisyChannel.LossyContext

Lossy-Context Surprisal: Bridge to Classical Surprisal #

Main definitions #

Main theorem #