Generalised Surprisal Configuration #
@cite{giulianelli-etal-2026}
Enum-level configuration for the generalised surprisal family. The
real-valued semantics of these enum tags lives in IAS.lean; this file
just enumerates the parameter axes.
A generalised surprisal model has four parameters:
- A warping function f mapping expected scores to processing measures
- A scoring function g measuring how well alternatives match the target
- A forecast horizon h: how many future symbols are considered
- A representational level: the abstraction at which alternatives are compared
Standard surprisal is the special case (negLog, indicator, 1, predictive). Incremental information value is the family (identity, distance, h, l).
Scope note #
Per linglib's processing-library scope (CLAUDE.md): this file formalizes the parameter space of a processing-theory family. It does not formalize psycholinguistic measurement instruments (N400, P600, RT, cloze, etc.) or empirical-fit tables — those are out of scope. Per-paper empirical findings about which (h, l) configuration best predicts which measure live in study-file docstring prose with citations, not as Lean theorems.
Main definitions #
SurprisalConfig: Complete generalised surprisal parameter tuplestandardSurprisal: The configuration corresponding to @cite{levy-2008}informationValue: The IAS configuration at a given (horizon, level)ias_recovers_surprisal: Standard surprisal is a special case of IAS
Equations
- Theories.Processing.PredictiveUncertainty.instDecidableEqWarpingFn x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Instances For
Scoring functions measuring prediction accuracy. g(a, w, c) evaluates alternative a against target w in context c.
- indicator : ScoringFn
𝟙{w ≤ a}: binary prefix match. With negLog → standard surprisal.
- distance : ScoringFn
d_r(a, w): representational distance. With identity → information value.
- similarity : ScoringFn
sim(r(a), r(w)): semantic similarity. @cite{meister-giulianelli-pimentel-2024}
Instances For
Equations
- Theories.Processing.PredictiveUncertainty.instDecidableEqScoringFn x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Instances For
Forecast horizon: how many future symbols each alternative spans. h = 1 is standard surprisal's implicit horizon (next word only).
Equations
Instances For
Representational level at which predictions are evaluated.
These tags name layers of abstraction — the kind of representational space in which alternatives are compared. They are not claims about specific layers of any particular neural network.
- lexical : RepLevel
Decontextualised lexical identity (token / embedding)
- shallowSyntactic : RepLevel
Shallow syntactic structure (linear order, POS)
- syntactic : RepLevel
Compositional syntactic structure
- semantic : RepLevel
Fully contextualised semantic content
- predictive : RepLevel
Predictive distribution over next symbols
Instances For
Equations
- Theories.Processing.PredictiveUncertainty.instDecidableEqRepLevel x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Instances For
A generalised surprisal model: the complete parameter set for a specific processing measure.
- warp : WarpingFn
- scoring : ScoringFn
- horizon : ForecastHorizon
- level : RepLevel
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Standard surprisal: −log P(next word). @cite{levy-2008} @cite{smith-levy-2013}
Equations
- One or more equations did not get rendered due to their size.
Instances For
Incremental information value at temporal-representational resolution (h, l). @cite{giulianelli-etal-2026}
Equations
- One or more equations did not get rendered due to their size.
Instances For
Standard surprisal is IAS at horizon 1 with predictive-level representation and negLog/indicator replacing identity/distance. Subsumption by construction.