Discriminative Lexicon Model — substrate #

@cite{baayen-2019} @cite{heitmeier-chuang-baayen-2026} @cite{chuang-bell-tseng-baayen-2026} @cite{lu-chuang-baayen-2026}

The Discriminative Lexicon Model (DLM) is a theory of lexical processing in which form ↔ meaning relations are mediated by error- driven networks rather than by stored lexical entries (@cite{baayen-2019}). This file lifts the DLM substrate out of the two paper-anchored Studies files that introduced it (Phenomena/Tone/Studies/ChuangEtAl2026.lean, Phenomena/Tone/Studies/LuChuangBaayen2026.lean) into a shared location. The substrate has graduated to Theories/ per CLAUDE.md's ≥ 2-consumer rule.

Architectural placement: Processing/Lexical, not Lexicon #

The DLM is housed under Theories/Processing/Lexical/ rather than under a hypothetical Theories/Lexicon/ for a substantive reason: the DLM denies that there is a separable lexicon to theorize about — its central architectural commitment is that the lexicon-as- stored-inventory does not exist; only the connection weights of comprehension and production networks do (@cite{baayen-2019}, abstract; see also the discussion in @cite{lu-chuang-baayen-2026} §5 on word-and-paradigm processing). Filing the DLM under "Lexicon" would commit linglib to the very architecture the model rejects. By contrast, the DLM is unambiguously a processing theory in the sense of Theories/Processing/: it specifies how form vectors map to meaning vectors and back, with the network weights as the operative computational object.

The Lexical/ parent groups DLM with future lexical-processing sibling theories (e.g. WEAVER++ lexical-access models, Dell-style spreading-activation, race / dual-route models, exemplar-based lexical access). All such theories share the architectural concern of how form ↔ meaning links are processed; they may disagree on whether items are stored.

This means Theories/Phonology/ItemSpecificity/ (frequency channels parameterised by a stored lexicon: @cite{zuraw-2000}, @cite{pater-2010}, @cite{coetzee-pater-2008}, @cite{moore-cantwell-2021}) and Theories/Morphology/UsageBased/Network.lean (Bybee 1985 dynamic network: @cite{bybee-1985}) intentionally do not sit alongside DLM here — they make positive lexicon-architecture commitments and belong with the linguistic level whose generalisations they primarily explain.

What is in this file #

LinearDiscriminativeLexicon R F M — a pair of LinearMaps between form and meaning vector spaces; carrier-typed (mathlib-canonical), applies to any [Module R F]/[Module R M] pair.
FormVec/MeaningVec — convenience Fin n → ℝ abbrevs for the Studies-file specialisation.
linear_dlm_distinguishes_meanings, dlm_neutralizes_meanings_in_kernel — sister structural theorems derived from map_sub + sub_ne_zero / sub_eq_zero. The kernel of the production map is the locus of pairs the DLM cannot distinguish in form — load-bearing for both consumer studies (Chuang's discrimination claim, Lu's tone-sandhi neutralization claim).
broadcastFirstCoord — a non-trivial LinearMap witness for the architectural-capacity existence theorem; reusable by consumers.
linear_dlm_admits_meaning_specific_contours — for any positive form/meaning dimensions, the substrate admits a DLM that distinguishes some meaning pair.

Out of scope here #

ResLDL (linear + nonlinear residual): the @cite{heitmeier-chuang-baayen-2026} extension of the DLM. A separate sibling file (ResidualDiscriminative.lean) would extend LinearDiscriminativeLexicon with a residual nonlinear component once a Studies file consumes it.
EuclideanSpace adoption / Lipschitz / inner-product theorems: the carriers here are bare Fin n → ℝ for minimal mathlib-import footprint. A future sibling Normed.lean will host EuclideanSpace ℝ (Fin n)-typed wrappers and the Lipschitz/norm- preservation theorems that go with the paper's "centroid neighbours → contour neighbours" finding.
No-stored-representations as a Lean theorem: the architectural commitment lives at the level of the structure definition (only two LinearMap fields, no entries-typed projection). Any "the DLM has no entries" theorem would need to negate an entries projection the substrate deliberately doesn't have.

source

structure Theories.Processing.Lexical.Discriminative.LinearDiscriminativeLexicon (R : Type u_1) (F : Type u_2) (M : Type u_3) [Semiring R] [AddCommMonoid F] [AddCommMonoid M] [Module R F] [Module R M] :

Type (max u_2 u_3)

The Linear Discriminative Lexicon Model (LDL) — a pair of linear maps between form and meaning vector spaces.

The "L" in LDL stands for Linear; we bake that into the type by using mathlib's LinearMap. Carrier-typed: applies to any [Module R F]/[Module R M] pair, not just Fin n → ℝ.

There are exactly two fields, both linear maps. There is no entries : List ... field, no frequency : ... → ℝ field. The "lexicon" of an LDL is its two maps; nothing else is stored. See the module docstring's "Architectural placement" note for the substantive reason this is housed under Processing/.

comprehension : F →ₗ[R] M
production : M →ₗ[R] F

Instances For

source

@[reducible, inline]

abbrev Theories.Processing.Lexical.Discriminative.FormVec (formDim : ℕ) :

Type

A formDim-dimensional form vector over ℝ, indexed by Fin formDim. Used by both consumer studies (Chuang 2026: formDim = 50; Lu 2026: formDim = 100) for pitch contours sampled at uniform points on a normalised time axis. The substrate is dimension- polymorphic; the specific dimension is paper-specific.

Equations

Theories.Processing.Lexical.Discriminative.FormVec formDim = (Fin formDim → ℝ)

Instances For

source

@[reducible, inline]

abbrev Theories.Processing.Lexical.Discriminative.MeaningVec (meaningDim : ℕ) :

Type

A meaningDim-dimensional meaning vector over ℝ. Used by both consumer studies for context-specific contextualised embeddings (768-dim CKIP GPT-2 in both papers).

Equations

Theories.Processing.Lexical.Discriminative.MeaningVec meaningDim = (Fin meaningDim → ℝ)

Instances For

source

theorem Theories.Processing.Lexical.Discriminative.linear_dlm_distinguishes_meanings {R : Type u_1} {F : Type u_2} {M : Type u_3} [Semiring R] [AddCommGroup F] [AddCommGroup M] [Module R F] [Module R M] (D : LinearDiscriminativeLexicon R F M) {e₁ e₂ : M} (h : D.production (e₁ - e₂) ≠ 0) :

D.production e₁ ≠ D.production e₂

Structural meaning-discrimination lemma. For any LinearDiscriminativeLexicon, if the production map sends the difference of two meanings to a non-zero vector, then the meanings map to distinct forms.

Derived from map_sub + sub_ne_zero — no DLM-specific reasoning; the linearity of the production map does all the work.

source

theorem Theories.Processing.Lexical.Discriminative.dlm_neutralizes_meanings_in_kernel {R : Type u_1} {F : Type u_2} {M : Type u_3} [Semiring R] [AddCommGroup F] [AddCommGroup M] [Module R F] [Module R M] (D : LinearDiscriminativeLexicon R F M) {e₁ e₂ : M} (h : D.production (e₁ - e₂) = 0) :

D.production e₁ = D.production e₂

Structural meaning-neutralization lemma (sister of linear_dlm_distinguishes_meanings). For any LinearDiscriminativeLexicon, if the production map sends the difference of two meanings to zero, then the meanings map to identical forms.

Architectural significance: the kernel of the production map is the neutralization locus — meaning-pairs the DLM cannot distinguish in form. When two empirically-distinct meanings have form identity, that identity follows from their meaning-difference landing in ker production, not from a stipulated rule. The Lu, Chuang & Baayen 2026 study applies this to Taiwan Mandarin T3 tone sandhi (Phenomena/Tone/Studies/LuChuangBaayen2026.lean): T3-T3 and T2-T3 word centroids in CE space differ only by a kernel-element of the DLM's trained production map, so their surface contours coincide — no phonological sandhi rule invoked.

Derived from map_sub + sub_eq_zero, dual to the discrimination lemma.

source

def Theories.Processing.Lexical.Discriminative.broadcastFirstCoord {n d : ℕ} (hd : 0 < d) :

MeaningVec d →ₗ[ℝ] FormVec n

The "broadcast first coordinate" linear map: e ↦ (i ↦ e[0]). The simplest non-trivial LinearMap from MeaningVec d to FormVec n for d, n ≥ 1. Used as a witness for the architectural-capacity existence theorem and re-used by consumer studies (e.g. LuChuangBaayen2026.dlm_dialect_flexibility) to construct concrete non-degenerate DLMs without re-implementing it.

Equations

One or more equations did not get rendered due to their size.

Instances For

source

@[simp]

theorem Theories.Processing.Lexical.Discriminative.broadcastFirstCoord_apply {n d : ℕ} (hd : 0 < d) (e : MeaningVec d) (i : Fin n) :

(broadcastFirstCoord hd) e i = e ⟨0, hd⟩

Evaluation of broadcastFirstCoord at any sample index returns the input's coordinate-0 value. @[simp] so consumer simp-only calls can drop the LinearMap.coe_mk, AddHom.coe_mk litany.

source

theorem Theories.Processing.Lexical.Discriminative.linear_dlm_admits_meaning_specific_contours {n d : ℕ} (hn : 0 < n) (hd : 0 < d) :

∃ (D : LinearDiscriminativeLexicon ℝ (FormVec n) (MeaningVec d)) (e₁ : MeaningVec d) (e₂ : MeaningVec d), e₁ ≠ e₂ ∧ D.production e₁ ≠ D.production e₂

Architectural capacity theorem. For any positive form and meaning dimensions, the LinearDiscriminativeLexicon substrate admits a DLM and two distinct meanings whose production-map images differ. The substrate is non-vacuously capable of meaning discrimination.

The empirical claim of the consumer papers (that the trained network distinguishes empirically most meaning pairs in the corpus) is paper-supplied; this theorem establishes only that the architecture has the structural capacity for such discrimination.

Documentation

Linglib.Theories.Processing.Lexical.Discriminative.Defs

Discriminative Lexicon Model — substrate #

Architectural placement: Processing/Lexical, not Lexicon #

What is in this file #

Out of scope here #