Documentation

Linglib.Theories.Processing.Lexical.Discriminative.Defs

Discriminative Lexicon Model — substrate #

@cite{baayen-2019} @cite{heitmeier-chuang-baayen-2026} @cite{chuang-bell-tseng-baayen-2026} @cite{lu-chuang-baayen-2026}

The Discriminative Lexicon Model (DLM) is a theory of lexical processing in which form ↔ meaning relations are mediated by error- driven networks rather than by stored lexical entries (@cite{baayen-2019}). This file lifts the DLM substrate out of the two paper-anchored Studies files that introduced it (Phenomena/Tone/Studies/ChuangEtAl2026.lean, Phenomena/Tone/Studies/LuChuangBaayen2026.lean) into a shared location. The substrate has graduated to Theories/ per CLAUDE.md's ≥ 2-consumer rule.

Architectural placement: Processing/Lexical, not Lexicon #

The DLM is housed under Theories/Processing/Lexical/ rather than under a hypothetical Theories/Lexicon/ for a substantive reason: the DLM denies that there is a separable lexicon to theorize about — its central architectural commitment is that the lexicon-as- stored-inventory does not exist; only the connection weights of comprehension and production networks do (@cite{baayen-2019}, abstract; see also the discussion in @cite{lu-chuang-baayen-2026} §5 on word-and-paradigm processing). Filing the DLM under "Lexicon" would commit linglib to the very architecture the model rejects. By contrast, the DLM is unambiguously a processing theory in the sense of Theories/Processing/: it specifies how form vectors map to meaning vectors and back, with the network weights as the operative computational object.

The Lexical/ parent groups DLM with future lexical-processing sibling theories (e.g. WEAVER++ lexical-access models, Dell-style spreading-activation, race / dual-route models, exemplar-based lexical access). All such theories share the architectural concern of how form ↔ meaning links are processed; they may disagree on whether items are stored.

This means Theories/Phonology/ItemSpecificity/ (frequency channels parameterised by a stored lexicon: @cite{zuraw-2000}, @cite{pater-2010}, @cite{coetzee-pater-2008}, @cite{moore-cantwell-2021}) and Theories/Morphology/UsageBased/Network.lean (Bybee 1985 dynamic network: @cite{bybee-1985}) intentionally do not sit alongside DLM here — they make positive lexicon-architecture commitments and belong with the linguistic level whose generalisations they primarily explain.

What is in this file #

Out of scope here #

structure Theories.Processing.Lexical.Discriminative.LinearDiscriminativeLexicon (R : Type u_1) (F : Type u_2) (M : Type u_3) [Semiring R] [AddCommMonoid F] [AddCommMonoid M] [Module R F] [Module R M] :
Type (max u_2 u_3)

The Linear Discriminative Lexicon Model (LDL) — a pair of linear maps between form and meaning vector spaces.

The "L" in LDL stands for Linear; we bake that into the type by using mathlib's LinearMap. Carrier-typed: applies to any [Module R F]/[Module R M] pair, not just Fin n → ℝ.

There are exactly two fields, both linear maps. There is no entries : List ... field, no frequency : ... → ℝ field. The "lexicon" of an LDL is its two maps; nothing else is stored. See the module docstring's "Architectural placement" note for the substantive reason this is housed under Processing/.

  • comprehension : F →ₗ[R] M
  • production : M →ₗ[R] F
Instances For
    @[reducible, inline]

    A formDim-dimensional form vector over ℝ, indexed by Fin formDim. Used by both consumer studies (Chuang 2026: formDim = 50; Lu 2026: formDim = 100) for pitch contours sampled at uniform points on a normalised time axis. The substrate is dimension- polymorphic; the specific dimension is paper-specific.

    Equations
    Instances For
      @[reducible, inline]

      A meaningDim-dimensional meaning vector over ℝ. Used by both consumer studies for context-specific contextualised embeddings (768-dim CKIP GPT-2 in both papers).

      Equations
      Instances For
        theorem Theories.Processing.Lexical.Discriminative.linear_dlm_distinguishes_meanings {R : Type u_1} {F : Type u_2} {M : Type u_3} [Semiring R] [AddCommGroup F] [AddCommGroup M] [Module R F] [Module R M] (D : LinearDiscriminativeLexicon R F M) {e₁ e₂ : M} (h : D.production (e₁ - e₂) 0) :
        D.production e₁ D.production e₂

        Structural meaning-discrimination lemma. For any LinearDiscriminativeLexicon, if the production map sends the difference of two meanings to a non-zero vector, then the meanings map to distinct forms.

        Derived from map_sub + sub_ne_zero — no DLM-specific reasoning; the linearity of the production map does all the work.

        theorem Theories.Processing.Lexical.Discriminative.dlm_neutralizes_meanings_in_kernel {R : Type u_1} {F : Type u_2} {M : Type u_3} [Semiring R] [AddCommGroup F] [AddCommGroup M] [Module R F] [Module R M] (D : LinearDiscriminativeLexicon R F M) {e₁ e₂ : M} (h : D.production (e₁ - e₂) = 0) :
        D.production e₁ = D.production e₂

        Structural meaning-neutralization lemma (sister of linear_dlm_distinguishes_meanings). For any LinearDiscriminativeLexicon, if the production map sends the difference of two meanings to zero, then the meanings map to identical forms.

        Architectural significance: the kernel of the production map is the neutralization locus — meaning-pairs the DLM cannot distinguish in form. When two empirically-distinct meanings have form identity, that identity follows from their meaning-difference landing in ker production, not from a stipulated rule. The Lu, Chuang & Baayen 2026 study applies this to Taiwan Mandarin T3 tone sandhi (Phenomena/Tone/Studies/LuChuangBaayen2026.lean): T3-T3 and T2-T3 word centroids in CE space differ only by a kernel-element of the DLM's trained production map, so their surface contours coincide — no phonological sandhi rule invoked.

        Derived from map_sub + sub_eq_zero, dual to the discrimination lemma.

        The "broadcast first coordinate" linear map: e ↦ (i ↦ e[0]). The simplest non-trivial LinearMap from MeaningVec d to FormVec n for d, n ≥ 1. Used as a witness for the architectural-capacity existence theorem and re-used by consumer studies (e.g. LuChuangBaayen2026.dlm_dialect_flexibility) to construct concrete non-degenerate DLMs without re-implementing it.

        Equations
        • One or more equations did not get rendered due to their size.
        Instances For
          @[simp]
          theorem Theories.Processing.Lexical.Discriminative.broadcastFirstCoord_apply {n d : } (hd : 0 < d) (e : MeaningVec d) (i : Fin n) :
          (broadcastFirstCoord hd) e i = e 0, hd

          Evaluation of broadcastFirstCoord at any sample index returns the input's coordinate-0 value. @[simp] so consumer simp-only calls can drop the LinearMap.coe_mk, AddHom.coe_mk litany.

          theorem Theories.Processing.Lexical.Discriminative.linear_dlm_admits_meaning_specific_contours {n d : } (hn : 0 < n) (hd : 0 < d) :
          ∃ (D : LinearDiscriminativeLexicon (FormVec n) (MeaningVec d)) (e₁ : MeaningVec d) (e₂ : MeaningVec d), e₁ e₂ D.production e₁ D.production e₂

          Architectural capacity theorem. For any positive form and meaning dimensions, the LinearDiscriminativeLexicon substrate admits a DLM and two distinct meanings whose production-map images differ. The substrate is non-vacuously capable of meaning discrimination.

          The empirical claim of the consumer papers (that the trained network distinguishes empirically most meaning pairs in the corpus) is paper-supplied; this theorem establishes only that the architecture has the structural capacity for such discrimination.