Documentation

Linglib.Phenomena.Morphology.Studies.AlbrightHayes2003

Albright & Hayes (2003): Rules vs. analogy in English past tenses #

@cite{albright-hayes-2003} @cite{berko-1958} @cite{albright-hayes-2002} @cite{mikheev-1997} @cite{pinker-prince-1988} @cite{bybee-moder-1983}

A computational/experimental study of how speakers form past tenses for novel English verbs (wug verbs). The paper's central architectural claim is that morphological knowledge is best modelled as multiple stochastic rules — each with a structural description, a scope, a hit count, and an adjusted-confidence score — and that this model fits human wug-test data better than either a purely analogical model or a single-default-rule dual-mechanism model.

Architectural commitments #

Three positions are at stake:

Empirical core: islands of reliability for both regulars and irregulars #

A&H's central empirical contribution is that wug ratings of regular past tenses also show context sensitivity, contrary to the single-default-rule prediction. The 4-way Core stimulus design crosses island-of-reliability (IOR) status for regulars × IOR for irregulars, and the published rating data show:

with no significant interaction. Both regulars and irregulars are sensitive to IOR membership.

What this file formalises #

This is the second consumer of Paradigms/WugTest.lean (the first is @cite{breiss-katsuda-kawahara-2026}). It supplies:

Out of scope #

Per CLAUDE.md "do not encode conclusions as definitions": we do not formalise the numerical correlation tables (r = 0.745 etc.) as Lean theorems with rfl proofs. The numbers are reported in prose and the paper-side citation. We formalise the qualitative prediction-shape contrasts that the empirical correlations support.

We also do not implement the @cite{mikheev-1997} lower-confidence-limit interval. The discriminator below depends only on rawConfidence and on the qualitative shape of the prediction (gradient on novel cells across IOR membership), not on the adjustment formula. We expose adjustedConfidence as a placeholder definition equal to rawConfidence so that downstream code can reference the API name; wiring this to a real Wilson interval (or the @cite{albright-hayes-2002} MGL implementation) is deferred.

Island-of-reliability category for a wug stem. The 4-way Core design crosses (IOR for regulars) × (IOR for irregulars); a stem is in exactly one cell, picked out by two booleans. Structural encoding via product avoids the 4-way enum + 2 helper accessors pattern: the cells of a 2×2 design are the boolean product. Table 3 of @cite{albright-hayes-2003}.

  • iorForRegular : Bool

    Whether the stem is in an island of reliability for the regular allomorph. Example for regular = true: bredge /brɛdʒ/.

  • iorForIrregular : Bool

    Whether the stem is in an island of reliability for some irregular pattern. Example for iorForIrregular = true only: spling /splɪŋ/ (close to spring/sling/sting).

Instances For
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For
      Equations
      • One or more equations did not get rendered due to their size.
      Instances For

        Named cells of Table 3, retained as abbrevs so that #

        paper-side terminology survives in the witness definitions.

        @[reducible, inline]

        IOR for both regulars and irregulars: e.g. dize.

        Equations
        Instances For
          @[reducible, inline]

          IOR for regulars only: e.g. bredge.

          Equations
          Instances For
            @[reducible, inline]

            IOR for irregulars only: e.g. spling.

            Equations
            Instances For
              @[reducible, inline]

              IOR for neither: e.g. gude.

              Equations
              Instances For

                A wug stem with its IPA form and its IOR category. The IPA strings are taken verbatim from example (14) of @cite{albright-hayes-2003}.

                Instances For
                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For

                    Sample stems from each cell of Table 3 (example 14) #

                    A past-tense structural change (the "input → output" half of a rule). The three regular allomorphs and a residual category for vowel-changing irregulars and zero-derivation.

                    Instances For
                      @[implicit_reducible]
                      Equations
                      Equations
                      • One or more equations did not get rendered due to their size.
                      Instances For

                        A stochastic rule: a structural change applied in a structurally- defined context, together with its scope (number of forms in the lexicon meeting the structural description) and hits (number of those forms on which the change actually obtains).

                        The bundled invariant hitsscope is a real-data property of rules extracted by a minimal-generalization procedure: every form counted as a hit must be in the scope. This is the structural fact that makes rawConfidence ≤ 1.

                        The structural-description / context is kept abstract — A&H's rules are extracted from the lexicon by a minimal-generalization procedure, and the discriminator below does not depend on any specific encoding of the contexts.

                        Instances For
                          Equations
                          • One or more equations did not get rendered due to their size.
                          Instances For

                            Raw confidence: hits / scope. Defaults to 0 when scope = 0 to avoid division by zero; that case never arises for a rule extracted from real data. @cite{albright-hayes-2003}.

                            Equations
                            Instances For

                              @cite{mikheev-1997} lower-confidence-limit adjustment to raw confidence, used by @cite{albright-hayes-2003} to penalise rules supported by few forms; A&H §2.3.4 reports the best-fit lower- confidence-limit parameter α = 0.55.

                              TODO: this is a placeholder equal to rawConfidence. A faithful implementation would apply the Wilson-style interval used in the @cite{albright-hayes-2002} MGL code. The discriminator below depends on rawConfidence, not on this adjustment, so the placeholder is sound for the present proof obligations.

                              Equations
                              Instances For

                                A cell in the A&H wug-rating paradigm. Carries:

                                • the stem being rated;
                                • whether the stem is presented as a wug (novel) or a real verb (attested) — A&H's cross-paradigm comparison;
                                • the IOR-for-regular factor — the propositional phonological- context dimension that A&H's experiments turn on. The field is Prop rather than Bool because IOR-membership is a propositional property of the stimulus, not a designed numeric coordinate; mathlib quality requires Prop with [Decidable] for such fields rather than Bool standing in for a proposition.
                                Instances For
                                  @[implicit_reducible]

                                  The Paradigms/WugTest.lean HasAttestation instance: BKK and A&H both use the same wug paradigm contract. Lens laws by rfl on the structure projections.

                                  Equations
                                  • One or more equations did not get rendered due to their size.
                                  @[reducible, inline]

                                  A&H's discriminator runs on the categorical island membership dimension; the WugTest paradigm contract handles this through the HasFactor Cell Prop specialisation, parallel to HasFrequency = HasFactor Cell ℝ. The lens-law shape is shared.

                                  The Prop factor inherits its < from mathlib's complete-Boolean- algebra structure on Prop (p < q ↔ (p → q) ∧ ¬(q → p)), so NovelShowsFactorGradient (F := Prop) instantiates to "rate is strictly higher under any pair where the second IOR proposition strictly entails the first" — exactly A&H's prediction reading IOR as a propositional property.

                                  Equations
                                  Instances For
                                    @[implicit_reducible]
                                    Equations
                                    • One or more equations did not get rendered due to their size.
                                    @[reducible, inline]

                                    A rate observable shows the novel-regulars IOR gradient if, on novel cells, switching the IOR-for-regular factor from false to true strictly raises the rate. The shared paradigm-level predicate NovelShowsFactorGradient (F := Bool) already expresses exactly this: the only Bool pair satisfying f₁ < f₂ is false < true. This is the A&H multiple-stochastic-rule prediction: novel regulars receive higher ratings when the stem occupies an island where the regular allomorph works particularly well.

                                    Equations
                                    Instances For
                                      @[reducible, inline]

                                      A rate observable is invariant in IOR for novel regulars if, on novel cells, switching the IOR-for-regular factor leaves the rate unchanged. This is the single-default-rule dual-mechanism prediction: regular pasts are derived by one rule whose confidence does not vary with phonological context, so novel regular ratings cannot vary with IOR membership. Specialises NovelInvariantInFactor (F := Prop).

                                      Equations
                                      Instances For

                                        These definitions are concrete witnesses that the A&H prediction- shape NovelRegularsShowIORGradient is realised by a model. The model is a step function: ratings on IOR=true cells equal slope, ratings on IOR=false cells equal 0. The shape is intentionally minimal — the goal is to exhibit a model satisfying the gradient, not to fit the empirical numbers.

                                        Step-function regular-rating model: rating = slope when the IOR-for-regular proposition holds, 0 otherwise. A faithful proxy for the monotonic relationship between IOR-supported rule- confidence and novel-form ratings reported in @cite{albright-hayes-2003} for the regulars panel. Noncomputable because the IOR-for-regular field is Prop rather than Bool; Classical.propDecidable discharges the Decidable-of-if.

                                        Equations
                                        Instances For

                                          A&H's model satisfies NovelRegularsShowIORGradient for any positive rating slope. The Prop-valued IOR factor satisfies f₁ < f₂ iff f₁ → f₂ and ¬(f₂ → f₁); the only consistent case (modulo classical reasoning) is ¬f₁ ∧ f₂, on which the rate jumps from 0 to slope.

                                          A concrete AHWugCell witness — the wug stem bredge (regulars-only IOR) presented as a novel form. Used as the discriminator-corollary witness below.

                                          Equations
                                          • One or more equations did not get rendered due to their size.
                                          Instances For

                                            A&H rules out the single-default-rule dual-mechanism prediction (the @cite{pinker-prince-1988} family). Wired through Paradigms/WugTest.lean's novelGradient_inconsistent_with_invariance at F := Bool: the empirical fact that novel regulars show an IOR gradient is structurally incompatible with the single-rule prediction that novel regulars are invariant in phonological context. Any account in the latter family is ruled out by structural impossibility, not just empirical fit.

                                            NB: this discriminator only captures A&H's anti-dual-mechanism prong. A&H also argue against pure analogy via §4.3.2 ("Failure of the analogical model to locate islands of reliability"); the structured-vs-variegated similarity contrast that drives the anti-analogy prong is not formalised here. See @cite{bybee-moder-1983} for the analogical tradition A&H argue against.