Documentation

Linglib.Studies.SmithMoskalEtAl2019

Smith, Moskal, Xu, Kang & Bobaljik (2019) — Case and Number Suppletion in Pronouns #

[SMX+19]

[SMX+19] extend [bobaljik-2012]'s structural-containment account of *ABA in adjectival degree suppletion (good–better–best / *good–better–goodest) to two further empirical domains: pronominal case suppletion (using [caha-2009]'s case hierarchy as the structural backbone) and pronominal number suppletion (using [harbour-2008] / [noyer-1992] for the number-feature geometry).

The cross-domain extension is not seamless: the paper identifies three points where the empirical generalizations require theoretical refinement of the Bobaljik 2012 framework:

  1. §3.6 — AAB attestation diverges across domains. AAB patterns (e.g., a paradigm where positive and comparative share a root and the superlative is suppletive) are systematically unattested in adjectival degree but are attested in pronominal case and pronominal number. This is the divergence formalized below.

  2. §3.7 — accessibility-domain locality replaces structural/linear adjacency. Adjacency is too strict once AAB is admitted: Tamil shows dative case suppletion across the plural morpheme, "neither linearly nor structurally adjacent to the root". The paper adopts [Mos15b]'s accessibility domain (AD) — "the first category-defining node above the root, and one node above that" — a trigger-relative bound on what may condition root suppletion, formalized below as DomainLocal.

  3. §4.3.1–§4.3.3 — number representation and markedness × suppletion. Cross-linguistic variation in pronominal number suppletion correlates with independent evidence for variation in the internal complexity / markedness of the number head — connecting suppletion theory to the feature recursion of [Har14a] (already substrate in this codebase, see Syntax/Minimalist/Phi/Recursion.lean).

This file formalizes (1) directly: the terminal-adjacent fragment of the realizational engine (Morphology/Containment/Vocabulary.lean, [Bob12]'s structural-adjacency locality) predicts AAB exclusion in every domain it applies to — realize_const_of_terminal_adjacent forces the two inner cells to share a root; the empirical data the paper reports falsifies that prediction in case and number. (2) is formalized in § 4: DomainLocal beside Adjacent in the engine's condition menu, the domain-relativized plateau, and the generability converse — the Wardaman-shaped AAB realization is generated by an AD-local vocabulary. (3) remains a substrate-addition TODO.

Scope of the formalization #

The structural-adjacency fragment of the realizational engine — terminal items (no portmanteaux) with adjacent conditioning, the locality [Bob12] assumes for degree — forces CMPR-cell = SPRL-cell for any generable root pattern (Morphology.Containment.realize_const_of_terminal_adjacent). This is strictly stronger than contiguity: it excludes both *ABA and *AAB.

The two theorems below state that content at two granularities: the general "no generable pattern has CMPR ≠ SPRL", and the specific corollary "the AAB shape is not generable." The paper's §3.7 move is precisely to drop these hypotheses for case and number — replacing structural adjacency with domain-based locality — which is why the axioms are à la carte Props rather than baked into the rule type.

Elsewhere insertion under [Bob12]'s structural adjacency (terminal items, adjacent contexts) cannot generate a pattern whose CMPR cell differs from its SPRL cell.

theorem SmithMoskalEtAl2019.dm_excludes_aab {v : List (Morphology.Containment.ExponenceRule 3 )} (hT : Morphology.Containment.Terminal v) (hAdj : Morphology.Containment.Adjacent v) {a b : } (hab : a b) :
Morphology.Containment.realize v ![some a, some a, some b]

Specific corollary: the AAB shape (POS = CMPR ≠ SPRL) cannot be generated under structural adjacency.

[SMX+19] §3.6 distinguishes two kinds of AAB pattern in pronominal case suppletion:

We encode two genuine AAB witnesses:

Projecting onto the 3-cell ABS/ERG/DAT hierarchy, both patterns have the shape [0, 0, 1] (positive and middle cells share root-class 0, suppletive third cell takes root-class 1).

Wardaman 3SG: ABS=narnaj, ERG=narnaj-(j)i, DAT=gunga. [SMX+19] Table 25 (data from [Mer94]).

Equations
Instances For

    Khinalugh 2SG: ABS=, ERG=va, DAT=oX(ɨr). [SMX+19] Table 24.

    Equations
    Instances For

      Both genuine-AAB witnesses are contiguous in the substrate sense (no *ABA violation): cells at positions 0 and 2 do not share a root they don't also share with position 1.

      The defining AAB shape: cells 1 and 2 differ (suppletion in the third position but not the second). This is the structural feature that the terminal-adjacent engine excludes — realize_const_of_terminal_adjacent forces the second and third cells to coincide for any generable root pattern.

      §3.6 cross-domain divergence theorem. The structural-adjacency engine (realize_const_of_terminal_adjacent) predicts, for any generable root pattern, that the second and third cells coincide. Lifted to case (where the 3-cell projection is UNMARKED–DEPENDENT–OBLIQUE, e.g. ABS–ERG–DAT in ergative languages), this prediction would exclude AAB cells [A, A, B] where the second cell equals the first but the third cell differs.

      [SMX+19] §3.6 establishes that AAB is robustly attested in pronominal case suppletion (Table 9: 10 instances, including Wardaman 3SG and the Nakh-Daghestanian 2SG patterns). The existence of a contiguous AAB-shaped case pattern witnesses the falsification of the lifted DM derivation: no realize_const_of_terminal_adjacent-style plateau theorem can hold for case morphology.

      The paper's positive proposal (§3.7) is to weaken the locality predicate from structural adjacency (Bobaljik 2012) / linear adjacency (Embick 2010) to domain-based locality ([Mos15a]); see § 4 below.

      The falsification run through the engine: no terminal vocabulary under structural adjacency generates the Wardaman-shaped AAB realization. Since Wardaman attests it, structural adjacency is the hypothesis that must go — the paper's §3.7 conclusion.

      [SMX+19] §4 surveys pronominal number suppletion and finds the same AAB-attestation profile that §3.6 reports for case: "we find extremely clear-cut examples of ABB, ABC and AAB patterns, alongside AAA. We do not find any unambiguously robust evidence of ABA patterns." Table 32 quantifies: 3 attested AAB number paradigms, marked "?" in the paper's prediction column (vs 48 ABB, 19 ABC, numerous AAA, 1 dubious ABA from Yagua).

      §4 Table 46 lists the three concrete AAB number witnesses:

      We encode Yagua 2 — the cleanest morphological case: PL jiryéy transparently contains the SG root jiy plus a plural suffix -éy, while DL sááda is suppletive (no shared formative). This projects to [0, 0, 1] over the SG/PL/DL hierarchy: positions 0 (SG) and 1 (PL) share root-class 0 (the jiy root); position 2 (DL) takes root-class 1 (the sááda root).

      The number paradigms are 3-cell over SG/PL/DL; the cell-ordering reflects the containment structure SG–PL–DL or SG–DL–PL depending on the language (the paper notes both orderings are attested, motivating the §4.3.1–§4.3.3 reanalysis of number representation — the Number containment hypothesis (32) plus markedness-relativized visibility — which connects to Harbour's [Har14a] feature recursion). For Yagua, the SG–PL–DL ordering matches the table caption directly.

      Yagua 2nd person number paradigm: SG=jiy, PL=jiryéy, DL=sááda. [SMX+19] Table 46 (data from [PP90]). The PL is transparently jiy + -éy; the DL is suppletive. Projects to [0, 0, 1] over SG/PL/DL.

      Equations
      Instances For

        §4 number-side analog of case_aab_attested_falsifies_dm. Same structural divergence: AAB is attested in pronominal number suppletion (3 instances per Table 32, with Wambaya / Yagua / Dehu listed in Table 46), falsifying the DM derivation lifted to number. The Yagua 2 witness is morphologically transparent — PL = SG + suffix; DL is suppletive — exactly the AAB shape that realize_const_of_terminal_adjacent would predict cannot arise.

        Number-side analog: the Yagua-shaped AAB realization is not generable under structural adjacency either.

        [SMX+19] §3.7: adjacency (structural or linear) is too strict a condition on root suppletion — Tamil shows dative suppletion across the plural morpheme, so "adjacency cannot be a universal restrictor on allomorphy" — and "locality in morphology, like in syntax, may need to appeal to interveners and thus, perhaps domains, rather than (structural or linear) adjacency". The paper adopts (as "one approach which may draw the right cut … at least to a first approximation") [Mos15b]'s accessibility domain (AD): the heads merged when the root's cycle is fixed — "the first category-defining node above the root, and one node above that". The AD is trigger-relative: a bound on which heads may condition root suppletion, not a partition of paradigm cells. Lexical nouns and adjectives have a category node, so case/superlative heads fall outside the AD; pronouns lack one ("there is no domain created low in the structure that contains just the pronominal base"), so all case heads are visible.

        DomainLocal d records the AD as the highest accessible head d, beside Adjacent/Grounded in the engine's à-la-carte condition menu. The paper's degree/pronoun split, as theorems:

        def SmithMoskalEtAl2019.DomainLocal {n : } {F : Type u_1} (d : Fin n) (v : List (Morphology.Containment.ExponenceRule n F)) :

        Accessibility-domain locality ([Mos15b], as adopted in [SMX+19] §3.7): every conditioning head lies within the root's accessibility domain [0, d]. Trigger-relative — a bound on rules' conditioning contexts, not a partition of cells; exponed spans are not restricted, since the AD governs what may condition insertion, not what is inserted.

        Equations
        Instances For
          @[implicit_reducible]
          instance SmithMoskalEtAl2019.instDecidableDomainLocal {n : } {F : Type u_1} (d : Fin n) (v : List (Morphology.Containment.ExponenceRule n F)) :
          Decidable (DomainLocal d v)
          Equations

          Under AD locality, terminal rules have thresholds inside the domain.

          The domain-relativized plateau: with terminal rules and AD-local conditioning, realization is constant above the domain — root suppletion cannot distinguish grades the domain cannot see. At d = CMPR this rederives [Bob12]'s no-AAB plateau for adjectives with no adjacency stipulation, which is the paper's reconstruction of the degree facts.

          Terminal adjacency is AD locality at any domain containing the first head: the [Bob12] regime is the smallest nontrivial accessibility domain.

          Wardaman 3SG as a vocabulary, after the paper's rules (20b)/(20f) (stated there for featural containment; transposed here to the 3-cell structural hierarchy of §3.1): elsewhere narnaj, oblique-conditioned gunga.

          Equations
          Instances For

            The generability converse: the AD-local pronoun vocabulary generates exactly the attested AAB shape (degreeShape is the 3-cell shape classifier, named for its degree origin).

            The vocabulary is antihomophonous and AD-local at the trivial (pronoun) domain — no category node, all case heads visible.

            Formalization-surfaced: the attested pronominal AAB vocabulary violates [Bob12]'s markedness condition (202) (Grounded) just as it violates adjacency — under the threshold encoding, the paper's data refutes both degree-side AAB-blockers as category-general principles. The degree/pronoun split must come from domain structure, not from the markedness condition alone.

            ABA remains excluded even at the trivial domain: containment + Elsewhere (with antihomophony) suffices — the paper's closing point of §3.7.

            The cell-level projection of the AD computation (Morphology/DM/DomainLocality.lean: DomainPartition, SameDomain, IsContiguousWithin), instantiated for the case and number domains the paper discusses:

            The 3-cell ergative case paradigm SMSE 2019 analyses: position 0 is ABS, position 1 is ERG, position 2 is DAT.

            Equations
            Instances For

              Case partition: derived from Case.IsOblique via caseAtPos — the Unmarked-Dependent vs Oblique boundary as a consequence of the order substrate, not a stipulated threshold.

              Equations
              Instances For

                Number partition: SG + PL non-dual (false), DL dual (true); boundary per [Har14a]'s feature-recursion split, stated as a cell table pending a number-feature derivation.

                Equations
                Instances For

                  Under the case partition, ERG and DAT lie in different domains — the structural feature that admits Wardaman-type AAB.

                  Under the number partition, PL and DL lie in different domains.

                  The case AAB witness (Wardaman 3SG) is contiguous under the case partition — trivially, since AAB is contiguous even in the universal sense; the substantive generability claim is wardaman_aab_generated above.

                  What's deferred #

                  The §4.3.3 markedness-relativized number containment ("if [α AUG] is the marked value … [α AUG] can serve as a context for suppletion"), which would derive numberDomainPartition from a number-feature substrate; and the graduation of DomainLocal to Morphology/Containment/ once a second study consumes it.