Documentation

Linglib.Studies.Dekier2021

Dekier (2021): Morphosyntax of specific and non-specific indefinite markers #

[Dek21]

Glossa: a journal of general linguistics 6(1), 1–33.

This paper proposes a nanosyntactic analysis of indefinite markers, arguing that the non-specific, specific unknown, and specific known functions correspond to three layers of a universal syntactic hierarchy (the indefinite fseq):

F₁P (non-specific) ⊂ F₂P (specific unknown) ⊂ F₃P (specific known)

Using data from 45 languages, [Dek21] shows:

  1. Four attested syncretism patterns: AAA (English), ABB (Yakut), AAB (Latin), ABC (Russian). The *ABA pattern is unattested.

  2. *The ABA generalization ([Bob12]) holds for indefinites: the Superset and Elsewhere Principles of Nanosyntax guarantee that a single lexical entry cannot match two non-contiguous phrasal nodes.

  3. Paradigm gaps are monotonic: gaps always start from the TOP of the hierarchy (SK first, then SU). No language has a gap for NS while filling SU or SK.

  4. Prefix vs suffix: spellout-driven movement produces suffixes (unary foot), subderivation produces prefixes (binary foot). Russian -nibudEntry' and -to are suffixes; koeEntry- is a prefix.

Connection to linglib #

This is the paper critiqued by [Bub26]. While Dekier argues that nanosyntax explains the indefinite typology via structural containment, Bubnov argues that the semantic account of [DA25] (based on team-semantic variation and constancy) provides a better explanation — one that also predicts which type is unattested (type vi) and accounts for bidirectional diachronic change.

Both papers analyze the same cross-linguistic data. This file formalizes Dekier's POSITIVE nanosyntactic analysis; Bubnov2026.lean formalizes the critique.

Dekier's hierarchy:

    F₃P  ⇒  specific known marker
   / \
  F₃  F₂P  ⇒  specific unknown marker
     / \
    F₂  F₁P  ⇒  non-specific marker
        |
        F₁

Features are ordered on a universal fseq. Each layer is characterized
by its rank (depth). A lexical entry at rank r stores F₁...Fᵣ and
matches any target of rank ≤ r via the Superset Principle —
`ExponenceRule.Matches` over the three-grade hierarchy `Fin 3`. 

Fseq grades for indefinite features.

Equations
Instances For
    Equations
    Instances For
      Equations
      Instances For

        A cross-linguistic indefinite paradigm entry. none in a cell indicates a paradigm gap.

        • language : String
        • nsForm : Option String
        • suForm : Option String
        • skForm : Option String
        Instances For
          Equations
          • One or more equations did not get rendered due to their size.
          Instances For
            def Dekier2021.instDecidableEqParadigmEntry.decEq (x✝ x✝¹ : ParadigmEntry) :
            Decidable (x✝ = x✝¹)
            Equations
            • One or more equations did not get rendered due to their size.
            Instances For
              Equations
              Instances For
                Equations
                • Dekier2021.pEnglish = { language := "English", nsForm := some "some-", suForm := some "some-", skForm := some "some-" }
                Instances For
                  Equations
                  • Dekier2021.pPolish = { language := "Polish", nsForm := some "-ś", suForm := some "-ś", skForm := some "-ś" }
                  Instances For
                    Equations
                    • Dekier2021.pJapanese = { language := "Japanese", nsForm := some "-ka", suForm := some "-ka", skForm := some "-ka" }
                    Instances For
                      Equations
                      • Dekier2021.pKorean = { language := "Korean", nsForm := some "-nka", suForm := some "-nka", skForm := some "-nka" }
                      Instances For
                        Equations
                        • Dekier2021.pLezgian = { language := "Lezgian", nsForm := some "-jat'ani", suForm := some "-jat'ani", skForm := some "-jat'ani" }
                        Instances For
                          Equations
                          • Dekier2021.pRomanian = { language := "Romanian", nsForm := some "-va", suForm := some "-va", skForm := some "-va" }
                          Instances For
                            Equations
                            • Dekier2021.pBulgarian = { language := "Bulgarian", nsForm := some "nja-", suForm := some "nja-", skForm := some "nja-" }
                            Instances For
                              Equations
                              • Dekier2021.pSerboCro = { language := "Serbo-Croatian", nsForm := some "ne-", suForm := some "ne-", skForm := some "ne-" }
                              Instances For
                                Equations
                                • Dekier2021.pCzech = { language := "Czech", nsForm := some "ně-", suForm := some "ně-", skForm := some "ně-" }
                                Instances For
                                  Equations
                                  • Dekier2021.pSlovak = { language := "Slovak", nsForm := some "nie-", suForm := some "nie-", skForm := some "nie-" }
                                  Instances For
                                    Equations
                                    • Dekier2021.pHungarian = { language := "Hungarian", nsForm := some "vala-", suForm := some "vala-", skForm := some "vala-" }
                                    Instances For
                                      Equations
                                      • Dekier2021.pHebrew = { language := "Hebrew", nsForm := some "-šehu", suForm := some "-šehu", skForm := some "-šehu" }
                                      Instances For
                                        Equations
                                        • Dekier2021.pTurkish = { language := "Turkish", nsForm := some "bir-", suForm := some "bir-", skForm := some "bir-" }
                                        Instances For
                                          Equations
                                          • Dekier2021.pLatvian = { language := "Latvian", nsForm := some "kaut-", suForm := some "kaut-", skForm := some "kaut-" }
                                          Instances For
                                            Equations
                                            • Dekier2021.pYakut = { language := "Yakut", nsForm := some "-eme", suForm := some "-ere", skForm := some "-ere" }
                                            Instances For
                                              Equations
                                              • Dekier2021.pGeorgian = { language := "Georgian", nsForm := some "-me", suForm := some "-γac", skForm := some "-γac" }
                                              Instances For
                                                Equations
                                                • Dekier2021.pOssetic = { language := "Ossetic", nsForm := some "is-", suForm := some "-dær", skForm := some "-dær" }
                                                Instances For
                                                  Equations
                                                  • Dekier2021.pLatin = { language := "Latin", nsForm := some "ali-", suForm := some "ali-", skForm := some "-dam" }
                                                  Instances For
                                                    Equations
                                                    • Dekier2021.pRussian = { language := "Russian", nsForm := some "-nibud'", suForm := some "-to", skForm := some "koe-" }
                                                    Instances For
                                                      Equations
                                                      • Dekier2021.pLithuanian = { language := "Lithuanian", nsForm := some "-nors", suForm := some "kaž-", skForm := some "kai-" }
                                                      Instances For
                                                        Equations
                                                        • Dekier2021.pKannada = { language := "Kannada", nsForm := some "-aadaruu", suForm := some "-oo", skForm := none }
                                                        Instances For
                                                          Equations
                                                          • Dekier2021.pQuechua = { language := "Quechua", nsForm := some "-pis", suForm := some "-chi", skForm := none }
                                                          Instances For
                                                            Equations
                                                            • Dekier2021.pChinese = { language := "Mandarin Chinese", nsForm := some "wh-pron", suForm := none, skForm := none }
                                                            Instances For
                                                              Equations
                                                              Instances For
                                                                Equations
                                                                Instances For
                                                                  Equations
                                                                  Instances For

                                                                    The full paradigms from Table 7 (21 languages with complete data).

                                                                    Equations
                                                                    • One or more equations did not get rendered due to their size.
                                                                    Instances For

                                                                      Each syncretism pattern corresponds to a particular lexicon configuration. The spellout algorithm (Superset + Minimize Junk, Morphology.Containment.spellout) derives the surface pattern from the lexicon. Entries are context-free ExponenceRules: an exponent paired with the largest grade its stored constituent spans.

                                                                      English AAA: a single entry at rank 2 covers all three layers. some- ⇔ F₃P.

                                                                      Equations
                                                                      Instances For

                                                                        Yakut ABB: -emeEntry at rank 0 (F₁P), -ereEntry at rank 2 (F₃P). Elsewhere gives -emeEntry for NS, -ereEntry covers SU and SK.

                                                                        Equations
                                                                        • Dekier2021.yakutLex = [{ exponent := "-eme", spans := 0, context := none }, { exponent := "-ere", spans := 2, context := none }]
                                                                        Instances For

                                                                          Latin AAB: aliEntry- at rank 1 (F₂P), -damEntry at rank 2 (F₃P). aliEntry- covers NS and SU via Superset; -damEntry wins for SK via Elsewhere (closer match).

                                                                          Note: the nanosyntactic derivation is complex — aliEntry- is a prefix (subderivation), -damEntry is a suffix (constituent extraction).

                                                                          Equations
                                                                          • Dekier2021.latinLex = [{ exponent := "ali-", spans := 1, context := none }, { exponent := "-dam", spans := 2, context := none }]
                                                                          Instances For

                                                                            Russian ABC: three entries, one per rank. Each layer gets its own exponent. -nibudEntry' ⇔ F₁P (suffix), -to ⇔ F₂P (suffix), koeEntry- ⇔ F₃P (prefix).

                                                                            Equations
                                                                            • Dekier2021.russianLex = [{ exponent := "-nibud'", spans := 0, context := none }, { exponent := "-to", spans := 1, context := none }, { exponent := "koe-", spans := 2, context := none }]
                                                                            Instances For

                                                                              Lithuanian ABC: three entries at ranks 0, 1, 2. [Dek21] Table 7.

                                                                              Equations
                                                                              • Dekier2021.lithuanianLex = [{ exponent := "-nors", spans := 0, context := none }, { exponent := "kaž-", spans := 1, context := none }, { exponent := "kai-", spans := 2, context := none }]
                                                                              Instances For

                                                                                Patterns are COMPUTED from spellout results, not stipulated. We derive the syncretism check from each Fragment file's canonical .form data rather than restating the form strings here. Note: Russian's Fragment form "-нибудь (-nibud')" differs from Dekier's Table 7 transliteration "-nibud'"; classifyTriple only inspects distinctness, so both classifications coincide as ABC.

                                                                                Lithuanian forms have no Fragment file yet (no Lithuanian.Indefinites), so the strings stay inline here.

                                                                                The Elsewhere Principle ([Dek21]): "If several lexical items match a syntactic node, insert the entry with the fewest features unspecified for that node."

                                                                                Combined with the Superset Principle this derives *ABA in general:
                                                                                any antihomophonous context-free lexicon yields a contiguous
                                                                                pattern (`Morphology.Containment.isContiguous_spellout`). Each
                                                                                sample lexicon instantiates the theorem — *ABA is derived, not
                                                                                inspected case by case. 
                                                                                

                                                                                English: contiguous spellout (no ABA configuration possible).

                                                                                The ABA pattern itself is unattested cross-linguistically. This aligns with the *ABA generalization of [Bob12].

                                                                                [Dek21] Table 8: paradigm gaps follow a monotonic pattern. Gaps always start from the TOP of the hierarchy (SK first, then SU). No language has a gap for NS while having a form for SU or SK.

                                                                                This follows from the Superset Principle: any entry at rank r
                                                                                matches ALL targets of rank ≤ r. So if ANY entry exists in the
                                                                                lexicon, NS (rank 0) is always filled. 
                                                                                

                                                                                Paradigm gap lexicons: the gap position corresponds to the ABSENCE of high-rank entries.

                                                                                Equations
                                                                                • Dekier2021.kannadaLex = [{ exponent := "-aadaruu", spans := 0, context := none }, { exponent := "-oo", spans := 1, context := none }]
                                                                                Instances For
                                                                                  Equations
                                                                                  Instances For

                                                                                    Consequence: if NS (rank 0) has no form, nothing does.

                                                                                    Consequence: if SU (rank 1) has no form, SK doesn't either.

                                                                                    [Dek21]: the nanosyntactic derivation predicts a structural difference between prefixes and suffixes:

                                                                                    - **Suffix**: formed via spellout-driven movement (roll-up).
                                                                                      The stem moves above the indefinite layer, leaving a remnant
                                                                                      constituent with a unary foot. Result: stem + marker.
                                                                                    
                                                                                    - **Prefix**: formed via subderivation. The indefinite layers
                                                                                      are built in a parallel derivation and integrated as a complex
                                                                                      left branch. Result: marker + stem.
                                                                                    
                                                                                    In Russian:
                                                                                    - *-nibudEntry'* (F₁P, rank 0): suffix — stem rolls up past F₁
                                                                                    - *-to* (F₂P, rank 1): suffix — stem rolls up past F₂
                                                                                    - *koeEntry-* (F₃P, rank 2): prefix — subderived [F₁, F₂, F₃]
                                                                                    
                                                                                    Prediction: in a language with both prefixes and suffixes,
                                                                                    the morphological boundary (prefix/suffix break) correlates with
                                                                                    the derivational mechanism switch (spellout movement → subderivation).
                                                                                    

                                                                                    Russian indefinite markers with their morphological types. [Dek21].

                                                                                    Instances For
                                                                                      Equations
                                                                                      • One or more equations did not get rendered due to their size.
                                                                                      Instances For
                                                                                        Equations
                                                                                        • One or more equations did not get rendered due to their size.
                                                                                        Instances For

                                                                                          In Russian, suffixes occupy the lower ranks and the prefix occupies the highest rank. This matches the spellout-movement (low) vs subderivation (high) prediction.

                                                                                          [Dek21]: the ordering NS < SU < SK is preferred over SK < SU < NS based on functional complexity:

                                                                                          - NS markers only introduce an indefinite entity (simplest)
                                                                                          - SU markers add specificity of the referent
                                                                                          - SK markers add speaker knowledge of the referent's identity
                                                                                          
                                                                                          Each higher layer adds a functional property, matching the
                                                                                          nanosyntactic assumption that higher layers on the fseq encode
                                                                                          more complex functional content.
                                                                                          
                                                                                          Both orderings are compatible with the syncretism data. The
                                                                                          functional complexity argument selects NS < SU < SK. 
                                                                                          

                                                                                          The hierarchy respects the rank ordering.

                                                                                          Connect the nanosyntactic spellout results to the typed indefinite entries in the fragment files. The fragment entries use the [Has97] function-coverage substrate; the D&A typology is a projection living in Semantics/Quantification/DeganoAloni2025.lean. Dekier's syntactic hierarchy is the candidate counterpart on the morphological side; the bridge here pairs each Fragment entry's function coverage with the lexicon's spellout result.

                                                                                          The ParadigmEntry records (Tables 7 & 8) and the nanosyntactic lexicons are two independent representations of the same data. These theorems verify they agree.

                                                                                          [Dek21] analyzed 45 languages total: Basque, Bulgarian, Catalan, Czech, Dutch, English, Filipino, Finnish, French, Georgian, German, Greek, Hausa, Hebrew, Hindi, Hungarian, Icelandic, Irish, Italian, Japanese, Kannada, Kazakh, Korean, Latin, Latvian, Lithuanian, Lezgian, Maltese, Mandarin Chinese, Nanay, Ossetic, Persian, Polish, Portuguese, Quechua, Romanian, Russian, Serbian/Croatian, (Colombian) Spanish, Swahili, Swedish, Turkish, and Yakut.

                                                                                          Of these, 20 have complete paradigms with explicit forms in
                                                                                          Tables 7 (formalized above). 6 have paradigm gaps (Table 8).
                                                                                          The remaining 19 are discussed in the appendices or show patterns
                                                                                          consistent with the four attested types. 
                                                                                          
                                                                                          theorem Dekier2021.sample_no_aba :
                                                                                          (fullParadigms.all fun (p : ParadigmEntry) => match p.nsForm, p.suForm, p.skForm with | some ns, some su, some sk => decide (Indefinite.classifyTriple ns su sk).IsAttested | x, x_1, x_2 => true) = true

                                                                                          No language in the sample violates *ABA.

                                                                                          theorem Dekier2021.gaps_at_top :
                                                                                          (gapParadigms.all fun (p : ParadigmEntry) => decide ((p.skForm.isSome = truep.suForm.isSome = true) (p.suForm.isSome = truep.nsForm.isSome = true))) = true

                                                                                          All paradigm-gap languages have gaps at the TOP of the hierarchy, consistent with the monotonicity prediction.