Dekier (2021): Morphosyntax of specific and non-specific indefinite markers #
@cite{dekier-2021}
Glossa: a journal of general linguistics 6(1), 1–33.
This paper proposes a nanosyntactic analysis of indefinite markers, arguing that the non-specific, specific unknown, and specific known functions correspond to three layers of a universal syntactic hierarchy (the indefinite fseq):
F₁P (non-specific) ⊂ F₂P (specific unknown) ⊂ F₃P (specific known)
Using data from 45 languages, @cite{dekier-2021} shows:
Four attested syncretism patterns: AAA (English), ABB (Yakut), AAB (Latin), ABC (Russian). The *ABA pattern is unattested.
*The ABA generalization (@cite{bobaljik-2012}) holds for indefinites: the Superset and Elsewhere Principles of Nanosyntax guarantee that a single lexical entry cannot match two non-contiguous phrasal nodes.
Paradigm gaps are monotonic: gaps always start from the TOP of the hierarchy (SK first, then SU). No language has a gap for NS while filling SU or SK.
Prefix vs suffix: spellout-driven movement produces suffixes (unary foot), subderivation produces prefixes (binary foot). Russian -nibudEntry' and -to are suffixes; koeEntry- is a prefix.
Connection to linglib #
This is the paper critiqued by @cite{bubnov-2026}. While Dekier
argues that nanosyntax explains the indefinite typology via structural
containment, Bubnov argues that the semantic account of
@cite{degano-aloni-2025} (based on team-semantic variation and
constancy) provides a better explanation — one that also predicts
which type is unattested (type vi) and accounts for bidirectional
diachronic change.
Both papers analyze the same cross-linguistic data. This file
formalizes Dekier's POSITIVE nanosyntactic analysis; Bubnov2026.lean
formalizes the critique.
Dekier's hierarchy:
F₃P ⇒ specific known marker
/ \
F₃ F₂P ⇒ specific unknown marker
/ \
F₂ F₁P ⇒ non-specific marker
|
F₁
Features are ordered on a universal fseq. Each layer is characterized
by its rank (depth). A lexical entry at rank r stores F₁...Fᵣ and
matches any target of rank ≤ r via the Superset Principle.
Fseq ranks for indefinite features.
Equations
Instances For
Equations
Instances For
Equations
Instances For
A cross-linguistic indefinite paradigm entry.
none in a cell indicates a paradigm gap.
- language : String
- nsForm : Option String
- suForm : Option String
- skForm : Option String
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
- Phenomena.Indefinites.Studies.Dekier2021.instBEqParadigmEntry.beq x✝¹ x✝ = false
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pEnglish = { language := "English", nsForm := some "some-", suForm := some "some-", skForm := some "some-" }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pPolish = { language := "Polish", nsForm := some "-ś", suForm := some "-ś", skForm := some "-ś" }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pJapanese = { language := "Japanese", nsForm := some "-ka", suForm := some "-ka", skForm := some "-ka" }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pKorean = { language := "Korean", nsForm := some "-nka", suForm := some "-nka", skForm := some "-nka" }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pLezgian = { language := "Lezgian", nsForm := some "-jat'ani", suForm := some "-jat'ani", skForm := some "-jat'ani" }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pRomanian = { language := "Romanian", nsForm := some "-va", suForm := some "-va", skForm := some "-va" }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pBulgarian = { language := "Bulgarian", nsForm := some "nja-", suForm := some "nja-", skForm := some "nja-" }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pSerboCro = { language := "Serbo-Croatian", nsForm := some "ne-", suForm := some "ne-", skForm := some "ne-" }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pCzech = { language := "Czech", nsForm := some "ně-", suForm := some "ně-", skForm := some "ně-" }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pSlovak = { language := "Slovak", nsForm := some "nie-", suForm := some "nie-", skForm := some "nie-" }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pHungarian = { language := "Hungarian", nsForm := some "vala-", suForm := some "vala-", skForm := some "vala-" }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pHebrew = { language := "Hebrew", nsForm := some "-šehu", suForm := some "-šehu", skForm := some "-šehu" }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pTurkish = { language := "Turkish", nsForm := some "bir-", suForm := some "bir-", skForm := some "bir-" }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pLatvian = { language := "Latvian", nsForm := some "kaut-", suForm := some "kaut-", skForm := some "kaut-" }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pYakut = { language := "Yakut", nsForm := some "-eme", suForm := some "-ere", skForm := some "-ere" }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pGeorgian = { language := "Georgian", nsForm := some "-me", suForm := some "-γac", skForm := some "-γac" }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pOssetic = { language := "Ossetic", nsForm := some "is-", suForm := some "-dær", skForm := some "-dær" }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pLatin = { language := "Latin", nsForm := some "ali-", suForm := some "ali-", skForm := some "-dam" }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pRussian = { language := "Russian", nsForm := some "-nibud'", suForm := some "-to", skForm := some "koe-" }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pLithuanian = { language := "Lithuanian", nsForm := some "-nors", suForm := some "kaž-", skForm := some "kai-" }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pKannada = { language := "Kannada", nsForm := some "-aadaruu", suForm := some "-oo", skForm := none }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pQuechua = { language := "Quechua", nsForm := some "-pis", suForm := some "-chi", skForm := none }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pChinese = { language := "Mandarin Chinese", nsForm := some "wh-pron", suForm := none, skForm := none }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pSwahili = { language := "Swahili", nsForm := none, suForm := none, skForm := none }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pIrish = { language := "Irish", nsForm := none, suForm := none, skForm := none }
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.pFilipino = { language := "Filipino", nsForm := none, suForm := none, skForm := none }
Instances For
The full paradigms from Table 7 (21 languages with complete data).
Equations
- One or more equations did not get rendered due to their size.
Instances For
The paradigm-gap languages from Table 8 (6 languages).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Each syncretism pattern corresponds to a particular lexicon configuration. The spellout algorithm (Superset + Elsewhere Principles) derives the surface pattern from the lexicon.
English AAA: a single entry at rank 2 covers all three layers. some- ⇔ F₃P.
Equations
- Phenomena.Indefinites.Studies.Dekier2021.englishLex = [{ rank := 2, exponent := "some-" }]
Instances For
Yakut ABB: -emeEntry at rank 0 (F₁P), -ereEntry at rank 2 (F₃P). Elsewhere gives -emeEntry for NS, -ereEntry covers SU and SK.
Equations
- Phenomena.Indefinites.Studies.Dekier2021.yakutLex = [{ rank := 0, exponent := "-eme" }, { rank := 2, exponent := "-ere" }]
Instances For
Latin AAB: aliEntry- at rank 1 (F₂P), -damEntry at rank 2 (F₃P). aliEntry- covers NS and SU via Superset; -damEntry wins for SK via Elsewhere (closer match).
Note: the nanosyntactic derivation is complex — aliEntry- is a prefix (subderivation), -damEntry is a suffix (constituent extraction).
Equations
- Phenomena.Indefinites.Studies.Dekier2021.latinLex = [{ rank := 1, exponent := "ali-" }, { rank := 2, exponent := "-dam" }]
Instances For
Russian ABC: three entries, one per rank. Each layer gets its own exponent. -nibudEntry' ⇔ F₁P (suffix), -to ⇔ F₂P (suffix), koeEntry- ⇔ F₃P (prefix).
Equations
Instances For
Lithuanian ABC: three entries at ranks 0, 1, 2. @cite{dekier-2021} Table 7.
Equations
Instances For
Patterns are COMPUTED from spellout results, not stipulated. We
derive the syncretism check from each Fragment file's canonical
.form data rather than restating the form strings here. Note:
Russian's Fragment form "-нибудь (-nibud')" differs from Dekier's
Table 7 transliteration "-nibud'"; classifyTriple only inspects
distinctness, so both classifications coincide as ABC.
Lithuanian forms have no Fragment file yet (no Fragments.Lithuanian.Indefinites),
so the strings stay inline here.
The Elsewhere Principle (@cite{dekier-2021}): "If several lexical items match a syntactic node, insert the entry with the fewest features unspecified for that node."
Combined with the Superset Principle: if entry β (rank rβ) beats
entry α (rank rα > rβ) for case Y, then β also beats α for ALL
cases X < Y. So α cannot resurface below β on the fseq.
This is `smaller_entry_dominates_below` from `Nanosyntax.Core`.
No lexicon analyzed by Dekier produces *ABA.
The ABA pattern itself is unattested cross-linguistically. This aligns with the *ABA generalization of @cite{bobaljik-2012}.
@cite{dekier-2021} Table 8: paradigm gaps follow a monotonic pattern. Gaps always start from the TOP of the hierarchy (SK first, then SU). No language has a gap for NS while having a form for SU or SK.
This follows from the Superset Principle: any entry at rank r
matches ALL targets of rank ≤ r. So if ANY entry exists in the
lexicon, NS (rank 0) is always filled.
Paradigm gap lexicons: the gap position corresponds to the ABSENCE of high-rank entries.
Equations
- Phenomena.Indefinites.Studies.Dekier2021.kannadaLex = [{ rank := 0, exponent := "-aadaruu" }, { rank := 1, exponent := "-oo" }]
Instances For
Equations
- Phenomena.Indefinites.Studies.Dekier2021.chineseLex = [{ rank := 0, exponent := "wh-pron" }]
Instances For
Equations
Instances For
Consequence: if NS (rank 0) has no form, nothing does.
Consequence: if SU (rank 1) has no form, SK doesn't either.
@cite{dekier-2021}: the nanosyntactic derivation predicts a structural difference between prefixes and suffixes:
- **Suffix**: formed via spellout-driven movement (roll-up).
The stem moves above the indefinite layer, leaving a remnant
constituent with a unary foot. Result: stem + marker.
- **Prefix**: formed via subderivation. The indefinite layers
are built in a parallel derivation and integrated as a complex
left branch. Result: marker + stem.
In Russian:
- *-nibudEntry'* (F₁P, rank 0): suffix — stem rolls up past F₁
- *-to* (F₂P, rank 1): suffix — stem rolls up past F₂
- *koeEntry-* (F₃P, rank 2): prefix — subderived [F₁, F₂, F₃]
Prediction: in a language with both prefixes and suffixes,
the morphological boundary (prefix/suffix break) correlates with
the derivational mechanism switch (spellout movement → subderivation).
Russian indefinite markers with their morphological types. @cite{dekier-2021}.
- form : String
- rank : ℕ
- morphType : Morphology.Nanosyntax.MorphType
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
In Russian, suffixes occupy the lower ranks and the prefix occupies the highest rank. This matches the spellout-movement (low) vs subderivation (high) prediction.
@cite{dekier-2021}: the ordering NS < SU < SK is preferred over SK < SU < NS based on functional complexity:
- NS markers only introduce an indefinite entity (simplest)
- SU markers add specificity of the referent
- SK markers add speaker knowledge of the referent's identity
Each higher layer adds a functional property, matching the
nanosyntactic assumption that higher layers on the fseq encode
more complex functional content.
Both orderings are compatible with the syncretism data. The
functional complexity argument selects NS < SU < SK.
Connect the nanosyntactic spellout results to the typed indefinite
entries in the fragment files. The fragment entries use the
@cite{haspelmath-1997} function-coverage substrate; the D&A typology
is a projection living in Theories/Semantics/Quantification/DeganoAloni2025.lean.
Dekier's syntactic hierarchy is the candidate counterpart on the
morphological side; the bridge here pairs each Fragment entry's
function coverage with the lexicon's spellout result.
English some- fills all three functions — consistent with a single nanosyntactic entry at rank 2 (F₃P).
Russian paradigm: three fragments match three spellout results.
Yakut paradigm: two fragments match two spellout results.
Latin paradigm: two fragments match two spellout results.
Kannada: the SK gap in the nanosyntactic model aligns with the absence of a SK-covering form in the fragment data.
The ParadigmEntry records (Tables 7 & 8) and the nanosyntactic
lexicons are two independent representations of the same data.
These theorems verify they agree.
@cite{dekier-2021} analyzed 45 languages total: Basque, Bulgarian, Catalan, Czech, Dutch, English, Filipino, Finnish, French, Georgian, German, Greek, Hausa, Hebrew, Hindi, Hungarian, Icelandic, Irish, Italian, Japanese, Kannada, Kazakh, Korean, Latin, Latvian, Lithuanian, Lezgian, Maltese, Mandarin Chinese, Nanay, Ossetic, Persian, Polish, Portuguese, Quechua, Romanian, Russian, Serbian/Croatian, (Colombian) Spanish, Swahili, Swedish, Turkish, and Yakut.
Of these, 20 have complete paradigms with explicit forms in
Tables 7 (formalized above). 6 have paradigm gaps (Table 8).
The remaining 19 are discussed in the appendices or show patterns
consistent with the four attested types.
No language in the sample violates *ABA.
All paradigm-gap languages have gaps at the TOP of the hierarchy, consistent with the monotonicity prediction.