Little, Moroney & Royer (2022) #
@cite{little-moroney-royer-2022}
Classifiers can be for numerals or nouns: Two strategies for numeral modification. Glossa 7(1). 1–35.
Core Claim #
Numeral classifiers form a heterogeneous class. Two families of theories — classifier-for-numeral (CLF-for-NUM) and classifier-for-noun (CLF-for-N) — are both correct, but for different languages:
- Ch'ol (Mayan): CLF-for-NUM — the classifier is a measure function required by the numeral.
- Shan (Kra-Dai): CLF-for-N — the classifier atomizes the noun denotation so the numeral can count.
Four Predictions (Table 8) #
The two strategies make divergent predictions about classifier distribution:
- (num) Variation in whether a numeral requires a CLF → CLF-for-NUM
- (noun) Variation in whether a noun requires a CLF → CLF-for-N
- (noun) CLF found beyond numerals (quantifiers, demonstratives) → CLF-for-N
- (num) CLF appears in counting (no noun present) → CLF-for-NUM
Ch'ol shows predictions 1 and 4; Shan shows predictions 2 and 3.
Semantic Equivalence #
Despite different derivational strategies, both languages derive the same meaning for "two dogs": {ab, ac, bc} — the set of pluralities of two dogs.
Architectural Note #
CLF-for-NUM is formalized using Mereology.QMOD — the measure function
Finset.card produces a quantized predicate (clfForNum_qua).
CLF-for-N is formalized directly as atom-pair selection: ∃ d₁ d₂, d₁ ≠ d₂ ∧ s = {d₁, d₂}.
The ClassifierStrategy enum in Typology captures the
typological parameter.
Note: Mereology.atomize cannot be applied to Finset Dog directly because
Finset has ∅ as a bottom element — Mereology.Atom (no proper part)
is only satisfied by ∅, so atomize(DOGS) would be empty. The CLF-for-N
semantics is instead formalized at the element level: singletons {d} are
the atoms, and clfForNounSem selects 2-element unions of distinct atoms.
The extensional equivalence (derivations_extensionally_equal) bridges the
two via Finset.card_eq_two.
Ch'ol noun categorization system: numeral classifier, CLF-for-NUM. @cite{bale-coon-2014} @cite{bale-et-al-2019}
Key properties:
- Classifiers are bound to the numeral (suffixes)
- Only Mayan-based numerals (1–6) take classifiers; Spanish loans do not
- Classifiers appear in counting contexts (no noun)
- Plural marking -ob co-occurs with classifiers (ex. 30)
- Classifiers are ungrammatical with quantifiers, demonstratives, modifiers (ex. 19)
Equations
- One or more equations did not get rendered due to their size.
Instances For
Shan noun categorization system: numeral classifier, CLF-for-N. @cite{moroney-2021}
Key properties:
- Classifiers are free morphemes derived from nominal elements
- All numerals uniformly require classifiers (no idiosyncrasies)
- Classifiers appear with quantifiers, demonstratives, relative clauses (ex. 42)
- Classifiers degraded/unacceptable in counting contexts (exs. 48–49)
- No plural–classifier co-occurrence
Equations
- One or more equations did not get rendered due to their size.
Instances For
The four distributional predictions from the CLF-for-NUM vs CLF-for-N distinction (Table 2/7/8).
- numeralIdiosyncrasies : Bool
Prediction 1: Idiosyncrasies in whether a numeral requires a CLF. Expected for CLF-for-NUM (measure function may be built into numeral).
- nounIdiosyncrasies : Bool
Prediction 2: Idiosyncrasies in whether a noun requires a CLF. Expected for CLF-for-N (some nouns may already denote atoms).
- clfBeyondNumerals : Bool
Prediction 3: CLF found with the noun beyond numerals (quantifiers, demonstratives, relative clauses). Expected for CLF-for-N (CLF is for the noun, not the numeral).
- clfInCounting : Bool
Prediction 4: CLF appears in counting contexts (no noun present). Expected for CLF-for-NUM (CLF is required by the numeral itself).
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
- LittleMoroneyRoyer2022.instBEqPredictions.beq x✝¹ x✝ = false
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
Expected predictions for CLF-for-NUM languages.
Equations
- LittleMoroneyRoyer2022.clfForNumPredictions = { numeralIdiosyncrasies := true, nounIdiosyncrasies := false, clfBeyondNumerals := false, clfInCounting := true }
Instances For
Expected predictions for CLF-for-N languages.
Equations
- LittleMoroneyRoyer2022.clfForNounPredictions = { numeralIdiosyncrasies := false, nounIdiosyncrasies := true, clfBeyondNumerals := true, clfInCounting := false }
Instances For
Expected predictions for languages whose classifier system is the @cite{sudo-2016} blocking strategy: classifier semantics live with numerals, not nouns; the silent ∪-operator that lifts numerals to predicates is blocked by the lexical presence of classifiers.
LMR's diagnostic battery applied to Sudo's framework:
- numeralIdiosyncrasies = false: ∪ is uniformly blocked across all numerals; no per-numeral variation (contrast Ch'ol Mayan-vs- Spanish split, which Sudo's framework does not predict).
- nounIdiosyncrasies = false: explanation lives in numerals, not nouns; uniform across the noun lexicon.
- clfBeyondNumerals = false: classifiers exist to lift numerals to predicate type; they appear with numerals, not beyond them (contrast LMR's CLF-for-N prediction).
- clfInCounting = true: the ∩-operator (Sudo eq. 24) maps the numeral+CL property back to type-n, so number-predicate uses like juu-ni-nin-da "the number is twelve people" (Sudo eq. 22a) are well-formed (contrast LMR's CLF-for-N prediction).
Equations
- LittleMoroneyRoyer2022.sudoBlockingPredictions = { numeralIdiosyncrasies := false, nounIdiosyncrasies := false, clfBeyondNumerals := false, clfInCounting := true }
Instances For
Derive LMR's distributional predictions from any classifier strategy.
This is the paper's core claim for .forNumeral and .forNoun;
extended here to .sudoBlocking per @cite{sudo-2016}'s analysis.
Equations
- LittleMoroneyRoyer2022.predictionsOf Typology.ClassifierStrategy.forNumeral = LittleMoroneyRoyer2022.clfForNumPredictions
- LittleMoroneyRoyer2022.predictionsOf Typology.ClassifierStrategy.forNoun = LittleMoroneyRoyer2022.clfForNounPredictions
- LittleMoroneyRoyer2022.predictionsOf Typology.ClassifierStrategy.sudoBlocking = LittleMoroneyRoyer2022.sudoBlockingPredictions
Instances For
§2b: LMR's per-language strategy assignments #
@cite{little-moroney-royer-2022} assigns Ch'ol the CLF-for-NUM strategy
and Shan the CLF-for-N strategy. They are consistent with the
@cite{chierchia-1998} CLF-for-N assignment for Mandarin/Japanese (LMR
treat Sinitic and Japonic as CLF-for-N). Per-language assignments live
here (in this study file) rather than on NounCategorizationSystem.
LMR's strategy assignment for Ch'ol: classifier is a measure function required by the numeral.
Instances For
LMR's strategy assignment for Shan: classifier atomizes the noun denotation.
Instances For
Ch'ol predictions derived from LMR's CLF-for-NUM assignment.
Equations
Instances For
Shan predictions derived from LMR's CLF-for-N assignment.
Equations
Instances For
The CLF-for-NUM and CLF-for-N profiles are distinct — the two LMR strategies make genuinely different predictions on all four diagnostics.
Ch'ol predictions follow from LMR's strategy assignment via predictionsOf.
Shan predictions follow from LMR's strategy assignment via predictionsOf.
@cite{chierchia-1998} and @cite{sudo-2016} disagree on Japanese's
classifier strategy: Chierchia assigns .forNoun, Sudo assigns
.sudoBlocking. Run through LMR's diagnostic battery, the two
strategies make divergent empirical predictions:
The empirical wedge: under LMR's diagnostics, Chierchia's .forNoun
and Sudo's .sudoBlocking agree on numeralIdiosyncrasies = false
but diverge on the other three. The most decisive disagreement is
clfInCounting: Sudo predicts true (citing eq. 22a — juu-ni-nin-da
"the number is twelve people" is well-formed via the ∩-operator),
Chierchia predicts false. Japanese empirically exhibits the Sudo
pattern on this diagnostic.
Symmetric divergence on clfBeyondNumerals: Chierchia predicts true
(CLF appears with quantifiers, demonstratives, relative clauses
independent of numerals); Sudo predicts false (CLF exists for
numerals, not beyond them).
Grammaticality judgments for Ch'ol classifier distribution (§3.1, §4). Each datum records whether a CLF appears in a given syntactic context.
- language : String
- context : String
- clfPresent : Bool
- grammatical : Bool
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
Ch'ol: CLF only with numerals and interrogative jay- 'how many'.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Shan: CLF with numerals, quantifiers, demonstratives, relative clauses.
Equations
- One or more equations did not get rendered due to their size.
Instances For
@cite{little-moroney-royer-2022} §3.4 refine @cite{greenberg-1972}'s complementarity universal. The original says numeral classifiers and obligatory number marking are in complementary distribution. The refinement: this holds for CLF-for-N (where CLF and PL occupy the same functional projection) but not for CLF-for-NUM (where CLF is in a separate projection and can co-occur with PL).
Ch'ol (CLF-for-NUM): cha'-tyikil wiñik-ob 'two-CLF men-PL' (ex. 30) Shan (CLF-for-N): *mǎa sǎam tǒ khǎw 'three CLF dogs PL' (unattested)
Prediction 3 (CLF beyond numerals) is derived from the system's scopes. CLF-for-N classifiers serve the noun, so they appear wherever the noun needs individuation — not just with numerals.
Ch'ol constituency (51): numeral and classifier form a constituent. [[cha'-kojty]_NumCLF [ts'i']_N] The numeral cha' first combines with the classifier -kojty to form a measure phrase, which then applies to the noun ts'i' 'dog'.
Equations
- LittleMoroneyRoyer2022.cholTree = ((Core.Tree.Tree.leaf "cha'").bin (Core.Tree.Tree.leaf "kojty")).bin (Core.Tree.Tree.leaf "ts'i'")
Instances For
Shan constituency (52): classifier and noun form a constituent. [[sǒŋ]_Num [[tǒ]_CLF [mǎa]_N]] The classifier tǒ first combines with the noun mǎa 'dog' to atomize it, then the numeral sǒŋ 'two' selects a 2-element sum.
Equations
- LittleMoroneyRoyer2022.shanTree = (Core.Tree.Tree.leaf "sǒŋ").bin ((Core.Tree.Tree.leaf "tǒ").bin (Core.Tree.Tree.leaf "mǎa"))
Instances For
The two derivation trees have different constituency despite both being binary branching over three terminals. In Ch'ol, the left daughter of the root is complex (Num+CLF); in Shan, the right daughter is complex (CLF+N).
This structural difference is what generates the four distributional predictions: if Num+CLF is a constituent, the classifier is part of the numeral's semantics and appears wherever the numeral appears (counting, number reference). If CLF+N is a constituent, the classifier is part of the noun's semantics and appears wherever the noun needs individuation (quantifiers, demonstratives).
Equations
- LittleMoroneyRoyer2022.instDecidableEqDog x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- LittleMoroneyRoyer2022.instReprDog = { reprPrec := LittleMoroneyRoyer2022.instReprDog.repr }
Equations
- One or more equations did not get rendered due to their size.
CLF-for-NUM derivation: Mereology.QMOD applied to the dog domain.
⟦two-CLF⟧ = λP. QMOD(P, μ#, 2) where μ# = Finset.card.
This uses Mereology.QMOD from Core/Mereology.lean:
QMOD(R, μ, n) = λx. R(x) ∧ μ(x) = n
Equations
- LittleMoroneyRoyer2022.clfForNumSem s = Mereology.QMOD (fun (x : Finset LittleMoroneyRoyer2022.Dog) => x.Nonempty) Finset.card 2 s
Instances For
CLF-for-N derivation: atomize, then count. ⟦CLF⟧(⟦DOGS⟧) restricts to atoms (singletons), then ⟦TWO⟧ selects 2-element sums from the atomized set. The result: s is the union of exactly two distinct atoms.
Equations
- LittleMoroneyRoyer2022.clfForNounSem s = ∃ (d₁ : LittleMoroneyRoyer2022.Dog) (d₂ : LittleMoroneyRoyer2022.Dog), d₁ ≠ d₂ ∧ s = {d₁, d₂}
Instances For
The two derivation strategies are extensionally equivalent: QMOD(DOGS, μ#, 2) = {s | ∃ d₁ d₂, d₁ ≠ d₂ ∧ s = {d₁, d₂}}. This is the paper's key semantic result (§5): despite different compositional paths, both strategies produce the same denotation for "two dogs."
The CLF-for-NUM path uses the measure constraint directly (QMOD);
the CLF-for-N path atomizes then forms 2-element sums.
Finset.card_eq_two provides the bridge: a finset has cardinality 2
iff it's a pair of distinct elements.
The full dog predicate (Nonempty) is cumulative: the union of two
dog-pluralities is a dog-plurality. Mereology.CUM applied to
Finset Dog with ⊔ = ∪.
Cumulativity is what forces classifier languages to need a classifier: counting over a CUM predicate is undefined until it's quantized. CLF-for-NUM uses QMOD to quantize directly; CLF-for-N atomizes first.
CLF-for-NUM creates a quantized predicate via QMOD: no proper subset of a 2-element set also has 2 elements.
Proof: y ⊂ x implies |y| < |x| (Finset.card_lt_card),
but both have card 2 — contradiction. This mirrors the general
Mereology.extMeasure_qua pattern (QMOD by any extensive measure
produces QUA), instantiated directly for Finset.card.
CLF-for-N also creates a quantized predicate: no proper subset of a pair of distinct dogs is also a pair of distinct dogs. Both strategies convert CUM predicates to QUA predicates — this is the semantic function of classifiers regardless of strategy.
Proof: if y ⊂ x and both satisfy clfForNounSem, then by
derivations_extensionally_equal, both have card 2. But y ⊂ x
implies |y| < |x| — contradiction.
Both strategies quantize: the semantic function of classifiers is to turn CUM predicates into QUA predicates, enabling counting.
Concrete witness: {a, b} is a two-dog plurality.
Concrete witness: {a, c} is a two-dog plurality.
Concrete witness: {b, c} is a two-dog plurality.
Singletons are not two-dog pluralities: the measure constraint excludes them. This is why CLF-for-N atomization alone doesn't suffice — the numeral still needs to select the right cardinality.
The triple is not a two-dog plurality: QMOD excludes oversized sums.
Extended system list including Ch'ol and Shan.
Equations
Instances For
Ch'ol and Shan are both numeral classifier systems in Aikhenvald's
typology, but have different classifier strategies.
They agree on Aikhenvald's morphosyntactic classification but differ
on the semantic level — illustrating that ClassifierType is too
coarse to capture the CLF-for-NUM vs CLF-for-N distinction.
Sample-restricted: in the 7-language Aikhenvald sample plus Ch'ol and Shan, every classifier-type language lacks agreement.
@cite{chierchia-1998}'s NMP predicts CLF-for-N for [+arg, -pred] languages (Mandarin, Japanese). Shan is also CLF-for-N per @cite{little-moroney-royer-2022}, despite being Kra-Dai not Sino-Tibetan — the strategy is independent of the NMP parameter. Ch'ol is CLF-for-NUM, which @cite{chierchia-1998} does not predict (Ch'ol is not a [+arg, -pred] language in the NMP typology).
This connects the two classifier study files: Chierchia predicts the strategy for Mandarin/Japanese; Little et al. provide the diagnostic framework that confirms it and extends it to new languages.
Ch'ol's CLF-for-NUM strategy differs from the Chierchia-predicted CLF-for-N found in Mandarin and Japanese. This is the paper's main typological contribution: not all numeral classifier languages use the same semantic strategy.
The unified classifierDenot correctly dispatches based on strategy.
- CLF-for-N →
clfForNoun(=atomize) - CLF-for-NUM →
clfForNum(=QMOD)
This confirms that the typological enum in Typology
is structurally connected to semantic content, not just a label.
The local clfForNumSem IS QMOD from Core.Mereology: both compute
R(x) ∧ μ(x) = n with μ = Finset.card and n = 2. The unified
clfForNum specializes QMOD to ℚ; the local definition uses ℕ
directly. Both reduce to QMOD.