Numeral typology — substrate types and WALS data #
@cite{wals-2013} (Chs 53, 54, 55, 56, 131) @cite{aikhenvald-2000} @cite{greenberg-1978} @cite{stolz-veselinova-2013}
Type-level enums + per-language profile struct for numeral systems across @cite{wals-2013} chapters 53–56 (Gil, Comrie) and 131: ordinal formation, distributive numerals, numeral classifiers, conjunction-quantifier identity, numeral base. Plus WALS distribution data, the principal cross-linguistic generalizations, and the Greenberg suppletion hierarchy for ordinal formation.
ClassifierStatus (WALS Ch 55) and fromWALS55A live in
Typology/ClassifierSystem.lean and are re-imported here.
Schema #
OrdinalFormation(Ch 53): ordinal numeral formationDistributiveNumeral(Ch 54): distributive marking strategyConjunctionQuantifier(Ch 56): identity of 'and' and 'all'Region: areal grouping (for cross-linguistic generalizations)PluralMarking: nominal plural status (Sanches-Slobin)NumeralBase(Ch 131): decimal/vigesimal/etc.NumeralProfile: per-language bundle
Per-language data lives in Fragments/{Lang}/Numerals.lean.
WALS Ch 53: how a language forms ordinal numerals from cardinals. The dominant pattern is "first" suppletive + higher ordinals regular.
- firstSuppletion : OrdinalFormation
"first" is suppletive, "second" onward regular (e.g., English).
- firstSecondSuppletion : OrdinalFormation
"first" and "second" suppletive, "third" onward regular.
- allFromCardinals : OrdinalFormation
All ordinals derived regularly from cardinals.
- various : OrdinalFormation
Mixed strategies, no single dominant pattern.
- noOrdinals : OrdinalFormation
No productive ordinal formation reported.
Instances For
Equations
- Typology.instDecidableEqOrdinalFormation x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- Typology.instReprOrdinalFormation = { reprPrec := Typology.instReprOrdinalFormation.repr }
Equations
- One or more equations did not get rendered due to their size.
Instances For
WALS Ch 54: whether and how a language marks distributive numerals ("N each" / "N apiece"). Reduplication is the most widespread dedicated strategy.
- noDistributive : DistributiveNumeral
No dedicated distributive numeral form.
- markedByReduplication : DistributiveNumeral
Cardinal is reduplicated (e.g., Turkish iki-şer, Tagalog dalawa-dalawa).
- markedBySuffix : DistributiveNumeral
Suffix creates distributive (e.g., Georgian -agan).
- markedByPrefix : DistributiveNumeral
Prefix creates distributive.
- markedByOtherMeans : DistributiveNumeral
Other strategies (particles, circumfix, etc.).
Instances For
Equations
- Typology.instDecidableEqDistributiveNumeral x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
WALS Ch 56 (Gil): relationship between 'and' and 'all/every'. Identity reflects a deep connection between conjunction (exhaustive pairing) and universal quantification (exhaustive predication).
- identity : ConjunctionQuantifier
Same morpheme for 'and' and 'all' (e.g., Mandarin dou, Tagalog lahat).
- differentiation : ConjunctionQuantifier
Different morphemes (e.g., English and/all, Japanese to/subete).
Instances For
Equations
- Typology.instDecidableEqConjunctionQuantifier x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
Areal region — used for areal generalizations about classifier distribution (e.g., Sanches-Slobin's classifier-belt observation).
- europe : Region
- eastAsia : Region
- southeastAsia : Region
- southAsia : Region
- centralAsia : Region
- westAsia : Region
- africa : Region
- northAmerica : Region
- mesoamerica : Region
- southAmerica : Region
- oceania : Region
Instances For
Equations
- Typology.instDecidableEqRegion x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- Typology.instReprRegion = { reprPrec := Typology.instReprRegion.repr }
Equations
- One or more equations did not get rendered due to their size.
- Typology.instReprRegion.repr Typology.Region.europe prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "Typology.Region.europe")).group prec✝
- Typology.instReprRegion.repr Typology.Region.eastAsia prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "Typology.Region.eastAsia")).group prec✝
- Typology.instReprRegion.repr Typology.Region.southAsia prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "Typology.Region.southAsia")).group prec✝
- Typology.instReprRegion.repr Typology.Region.centralAsia prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "Typology.Region.centralAsia")).group prec✝
- Typology.instReprRegion.repr Typology.Region.westAsia prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "Typology.Region.westAsia")).group prec✝
- Typology.instReprRegion.repr Typology.Region.africa prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "Typology.Region.africa")).group prec✝
- Typology.instReprRegion.repr Typology.Region.northAmerica prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "Typology.Region.northAmerica")).group prec✝
- Typology.instReprRegion.repr Typology.Region.mesoamerica prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "Typology.Region.mesoamerica")).group prec✝
- Typology.instReprRegion.repr Typology.Region.southAmerica prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "Typology.Region.southAmerica")).group prec✝
- Typology.instReprRegion.repr Typology.Region.oceania prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "Typology.Region.oceania")).group prec✝
Instances For
Whether a language has obligatory grammatical plural marking on common nouns. Used for the Sanches-Slobin generalization relating numeral classifiers and plural.
- obligatory : PluralMarking
Plural marking required (e.g., English, Spanish).
- optional : PluralMarking
Plural marking available but not required (e.g., Korean).
- none : PluralMarking
No grammatical plural on nouns (e.g., Mandarin, Japanese).
Instances For
Equations
- Typology.instDecidableEqPluralMarking x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Typology.instReprPluralMarking = { reprPrec := Typology.instReprPluralMarking.repr }
WALS Ch 131 (Comrie): the base of a language's numeral system. Most languages use a decimal (base-10) system.
- decimal : NumeralBase
Base 10 (e.g., English, Mandarin, Swahili).
- vigesimal : NumeralBase
Pure base 20 (e.g., Ainu, Chukchi).
- hybridVigesimalDecimal : NumeralBase
Mixed base-20 / base-10 (e.g., French, Basque, Georgian).
- otherBase : NumeralBase
Base 5, 6, or other (rare).
- bodyPartSystem : NumeralBase
Extended body-part counting system (e.g., Eipo).
- restricted : NumeralBase
Restricted numeral system (few numerals, no productive base).
Instances For
Equations
- Typology.instDecidableEqNumeralBase x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Typology.instReprNumeralBase = { reprPrec := Typology.instReprNumeralBase.repr }
A language's numeral typology profile across all four WALS dimensions
- areal region + plural-marking status.
- language : String
- iso : String
ISO 639-3 code.
- ordinal : OrdinalFormation
Ch 53: ordinal numeral formation.
- distributive : DistributiveNumeral
Ch 54: distributive numeral marking.
- classifier : ClassifierStatus
Ch 55: numeral classifier status.
- conjQuant : ConjunctionQuantifier
Ch 56: conjunction-quantifier relationship.
- region : Region
Areal region (for areal generalizations).
- pluralMarking : PluralMarking
Plural marking on common nouns (for Sanches-Slobin).
- numeralBase : Option NumeralBase
Ch 131: numeral base (optional; not all languages surveyed).
Instances For
Equations
- Typology.instReprNumeralProfile = { reprPrec := Typology.instReprNumeralProfile.repr }
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Does a language have obligatory numeral classifiers?
Equations
Instances For
Does a language have any numeral classifiers (obligatory or optional)?
Equations
Instances For
Does a language have obligatory plural marking on common nouns?
Equations
Instances For
Does a language form "first" by suppletion?
Equations
Instances For
Does a language have a morphological distributive numeral form?
Equations
Instances For
Is a language in the East/Southeast Asian region?
Equations
- p.isEastSoutheastAsian = (p.region == Typology.Region.eastAsia || p.region == Typology.Region.southeastAsia)
Instances For
Convert WALS 53A ordinal numeral values to the substrate enum.
WALS distinguishes eight subtypes; we collapse them into five categories.
"First, two-th, three-th" and "First, two, three" both map to
firstSuppletion (only "first" suppletive). "One, two, three" and
"First/one-th, …" map to various.
Equations
- Typology.fromWALS53A Data.WALS.F53A.OrdinalNumerals.none = Typology.OrdinalFormation.noOrdinals
- Typology.fromWALS53A Data.WALS.F53A.OrdinalNumerals.oneTwoThree = Typology.OrdinalFormation.various
- Typology.fromWALS53A Data.WALS.F53A.OrdinalNumerals.firstTwoThree = Typology.OrdinalFormation.firstSuppletion
- Typology.fromWALS53A Data.WALS.F53A.OrdinalNumerals.oneThTwoThThreeTh = Typology.OrdinalFormation.allFromCardinals
- Typology.fromWALS53A Data.WALS.F53A.OrdinalNumerals.firstOneThTwoThThreeTh = Typology.OrdinalFormation.various
- Typology.fromWALS53A Data.WALS.F53A.OrdinalNumerals.firstTwoThThreeTh = Typology.OrdinalFormation.firstSuppletion
- Typology.fromWALS53A Data.WALS.F53A.OrdinalNumerals.firstSecondThreeTh = Typology.OrdinalFormation.firstSecondSuppletion
- Typology.fromWALS53A Data.WALS.F53A.OrdinalNumerals.various = Typology.OrdinalFormation.various
Instances For
Convert WALS 54A distributive numeral values to the substrate enum.
Word-level and mixed strategies collapse into markedByOtherMeans.
Equations
- Typology.fromWALS54A Data.WALS.F54A.DistributiveNumerals.noDistributiveNumerals = Typology.DistributiveNumeral.noDistributive
- Typology.fromWALS54A Data.WALS.F54A.DistributiveNumerals.markedByReduplication = Typology.DistributiveNumeral.markedByReduplication
- Typology.fromWALS54A Data.WALS.F54A.DistributiveNumerals.markedByPrefix = Typology.DistributiveNumeral.markedByPrefix
- Typology.fromWALS54A Data.WALS.F54A.DistributiveNumerals.markedBySuffix = Typology.DistributiveNumeral.markedBySuffix
- Typology.fromWALS54A Data.WALS.F54A.DistributiveNumerals.markedByPrecedingWord = Typology.DistributiveNumeral.markedByOtherMeans
- Typology.fromWALS54A Data.WALS.F54A.DistributiveNumerals.markedByFollowingWord = Typology.DistributiveNumeral.markedByOtherMeans
- Typology.fromWALS54A Data.WALS.F54A.DistributiveNumerals.markedByMixedOrOtherStrategies = Typology.DistributiveNumeral.markedByOtherMeans
Instances For
Convert WALS 131A numeral base values to the substrate enum (one-to-one).
Equations
- Typology.fromWALS131A Data.WALS.F131A.NumeralBases.decimal = Typology.NumeralBase.decimal
- Typology.fromWALS131A Data.WALS.F131A.NumeralBases.pureVigesimal = Typology.NumeralBase.vigesimal
- Typology.fromWALS131A Data.WALS.F131A.NumeralBases.hybridVigesimalDecimal = Typology.NumeralBase.hybridVigesimalDecimal
- Typology.fromWALS131A Data.WALS.F131A.NumeralBases.otherBase = Typology.NumeralBase.otherBase
- Typology.fromWALS131A Data.WALS.F131A.NumeralBases.extendedBodyPartSystem = Typology.NumeralBase.bodyPartSystem
- Typology.fromWALS131A Data.WALS.F131A.NumeralBases.restricted = Typology.NumeralBase.restricted
Instances For
WALS Chapter 53 distribution: language counts per ordinal formation type. Total: 321 languages.
- firstSuppletion : Nat
- firstSecondSuppletion : Nat
- allFromCardinals : Nat
- various : Nat
- noOrdinals : Nat
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Equations
- d.total = d.firstSuppletion + d.firstSecondSuppletion + d.allFromCardinals + d.various + d.noOrdinals
Instances For
WALS Ch 53 counts (321 languages).
Equations
- Typology.ch53Distribution = { firstSuppletion := 99, firstSecondSuppletion := 45, allFromCardinals := 28, various := 83, noOrdinals := 66 }
Instances For
WALS Chapter 54 distribution: language counts per distributive type. Total: 251 languages.
- noDistributive : Nat
- reduplication : Nat
- suffixCount : Nat
- prefixCount : Nat
- otherMeans : Nat
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Equations
- d.total = d.noDistributive + d.reduplication + d.suffixCount + d.prefixCount + d.otherMeans
Instances For
WALS Ch 54 counts (251 languages).
Equations
- Typology.ch54Distribution = { noDistributive := 63, reduplication := 85, suffixCount := 34, prefixCount := 19, otherMeans := 50 }
Instances For
WALS Chapter 56 distribution: language counts per conjunction-quantifier type. Total: 220 languages.
- identity : Nat
- differentiation : Nat
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- d.total = d.identity + d.differentiation
Instances For
WALS Ch 56 counts (220 languages).
Equations
- Typology.ch56Distribution = { identity := 43, differentiation := 177 }
Instances For
Suppletive "first" is the dominant ordinal formation strategy (WALS Ch 53). Languages with suppletive "first" (alone or with suppletive "second") outnumber languages where all ordinals derive regularly from cardinals.
"First" suppletion alone is the single most common ordinal pattern.
Languages with some form of ordinal formation (regular or suppletive) outnumber languages lacking ordinals entirely.
Languages with dedicated distributive numeral forms outnumber those without, but neither is a negligible minority.
Reduplication is the single most common distributive strategy, outnumbering any other individual morphological means.
Differentiation between 'and' and 'all' is the dominant pattern (WALS Ch 56).
Differentiation accounts for more than three-quarters of the sample.
Identity between 'and' and 'all' is a non-negligible minority pattern, attested in roughly a fifth of languages (43 out of 220).
@cite{greenberg-1978}'s implicational universal for ordinal suppletion: if a language has a suppletive ordinal for numeral N, then it has suppletive ordinals for all numerals less than N. Equivalently: suppletion cuts off at some point in the sequence 1st, 2nd, 3rd,… and all ordinals above the cutoff are regular.
The WALS data captures the coarsest version: suppletion is most likely for "first", less likely for "second", and rare beyond that.
- none : SuppletionCutoff
No suppletive ordinals (all regular from cardinals).
- first : SuppletionCutoff
Only "first" is suppletive.
- firstAndSecond : SuppletionCutoff
"first" and "second" are suppletive.
Instances For
Equations
- Typology.instDecidableEqSuppletionCutoff x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- Typology.instReprSuppletionCutoff = { reprPrec := Typology.instReprSuppletionCutoff.repr }
Equations
- One or more equations did not get rendered due to their size.
Instances For
Numeric rank for the suppletion cutoff (higher = more suppletion).
Equations
Instances For
Map ordinal formation type to suppletion cutoff. Languages with 'various' or 'no ordinals' patterns are excluded from the hierarchy.
Equations
- Typology.OrdinalFormation.allFromCardinals.suppletionCutoff = some Typology.SuppletionCutoff.none
- Typology.OrdinalFormation.firstSuppletion.suppletionCutoff = some Typology.SuppletionCutoff.first
- Typology.OrdinalFormation.firstSecondSuppletion.suppletionCutoff = some Typology.SuppletionCutoff.firstAndSecond
- Typology.OrdinalFormation.various.suppletionCutoff = none
- Typology.OrdinalFormation.noOrdinals.suppletionCutoff = none
Instances For
The hierarchy is consistent: rank of each cutoff increases monotonically.
WALS aggregate confirms the hierarchy: languages with "first"-only suppletion outnumber those with "first+second" suppletion, which in turn outnumber those with no suppletion at all. This reflects the implicational scale: suppletion at higher numerals is rarer.