Documentation

Linglib.Typology.ClassifierSystem

Typology.ClassifierSystem #

@cite{aikhenvald-2000} @cite{chierchia-1998} @cite{dixon-1982} @cite{downing-1996} @cite{allan-1977} @cite{little-moroney-royer-2022} @cite{krifka-1995} @cite{bale-coon-2014} @cite{jenks-2011} @cite{nomoto-2013} @cite{sudo-2016}

Cross-linguistic typology of noun categorization devices, following @cite{aikhenvald-2000} Classifiers: A Typology of Noun Categorization Devices.

Provenance #

Merged from Core/Lexical/NounCategorization.lean (vocabulary types: ClassifierType, SemanticParameter, ShapeDimension, CategorizationScope, AssignmentPrinciple, SurfaceRealization, ClassifierEntry, ClassifierStrategy) and the existing Typology/ClassifierSystem.lean (NounCategorizationSystem paradigm record + WALS Ch 55 datapoints) in the cleanup that dissolved Core/Lexical/. Consolidation matches the Typology/PolarityMarking.lean precedent: vocabulary + entry schema

Sections #

What does NOT live here #

Theoretical commitments about which strategy mediates the numeral-noun composition (forNoun per Chierchia, sudoBlocking per Sudo) live in the relevant Phenomena/Classifiers/Studies/ files, not on the description here. Cross-paper disagreement is proved as theorems, not embedded as metadata.

Out of scope #

@cite{corbett-1991} on gender systems is the obvious adjacent reference for the nounClass constructor; not currently consumed by this file but should be cited if a per-language gender-system substrate gets formalised here. The nounClass cell here collapses what Corbett carefully separates (target gender, controller gender, gender vs. classifier).

The 9 focal classifier types on the noun-categorization continuum.

@cite{aikhenvald-2000} establishes these as "focal points" distinguished by morphosyntactic locus, scope, and grammatical function. Real systems are gradient — a language's system may sit between types.

UNVERIFIED: The 9-type taxonomy is Aikhenvald's specific carve-up; other typologists (Allan 1977, Craig 1986, Grinevald 2000, Senft 2000) differ. The nounClass cell here is also more fine-grained per Corbett 1991 — see file docstring "Out of scope".

  • nounClass : ClassifierType

    Noun class / gender: closed agreement system, realized outside the noun on modifiers (head-modifier NP) or predicate (pred-arg agreement). Small inventory (2–20). Examples: Bantu, Indo-European gender.

  • nounClassifier : ClassifierType

    Noun classifier: independent of other NP elements, characterizes the noun itself. Free forms or affixes on the noun.

  • numeralClassifier : ClassifierType

    Numeral classifier: appears in numeral/quantifier NPs, required for enumeration. Free forms or affixes on the numeral.

  • relationalClassifier : ClassifierType

    Relational classifier: in possessive NPs, characterizes the possessive relation (how the noun can be possessed/handled).

  • possessedClassifier : ClassifierType

    Possessed classifier: in possessive NPs, characterizes the possessed noun in terms of its inherent properties.

  • possessorClassifier : ClassifierType

    Possessor classifier: in possessive NPs, characterizes the possessor. Very rare.

  • verbalClassifier : ClassifierType

    Verbal classifier: marks agreement on the verb with an S or O argument. Incorporated classifiers, affixes, or suppletive verb stems.

  • locativeClassifier : ClassifierType

    Locative classifier: in adpositional NPs, marks agreement with the head noun in locative expressions.

  • deicticClassifier : ClassifierType

    Deictic classifier: appears on deictics, articles, demonstratives. Marks spatial location and/or determination.

Instances For
    @[implicit_reducible]
    Equations
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For

      Universal semantic parameters employed in noun categorization.

      @cite{aikhenvald-2000} identifies three large classes: animacy, physical properties, and function. These parameters are found across ALL types of noun categorization device, though different types show different preferences.

      Instances For
        @[implicit_reducible]
        Equations
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For

          Dimensionality sub-classification for shape-based classifiers.

          @cite{downing-1996} and @cite{allan-1977} show that shape-based classifiers decompose along a dimensionality axis:

          • 1D: long, slender, elongated (e.g., Japanese 本 hon, Mandarin 条 tiáo)
          • 2D: flat, thin, planar (e.g., Japanese 枚 mai, Mandarin 张 zhāng)
          • 3D: round, compact, globular (e.g., Japanese 個 ko)
          Instances For
            @[implicit_reducible]
            Equations
            Equations
            • One or more equations did not get rendered due to their size.
            Instances For

              Morphosyntactic scope of a classifier type.

              Instances For
                @[implicit_reducible]
                Equations
                Equations
                • One or more equations did not get rendered due to their size.
                Instances For

                  Principles governing noun-to-class/classifier assignment.

                  Instances For
                    @[implicit_reducible]
                    Equations
                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For

                      Surface realization of a classifier morpheme.

                      Instances For
                        @[implicit_reducible]
                        Equations
                        Equations
                        • One or more equations did not get rendered due to their size.
                        Instances For

                          A classifier lexical entry with semantic typing.

                          Each classifier carries its form, a gloss, and the semantic parameters that motivate its selection — making it possible to verify Aikhenvald's claims about which parameters different classifier types encode.

                          • form : String

                            Surface form (e.g. "只", "匹", "本")

                          • gloss : String

                            Gloss (e.g. "small.animal", "flat.bound.object")

                          • semantics : List SemanticParameter

                            Semantic parameters motivating selection of this classifier

                          • isDefault : Bool

                            Is this the "general" or "default" classifier? (个 in Mandarin, つ in Japanese)

                          • isMensural : Bool

                            Sortal (inherent properties) vs. mensural (configuration/measure)

                          • shapeDimension : Option ShapeDimension

                            Shape dimensionality sub-classification (@cite{downing-1996}). Only meaningful when semantics includes .shape.

                          Instances For
                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For
                              Equations
                              Instances For

                                Whether this classifier encodes a given semantic parameter.

                                Equations
                                Instances For

                                  Collect all distinct semantic parameters attested across a classifier inventory. Used to derive preferredSemantics from fragment data rather than hand-listing.

                                  Equations
                                  Instances For

                                    The semantic strategy a theoretical framework attributes to classifier constructions. Three competing positions are represented:

                                    • forNumeral (CLF-for-NUM): @cite{krifka-1995}, @cite{bale-coon-2014}, @cite{little-moroney-royer-2022}. The classifier is a measure function required by the numeral. The numeral takes the classifier as its first argument: ⟦TWO⟧ = λm⟨e,n⟩λPλx.[P(x) ∧ m(x) = 2]. Predicts: numeral idiosyncrasies in CLF requirement, CLF obligatory even without a noun (counting contexts), CLF + plural marking can co-occur.
                                    • forNoun (CLF-for-N): @cite{chierchia-1998}, @cite{jenks-2011}, @cite{nomoto-2013}, @cite{little-moroney-royer-2022}. The classifier atomizes the noun denotation so the numeral can count. ⟦CLF⟧ = λPλx.[P(x) ∧ ¬∃y[P(y) ∧ y < x]]. Predicts: noun idiosyncrasies in CLF requirement, CLF appears beyond numerals (with quantifiers, demonstratives, relative clauses), CLF + plural marking in complementary distribution.
                                    • sudoBlocking: @cite{sudo-2016}. Classifier semantics live with numerals, not nouns. Numerals are universally type-n singular terms; a phonologically silent ∪-operator type-shifts them to predicates in languages without classifiers, but is blocked (per @cite{chierchia-1998}'s Blocking Principle) in languages whose lexicon contains overt classifiers. Predicts: no numeral or noun idiosyncrasies, CLF appears with numerals not beyond them, CLF appears in counting contexts (via the ∩-operator).

                                    Strategy assignments to specific languages live in study files (Phenomena/Classifiers/Studies/{NMP,LittleMoroneyRoyer2022,Sudo2016}.lean), not in this file or in NounCategorizationSystem. Each paper owns its own per-language commitments; cross-paper agreement and disagreement are first-class theorems in the study files.

                                    Instances For
                                      @[implicit_reducible]
                                      Equations
                                      Equations
                                      • One or more equations did not get rendered due to their size.
                                      Instances For

                                        A noun categorization system in a language.

                                        Captures @cite{aikhenvald-2000}'s 7 definitional properties (A–G): (A) morphosyntactic locus → scopes (B) scope/domain → classifierType + scopes (C) assignment principles → assignment (D) surface realization → realizations (E) agreement → hasAgreement (F) markedness → hasUnmarkedDefault (G) grammaticalization → isObligatory

                                        UNVERIFIED: A–G enumeration cited from memory.

                                        • family : String

                                          Language family (e.g., "Indo-European", "Sino-Tibetan", "Bantu").

                                        • classifierType : ClassifierType

                                          Aikhenvald classifier type.

                                        • scopes : List CategorizationScope

                                          Morphosyntactic scopes this system operates in (A, B).

                                        • assignment : AssignmentPrinciple

                                          How nouns are assigned to classes/classifiers (C).

                                        • realizations : List SurfaceRealization

                                          Morphological realization types used (D).

                                        • hasAgreement : Bool

                                          Does the system involve agreement? (E) — definitional for noun classes. Stored as Bool so the struct stays decidable as a whole; the user-facing predicate is HasAgreement : Prop.

                                        • inventorySize : Nat

                                          Inventory size (number of classes or classifiers).

                                        • isObligatory : Bool

                                          Is realization obligatory or optional? (G). User-facing predicate: IsObligatory : Prop.

                                        • hasUnmarkedDefault : Bool

                                          Is there a formally/functionally unmarked default? (F). User-facing predicate: HasUnmarkedDefault : Prop.

                                        • preferredSemantics : List SemanticParameter

                                          Preferred semantic parameters (Aikhenvald §11.2).

                                        • hasObligatoryNumber : Bool

                                          Does the language have obligatory grammatical number marking? User-facing predicate: HasObligatoryNumber : Prop.

                                        • pluralClfCooccur : Bool

                                          Can classifiers and plural marking co-occur? Predicted by CLF-for-NUM (@cite{little-moroney-royer-2022}: CLF and PL are in different projections) but not by CLF-for-N (same projection, complementary distribution per @cite{borer-2005}). User-facing predicate: PluralClfCooccur : Prop.

                                        • source : String

                                          Citation backing the hand-coded values.

                                        Instances For
                                          Equations
                                          • One or more equations did not get rendered due to their size.
                                          Instances For
                                            Equations
                                            • One or more equations did not get rendered due to their size.
                                            Instances For

                                              Prop API for the boolean property fields #

                                              The struct's hasAgreement/isObligatory/hasUnmarkedDefault/ hasObligatoryNumber/pluralClfCooccur fields are stored as Bool so the struct itself stays decidably equal. The user-facing predicates are the Prop versions defined here, each with a Decidable instance via the underlying Bool. Theorem statements should prefer the Prop form (s.HasAgreement rather than s.hasAgreement = true); decide works for either since the Bool projection reduces structurally for concrete fragment values.

                                              @[reducible, inline]

                                              The system involves agreement (E).

                                              Equations
                                              Instances For
                                                @[reducible, inline]

                                                Realization is obligatory (G).

                                                Equations
                                                Instances For
                                                  @[reducible, inline]

                                                  The system has a formally/functionally unmarked default (F).

                                                  Equations
                                                  Instances For
                                                    @[reducible, inline]

                                                    The language has obligatory grammatical number marking.

                                                    Equations
                                                    Instances For
                                                      @[reducible, inline]

                                                      Classifiers and plural marking can co-occur.

                                                      Equations
                                                      Instances For

                                                        @cite{dixon-1982}'s noun-class vs. classifier divide. Noun classes: small, closed, grammaticalized, agreement. Classifiers: large, open, lexical, no agreement.

                                                        Equations
                                                        Instances For

                                                          All non-noun-class types are "classifier" types in the broad sense.

                                                          Equations
                                                          Instances For

                                                            Grammatical categories that interact with classifier types (@cite{aikhenvald-2000}).

                                                            Instances For
                                                              @[implicit_reducible]
                                                              Equations
                                                              Equations
                                                              • One or more equations did not get rendered due to their size.
                                                              Instances For

                                                                Whether a classifier type typically interacts with a grammatical category (@cite{aikhenvald-2000}).

                                                                Equations
                                                                Instances For

                                                                  Whether a language uses numeral classifiers (@cite{wals-2013} Ch 55).

                                                                  • absent : ClassifierStatus

                                                                    No numeral classifiers (e.g., English, Spanish, Arabic).

                                                                  • optional : ClassifierStatus

                                                                    Classifiers available but not required (e.g., Turkish, Bengali).

                                                                  • obligatory : ClassifierStatus

                                                                    Classifiers required (e.g., Mandarin, Japanese, Thai).

                                                                  Instances For
                                                                    @[implicit_reducible]
                                                                    Equations
                                                                    Equations
                                                                    • One or more equations did not get rendered due to their size.
                                                                    Instances For

                                                                      WALS Chapter 55 distribution: language counts per classifier status. Total: 400 languages.

                                                                      • absent : Nat
                                                                      • optional : Nat
                                                                      • obligatory : Nat
                                                                      Instances For
                                                                        Equations
                                                                        • One or more equations did not get rendered due to their size.
                                                                        Instances For

                                                                          Actual WALS Ch 55 counts (260 absent + 62 optional + 78 obligatory = 400).

                                                                          Equations
                                                                          Instances For