Morphological Infrastructure #
[Byb85] [Cha17] [Lin83] [ZP83]
Framework-agnostic types for morphological analysis and compositional morphological rules.
Typological Classification #
AttachmentSide: prefix, suffix, infix, circumfixSelectionDegree: how restrictive a morpheme's host selection isMorphStatus: free word / simple clitic / special clitic / affixParadigmCell: one cell in a morphological paradigm (form + features)
[Byb85] Relevance Hierarchy #
MorphCategory classifies morpheme functional categories ordered by
semantic relevance to the stem:
stem < derivation < valence < voice < aspect < tense < mood < negation < agreement
Compositional Rules #
MorphRule σ: a morphological rule carrying formal AND semantic effectsStem σ: a lexical stem with its inflectional paradigm
A MorphRule σ transforms a stem's surface form, morphosyntactic features,
and meaning of type σ simultaneously. Rules where the word-level semantic
contribution is delegated to a higher composition layer (e.g., tense rules
that delegate to Semantics/Tense/, agreement rules that contribute
no truth-conditional meaning) carry delegatedSemantics := true. The Bool
flag is not a claim that the morpheme is meaningless — Bybee 1985 Ch 1
§3 explicitly argues against the vacuity-of-inflection position. It tracks
where the meaning is computed (delegate to Theory layer vs. compute at
the morphological word level), not whether meaning exists.
Side on which a bound morpheme attaches to its host.
- prefix : AttachmentSide
- suffix : AttachmentSide
- infix : AttachmentSide
- circumfix : AttachmentSide
Instances For
Equations
- Morphology.instDecidableEqAttachmentSide x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Morphology.instReprAttachmentSide = { reprPrec := Morphology.instReprAttachmentSide.repr }
Typological position classification for formatives. [BN07] Table 2.
Superset of AttachmentSide: adds simulfixation (process morphology),
detached formatives (Wackernagel clitics, free auxiliaries), and
endoclisis (clitic insertion inside a word).
- praefixed : FormativePosition
- postfixed : FormativePosition
- infixed : FormativePosition
- circumfixed : FormativePosition
- simultaneous : FormativePosition
- detached : FormativePosition
- endoclitic : FormativePosition
Instances For
Equations
- Morphology.instDecidableEqFormativePosition x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Map AttachmentSide to the richer FormativePosition classification.
Equations
- Morphology.AttachmentSide.prefix.toFormativePosition = Morphology.FormativePosition.praefixed
- Morphology.AttachmentSide.suffix.toFormativePosition = Morphology.FormativePosition.postfixed
- Morphology.AttachmentSide.infix.toFormativePosition = Morphology.FormativePosition.infixed
- Morphology.AttachmentSide.circumfix.toFormativePosition = Morphology.FormativePosition.circumfixed
Instances For
How restrictive a morpheme is about what it can attach to.
[ZP83] criterion A: clitics exhibit low selection (attach to virtually any word), while affixes exhibit high selection (attach only to specific stems or categories).
- low : SelectionDegree
Attaches to words of virtually any category (prepositions, verbs, adjectives, adverbs). Characteristic of simple clitics.
- singleCategory : SelectionDegree
Attaches to words of a single major category (e.g., past tense -ed to verbs, plural -s to nouns). Characteristic of inflectional affixes.
- closedClass : SelectionDegree
Attaches only to a closed list of stems (e.g., -n't only to finite auxiliaries). Maximally selective.
Instances For
Equations
- Morphology.instDecidableEqSelectionDegree x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Morphology.instReprSelectionDegree = { reprPrec := Morphology.instReprSelectionDegree.repr }
Affixes are more selective than clitics.
Equations
Instances For
Equations
- Morphology.instDecidablePredSelectionDegreeIsHighSelection s = id inferInstance
Morphological status of a linguistic form.
Classifies forms by their degree of syntactic independence and mode of combination. The clitic–affix boundary is the central question of [ZP83]: the criteria A–F serve to locate a given morpheme on this scale.
- freeWord : MorphStatus
Syntactically independent word.
- simpleClitic : MorphStatus
Simple clitic: phonologically bound form that can attach to hosts of virtually any syntactic category. [BN07]: defined primarily by low selectivity (categorical freedom) + phonological dependence, not necessarily by being a reduced variant of a free word. Many simple clitics have no free-word counterpart (Latin -que). English contracted auxiliaries ('s, 've, 'd) are a subcase where a free variant exists.
- specialClitic : MorphStatus
Special clitic: either no corresponding free word exists, or the distribution differs from the free word. Romance pronominal clitics, Latin -que.
- inflAffix : MorphStatus
Inflectional affix: paradigmatic, category-preserving, highly selective, with possible gaps and idiosyncrasies. English -ed, -s, -est, -n't.
- derivAffix : MorphStatus
Derivational affix: potentially category-changing, often productive but may show lexical restrictions. English -ness, un-, -ize.
Instances For
Equations
- Morphology.instDecidableEqMorphStatus x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Morphology.instReprMorphStatus = { reprPrec := Morphology.instReprMorphStatus.repr }
Is this an affix (inflectional or derivational)?
Equations
- s.IsAffix = (s = Morphology.MorphStatus.inflAffix ∨ s = Morphology.MorphStatus.derivAffix)
Instances For
Is this a clitic (simple or special)?
Equations
Instances For
A single cell in a morphological paradigm: one form of a lexeme in a particular morphosyntactic context.
The type parameter F is the feature bundle type (e.g., UD.MorphFeatures
for a full UD specification, or a simpler domain-specific type).
- features : F
The morphosyntactic features selecting this cell.
- form : Option String
The surface form, or
nonefor a paradigm gap. - regular : Bool
Is this form predictable from the stem by regular rule?
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Morphology.instReprParadigmCell = { reprPrec := Morphology.instReprParadigmCell.repr }
Equations
- Morphology.instBEqParadigmCell.beq { features := a, form := a_1, regular := a_2 } { features := b, form := b_1, regular := b_2 } = (a == b && (a_1 == b_1 && a_2 == b_2))
- Morphology.instBEqParadigmCell.beq x✝¹ x✝ = false
Instances For
Equations
Does this cell represent a paradigm gap?
Instances For
Does this cell show irregularity (suppletion or unpredictable allomorphy)?
Equations
- c.isIrregular = (c.regular = false ∧ c.form ≠ none)
Instances For
Morpheme functional category.
Categories are ordered by semantic relevance to the verb stem: more relevant categories appear closer to the stem in suffixal morphology.
- stem : MorphCategory
- derivation : MorphCategory
- valence : MorphCategory
- voice : MorphCategory
- aspect : MorphCategory
- tense : MorphCategory
- mood : MorphCategory
- negation : MorphCategory
- agreement
(controller : Agreement.Controller)
: MorphCategory
Agreement morphology, parameterized by the grammatical role of the controlling NP (
Agreement.Controller). The role distinction (subj vs obj vs poss vs ...) is what allows Anderson Ch 5 §5.2 split/doubled AVC typology to be Lean-checkable; Bybee 1985'spersonAgr / personAgrObj / genderAgrsource distinctions also round-trip cleanly. Seescratch/morphcategory_agreement_split_plan.mdfor the design rationale (0.230.578-0.230.584). - nonfinite : MorphCategory
- number : MorphCategory
- degree : MorphCategory
Instances For
Equations
- Morphology.instReprMorphCategory = { reprPrec := Morphology.instReprMorphCategory.repr }
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Predicate testing whether a MorphCategory is an agreement category,
independent of which Controller role parameterizes it. Used for
Bybee-style relevance-hierarchy code that doesn't care which role
triggers the agreement, only that agreement IS the category.
Equations
- (Morphology.MorphCategory.agreement controller).IsAgreement = true
- x✝.IsAgreement = false
Instances For
Peripherality: numerical embedding of Bybee's relevance hierarchy where higher = farther from stem = less semantically relevant.
In Bybee's text, "high relevance" means more semantically
integrated with the stem ([bybee-1985] Ch 2 §2.1 p. 13). The
substrate uses the opposite numerical direction: stem = 0 (most
relevant), agreement = 8 (least relevant), so that Nat ordering
mirrors stem-outward linear position in suffixing morphology
(Ch 2 §6 iconicity, p. 33). The field name peripherality makes
this directionality explicit and avoids the wrong-on-its-face
gloss "high relevance rank means low relevance."
Categories from Bybee 1985 Ch 2 §3 (verified against the book): valence, voice, aspect, tense, mood, agreement.
Linglib extensions (NOT in Bybee 1985 — flag in any consumer that reads these ranks):
derivation(rank 1): Bybee Ch 4 argues lex/deriv/infl is a continuum, not a discrete level on the relevance scale.number(rank 3): Bybee discusses verbal-number agreement at the low end (with person agreement). Noun number is treated separately (Ch 2 §6 cites Greenberg 1963 only, "stem < number < case" for nouns). Cross-comparison of noun-number rank with verb-aspect rank is an artifact of unifying both onto one scale.degree(rank 5): Bybee never discusses adjectival degree morphology. Comparative morphology is often derivational cross-linguistically (Stassen WALS).negation(rank 7): Bybee discusses negation as a kind of mood (Part II Ch 8 §5), not a separate level. Rank 7 is plausible per Miestamo 2005 cross-linguistic ordering data, but is a linglib extension.nonfinite(rank 9): not on Bybee's hierarchy at all (nonfinite morphology often changes syntactic category, outside the scope of inflectional categories proper).
Equations
- Morphology.MorphCategory.stem.peripherality = 0
- Morphology.MorphCategory.derivation.peripherality = 1
- Morphology.MorphCategory.valence.peripherality = 2
- Morphology.MorphCategory.number.peripherality = 3
- Morphology.MorphCategory.voice.peripherality = 3
- Morphology.MorphCategory.aspect.peripherality = 4
- Morphology.MorphCategory.degree.peripherality = 5
- Morphology.MorphCategory.tense.peripherality = 5
- Morphology.MorphCategory.mood.peripherality = 6
- Morphology.MorphCategory.negation.peripherality = 7
- (Morphology.MorphCategory.agreement controller).peripherality = 8
- Morphology.MorphCategory.nonfinite.peripherality = 9
Instances For
The relevance order #
peripherality is a rank function — a numeric embedding. The object the
hierarchy is really about is the order it induces: which categories are
more stem-relevant than which. All relevance-hierarchy code — this file's
RespectsRelevanceHierarchy and the consumers in Studies/ — speaks in that
order via RelevanceLE / RelevanceLT; the specific ℕ values of
peripherality are an implementation detail (only their comparisons carry
meaning, as relevanceLE_iff_peripherality records).
a is at least as stem-relevant as b: the rank order induced by
peripherality. This is the unified relation relevance-hierarchy code uses.
Equations
- a.RelevanceLE b = (a.peripherality ≤ b.peripherality)
Instances For
a is strictly more stem-relevant than b.
Equations
- a.RelevanceLT b = (a.peripherality < b.peripherality)
Instances For
The relevance order is reflexive.
The relevance order is transitive.
The relevance order is total: any two categories are comparable.
Strict relevance order is the strict part of the order, as expected.
peripherality reflects the relevance order exactly: it is the canonical
rank realizing the order, so the order carries precisely the information the
rank does.
A morpheme ordering respects the relevance hierarchy when its categories are sorted stem-outward by the relevance order.
Equations
- Morphology.RespectsRelevanceHierarchy slots = List.Pairwise Morphology.MorphCategory.RelevanceLE slots
Instances For
Equations
- Morphology.instDecidablePredListMorphCategoryRespectsRelevanceHierarchy x✝ = id inferInstance
A morphological rule: carries formal AND semantic effects.
The type parameter σ is the meaning type, so this works uniformly
across Bool/Frac/Float semantic backends.
Design principle: semEffect can be id for rules whose word-level
semantic contribution is delegated to a higher composition layer
(verb agreement -s carries no truth-conditional meaning at the word
level; tense rules delegate to the intensional layer), making it
explicit which inflections compute meaning at the word level and
which delegate.
- category : MorphCategory
Which morphological category this rule realizes
- value : String
The feature value this rule realizes
- formRule : String → String
How the surface form changes
- featureRule : UD.MorphFeatures → UD.MorphFeatures
How morphosyntactic features change
- valenceRule : Option ComplementType → Option ComplementType
How the lexical frame changes —
idexcept for valency-changing morphology (reciprocal, passive, causative affixes; [DA00]), where the frame change is the rule's whole point. The frame is lexical, not UD morphology, so it is its own effect channel rather than afeatureRulecomponent. - semEffect : σ → σ
Semantic effect (
idwhen meaning is delegated to a higher layer) - delegatedSemantics : Bool
Is the word-level semantic contribution delegated to a higher composition layer? (Set
truefor agreement, tense, etc., whereSemantics/{Tense,Aspect,Modality,Agreement}/handle the meaning. NOT a claim that the morpheme is meaningless — see file docstring.)
Instances For
A lexical stem: a root meaning plus its morphological paradigm.
- lemma_ : String
Base form (lemma)
- cat : UD.UPOS
Syntactic category
- baseFeatures : UD.MorphFeatures
Base morphosyntactic features
- baseFrame : Option ComplementType
Base lexical frame (complement selection) — lexical, beside the morphology.
- paradigm : List (MorphRule σ)
Available inflectional rules
Instances For
Apply a morphological rule to generate an inflected form + meaning. Threads
morphology only; a rule's valenceRule acts on Stem.baseFrame at the consumer
that builds Words.
Equations
- s.inflect rule baseMeaning = (rule.formRule s.lemma_, rule.featureRule s.baseFeatures, rule.semEffect baseMeaning)
Instances For
Generate all forms in the paradigm (base + inflected).
Equations
- s.allForms baseMeaning = (s.lemma_, s.baseFeatures, baseMeaning) :: List.map (fun (x : Morphology.MorphRule σ) => s.inflect x baseMeaning) s.paradigm
Instances For
Distribution of inflectional categories between two elements of a periphrastic construction (e.g., auxiliary and lexical verb in an AVC). [And06a] [Byb85]
In an aux-headed AVC, onLex is minimal (stem only or empty).
In a lex-headed AVC, onAux is empty.
In a split AVC, onAux and onLex host different category types.
In a doubled AVC, onAux and onLex overlap.
- onAux : List MorphCategory
- onLex : List MorphCategory
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Morphology.instReprInflDistribution = { reprPrec := Morphology.instReprInflDistribution.repr }
Equations
- Morphology.instDecidableEqInflDistribution.decEq { onAux := a, onLex := a_1 } { onAux := b, onLex := b_1 } = if h : a = b then h ▸ if h : a_1 = b_1 then h ▸ isTrue ⋯ else isFalse ⋯ else isFalse ⋯