Extraction Morphology #

@cite{elkins-torrence-brown-2026} @cite{erlewine-2018} @cite{erlewine-2016}

Theory-neutral types for cross-linguistic extraction morphology --- how languages morphologically mark that a constituent has undergone A-bar-movement (wh-movement, relativization, focus fronting, etc.).

Languages vary dramatically in whether and how they track extraction:

English: no overt marking (gap strategy)
Austronesian (Tagalog, Malagasy): voice alternation marks which argument has been extracted
Mayan (Mam, K'iche'): dedicated morphemes on verbal complex (Mam =(y)a', K'iche' wi)
Celtic (Irish): complementizer changes form (aL vs. aN)
Chamorro: agreement morphology tracks extracted position

source

inductive Typology.ExtractionMarkingStrategy :

Type

How a language morphologically marks extraction (A-bar-movement).

This is a descriptive typology of the surface strategy; different syntactic theories will derive these differently.

unmarked : ExtractionMarkingStrategy
No overt morphology marks extraction. The extracted position is a silent gap. E.g., English "What did you buy __?". (Renamed from none to avoid shadowing Option.none.)
voiceAlternation : ExtractionMarkingStrategy
Voice alternation: verbal voice morphology changes to mark which argument has been extracted. E.g., Tagalog Actor/Patient/Locative voice; Toba Batak Actor/Object voice.
dedicatedMorpheme : ExtractionMarkingStrategy
A dedicated morpheme appears on the verbal complex when extraction occurs. E.g., Mam =(y)a' on Voice0/Dir0, K'iche' wi, Kaqchikel AF -Vn, Irish complementizer aL.
agreementTracking : ExtractionMarkingStrategy
Agreement morphology on the verb tracks the extracted position. E.g., Chamorro wh-agreement.
complementizerChange : ExtractionMarkingStrategy
The complementizer changes form depending on whether extraction has occurred through its clause. E.g., Irish aL (direct) vs. aN (indirect).

Instances For

source

@[implicit_reducible]

instance Typology.instDecidableEqExtractionMarkingStrategy :

DecidableEq ExtractionMarkingStrategy

Equations

Typology.instDecidableEqExtractionMarkingStrategy x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

source

def Typology.instReprExtractionMarkingStrategy.repr :

ExtractionMarkingStrategy → Nat → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

source

@[implicit_reducible]

instance Typology.instReprExtractionMarkingStrategy :

Repr ExtractionMarkingStrategy

Equations

Typology.instReprExtractionMarkingStrategy = { reprPrec := Typology.instReprExtractionMarkingStrategy.repr }

The 5 cases above are descriptive surface-typology categories. The analytical claims that competing accounts give for why a language has the surface pattern it does — e.g., Erlewine @cite{erlewine-2016} @cite{erlewine-2018} on Kaqchikel/Mayan AF as Spec-to-Spec Anti-Locality repair; Erlewine 2018 on Toba Batak extraction as a structural pivot restriction; Aldridge, Coon, Coon & Mateo Pedro & Preminger, Coon & Keine, Henderson, etc. with rival analyses — live in Phenomena/FillerGap/Studies/ files anchored on the specific paper. They are not enum cases here.

source

inductive Typology.ExtractionTarget :

Type

The grammatical position from which extraction occurs.

This intersects with the @cite{keenan-comrie-1977} Accessibility Hierarchy (see Core/Relativization/Hierarchy.lean), but is defined independently because extraction morphology may make finer distinctions than relativization.

subject : ExtractionTarget
Subject (ergative/nominative) extraction
directObject : ExtractionTarget
Direct object (accusative/absolutive) extraction
indirectObject : ExtractionTarget
Indirect object (dative/applicative) extraction
oblique : ExtractionTarget
Oblique (instrumental, locative, etc.) extraction
possessor : ExtractionTarget
Possessor extraction

Instances For

source

@[implicit_reducible]

instance Typology.instDecidableEqExtractionTarget :

DecidableEq ExtractionTarget

Equations

Typology.instDecidableEqExtractionTarget x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

source

@[implicit_reducible]

instance Typology.instReprExtractionTarget :

Repr ExtractionTarget

Equations

Typology.instReprExtractionTarget = { reprPrec := Typology.instReprExtractionTarget.repr }

source

def Typology.instReprExtractionTarget.repr :

ExtractionTarget → Nat → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

source

inductive Typology.ArgumentRole :

Type

The thematic category of an argument being extracted: agent (external argument), patient (internal argument), or oblique.

Coarser than ThetaRole (which distinguishes agent/experiencer/ causer, patient/theme, goal/source/instrument). Used when the relevant distinction is which macro-role is extracted, not fine- grained thematic relations or structural positions.

Complements ExtractionTarget (structural position): ArgumentRole identifies what is extracted; ExtractionTarget identifies where it was extracted. The two coincide in simple active clauses (agent = subject, patient = object) but diverge under voice alternation (in OV, the patient becomes the subject).

agent : ArgumentRole
patient : ArgumentRole
oblique : ArgumentRole

Instances For

source

@[implicit_reducible]

instance Typology.instDecidableEqArgumentRole :

DecidableEq ArgumentRole

Equations

Typology.instDecidableEqArgumentRole x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

source

def Typology.instReprArgumentRole.repr :

ArgumentRole → Nat → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

source

@[implicit_reducible]

instance Typology.instReprArgumentRole :

Repr ArgumentRole

Equations

Typology.instReprArgumentRole = { reprPrec := Typology.instReprArgumentRole.repr }

source

def Typology.ArgumentRole.defaultPosition :

ArgumentRole → ExtractionTarget

Default structural position for a given argument role (active voice).

Equations

Instances For

source

inductive Typology.Extractee :

Type

What is being extracted: a DP argument (which has a thematic role and needs Case licensing) or a non-DP adjunct (which has no thematic role and is Case-exempt).

This distinction drives the DP/non-DP extraction asymmetry: in predicate-fronting languages like Toba Batak, only DP extraction is restricted to the pivot; adjuncts extract freely.

dpArg : ArgumentRole → Extractee
adjunct : Extractee

Instances For

source

def Typology.instDecidableEqExtractee.decEq (x✝ x✝¹ : Extractee) :

Decidable (x✝ = x✝¹)

Equations

Typology.instDecidableEqExtractee.decEq (Typology.Extractee.dpArg a) (Typology.Extractee.dpArg b) = if h : a = b then h ▸ isTrue ⋯ else isFalse ⋯
Typology.instDecidableEqExtractee.decEq (Typology.Extractee.dpArg a) Typology.Extractee.adjunct = isFalse ⋯
Typology.instDecidableEqExtractee.decEq Typology.Extractee.adjunct (Typology.Extractee.dpArg a) = isFalse ⋯
Typology.instDecidableEqExtractee.decEq Typology.Extractee.adjunct Typology.Extractee.adjunct = isTrue ⋯

Instances For

source

@[implicit_reducible]

instance Typology.instDecidableEqExtractee :

DecidableEq Extractee

Equations

Typology.instDecidableEqExtractee = Typology.instDecidableEqExtractee.decEq

source

@[implicit_reducible]

instance Typology.instReprExtractee :

Repr Extractee

Equations

Typology.instReprExtractee = { reprPrec := Typology.instReprExtractee.repr }

source

def Typology.instReprExtractee.repr :

Extractee → Nat → Std.Format

Equations

One or more equations did not get rendered due to their size.
Typology.instReprExtractee.repr Typology.Extractee.adjunct prec✝ = Repr.addAppParen (Std.Format.nest (if prec✝ ≥ 1024 then 1 else 2) (Std.Format.text "Typology.Extractee.adjunct")).group prec✝

Instances For

source

structure Typology.ExtractionProfile :

Type

A language's extraction morphology profile: what strategy it uses, which positions are marked, and whether the marking distinguishes between different extracted positions.

Follows the RelativizationProfile pattern from Typology/Relativization/Defs.lean.

language : String
Language name
strategy : ExtractionMarkingStrategy
Primary extraction-marking strategy
markedPositions : List ExtractionTarget
Which extraction targets trigger overt marking. Empty for none strategy languages.
distinguishesPosition : Bool
Does the marking distinguish which position was extracted? E.g., Tagalog voice distinguishes subject/object/oblique; Mam =(y)a' marks only oblique.
notes : String
Notes