VP Raising in a VOS Language @cite{cole-hermon-2008} #
@cite{cole-hermon-2008} argue that VOS word order in Toba Batak derives from VP-raising to Spec,TP (or more precisely, VoiceP-raising to Spec,FP in their full analysis), rather than from rightward subject shift or base-generation. Three lines of evidence converge:
Word order: VOS and the positions of IOs and adverbials follow from VP/VoiceP raising + remnant movement. The subject is stranded below the fronted predicate.
Extraction restrictions: Direct objects and passive agents cannot be Ā-extracted (frozen inside the raised VoiceP), while subjects, indirect objects, and adverbials can (they escape VoiceP before it raises). This freezing effect is the paper's central novel prediction.
Binding asymmetries: In actives, the subject c-commands the direct object (can bind a reflexive DO) but the DO cannot c-command the subject (cannot bind a reflexive subject). In passives, reconstruction allows the passive agent to bind a reflexive passive subject — a pattern unexplained by a purely thematic hierarchy.
Simplification #
The derivation here follows the simplified tree (2) from p. 146 of the paper, where VP raises to Spec,TP. The paper's full analysis (§4, trees 50–65) uses VoiceP raising to Spec,FP with remnant movement (IO/Adv escape before VoiceP fronts), a Voice head for mang-/di- morphology, and a richer functional sequence. The simplified derivation suffices for the word-order and c-command predictions; the extraction and binding predictions are formalized separately using the paper's empirical generalizations.
EPP Parameter (formerly Core/EPP.lean) #
The Extended Projection Principle (EPP) requires Spec,TP to be filled. Cross-linguistically, languages differ in how this requirement is satisfied, yielding different basic word orders from the same underlying vP-internal structure. This is the parameter space @cite{cole-hermon-2008} exploit.
What satisfies the EPP (requirement to fill Spec,TP).
- subjectRaising : EPPStrategy
Subject DP raises to Spec,TP (English, French, etc.).
- vpRaising : EPPStrategy
VP/predicate phrase raises to Spec,TP (Toba Batak VOS, Malagasy VOS).
- expletive : EPPStrategy
Expletive inserted in Spec,TP (English there, French il).
- none : EPPStrategy
No EPP — verb-initial order persists (one analysis of Irish/Arabic VSO).
Instances For
Equations
- Minimalist.instReprEPPStrategy = { reprPrec := Minimalist.instReprEPPStrategy.repr }
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Minimalist.instDecidableEqEPPStrategy x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Word-order parameter: EPP strategy and predicted basic order.
- language : String
- eppStrategy : EPPStrategy
- predictedOrder : String
Instances For
Equations
- Minimalist.english_wo = { language := "English", eppStrategy := Minimalist.EPPStrategy.subjectRaising, predictedOrder := "SVO" }
Instances For
Equations
- Minimalist.tobaBatak_wo = { language := "Toba Batak", eppStrategy := Minimalist.EPPStrategy.vpRaising, predictedOrder := "VOS" }
Instances For
"mangatuk" — ACT-hit (active voice transitive verb).
Equations
- ColeHermon2008.v_mangatuk = Minimalist.mkLeafPhon Minimalist.Cat.V [] "mangatuk" 1
Instances For
"biangi" — dog-DEF (definite object DP).
Equations
- ColeHermon2008.n_biangi = Minimalist.mkLeafPhon Minimalist.Cat.N [] "biangi" 2
Instances For
"dakdanakan" — child-that (subject DP).
Equations
- ColeHermon2008.n_dakdanakan = Minimalist.mkLeafPhon Minimalist.Cat.N [] "dakdanakan" 3
Instances For
Little v (silent, selects VP).
Equations
Instances For
T (silent, selects vP).
Equations
Instances For
The VP constituent [VP V Obj] — the phrase that raises.
Instances For
Toba Batak VOS via VP-raising to Spec,TP.
Steps (bottom-up):
- EM-R Obj →
[VP V Obj] - EM-L v →
[v' v VP] - EM-L Subj →
[vP Subj [v' v VP]] - EM-L T →
[TP T [vP Subj [v' v VP]]] - IM VP →
[TP VP [T' T [vP Subj [v' v tVP]]]]
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- ColeHermon2008.v_saw = Minimalist.mkLeafPhon Minimalist.Cat.V [] "saw" 11
Instances For
Equations
- ColeHermon2008.n_mary = Minimalist.mkLeafPhon Minimalist.Cat.N [] "Mary" 12
Instances For
Equations
- ColeHermon2008.n_john = Minimalist.mkLeafPhon Minimalist.Cat.N [] "John" 13
Instances For
Equations
Instances For
Equations
Instances For
English SVO via subject-raising to Spec,TP.
Same base as Toba Batak, but the subject (not VP) moves to Spec,TP.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Phase 1.0 substrate caveat: under MCB nonplanar SOs (FreeCommMagma carrier)
phonYield is noncomputable (depends on the LCA-derived linearization
that's a Phase 2 substrate item) and .shape is gone. Word-order theorems
that depend on which specific word-order surfaces require Phase 2 LCA
+ linearize work. Sorry-stubbed pending that.
VP-raising yields Verb-Object-Subject surface order.
Subject-raising yields Subject-Verb-Object surface order.
Both derivations have the same tree shape before the movement step
(stage 4). The only parametric difference is what moves in step 5.
Phase 1.0 sorry: .shape no longer typechecks.
Toba Batak moves the VP (one moved item).
English moves the subject DP (one moved item).
After VP-raising, the VP c-commands the subject in the derived tree.
The derived tree is [TP [VP V Obj] [T' T [vP Subj [v' v tVP]]]].
VP is the left daughter of TP; its sister T' dominates the subject.
@cite{cole-hermon-2008} use this c-command relation to explain:
- Freezing: the raised VP is a moved constituent in specifier position, making it an island for extraction. Elements inside VP (including the direct object) are frozen and cannot Ā-extract.
- Subject accessibility: the subject, outside the raised VP, is stranded and remains accessible for further extraction.
Note: this does NOT establish "backward binding" by the object into the subject. The paper explicitly shows that active DOs cannot bind a reflexive subject (Table 1, Type C: ill-formed for all speakers). VP c-commanding the subject is a phrasal c-command relation; it does not entail that the DO (properly contained within VP) individually c-commands the subject.
Toba Batak and English instantiate the same EPP parameter space: VP-raising → VOS, subject-raising → SVO.
Freezing under VP-raising #
@cite{cole-hermon-2008} §4: the VP-raising analysis predicts extraction restrictions via freezing. The raised VP/VoiceP is a moved constituent in specifier position, making it an island (following the Sentential Subject Constraint / Condition on Extraction Domain). The predictions:
- Direct object: frozen inside the raised VP → cannot Ā-extract
- Passive agent: frozen inside the raised VoiceP → cannot Ā-extract
- Subject (pivot): stranded outside the raised VP, in Spec,FP → can extract
- Indirect object: escapes VP via remnant movement before VP raises → can extract
- Adverbials: likewise escape before VP raises → can extract
These predictions match the Toba Batak extraction data formalized in
Fragments.TobaBatak.Basic and verified in
Phenomena.FillerGap.Studies.Erlewine2018.
The subject is NOT contained within the fronted VP. It is stranded outside the moved constituent and remains accessible for extraction.
This is the structural basis for the pivot-only extraction restriction: only the subject (= pivot) survives VP-raising in a position where Ā-extraction is possible.
Extraction prediction: the VP-raising analysis predicts exactly the extraction pattern found in Toba Batak.
For DP arguments in actor voice:
- Agent (= subject/pivot, outside VP): grammatical
- Patient (= DO, inside VP): ungrammatical
This matches Fragments.TobaBatak.avAgentExtraction (grammatical)
and Fragments.TobaBatak.avPatientExtraction (ungrammatical).
Binding data from Table 1 #
@cite{cole-hermon-2008} §3.4–§5 present binding data that bear on the choice between a c-command analysis and the Semantic Hierarchy Condition of Schachter (1984b) and Sugamoto (1984). The key data from Table 1:
| Antecedent | Reflexive | Acceptability |
|---|---|---|
| Active subject | Direct object | Type A (fully acceptable) |
| Passive agent | Passive subject | Type A (fully acceptable) |
| Passive subject | Passive agent | Type B (intermediate) |
| Active DO | Active subject | Type C (ill-formed) |
Type A follows from c-command in the base structure (pre-movement). Type C follows from the absence of c-command: the DO does not c-command the subject at any derivational stage. Type B requires reconstruction: the passive subject can be interpreted in its base VP-internal position, where the passive agent c-commands it.
The VP-raising analysis correctly predicts all four types. The Semantic Hierarchy Condition alone fails to distinguish Types B and C (it predicts both should be ill-formed, since in both cases the patient antecedes the agent reflexive).
Binding acceptability from Table 1 of @cite{cole-hermon-2008}.
- fullyAcceptable : BindingAcceptability
Type A: fully acceptable for all speakers.
- intermediate : BindingAcceptability
Type B: intermediate — acceptable, but not the most usual way to express the sentence.
- illFormed : BindingAcceptability
Type C: ill-formed, not acceptable for any speakers.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Equations
- ColeHermon2008.instDecidableEqBindingAcceptability x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
A binding datum: which NP is the antecedent, which is the reflexive, in which voice, and the acceptability judgment.
- antecedentRole : String
- reflexiveRole : String
- voice : String
- acceptability : BindingAcceptability
- description : String
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- ColeHermon2008.instReprBindingDatum = { reprPrec := ColeHermon2008.instReprBindingDatum.repr }
Active subject antecedes DO reflexive: Type A. Example: "Si-Bunga mang-ida [dirina sandiri]" (Bunga saw herself.)
Equations
- One or more equations did not get rendered due to their size.
Instances For
Passive agent antecedes passive subject reflexive: Type A. Example: "Di-ida si-Torus dirina natoari" (Himself was seen by Torus yesterday.)
Equations
- One or more equations did not get rendered due to their size.
Instances For
Passive subject antecedes passive agent reflexive: Type B. Example: "Di-ida [dirina sandiri] si-John" (John was seen by himself.)
Equations
- One or more equations did not get rendered due to their size.
Instances For
Active DO antecedes active subject reflexive: Type C. Example: "*[Dirina sandiri] pa-ias-hon dakdanak-i" (*Himself cleaned the child.)
Equations
- One or more equations did not get rendered due to their size.
Instances For
All binding data from Table 1.
Equations
Instances For
Types A and B are acceptable; Type C is ill-formed.
The direct object does not c-command the subject (Boolean check).
The DO's inability to bind the subject (Type C) follows from the
derivation: the DO is inside VP, and VP c-commands the subject
(proved in vp_ccommands_subject), but the DO itself does not
c-command the subject. C-command is not inherited by sub-constituents.
In the derived tree [TP [VP V DO] [T' T [vP Subj ...]]]:
- VP c-commands Subj ✓ (VP's sister T' dominates Subj)
- DO does NOT c-command Subj ✗ (DO's sister is V, which does not dominate Subj — V is inside VP, not sister to anything outside VP)
This asymmetry is why Types A and C differ: the subject (in Spec,vP) c-commands into VP (can bind a reflexive DO), but the DO (inside VP) cannot c-command out past VP (cannot bind a reflexive subject).
Connecting to Toba Batak extraction infrastructure #
The VP-raising analysis's extraction predictions are independently
formalized in Fragments.TobaBatak.Basic (empirical extraction data)
and Phenomena.FillerGap.Studies.Erlewine2018 (verification theorems).
This section bridges the derivational analysis to that data.
The EPP strategy for Toba Batak is VP-raising, which is the derivational mechanism that produces the freezing effect responsible for the extraction restriction.
The extraction profile marks only the subject position as extractable. This is exactly the position that VP-raising strands outside the fronted predicate: the pivot in Spec,TP (or Spec,FP in the full analysis).
The voice system is two-way (AV/OV), determining which thematic role occupies the extractable pivot position.
decide confirmation of vp_ccommands_subject's structured proof.
The VOS Hypothesis #
@cite{cole-hermon-2008} §5: SVO order is common in Toba Batak (~1/3 of sentences). Two competing analyses:
- SVO Hypothesis: SVO sentences have underlying SVO and VoiceP never raises. Predicts different extraction restrictions for SVO vs VOS.
- VOS Hypothesis: ALL clauses go through a VOS stage; SVO results from the subject raising past the fronted VoiceP to a higher specifier. Predicts the SAME extraction restrictions for SVO and VOS.
The data confirm the VOS Hypothesis: extraction from SVO clauses shows the same freezing effects as VOS (examples 85–88). Direct objects cannot be wh-fronted regardless of surface word order.
The derivation extends tobaBatakVOS with one more step: subject raises
to Spec,FP (a higher functional projection), past the fronted VP.
This analysis connects to the claim in §6 that linear order within Merge
is irrelevant — only c-command matters. This is precisely the content of
the Linear Correspondence Axiom (LCA) formalized in
Theories.Syntax.Minimalist.Linearization.LCA.
F head (higher functional projection above TP).
Equations
Instances For
Toba Batak SVO via the VOS Hypothesis.
Steps 1–5 are identical to tobaBatakVOS (yielding VOS at stage 5).
Then:
6. EM-L F → [FP F [TP VP [T' T [vP Subj [v' v tVP]]]]]
7. IM Subj → [FP Subj [F' F [TP VP [T' T [vP tSubj [v' v tVP]]]]]]
The subject raises past the fronted VP, yielding S-V-O surface order.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The VOS Hypothesis derives SVO surface order.
SVO goes through VOS: at stage 5 (before subject-raising), the
intermediate tree has VOS order — the same as tobaBatakVOS.final.
The VOS Hypothesis predicts identical extraction restrictions for SVO: the DO is still inside the fronted VP, regardless of whether the subject subsequently raises past it.
SVO requires two movement steps (VP-raising + subject-raising).
English passives and the agent-as-adjunct analysis #
@cite{cole-hermon-2008} §7 extends the VP-raising analysis to English passives, predicting why English and Toba Batak differ on passive binding.
The key structural difference: in TB, the passive agent is an argument generated in Spec,vP (high position, c-commands patient in VP). In English, the passive agent is an adjunct (by-phrase, low position inside VP, does not c-command patient).
Consequence:
- TB passive: agent c-commands patient at the base stage → reconstruction allows agent to bind reflexive patient (Type A). Patient raised to surface subject also c-commands agent (Type B).
- English passive: agent never c-commands patient → agent cannot bind reflexive patient (example 96: "*himself was injured by the boy"). Patient raised to subject c-commands agent → patient can bind reflexive agent (example 95: "the boy was injured by himself").
We model the English passive with the agent as a low complement of V (representing the by-phrase adjunct) and the patient as specifier of VP (following @cite{larson-1988}), with no external argument in Spec,vP.
Equations
- ColeHermon2008.v_injured = Minimalist.mkLeafPhon Minimalist.Cat.V [] "was-injured" 21
Instances For
Equations
- ColeHermon2008.n_boy = Minimalist.mkLeafPhon Minimalist.Cat.N [] "the-boy" 22
Instances For
Equations
- ColeHermon2008.n_by_himself = Minimalist.mkLeafPhon Minimalist.Cat.N [] "by-himself" 23
Instances For
Equations
Instances For
Equations
Instances For
The passive VP: [VP patient [V' V agent-PP]].
Equations
Instances For
English passive derivation (trees 97–100 of the paper).
Steps:
- EM-R agent-PP →
[V' V agent] - EM-L patient →
[VP patient [V' V agent]] - EM-L v →
[v' v VP](no external argument — passive) - EM-L T →
[TP T [vP v VP]] - IM patient →
[TP patient [T' T [vP v [VP t [V' V agent]]]]]
Equations
- One or more equations did not get rendered due to their size.
Instances For
English passive yields patient-verb-agent surface order.
TB vs English passive binding #
The same theory (c-command based binding + optional reconstruction) with different structural parameters (agent-as-argument vs agent-as-adjunct) correctly predicts the cross-linguistic contrast:
| Pattern | Toba Batak | English |
|---|---|---|
| Agent antecedes patient refl. | ✓ (Type A) | ✗ (96) |
| Patient antecedes agent refl. | ✓ (Type B) | ✓ (95) |
Both predictions follow from c-command in the derived tree:
- TB agent (Spec,vP) c-commands patient (VP) at base → reconstruction
- English agent (low PP) does not c-command patient at any stage
- Both: patient (raised to Spec,TP/FP) c-commands agent
The formalization verifies the c-command predictions computationally
using cCommandsIn over the derived trees.
In the English passive, the patient (raised to Spec,TP) c-commands the by-phrase agent. This is why "The boy was injured by himself" is grammatical: the patient can bind a reflexive in the agent position.
In the English passive, the by-phrase agent does NOT c-command the patient. This is why "*Himself was injured by the boy" is ungrammatical: the agent (low adjunct inside VP) cannot bind a reflexive in subject position.
In the TB active base structure (pre-movement, stage 4), the subject c-commands the object. This is the structural basis for Type A binding (active subject → DO refl).
Binding is evaluated at the pre-movement stage: the base tree is
[TP T [vP Subj [v' v [VP V Obj]]]], where Subj's sister (v')
contains the object. After VP-raising, the object moves to a
different branch; reconstruction restores the base c-command.
Cross-linguistic contrast verified: same c-command theory, different structural parameters, different binding predictions.
The conjunction links four c-command checks across two languages:
- TB active (base): subject c-commands object (Type A binding)
- TB active (derived): object does not c-command subject (Type C)
- English passive (derived): patient c-commands agent (ex. 95)
- English passive (derived): agent does not c-command patient (ex. 96)
Toba Batak VP-raising as a RemnantFronting instance #
@cite{cole-hermon-2008}'s VP-raising analysis is one of the canonical
empirical motivations for remnant-XP movement substrate
(Theories/Syntax/Minimalist/Movement/Remnant.lean). The full §4
analysis has IO/Adv evacuating before VoiceP raises — making the
fronted constituent a genuine remnant. The simplified §1 derivation
formalised above keeps the IO/Adv inside vP for tractability, but the
core VP-raising mechanism is the same.
Below we register the simplified Toba Batak derivation as a
Minimalist.Movement.RemnantFronting witness. The substrate is then
used — properRemnant is decidable on the witness — rather than
implicitly referenced in prose.
Toba Batak VP-raising as a remnant-fronting witness. In the
simplified §1 derivation, no head has evacuated vp before it
fronts (V stays inside VP, no IO/Adv extraction), so the
evacuatedHeads list is empty — vacuously a "remnant" in the
structural sense, though not a contentful remnant in the §4 sense
of evacuated IO/Adv. The full §4 instance would populate
evacuatedHeads with the IO and adverbials.
Equations
- tobaBatakVPRaising_remnant = { frontedXP := ColeHermon2008.vp, evacuatedHeads := [], landingSite := ColeHermon2008.t_head }
Instances For
The §1-simplified Toba Batak fronting trivially satisfies
properRemnant: an empty evacuation list vacuously satisfies
the universal "every evacuated head was originally inside the
fronted XP".