Pietraszko 2026: In defense of the clause-internal phase (Ndebele) #
[pietraszko-2026]
VoiceP in Zimbabwean Ndebele is a clause-internal phase, diagnosed by operational opacity for A-movement and φ-agreement (not by successive-cyclic footprints). Mechanism: Voice optionally bears EPP; when it does, the in-vP subject vacates to the phase edge (Spec,VoiceP) and is visible to T's probes; when it does not, the subject is trapped in the phase complement and T defaults to class-15 ku- agreement.
Main results #
- §1-§2: Tree-derived PIC accessibility (
subject_inSitu_inaccessible,subject_atEdge_accessible) — the load-bearing structural derivations. - §3-§5: Bidirectional movement-agreement, Aux-V uniformity, phase-delimited movement.
- §6: Convergence with [ES25b] (Malayic
VoiceP-as-phase) —
voice_phase_attested_in_two_families. - §7: Four-cell
(flavor.defaultPhasal × phaseOverride)typology witnessed by Pietraszko + E&S + [Coo19] + [CMPP14]. - §8: Study-local horizon-based reduction (
pietraszkoNdebeleConfig) for side-by-side comparison with [Kei20]'s framework. - §3.1.1, §3.1.3, §3.2: Empirical theorems for raising-to-object, reduced-clause AspP test (with morphological exponent functions), and object-dislocation entails subject accessibility.
See also: substrate, sibling Studies, deferred work #
See module-internal Notes block (after the namespace) for: methodology
caveats (single-consultant data; novel allomorphy split); cross-framework
positions vs Keine 2020; alternative analyses ruled out (Carstens &
Mletshe, Halpert, Zeller); IS-syntax interface deliberately bracketed;
deferred work (multi-clause hyperraising; per-paper alternative models
as formalized siblings; bibliography backlog).
Notes #
Methodology caveat. All Ndebele data are from a single native
consultant ([pietraszko-2026], fn 1: a 60-year-old speaker who
grew up in Bulawayo). The class-1 /u/ (T) vs /e/ (Perf/Asp)
allomorphy split (encoded in class1Allomorph below) is novel to this
paper and not yet corroborated against published Ndebele grammars.
Alternative analyses ruled out empirically (Pietraszko §3, §4 — none yet formalized in linglib):
- Optional [EPP,φ] on T (Carstens & Mletshe 2015 for Xhosa) — falsified by §3.1.1 raising-to-object and §3.1.2 Aux-V uniformity.
- Expletive pro (Halpert 2015 + Buell 2005, 2007 for Zulu) — falsified by §3.2 topicalization across the expletive.
- Antifocus feature (Zeller 2008, 2015) — falsified by §4 hyperraising (ex 73-76); not formalized here.
Information-structural conditioning bracketed. Bantu VS order is
paradigmatically focus / discourse-newness conditioned (Buell 2005,
Zerbian 2006). Pietraszko 2026 treats the optionality as syntactic.
Linglib/Features/InformationStructure.lean exists but is not consumed
by this file.
Deferrals (next-session candidates):
- §4 hyperraising-to-VP (ex 73-76) — multi-clause derivations.
- Per-paper alternative models (CM, Halpert, Zeller) as formalized sibling accounts whose distinct predictions can be refuted by explicit theorem rather than by docstring claim.
- A
Fragments/Ndebele/graduation once a second Ndebele paper lands. - A cross-Bantu agreement typology hub under
Typology/. - Typeclass-based phasehood-attestation registry (per the audit's
"obvious next move"): would let convergence with E&S 2025 be
true-by-construction via
instancedeclarations. - Bibliography: ~25 missing entries with publisher-verified DOIs.
§1. Lexical sample (Ndebele Aux-V structure) #
Voice with EPP feature (Pietraszko 2026): triggers subject movement
to Spec,VoiceP. VoiceP is always phasal in Ndebele
(phaseOverride := some true); the variation is on the EPP feature.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Voice without EPP: subject trapped in vP, invisible to higher probes via PIC at the Voice phase boundary.
Equations
- Pietraszko2026.Sample.voiceWithoutEPP = { flavor := Minimalist.VoiceFlavor.agentive, hasD := true, phaseOverride := some true }
Instances For
Token id namespace (each lexical position in the derivation
gets a distinct id; needed because LIToken distinguishes
id-tagged copies).
Equations
Instances For
Equations
Instances For
Equations
Instances For
A T head bearing [φ, EPP].
Equations
Instances For
An Asp head also bearing [φ, EPP] (per [baker-2003], [baker-2008], [collins-2004], [carstens-2005]: in Bantu, φ-probes occur only on heads with EPP, so each inflectional head above Voice carries the same probe profile).
Equations
Instances For
A Voice head (the lexical leaf; Voice's phasehood and EPP are
metadata on the corresponding VoiceHead, not on the LIToken).
Equations
Instances For
A v / V head (lexical verb stem, here "phek-" 'cook').
Equations
- Pietraszko2026.Sample.vToken = { item := Minimalist.LexicalItem.simple Minimalist.Cat.v [], id := Pietraszko2026.Sample.idV }
Instances For
The subject DP's LIToken (class 1, uZondi).
Equations
Instances For
A trace LIToken at the base position after movement (distinct LIToken, indistinguishable to PIC since the interior membership check tests structural containment, not LI equality).
Equations
Instances For
The subject DP at its base position (Spec,vP). Class 1 (uZondi).
Equations
Instances For
The subject DP after movement to Spec,VoiceP (a fresh copy, since
each LIToken in copy theory is a distinct token).
Equations
Instances For
A trace at the base position after movement (distinct LIToken).
Equations
Instances For
The trees below are built planar-first via the DSL (SO.ofPlanar
/ SO.nodeP / SO.leafP) because the smart Merge SO.node is
noncomputable; planar construction keeps the decide PIC proofs
reducing.
The Voice projection with NO Spec,VoiceP — subject remains in vP.
Structure: voice (subject v). Voice is the phase; its complement
is the entire vP, which contains the subject.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The Voice projection with subject at Spec,VoiceP (phase edge).
Structure: subject (voice (trace v)). The OUTER node is not the
phase; the inner voice (trace v) is the phase, and the subject
(sister to it) is at the edge.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The Voice phase for voiceP_noSpec: phase head voiceToken, tree
voiceP_noSpec. Its interior (= the head's c-command domain, the
spelled-out vP) and edge are derived from the selection head
(MCB §1.14), not stipulated. The subject sits in the interior.
Equations
Instances For
The Voice phase for voiceP_withSpec: phase head voiceToken, whose maximal
projection is the inner VoiceP. The derived interior is the post-movement vP;
the moved subject at Spec,VoiceP is on the derived edge.
Equations
Instances For
The full Aux-V tree (T > Asp > Voice > vP) with subject in situ.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The full Aux-V tree with subject moved to Spec,VoiceP. T's probe will additionally attract the subject to Spec,TP — that movement is the derivational step we don't model here; the salient question is just whether T's probe finds the subject under PIC.
Equations
- One or more equations did not get rendered due to their size.
Instances For
§2. Phase-bounded accessibility (derived from PIC) #
The substantive content of Pietraszko 2026's analysis is that the
subject's accessibility to a higher probe is determined by whether
the subject sits inside the Voice-phase complement. We derive this
from the MCB-grounded Phase.Impenetrable (the interior = complement,
eq 1.14.3) rather than stipulate it.
The subject is accessible to a probe above the Voice phase iff it is NOT
frozen in the phase interior (= the spelled-out complement, PIC). This is the
negation of the MCB-derived Phase.Impenetrable — no stipulated complement.
Equations
- Pietraszko2026.IsSubjectAccessibleAcross voicePhase subj = ¬voicePhase.Impenetrable subj
Instances For
When subject is in situ (in vP, the Voice-phase complement), it is
NOT accessible across the phase. Derived from voicePhase_noSpec's
complement field via contains.
When subject is at Spec,VoiceP (the phase edge), it IS accessible
across the phase — the trace inside vP is a distinct LIToken
(different id), so contains correctly returns false on the
moved-subject copy, and accessibility holds.
PIC encoding choice: traces use distinct ids (idSubjectTrace = 7
vs idSubject = 5); a substrate move to true copy theory (where
a chain is one entity with deletion at PF) would require revisiting
this representation. The current encoding is sound for contains
PIC checking but is not faithful to chain-based copy theory.
§3. Prediction 1 — bidirectional movement-agreement covariation #
Pietraszko §2.3: subject movement to Spec,TP and φ-agreement on T covary
perfectly. Honest framing: in copy-theory PIC, "subject moved" and
"T agrees" are the same structural condition (subject is OUTSIDE the
Voice-phase complement). The bidirectional theorem in this file
therefore reduces to rfl modulo decide — its content is structural
(the single PIC condition has the two observable consequences),
not a coincidence between independent measurements. The load-bearing
derivation lives in §2 (subject_inSitu_inaccessible,
subject_atEdge_accessible).
A typeclass-based reformulation (per the prior mathlib audit's "obvious
next move") would let convergence with E&S 2025 register as instance
declarations and the bidirectional become true-by-construction.
Deferred.
Surface phenomenon — both routes coincide: T's probe finds an accessible subject ↔ the subject has vacated the Voice phase complement. Derived from PIC via the tree's Voice phase.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Subject's accessibility to a higher probe in a given Aux-V tree. The single observable; both surface phenomena (movement + agreement) are derived from this. Defaults to inaccessible for unrecognized tree shapes.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Prediction 1 (bidirectional): surface movement-and-agreement
coincide on the named trees because both reflect the same
IsSubjectAccessibleInTree structural condition.
§4. Prediction 2 — Aux-V uniformity #
In an Aux-V chain [T > Asp > Voice > vP], every functional head above
Voice carries the same φ + EPP probe (per the Bantu Baker generalization:
[baker-2003], [baker-2008], [collins-2004],
[carstens-2005]). Each probe independently checks accessibility
across the Voice phase. Because they share the same accessibility
condition, they uniformly succeed or uniformly fail.
Modeled here without per-head parameterization: every above-Voice probe
reads IsSubjectAccessibleInTree, so uniformity follows by construction.
A more articulated model with per-head probes would consult an
InflectionalHost enum and run independent applyAgree calls; both
would still derive uniformity from the shared accessibility condition.
Prediction 2 (Aux-V uniformity): every functional head above
Voice agrees iff the subject is accessible across the Voice phase.
The theorem is structural: the SAME predicate
IsSubjectAccessibleInTree answers for all such probes, so
uniformity is by definition. The empirical content is in the
concrete witness theorem below — all heads jointly agree or jointly
default on the named trees.
Concrete witness: T and Asp uniformly agree on the moved tree, both fail on the in-situ tree.
§5. Phase-delimited movement (§4 ex 78-79) #
Pietraszko 2026's generalization: A-movement is obligatory within a phase; optional across phases. The Voice-phase boundary is the locus of optionality — once the subject vacates the phase via Spec,VoiceP, movement to Spec,TP is forced.
Phase-internal A-movement is obligatory: a subject inside vP cannot be reached by T (PIC blocks it), so the only way for T to agree is for the subject to be at the phase edge.
Cross-phasal A-movement is optional: independent voice variants can give either visible (with EPP) or invisible (without EPP) — the optionality lives at the Voice phase boundary.
§6. Convergence with [ES25b] #
Both papers commit Voice to be phasal via phaseOverride := some true,
on disjoint empirical domains. The convergence is now machine-checked
rather than docstring-asserted.
Voice phasehood is attested in two unrelated families via the same
phaseOverride := some true carrier: Pietraszko 2026 (Ndebele,
Bantu) and Erlewine & Sommerlot 2025 (Malayic, Austronesian).
§7. Four-cell phase-override typology #
Three concrete clients now exhaust the (VoiceFlavor.defaultPhasal × phaseOverride : Option Bool) typology cells:
- agentive + default-phasal (no override): Collins/Chomsky baseline
—
Minimalist.voiceAgent - agentive + override-true: Pietraszko 2026, this file —
Sample.voiceWithEPP - agentive + override-false: [Coo19] (Chol intransitive), [CMPP14] (Mam Agent Focus)
- passive + override-true: [ES25b] (Malayic di-passive, bare passive)
The typology theorem witnesses each cell with a concrete VoiceHead.
Cell 1: agentive flavor + no override = Collins/Chomsky baseline
(Minimalist.voiceAgent). Inherits IsPhasal from defaultPhasal.
Cell 2: agentive flavor + phaseOverride := some true =
Pietraszko 2026's Voice-with-EPP (this file's Sample.voiceWithEPP).
Same observable phasehood as Cell 1 but with explicit override
documenting the per-paper commitment.
Cell 3: agentive flavor + phaseOverride := some false =
Chol intransitive variants (Coon2019.v_w); also Mam Agent Focus
in [CMPP14]. Override forces non-phasal
against the agentive flavor default.
Cell 4: passive flavor + phaseOverride := some true =
Malayic di- passive in [ES25b]. Override
forces phasal against the passive flavor default (which Collins-style
treats as non-phasal).
§8. Divergence from [Kei20]: study-local probe config #
Pietraszko 2026 §3.1.2 + §3.1.3 argue that probe-based locality ([Kei20]'s horizons) cannot derive Aux-V uniformity. Below is a study-local Keine-style probe configuration intended to model Pietraszko's data using the horizon framework — included for side-by-side comparison, not as the substrate's preferred encoding.
Per linglib layer discipline, this lives here (study-local) rather
than as a row in Syntax/Minimalist/Probe.lean's
LanguageProbeConfig table.
The horizon-based reduction of Pietraszko's account: Voice is the horizon for both φ and A-movement probes. Topicalization (Ā) and wh-licensing are unbounded (Pietraszko §3.2 ex 28: A-bar movement crosses VoiceP without difficulty even when subject is in situ).
Equations
- One or more equations did not get rendered due to their size.
Instances For
The configuration's φ-probe correctly classifies as A-level (not Ā).
A and Ā probes differ in horizon: A is bounded by Voice, Ā is unbounded. This is what permits A′-movement out of VoiceP even when A-movement is blocked (Pietraszko §3.2 ex 55: object dislocation requires subject movement, A and A′ asymmetry resolved).
§9. Empirical predictions deferred to substrate extension #
The following empirical theorems are marked here as the natural next
formalization steps. Each requires substrate that is either present
but underused (raising-to-object: cross-clausal applyAgree) or
genuinely missing (hyperraising: multi-clause Derivation model;
reduced-clause AspP test: LIToken allomorph exponent function).
§3.1.1 raising-to-object #
Pietraszko §3.1.1 ex 25-27: Ngi-funa uZondi {a/*ku}-pheke "I want
Zondi to cook". The embedded subject uZondi raises to matrix object
position; the embedded T's class-1 a- (matching agreement) is forced,
default ku- is *. This falsifies Carstens & Mletshe's optional-T-EPP
analysis: if T optionally lacked [EPP,φ], default agreement should be
possible even while raising. The phasal-Voice account predicts
obligatoriness: raising requires the subject to vacate VoiceP, and
once vacated, T's probe necessarily finds it.
The conditional: subject accessibility (the precondition for both raising-to-object licensing and embedded T agreement) is the single structural condition; either both hold or neither does.
§3.1.3 reduced-clause AspP test #
Pietraszko §3.1.3 ex 34-42: in Ngi-khulume, ubaba e/*u-si-dla "I spoke
while father eating", the adjunct clause is reduced (no T layer).
Subject-agreement uses the /e/ allomorph (Asp) not /u/ (T),
confirming the agreement is on Asp. This argues T always has EPP — were
T's EPP optional, the adjunct could lack T entirely AND still show u-
on a hypothetical T head, but in fact only Asp's /e/ appears.
We formalize the morphological exponent function and the empirical contrast directly.
The inflectional host of an agreement morpheme above Voice. Used by
class1Allomorph and the AspP test.
- T : InflectionalHost
- Asp : InflectionalHost
Instances For
Equations
- Pietraszko2026.instDecidableEqInflectionalHost x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
The class-1 agreement allomorph by host. Pietraszko §3.1.3:
distinct exponents (u_ on T, e_ on Asp) — the morphological
diagnostic that the agreement target in reduced clauses is Asp,
not a hypothetical T head.
- u_ : Class1Exponent
- e_ : Class1Exponent
Instances For
Equations
- Pietraszko2026.instDecidableEqClass1Exponent x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
Default class-15 agreement is uniformly ku-, regardless of host. A constant function — the host argument is empirically vacuous, which is itself the diagnostic content.
Equations
- Pietraszko2026.defaultAllomorph x✝ = "ku-"
Instances For
The reduced-clause AspP test: in adjunct contexts where the highest head is Asp (not T), the class-1 prefix is e- not u-. The contrast is morphological proof that the agreement target in such contexts is genuinely Asp, supporting Pietraszko's claim that T always has EPP and that the optionality lives lower in the spine.
The default ku- prefix is host-uniform: identical on T and Asp. This morphological identity is what makes Aux-V uniformity (§3.1.2) detectable as a single observable.
Combined morphological prediction: matching agreement differs by host; default agreement is host-uniform.
§3.2 object-dislocation entails subject movement (ex 55) #
Pietraszko §3.2: "in Ndebele, if the subject cannot move out of VoiceP, neither can the object." A/A′ convergence — when Voice lacks EPP, both A-probes (T's φ) and the object-dislocation-licensing operation are blocked. We formalize the entailment as a Bool implication on the tree state.
Object dislocation requires its own phase-edge licensing: the object
must escape the Voice phase complement to reach a dislocation
position above. Routed through voicePhaseFor so the entailment
theorem below is not a definitional alias of subject accessibility,
even though the two share the Voice-EPP precondition.
Equations
- Pietraszko2026.CanObjectDislocate tree = match Pietraszko2026.voicePhaseFor tree with | some _ph => Pietraszko2026.IsSubjectAccessibleInTree tree | none => False
Instances For
Equations
- One or more equations did not get rendered due to their size.
§3.2 ex 55 entailment: object dislocation is licensed only when the subject is also accessible across the Voice phase. The two operations share the Voice-EPP precondition; neither can apply when Voice lacks EPP.
Contrapositive: when subject is inaccessible (in situ), object dislocation is impossible.