Documentation

Linglib.Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986

Bach, Brown & Marslen-Wilson (1986) #

@cite{bach-brown-marslen-wilson-1986}

Crossed and Nested Dependencies in German and Dutch: A Psycholinguistic Study. Language and Cognitive Processes, 1(4), 249–262.

Core Finding #

Dutch crossed verb-cluster dependencies (NP₁ NP₂ NP₃ V₁ V₂ V₃) are easier to process than German nested dependencies (NP₁ NP₂ NP₃ V₃ V₂ V₁) at two or more levels of embedding, in both comprehensibility ratings and comprehension accuracy. At one level of embedding (Level 2), German/Participle does not differ from Dutch, though German/Infinitive shows a significant baseline disadvantage across all levels. This confirms @cite{evers-1975}'s intuition that crossed dependencies are easier, with the first controlled experimental evidence.

Incremental Integration Model #

The paper argues qualitatively that crossed dependencies allow incremental top-down integration while nested dependencies force bottom-up accumulation of floating propositions. We formalize this via totalIntegrationCost: the cumulative count of NPs awaiting matrix-connected integration during verb-cluster processing. This metric is our formalization, not the paper's — they argue informally about when partial interpretations become available.

The integration model is derived from the VerbClusterBinding permutation: NP_i is matrix-integrated after k verbs iff max(σ(0), σ(i)) < k — all verbs in the control chain from the bound verb to the matrix verb have been heard. For identity (crossed), σ(0) = 0 so integration is min(k, n). For reverse (nested), σ(0) = n−1 so nothing integrates until all verbs are heard.

The cost ratio nested/crossed is exactly 2 for all n, but the absolute difference n(n−1)/2 grows quadratically — consistent with the finding that the processing difference is undetectable at n=2 (gap = 1) but large at n=3 (gap = 3).

Dependency Length Invariance #

Crossed and nested patterns have identical total NP-verb dependency length (n²). This means the Bach et al. finding cannot be explained by dependency length minimization alone — the advantage of crossed dependencies is about when information becomes available for matrix integration, not about dependency distance.

Formal–Processing Dissociation #

Crossed dependencies require mildly context-sensitive power (@cite{shieber-1985}, @cite{bresnan-etal-1982}) while nested dependencies are context-free, yet crossed is psycholinguistically easier. This refutes models where parsing difficulty tracks the Chomsky hierarchy and provides evidence against push-down-store models of human parsing (@cite{evers-1975}).

def BachBrownMarslenWilson1986.totalIntegrationCost {n : ℕ} (σ : Features.VerbClusterBinding n) :

ℕ

Cumulative unintegrated NPs across verb positions 1..n.

Crossed: (n−1) + (n−2) + ··· + 0 = n(n−1)/2 Nested: n + n + ··· + n + 0 = n(n−1)

Equations

BachBrownMarslenWilson1986.totalIntegrationCost σ = ∑ k ∈ Finset.range n, σ.unintegratedCount (k + 1)

Instances For

theorem BachBrownMarslenWilson1986.level2_costs :

totalIntegrationCost (Features.VerbClusterBinding.identity 2) = 1 ∧ totalIntegrationCost (Features.VerbClusterBinding.reverse 2) = 2

Level 2 (n=2): minimal gap (1 vs 2).

theorem BachBrownMarslenWilson1986.level3_costs :

totalIntegrationCost (Features.VerbClusterBinding.identity 3) = 3 ∧ totalIntegrationCost (Features.VerbClusterBinding.reverse 3) = 6

Level 3 (n=3): gap widens (3 vs 6).

theorem BachBrownMarslenWilson1986.level4_costs :

totalIntegrationCost (Features.VerbClusterBinding.identity 4) = 6 ∧ totalIntegrationCost (Features.VerbClusterBinding.reverse 4) = 12

Level 4 (n=4): gap widens further (6 vs 12).

theorem BachBrownMarslenWilson1986.gap_grows :

totalIntegrationCost (Features.VerbClusterBinding.reverse 3) - totalIntegrationCost (Features.VerbClusterBinding.identity 3) > totalIntegrationCost (Features.VerbClusterBinding.reverse 2) - totalIntegrationCost (Features.VerbClusterBinding.identity 2)

The absolute cost gap grows with embedding depth.

theorem BachBrownMarslenWilson1986.crossed_lt_nested (n : ℕ) (h : n ≥ 2) :

totalIntegrationCost (Features.VerbClusterBinding.identity n) < totalIntegrationCost (Features.VerbClusterBinding.reverse n)

Crossed is strictly cheaper for n ≥ 2.

Proof by element-wise comparison via Finset.sum_lt_sum: at each verb position k ∈ {1,…,n}, unintegratedCount (identity n) ≤ unintegratedCount (reverse n), with strict inequality at k = 1 (the first verb heard).

def BachBrownMarslenWilson1986.totalNPVerbDist {n : ℕ} (σ : Features.VerbClusterBinding n) :

ℕ

Total NP-verb dependency length across all n pairs.

Equations

BachBrownMarslenWilson1986.totalNPVerbDist σ = ∑ i ∈ Finset.range n, if hi : i < n then Features.VerbClusterBinding.npVerbDist n σ ⟨i, hi⟩ else 0

Instances For

theorem BachBrownMarslenWilson1986.dep_length_equal (n : ℕ) :

totalNPVerbDist (Features.VerbClusterBinding.identity n) = totalNPVerbDist (Features.VerbClusterBinding.reverse n)

General case: both patterns yield total distance n².

theorem BachBrownMarslenWilson1986.formal_processing_dissociation :

totalIntegrationCost (Features.VerbClusterBinding.identity 3) < totalIntegrationCost (Features.VerbClusterBinding.reverse 3)

Crossed dependencies are formally harder (mildly context-sensitive) but psycholinguistically easier — formal complexity ≠ processing complexity.

Two independent arguments against PDA parsing:

Dutch is comprehensible at Level 2 despite requiring MCS power (a PDA cannot handle crossed deps at any depth)
Dutch is easier than German at Level 3+ (a PDA predicts nested should be easier or equal)

theorem BachBrownMarslenWilson1986.cost_differs_despite_equal_dep_length :

totalIntegrationCost (Features.VerbClusterBinding.identity 3) < totalIntegrationCost (Features.VerbClusterBinding.reverse 3) ∧ totalNPVerbDist (Features.VerbClusterBinding.identity 3) = totalNPVerbDist (Features.VerbClusterBinding.reverse 3)

Integration cost difference is NOT explained by dependency length.

inductive BachBrownMarslenWilson1986.LangGroup :

Language group. German was tested with two verb-form versions (infinitive and past participle) due to normative disagreement among informants.

dutch : LangGroup
germanInf : LangGroup
germanPart : LangGroup

Instances For

@[implicit_reducible]

instance BachBrownMarslenWilson1986.instDecidableEqLangGroup :

DecidableEq LangGroup

Equations

BachBrownMarslenWilson1986.instDecidableEqLangGroup x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

@[implicit_reducible]

instance BachBrownMarslenWilson1986.instReprLangGroup :

Equations

BachBrownMarslenWilson1986.instReprLangGroup = { reprPrec := BachBrownMarslenWilson1986.instReprLangGroup.repr }

def BachBrownMarslenWilson1986.instReprLangGroup.repr :

LangGroup → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

def BachBrownMarslenWilson1986.testRating :

LangGroup → Fin 4 → ℕ

Test sentence comprehensibility ratings × 100 (Table 1). Original scale: 1 = easy, 9 = hard. Levels 1–4 indexed 0–3.

Equations

Instances For

def BachBrownMarslenWilson1986.paraRating :

LangGroup → Fin 3 → ℕ

Paraphrase sentence ratings × 100 (Table 1, Levels 2–4 indexed 0–2). Paraphrases express the same propositions using right-branching structure, controlling for propositional complexity.

Equations

Instances For

def BachBrownMarslenWilson1986.testComprehension :

LangGroup → Fin 2 → ℕ

Comprehension accuracy × 100 for Test sentences (Table 3). Questions tested whether each subject NP was correctly associated with its predicate verb phrase. Levels 2–3 indexed 0–1.

Equations

Instances For

def BachBrownMarslenWilson1986.comprehensionByNP :

LangGroup → Fin 3 → ℕ

Comprehension accuracy × 100 by NP position at Level 3, Test (Table 4). NP1 = matrix subject (highest clause), NP3 = most deeply embedded.

Equations

Instances For

def BachBrownMarslenWilson1986.errorDiffByNP :

LangGroup → Fin 3 → ℕ

Test−Paraphrase error rate difference × 100 by NP at Level 3 (Table 5). Higher = more syntactic disruption (Test harder relative to Paraphrase).

Equations

Instances For

theorem BachBrownMarslenWilson1986.level2_german_part_similar :

testRating LangGroup.germanPart 1 - testRating LangGroup.dutch 1 ≤ 30

At Level 2, German/Participle does not differ from Dutch (spread = 29). German/Infinitive is slightly worse throughout (spread = 43). The paper reports a significant overall Ger/Inf disadvantage but no difference for Ger/Part vs Dutch at Level 2.

theorem BachBrownMarslenWilson1986.level3_dutch_easier_rating :

testRating LangGroup.dutch 2 < testRating LangGroup.germanInf 2 ∧ testRating LangGroup.dutch 2 < testRating LangGroup.germanPart 2

At Level 3, Dutch rates Test sentences as easier than both German groups.

theorem BachBrownMarslenWilson1986.level3_dutch_better_comprehension :

testComprehension LangGroup.dutch 1 > testComprehension LangGroup.germanInf 1 ∧ testComprehension LangGroup.dutch 1 > testComprehension LangGroup.germanPart 1

At Level 3, Dutch comprehension accuracy exceeds both German groups.

theorem BachBrownMarslenWilson1986.syntactic_effect_grows_faster_for_german :

have dutch_l2 := testRating LangGroup.dutch 1 - paraRating LangGroup.dutch 0; have dutch_l3 := testRating LangGroup.dutch 2 - paraRating LangGroup.dutch 1; have gerInf_l2 := testRating LangGroup.germanInf 1 - paraRating LangGroup.germanInf 0; have gerInf_l3 := testRating LangGroup.germanInf 2 - paraRating LangGroup.germanInf 1; have gerPart_l2 := testRating LangGroup.germanPart 1 - paraRating LangGroup.germanPart 0; have gerPart_l3 := testRating LangGroup.germanPart 2 - paraRating LangGroup.germanPart 1; gerInf_l3 - dutch_l3 > gerInf_l2 - dutch_l2 ∧ gerPart_l3 - dutch_l3 > gerPart_l2 - dutch_l2

The syntactic complexity effect (Test − Paraphrase) grows faster for both German groups than Dutch from Level 2 to Level 3, paralleling the model's prediction that the integration cost gap grows with depth.

NP2 (middle NP) is hardest for all three groups (Table 4, Test). This is an interference effect: NP2 is distinguished by neither primacy (NP1) nor recency (NP3), making it hardest to retrieve regardless of the dependency pattern.

theorem BachBrownMarslenWilson1986.dutch_np3_advantage :

errorDiffByNP LangGroup.dutch 2 = 0 ∧ errorDiffByNP LangGroup.germanInf 2 > 0 ∧ errorDiffByNP LangGroup.germanPart 2 > 0

Dutch advantage is largest for NP3 (most deeply embedded clause).

Dutch shows ZERO Test−Para error for NP3 (errorDiffByNP .dutch 2 = 0), while both German groups show substantial error (41, 36). The paper explains: in Dutch, NP3's verb (V₃) arrives last and integrates into an already-built matrix structure. In German, NP3's verb (V₃) arrives first — the proposition is immediately parseable but floats without a matrix root, so the information decays before it can be used.

theorem BachBrownMarslenWilson1986.pda_refuted :

totalIntegrationCost (Features.VerbClusterBinding.identity 3) < totalIntegrationCost (Features.VerbClusterBinding.reverse 3) ∧ testRating LangGroup.dutch 2 < testRating LangGroup.germanInf 2 ∧ testComprehension LangGroup.dutch 1 > testComprehension LangGroup.germanInf 1

A push-down automaton (PDA) parsing model predicts that difficulty should track the Chomsky hierarchy: context-free patterns (nested) should be easier than mildly-context-sensitive patterns (crossed), because PDAs can parse the former but not the latter (@cite{evers-1975}).

The data refutes this: crossed is empirically easier at n ≥ 3. This is the paper's central argument against stack-based parsing.

theorem BachBrownMarslenWilson1986.model_matches_data :

totalIntegrationCost (Features.VerbClusterBinding.identity 3) < totalIntegrationCost (Features.VerbClusterBinding.reverse 3) ∧ testRating LangGroup.dutch 2 < testRating LangGroup.germanInf 2 ∧ testComprehension LangGroup.dutch 1 > testComprehension LangGroup.germanInf 1 ∧ totalNPVerbDist (Features.VerbClusterBinding.identity 3) = totalNPVerbDist (Features.VerbClusterBinding.reverse 3)

The model predicts crossed < nested, the data confirms it, and dependency length cannot explain the difference.