Documentation

Linglib.Phenomena.Phonology.Studies.Magri2025

@cite{magri-2025}: Constraint Interaction in Probabilistic Phonology #

@cite{magri-2025}

Replication of @cite{magri-2025} "Constraint Interaction in Probabilistic Phonology: Deducing Maximum Entropy Grammars from Hayes and Zuraw's Shifted Sigmoids Generalization" (Linguistic Inquiry, Early Access).

Main result #

Within harmony-based probabilistic phonology, an n-ary harmony function predicts the shifted-sigmoids generalization of Hayes and Zuraw (@cite{zuraw-hayes-2017}; @cite{hayes-2022}) if and only if the harmony is separable — it decomposes as ∏ₖ hₖ(Cₖ)^{wₖ}. Since MaxEnt harmony is separable (each hₖ = exp(−·)), ME predicts HZ as a corollary. And since any separable harmony can be construed as ME through constraint rescaling Ĉₖ = −log hₖ(Cₖ), the characterization is complete.

Formalization #

This study file instantiates @cite{magri-2025}'s theory with the Tagalog nasal substitution case study from the paper, verifying:

The six constraints satisfy ConstraintIndependence
The violation differences inherit independence (ViolDiffIndependence)
ME predicts HZ's constant logit-rate difference identity
The identity holds for any weight assignment (not just specific values)

The 2×2 square data and constraint inventory come from Phenomena/Phonology/Studies/ZurawHayes2017.lean (Magri 2025 inherits the sub-square setup from Z&H 2017).

§ 1: Constraint Independence #

The constraint violation profiles viewed as functions on underlying forms (ignoring the candidate dimension, since we work with violation differences Δₖ). For the independence check, we verify that each raw constraint is insensitive to at least one dimension.

theorem Magri2025.nasSub_insensitive_to_row (o : ZurawHayes2017.NasalSubOutput) :

ZurawHayes2017.nasSub.eval (ZurawHayes2017.NasalSubInput.mang_b, o) = ZurawHayes2017.nasSub.eval (ZurawHayes2017.NasalSubInput.pang_b, o) ∧ ZurawHayes2017.nasSub.eval (ZurawHayes2017.NasalSubInput.mang_k, o) = ZurawHayes2017.nasSub.eval (ZurawHayes2017.NasalSubInput.pang_k, o)

C₁ = NasSub is insensitive to the prefix (row dimension): the violation is 1 for NO and 0 for YES regardless of prefix. Per @cite{zuraw-hayes-2017} ex. (3) (NasSub is the markedness driver against nasal+obstruent sequences).

theorem Magri2025.starNC_insensitive_to_row (o : ZurawHayes2017.NasalSubOutput) :

ZurawHayes2017.starNC.eval (ZurawHayes2017.NasalSubInput.mang_b, o) = ZurawHayes2017.starNC.eval (ZurawHayes2017.NasalSubInput.pang_b, o) ∧ ZurawHayes2017.starNC.eval (ZurawHayes2017.NasalSubInput.mang_k, o) = ZurawHayes2017.starNC.eval (ZurawHayes2017.NasalSubInput.pang_k, o)

C₂ = *NC is insensitive to the prefix. Per @cite{zuraw-2010} ex. (17): "*NC: A [+nasal] segment must not be immediately followed by a [-voice, -sonorant] segment".

theorem Magri2025.starStemVelar_insensitive_to_row (o : ZurawHayes2017.NasalSubOutput) :

ZurawHayes2017.starStemVelar.eval (ZurawHayes2017.NasalSubInput.mang_b, o) = ZurawHayes2017.starStemVelar.eval (ZurawHayes2017.NasalSubInput.pang_b, o) ∧ ZurawHayes2017.starStemVelar.eval (ZurawHayes2017.NasalSubInput.mang_k, o) = ZurawHayes2017.starStemVelar.eval (ZurawHayes2017.NasalSubInput.pang_k, o)

C₃ = *[stem] is insensitive to the prefix.

theorem Magri2025.starStemVelarCoronal_insensitive_to_row (o : ZurawHayes2017.NasalSubOutput) :

ZurawHayes2017.starStemVelarCoronal.eval (ZurawHayes2017.NasalSubInput.mang_b, o) = ZurawHayes2017.starStemVelarCoronal.eval (ZurawHayes2017.NasalSubInput.pang_b, o) ∧ ZurawHayes2017.starStemVelarCoronal.eval (ZurawHayes2017.NasalSubInput.mang_k, o) = ZurawHayes2017.starStemVelarCoronal.eval (ZurawHayes2017.NasalSubInput.pang_k, o)

C₄ = *[stem]/n is insensitive to the prefix.

theorem Magri2025.unifMang_insensitive_to_col (o : ZurawHayes2017.NasalSubOutput) :

ZurawHayes2017.unifMang.eval (ZurawHayes2017.NasalSubInput.mang_b, o) = ZurawHayes2017.unifMang.eval (ZurawHayes2017.NasalSubInput.mang_k, o) ∧ ZurawHayes2017.unifMang.eval (ZurawHayes2017.NasalSubInput.pang_b, o) = ZurawHayes2017.unifMang.eval (ZurawHayes2017.NasalSubInput.pang_k, o)

C₅ = UNIF(maŋ) is insensitive to the stem-initial obstruent (column).

theorem Magri2025.unifPang_insensitive_to_col (o : ZurawHayes2017.NasalSubOutput) :

ZurawHayes2017.unifPang.eval (ZurawHayes2017.NasalSubInput.mang_b, o) = ZurawHayes2017.unifPang.eval (ZurawHayes2017.NasalSubInput.mang_k, o) ∧ ZurawHayes2017.unifPang.eval (ZurawHayes2017.NasalSubInput.pang_b, o) = ZurawHayes2017.unifPang.eval (ZurawHayes2017.NasalSubInput.pang_k, o)

C₆ = UNIF(paŋ) is insensitive to the stem-initial obstruent.

theorem Magri2025.constraint_independence (o : ZurawHayes2017.NasalSubOutput) :

Core.Constraint.ConstraintIndependence (fun (k : Fin 6) (x : ZurawHayes2017.NasalSubInput) => (ZurawHayes2017.constraints k).eval (x, o)) ZurawHayes2017.nasalSubSquare

Constraint independence: for each fixed output, the six constraints satisfy ConstraintIndependence on the nasal substitution square.

C₁–C₄ (markedness) are insensitive to row (prefix); C₅–C₆ (faithfulness) are insensitive to column (stem obstruent).

§ 2: Violation Difference Consistency #

theorem Magri2025.violDiff_consistent (k : Fin 6) (x : ZurawHayes2017.NasalSubInput) :

ZurawHayes2017.violDiffProfile k x = ↑((ZurawHayes2017.constraints k).eval (x, ZurawHayes2017.NasalSubOutput.no)) - ↑((ZurawHayes2017.constraints k).eval (x, ZurawHayes2017.NasalSubOutput.yes))

The violation differences are consistent with the raw constraint profiles: Δₖ(x) = Cₖ(x, NO) − Cₖ(x, YES).

§ 3: ME Predicts HZ #

theorem Magri2025.me_predicts_hz_tagalog (w : Fin 6 → ℝ) :

Core.Constraint.ConstantLogitDiff (fun (x : ZurawHayes2017.NasalSubInput) => ∑ k : Fin 6, w k * ZurawHayes2017.deltaR k x) ZurawHayes2017.nasalSubSquare

ME predicts HZ for Tagalog nasal substitution: for any weight assignment w : Fin 6 → ℝ, the MaxEnt logit rates of nasal substitution satisfy the constant-difference identity.

LR(/maŋb/) − LR(/maŋk/) = LR(/paŋb/) − LR(/paŋk/)

This is a direct instantiation of me_predicts_hz with the Tagalog violation differences and their verified independence.

§ 4: Concrete Logit-Rate Computations #

The logit rate is LR(x) = Σₖ wₖ · Δₖ(x). We verify the symbolic expressions for each cell.

theorem Magri2025.logitRate_mang_b (w : Fin 6 → ℚ) :

∑ k : Fin 6, w k * ↑(ZurawHayes2017.violDiffProfile k ZurawHayes2017.NasalSubInput.mang_b) = w 0 - w 4

LR(maŋb) = w₁ − w₅

theorem Magri2025.logitRate_mang_k (w : Fin 6 → ℚ) :

∑ k : Fin 6, w k * ↑(ZurawHayes2017.violDiffProfile k ZurawHayes2017.NasalSubInput.mang_k) = w 0 + w 1 - w 2 - w 3 - w 4

LR(/maŋk/) = w₁ + w₂ − w₃ − w₄ − w₅

theorem Magri2025.logitRate_pang_b (w : Fin 6 → ℚ) :

∑ k : Fin 6, w k * ↑(ZurawHayes2017.violDiffProfile k ZurawHayes2017.NasalSubInput.pang_b) = w 0 - w 5

LR(/paŋb/) = w₁ − w₆

theorem Magri2025.logitRate_pang_k (w : Fin 6 → ℚ) :

∑ k : Fin 6, w k * ↑(ZurawHayes2017.violDiffProfile k ZurawHayes2017.NasalSubInput.pang_k) = w 0 + w 1 - w 2 - w 3 - w 5

LR(/paŋk/) = w₁ + w₂ − w₃ − w₄ − w₆

theorem Magri2025.hz_constant_value (w : Fin 6 → ℚ) :

∑ k : Fin 6, w k * ↑(ZurawHayes2017.violDiffProfile k ZurawHayes2017.NasalSubInput.mang_b) - ∑ k : Fin 6, w k * ↑(ZurawHayes2017.violDiffProfile k ZurawHayes2017.NasalSubInput.mang_k) = -w 1 + w 2 + w 3

The constant logit-rate difference equals −w₂ + w₃ + w₄ for both rows, regardless of weights. This follows from the insensitivity structure of the six constraints (§ 1).

Note that w 2 and w 3 are not separately identifiable from the b-vs-k square data — only the sum w 2 + w 3 matters here, since *[stemŋ] and *[stemŋ]/n coincide on the b/k restriction.

theorem Magri2025.hz_constant_value' (w : Fin 6 → ℚ) :

∑ k : Fin 6, w k * ↑(ZurawHayes2017.violDiffProfile k ZurawHayes2017.NasalSubInput.pang_b) - ∑ k : Fin 6, w k * ↑(ZurawHayes2017.violDiffProfile k ZurawHayes2017.NasalSubInput.pang_k) = -w 1 + w 2 + w 3

theorem Magri2025.hz_identity_concrete (w : Fin 6 → ℚ) :

∑ k : Fin 6, w k * ↑(ZurawHayes2017.violDiffProfile k ZurawHayes2017.NasalSubInput.mang_b) - ∑ k : Fin 6, w k * ↑(ZurawHayes2017.violDiffProfile k ZurawHayes2017.NasalSubInput.mang_k) = ∑ k : Fin 6, w k * ↑(ZurawHayes2017.violDiffProfile k ZurawHayes2017.NasalSubInput.pang_b) - ∑ k : Fin 6, w k * ↑(ZurawHayes2017.violDiffProfile k ZurawHayes2017.NasalSubInput.pang_k)

The HZ identity verified concretely: both row-differences are equal.

§ 5: Empirical Rate Verification #

The empirical rates satisfy HZ's identity to good approximation. The exact identity is logit(R(tl)) − logit(R(tr)) = logit(R(bl)) − logit(R(br)). We verify the approximate version on the rational rates.

theorem Magri2025.rate_pos (x : ZurawHayes2017.NasalSubInput) :

0 < ZurawHayes2017.nasalSubRate x

Rates are in (0, 1).

theorem Magri2025.rate_lt_one (x : ZurawHayes2017.NasalSubInput) :

ZurawHayes2017.nasalSubRate x < 1

theorem Magri2025.top_row_odds_ratio :

ZurawHayes2017.nasalSubRate ZurawHayes2017.NasalSubInput.mang_b * (1 - ZurawHayes2017.nasalSubRate ZurawHayes2017.NasalSubInput.mang_k) / (ZurawHayes2017.nasalSubRate ZurawHayes2017.NasalSubInput.mang_k * (1 - ZurawHayes2017.nasalSubRate ZurawHayes2017.NasalSubInput.mang_b)) = 6412 / 83412

Logit-odds ratio for top row: (916/1000)·(7/1000) / ((993/1000)·(84/1000)) = 916·7 / (993·84) = 6412 / 83412.

theorem Magri2025.bottom_row_odds_ratio :

ZurawHayes2017.nasalSubRate ZurawHayes2017.NasalSubInput.pang_b * (1 - ZurawHayes2017.nasalSubRate ZurawHayes2017.NasalSubInput.pang_k) / (ZurawHayes2017.nasalSubRate ZurawHayes2017.NasalSubInput.pang_k * (1 - ZurawHayes2017.nasalSubRate ZurawHayes2017.NasalSubInput.pang_b)) = 39494 / 514494

Logit-odds ratio for bottom row: (434/1000)·(91/1000) / ((909/1000)·(566/1000)) = 434·91 / (909·566) = 39494 / 514494.

theorem Magri2025.odds_ratios_close :

6412 * 514494 = 3298935528 ∧ 39494 * 83412 = 3294273528

The two odds ratios are close: 6412/83412 ≈ 0.0769 and 39494/514494 ≈ 0.0768 — a remarkable match confirming HZ's empirical observation. Equality of these ratios would mean logit(R(tl)) − logit(R(tr)) = logit(R(bl)) − logit(R(br)) exactly.

§ 6: Separable Forward Direction #

theorem Magri2025.me_separable_predicts_hz_tagalog (w : Fin 6 → ℝ) :

Core.Constraint.ConstantLogitDiff (fun (x : ZurawHayes2017.NasalSubInput) => Real.log (((Core.Constraint.meSeparable 6 w).eval fun (k : Fin 6) => (ZurawHayes2017.constraints k).eval (x, ZurawHayes2017.NasalSubOutput.yes)) / (Core.Constraint.meSeparable 6 w).eval fun (k : Fin 6) => (ZurawHayes2017.constraints k).eval (x, ZurawHayes2017.NasalSubOutput.no))) ZurawHayes2017.nasalSubSquare

ME predicts HZ at the probability level: the log-probability-ratio log(P(YES|x)/P(NO|x)) under ME satisfies HZ's constant-difference identity for Tagalog nasal substitution, for any weight assignment.

This instantiates separable_predicts_hz with meSeparable and the Tagalog constraints. Since ME rescaling is the identity (meSeparable_rescale), the rescaled violation differences reduce to the raw violation differences, and violDiff_independence provides the independence hypothesis.