Coetzee & Pater (2011): The Place of Variation in Phonological Theory #
@cite{coetzee-pater-2011}
Handbook of Phonological Theory chapter comparing three frameworks for modeling phonological variation, illustrated with English t/d-deletion.
Models formalized #
Partially Ordered Constraints (POC) (@cite{anttila-1997}, @cite{kiparsky-1993b}): grammar is a partial order on OT constraints. Each evaluation randomly samples a total order consistent with the partial order. Probability of an output = fraction of total orders that select it as optimal.
MaxEnt Harmonic Grammar (@cite{goldwater-johnson-2003}): constraints have numerical weights; candidate probability ∝ exp(harmony score). More expressive than POC — can encode arbitrary probability distributions over outputs.
Bridge: the OT limit theorem (
maxent_ot_limit) shows that as α → ∞, MaxEnt recovers OT's categorical optimization, connecting the two frameworks.
Key results #
POC deletion counts: 4! = 24 total rankings; pre-V deletion in 8, pre-pause in 8, pre-C in 12, matching @cite{coetzee-pater-2011} §3.2.
Factorial typology: exactly 5 distinct language types across all 24 rankings, matching the 5 crucial ranking classes in table (12).
Structural implication: every ranking that produces pre-V deletion also produces pre-C deletion. This is the formal basis for the cross-dialectal generalization pre-C ≥ pre-V.
Tejano' impossibility (§4.4): a hypothetical dialect with pre-C < pre-V rates cannot be generated by POC or Stochastic OT but CAN be generated by MaxEnt with negative weights — a concrete framework separation theorem.
Output form for t/d-deletion: either retain or delete.
Instances For
Equations
- CoetzeePater2011.instDecidableEqTDOutput x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- CoetzeePater2011.instReprTDOutput = { reprPrec := CoetzeePater2011.instReprTDOutput.repr }
Equations
Equations
A candidate pairs a phonological context with an output form.
- context : Fragments.English.TDDeletion.Context
- output : TDOutput
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- CoetzeePater2011.instInhabitedTDCandidate = { default := { context := default, output := default } }
Candidates for a given context.
Equations
- CoetzeePater2011.candidatesFor ctx = [{ context := ctx, output := CoetzeePater2011.TDOutput.retain }, { context := ctx, output := CoetzeePater2011.TDOutput.delete }]
Instances For
*CT (markedness): penalizes word-final consonant clusters ending in a coronal stop. Violated by the faithful (retaining) candidate. @cite{coetzee-pater-2011} example (11).
Equations
- CoetzeePater2011.starCT = Core.Constraint.OT.mkMark "*CT" fun (c : CoetzeePater2011.TDCandidate) => (c.output == CoetzeePater2011.TDOutput.retain) = true
Instances For
MAX (faithfulness): penalizes deletion of an input consonant. Violated by the deleting candidate in all contexts. @cite{coetzee-pater-2011} example (11).
Equations
- CoetzeePater2011.maxC = Phonology.Constraints.mkMax "MAX" fun (c : CoetzeePater2011.TDCandidate) => (c.output == CoetzeePater2011.TDOutput.delete) = true
Instances For
MAX-PRE-V (contextual faithfulness): penalizes deletion specifically in pre-vocalic position, where perceptual cues for t/d are maximal. @cite{coetzee-pater-2011} example (11).
Equations
- One or more equations did not get rendered due to their size.
Instances For
MAX-FINAL (contextual faithfulness): penalizes deletion in phrase-final position, where consonantal release provides cues. @cite{coetzee-pater-2011} example (11).
Equations
- One or more equations did not get rendered due to their size.
Instances For
The four constraints from the analysis.
Equations
Instances For
*CT is a markedness constraint.
MAX is a faithfulness constraint.
MAX-PRE-V is a faithfulness constraint.
MAX-FINAL is a faithfulness constraint.
All violations are bounded by 1 (binary constraints).
Violation profile for a candidate under the 4 constraints. Order: [*CT, MAX, MAX-PRE-V, MAX-FINAL].
Equations
- CoetzeePater2011.violationProfile c = List.map (fun (con : Core.Constraint.OT.NamedConstraint CoetzeePater2011.TDCandidate) => con.eval c) CoetzeePater2011.constraints
Instances For
Retain always violates *CT once and nothing else.
Delete in pre-C violates MAX only (no contextual faithfulness active).
Delete in pre-V violates MAX and MAX-PRE-V.
Delete pre-pausally violates MAX and MAX-FINAL.
Candidate set per context for POC: both retain and delete are available
in every context. Required by PartialOrderConstraints.pocPredict.
Equations
- CoetzeePater2011.tdCands x✝ = Finset.univ
Instances For
Violation profile in POC's Input → Output → Fin n → ℕ shape.
Constraint indexing matches constraints: 0 = *CT, 1 = MAX,
2 = MAX-PRE-V, 3 = MAX-FINAL.
Equations
- CoetzeePater2011.tdVp x✝ CoetzeePater2011.TDOutput.retain ⟨0, isLt⟩ = 1
- CoetzeePater2011.tdVp x✝ CoetzeePater2011.TDOutput.delete ⟨1, isLt⟩ = 1
- CoetzeePater2011.tdVp Fragments.English.TDDeletion.Context.preV CoetzeePater2011.TDOutput.delete ⟨2, isLt⟩ = 1
- CoetzeePater2011.tdVp Fragments.English.TDDeletion.Context.pause CoetzeePater2011.TDOutput.delete ⟨3, isLt⟩ = 1
- CoetzeePater2011.tdVp x✝² x✝¹ x✝ = 0
Instances For
@cite{coetzee-pater-2011} §3.2 explicitly adopts the POC framework: "the grammar states a partial, rather than a total, order on the constraint set. Each time the grammar is used to evaluate a candidate set, one of the total orders consistent with the partial order is randomly chosen." Equation (9): "If a candidate is selected as optimal in n of these rankings, then this candidate's probability of occurrence is n/t."
We formalize the §3.2 t/d-deletion analysis (table 10) using POC's
`pocPredict`, with deletion probabilities derived in closed form via
the `picksAt_binary_iff_permDList_head_lt` bridge + the substrate's
`perm_filter_head_in_card`.
The discrete partial order (`PartialOrderConstraints.discrete 4`)
encodes "no rankings imposed" — uniform sampling over all 4! = 24
total orders.
Probability that POC sampling selects deletion at context ctx,
under the discrete partial order.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Pre-vocalic deletion probability: 8/24 = 1/3.
Closed form via picksAt_rate_eq: count / 4! = |Y ∩ D| / |D| with
D = {*CT, MAX, MAX-PRE-V} (3 distinguishing) and Y ∩ D = {*CT}
(only *CT favors delete) → 1/3.
Phrase-final deletion probability: 8/24 = 1/3. Same closed form as preV
with D = {*CT, MAX, MAX-FINAL}, Y ∩ D = {*CT}.
Pre-consonantal deletion probability: 12/24 = 1/2 (the highest, since
no positional faithfulness applies). D = {*CT, MAX}, Y ∩ D = {*CT} →
1/2.
The cross-dialectal generalization: pre-C deletion rate exceeds pre-V and pre-pause rates. Direct consequence of the closed-form arithmetic — no enumeration.
Pre-V and pre-pause have equal deletion probabilities (1/3 each).
The factorial typology over Equiv.Perm (Fin 4) rankings, using POC's
PicksAt for the per-context deletion check. Per-σ pattern is
(PicksAt σ .preV .delete, PicksAt σ .pause .delete, PicksAt σ .preC .delete).
These are per-σ patterns (not head-in-Y predicates), so the substrate
doesn't apply — we use plain `decide` on the 24-perm `Equiv.Perm (Fin 4)`.
The deletion pattern (preV?, pause?, preC?) produced by ranking σ.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Distinct categorical dialect types over all 4! = 24 total orders.
Equations
- CoetzeePater2011.factorialTypes = Finset.image CoetzeePater2011.deletionPattern Finset.univ
Instances For
The factorial typology has exactly 5 distinct language types, matching the 5 crucial ranking classes in table (12): a. (F,F,F) — MAX >> *CT, no deletion b. (F,T,T) — MAX-PRE-V >> *CT >> {MAX, MAX-FINAL} c. (T,F,T) — MAX-FINAL >> *CT >> {MAX, MAX-PRE-V} d. (F,F,T) — {MAX-PRE-V, MAX-FINAL} >> *CT >> MAX e. (T,T,T) — *CT >> {MAX, MAX-PRE-V, MAX-FINAL}
Type a (F,F,F): 12 rankings — MAX >> *CT blocks all deletion.
Type b (F,T,T): 2 rankings.
Type c (T,F,T): 2 rankings.
Type d (F,F,T): 2 rankings.
Type e (T,T,T): 6 rankings — *CT >> {MAX, MAX-PRE-V, MAX-FINAL}.
No ranking produces the (T,F,F) pattern — deletion only in pre-V without pre-C deletion is impossible.
Every ranking that produces pre-V deletion also produces pre-C deletion. This is the structural reason POC cannot generate reversed rates: pre-V deletion requires *CT >> MAX ∧ *CT >> MAX-PRE-V, which entails *CT >> MAX, the sole condition for pre-C deletion.
Similarly, every ranking producing pause deletion also produces pre-C deletion: *CT >> MAX ∧ *CT >> MAX-FINAL entails *CT >> MAX.
No ranking produces pre-V or pause deletion without also producing
pre-C deletion. The formal basis for the cross-dialectal generalization
P(del|preC) ≥ P(del|preV). Direct corollary of the closed-form
deletionProb_* theorems.
Weighted version of the t/d-deletion constraints for MaxEnt. Weight parameterization enables dialect-specific fitting.
Equations
- One or more equations did not get rendered due to their size.
Instances For
MaxEnt harmony ordering is a decidable proxy for probability ordering:
H(a) > H(b) ⟺ P(a) > P(b) by monotonicity of exp.
With AAVE weights from table (23) ME-HG row, deletion probability ranks pre-C > pause > pre-V. Weights are exact ℚ transcriptions of the one-decimal-place values reported in the paper: *CT = 100.6, MAX-P-V = 2.1, MAX-FIN = 0.2, MAX = 99.4.
With non-negative MAX-PRE-V weight, harmony of pre-C deletion ≥ pre-V deletion. Pre-V delete violates {MAX, MAX-PRE-V} while pre-C delete violates only {MAX}, so H(del|preC) - H(del|preV) = wMaxPreV ≥ 0.
Banning negative weights thus makes MaxEnt respect the same typological restriction as POC (@cite{coetzee-pater-2011} §4.4).
Analogously, non-negative MAX-FINAL weight ensures pre-C ≥ pause. H(del|preC) - H(del|pause) = wMaxFin ≥ 0.
Tejano' is a hypothetical dialect with reversed pre-C/pre-V rates: lowest deletion in pre-consonantal position. Created by swapping Tejano's pre-V (25%) and pre-C (62%) rates. @cite{coetzee-pater-2011} §4.4.
Equations
Instances For
Tejano' has reversed rates: pre-V > pre-C.
POC cannot generate Tejano': every ranking that produces pre-V deletion also produces pre-C deletion (§7), so P(del|preC) ≥ P(del|preV) for any POC grammar over these 4 constraints.
MaxEnt CAN generate Tejano' with negative MAX-PRE-V weight: when MAX-PRE-V has a negative weight, violating it helps the candidate, rewarding deletion in pre-vocalic position.
Witness from @cite{coetzee-pater-2011} table (23) Tejano' ME-HG row: *CT = 99.4, MAX = 100.6, MAX-PRE-V = −1.6, MAX-FINAL = −0.8. These are the weights the paper's MaxEnt fitting procedure learned on Tejano' frequencies.
Witness arithmetic:
- Pre-V delete: H = −(100.6·1 + (−1.6)·1) = −99
- Pre-V retain: H = −(99.4·1) = −99.4 → delete > retain ✓
- Pre-C delete: H = −(100.6·1) = −100.6
- Pre-C retain: H = −(99.4·1) = −99.4 → retain > delete ✓ (REVERSED)
@cite{coetzee-pater-2011} §4.4, table (23)
Framework separation: POC/StOT and MaxEnt have different typological predictions. POC cannot generate all patterns that MaxEnt can.
Left conjunct: POC always has pre-C ≥ pre-V (structural implication). Right conjunct: MaxEnt can achieve pre-V > pre-C (negative weights).
When MAX >> *CT, the categorical OT prediction is retention (no deletion) in all contexts.
When *CT >> all faithfulness, the categorical OT prediction is deletion in all contexts.
At each context, the AAVE MaxEnt model is a per-context
ConstraintSystem TDOutput ℝ: candidates = {retain, delete}, score =
harmony of (ctx, ·), decoder = softmaxDecoder 1. The two-candidate
softmax is the logistic function, so predict .delete is genuine
conditional probability P(delete | ctx).
Equations
Equations
- One or more equations did not get rendered due to their size.
The AAVE constraint weights from table (23) ME-HG row.
Equations
- CoetzeePater2011.aaveWeights = CoetzeePater2011.mkWeightedConstraints (1006 / 10) (994 / 10) (21 / 10) (2 / 10)
Instances For
The AAVE MaxEnt model at a fixed context, packaged as a generic
ConstraintSystem. With only two candidates, predict .delete is
the conditional probability P(delete | ctx).
Equations
- One or more equations did not get rendered due to their size.
Instances For
In the pre-consonantal context, the AAVE system predicts deletion over retention. Conditional probability claim: P(delete | preC) > P(retain | preC).
In the pre-vocalic context, the AAVE system predicts retention over deletion: pre-V deletion violates both MAX and MAX-PRE-V, costing more than the *CT violation incurred by retention. Conditional probability claim: P(retain | preV) > P(delete | preV).
The per-context AAVE system is a probability distribution over
TDOutput. Generic property of softmaxDecoder.