Documentation

Linglib.Studies.CoetzeePater2011

Coetzee & Pater (2011): The Place of Variation in Phonological Theory #

[CP11b]

Handbook of Phonological Theory chapter comparing three frameworks for modeling phonological variation, illustrated with English t/d-deletion.

Models formalized #

  1. Partially Ordered Constraints (POC) ([Ant97], [Kip93]): grammar is a partial order on OT constraints. Each evaluation randomly samples a total order consistent with the partial order. Probability of an output = fraction of total orders that select it as optimal.

  2. MaxEnt Harmonic Grammar ([GJ03]): constraints have numerical weights; candidate probability ∝ exp(harmony score). More expressive than POC — can encode arbitrary probability distributions over outputs.

  3. Categorical OT bookends: max_dominates_implies_no_deletion and ct_dominates_implies_deletion exhibit the two extreme rankings whose categorical OT optima bracket the variation that POC and MaxEnt distribute over.

Key results #

Empirical data (tables 7 and 10) #

External phonological context following the word-final cluster. [CP11b] table (10), p. 410.

Instances For
    @[implicit_reducible]
    Equations
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For
      @[implicit_reducible]
      Equations
      • One or more equations did not get rendered due to their size.

      English dialects with t/d-deletion data, per [CP11b] footnote 3: AAVE (Fasold 1972), Chicano (Santa Ana 1992), Jamaican (Patrick 1992), Trinidad (Kang 1994), Tejano (Bayley 1995).

      Instances For
        @[implicit_reducible]
        Equations
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For
          @[implicit_reducible]
          Equations
          • One or more equations did not get rendered due to their size.

          Observed deletion rate as a percentage (integer). From [CP11b] table (10), p. 410.

          Equations
          Instances For

            Pre-consonantal deletion ≥ pre-vocalic in every dialect.

            Pre-consonantal deletion ≥ pre-pausal in every dialect (Chicano included: 62 ≥ 37).

            In all non-Chicano dialects, pre-pausal ≥ pre-vocalic.

            Morphological conditioning (table 7) #

            Morphological status of the word-final t/d. [CP11b] table (7), p. 407.

            Instances For
              @[implicit_reducible]
              Equations
              Equations
              • One or more equations did not get rendered due to their size.
              Instances For
                @[implicit_reducible]
                Equations
                • One or more equations did not get rendered due to their size.

                Dialect labels for table (7) morphological data. Table (7) uses different dialects than table (10) — Philadelphia English replaces AAVE; Jamaican and Trinidad have no morph data.

                Instances For
                  @[implicit_reducible]
                  Equations
                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For
                    @[implicit_reducible]
                    Equations
                    • One or more equations did not get rendered due to their size.

                    Monomorpheme deletion ≥ regular past in every dialect with data.

                    Monomorpheme deletion ≥ semi-weak past in every dialect with data. Together with semiWeak_ge_regular, this discharges Guy 1991b's three-way ordering monomorpheme ≥ semi-weak ≥ regular past.

                    Candidate type #

                    Output form for t/d-deletion: either retain or delete.

                    Instances For
                      @[implicit_reducible]
                      Equations
                      Equations
                      • One or more equations did not get rendered due to their size.
                      Instances For
                        @[implicit_reducible]
                        Equations
                        • One or more equations did not get rendered due to their size.

                        A candidate pairs a phonological context with an output form.

                        Instances For
                          def CoetzeePater2011.instDecidableEqTDCandidate.decEq (x✝ x✝¹ : TDCandidate) :
                          Decidable (x✝ = x✝¹)
                          Equations
                          • One or more equations did not get rendered due to their size.
                          Instances For
                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For
                              @[implicit_reducible]
                              Equations

                              Candidates for a given context.

                              Equations
                              Instances For

                                Constraints (example 11) #

                                *CT (markedness): penalizes word-final consonant clusters ending in a coronal stop. Violated by the faithful (retaining) candidate. [CP11b] example (11).

                                Equations
                                Instances For

                                  MAX (faithfulness): penalizes deletion of an input consonant. Violated by the deleting candidate in all contexts. [CP11b] example (11).

                                  Equations
                                  Instances For

                                    MAX-PRE-V (contextual faithfulness): penalizes deletion specifically in pre-vocalic position, where perceptual cues for t/d are maximal. [CP11b] example (11).

                                    Equations
                                    • One or more equations did not get rendered due to their size.
                                    Instances For

                                      MAX-FINAL (contextual faithfulness): penalizes deletion in phrase-final position, where consonantal release provides cues. [CP11b] example (11).

                                      Equations
                                      • One or more equations did not get rendered due to their size.
                                      Instances For

                                        All violations are bounded by 1 (binary constraints).

                                        Violation profiles #

                                        Violation profile for a candidate under the 4 constraints. Order: [*CT, MAX, MAX-PRE-V, MAX-FINAL].

                                        Equations
                                        Instances For
                                          theorem CoetzeePater2011.retain_profile (ctx : Context) :
                                          violationProfile { context := ctx, output := TDOutput.retain } = [1, 0, 0, 0]

                                          Retain always violates *CT once and nothing else.

                                          Delete in pre-C violates MAX only (no contextual faithfulness active).

                                          Delete in pre-V violates MAX and MAX-PRE-V.

                                          Delete pre-pausally violates MAX and MAX-FINAL.

                                          POC adapter: tdCands and tdVp for pocPredict consumption #

                                          Candidate set per context for POC: both retain and delete are available in every context. Required by PartialOrderConstraints.pocPredict.

                                          Equations
                                          Instances For
                                            def CoetzeePater2011.tdVp :
                                            ContextTDOutputFin 4

                                            Violation profile in POC's Input → Output → Fin n → ℕ shape. Constraint indexing matches constraints: 0 = *CT, 1 = MAX, 2 = MAX-PRE-V, 3 = MAX-FINAL.

                                            Equations
                                            Instances For

                                              POC model — substrate-derived #

                                              [CP11b] §3.2 explicitly adopts the POC framework: "the grammar states a partial, rather than a total, order on the constraint set. Each time the grammar is used to evaluate a candidate set, one of the total orders consistent with the partial order is randomly chosen." Equation (9): "If a candidate is selected as optimal in n of these rankings, then this candidate's probability of occurrence is n/t."

                                              We formalize the §3.2 t/d-deletion analysis (table 10) using POC's
                                              `pocPredict`, with deletion probabilities derived in closed form via
                                              the `picksAt_binary_iff_permDList_head_lt` bridge + the substrate's
                                              `perm_filter_head_in_card`.
                                              
                                              The discrete partial order (`PartialOrderConstraints.discrete 4`)
                                              encodes "no rankings imposed" — uniform sampling over all 4! = 24
                                              total orders. 
                                              

                                              Probability that POC sampling selects deletion at context ctx, under the discrete partial order.

                                              Equations
                                              • One or more equations did not get rendered due to their size.
                                              Instances For

                                                Pre-vocalic deletion probability: D = {*CT, MAX, MAX-PRE-V} (3 distinguishing), Y ∩ D = {*CT} (only *CT favors delete) → 1/3.

                                                Phrase-final deletion probability: D = {*CT, MAX, MAX-FINAL}, Y ∩ D = {*CT}1/3.

                                                Pre-consonantal deletion probability: D = {*CT, MAX}, Y ∩ D = {*CT}1/2 (highest — no positional faithfulness applies).

                                                The cross-dialectal generalization: pre-C deletion rate exceeds pre-V and pre-pause rates. Direct consequence of the closed-form arithmetic — no enumeration.

                                                Pre-V and pre-pause have equal deletion probabilities (1/3 each).

                                                Factorial typology (table 12) — POC-native #

                                                The factorial typology over Equiv.Perm (Fin 4) rankings, using POC's PicksAt for the per-context deletion check. Per-σ pattern is (PicksAt σ .preV .delete, PicksAt σ .pause .delete, PicksAt σ .preC .delete).

                                                These are per-σ patterns (not head-in-Y predicates), so the substrate
                                                doesn't apply — we use plain `decide` on the 24-perm `Equiv.Perm (Fin 4)`. 
                                                
                                                def CoetzeePater2011.deletionPattern (σ : Equiv.Perm (Fin 4)) :
                                                Bool × Bool × Bool

                                                The deletion pattern (preV?, pause?, preC?) produced by ranking σ.

                                                Equations
                                                • One or more equations did not get rendered due to their size.
                                                Instances For
                                                  def CoetzeePater2011.factorialTypes :
                                                  Finset (Bool × Bool × Bool)

                                                  Distinct categorical dialect types over all 4! = 24 total orders.

                                                  Equations
                                                  Instances For

                                                    The factorial typology has exactly 5 distinct language types, matching the 5 crucial ranking classes in table (12): a. (F,F,F) — MAX >> *CT, no deletion b. (F,T,T) — MAX-PRE-V >> *CT >> {MAX, MAX-FINAL} c. (T,F,T) — MAX-FINAL >> *CT >> {MAX, MAX-PRE-V} d. (F,F,T) — {MAX-PRE-V, MAX-FINAL} >> *CT >> MAX e. (T,T,T) — *CT >> {MAX, MAX-PRE-V, MAX-FINAL}

                                                    theorem CoetzeePater2011.type_a_count :
                                                    {σ : Equiv.Perm (Fin 4) | deletionPattern σ = (false, false, false)}.card = 12

                                                    Type a (F,F,F): 12 rankings — MAX >> *CT blocks all deletion.

                                                    theorem CoetzeePater2011.type_b_count :
                                                    {σ : Equiv.Perm (Fin 4) | deletionPattern σ = (false, true, true)}.card = 2

                                                    Type b (F,T,T): 2 rankings.

                                                    theorem CoetzeePater2011.type_c_count :
                                                    {σ : Equiv.Perm (Fin 4) | deletionPattern σ = (true, false, true)}.card = 2

                                                    Type c (T,F,T): 2 rankings.

                                                    theorem CoetzeePater2011.type_d_count :
                                                    {σ : Equiv.Perm (Fin 4) | deletionPattern σ = (false, false, true)}.card = 2

                                                    Type d (F,F,T): 2 rankings.

                                                    theorem CoetzeePater2011.type_e_count :
                                                    {σ : Equiv.Perm (Fin 4) | deletionPattern σ = (true, true, true)}.card = 6

                                                    Type e (T,T,T): 6 rankings — *CT >> {MAX, MAX-PRE-V, MAX-FINAL}.

                                                    theorem CoetzeePater2011.no_preV_only :
                                                    {σ : Equiv.Perm (Fin 4) | deletionPattern σ = (true, false, false)}.card = 0

                                                    No ranking produces the (T,F,F) pattern — deletion only in pre-V without pre-C deletion is impossible.

                                                    Structural implication (key POC typological prediction) #

                                                    Every ranking that produces pre-V deletion also produces pre-C deletion. This is the structural reason POC cannot generate reversed rates: pre-V deletion requires *CT >> MAX ∧ *CT >> MAX-PRE-V, which entails *CT >> MAX, the sole condition for pre-C deletion.

                                                    Similarly, every ranking producing pause deletion also produces pre-C deletion: *CT >> MAX ∧ *CT >> MAX-FINAL entails *CT >> MAX.

                                                    No ranking produces pre-V or pause deletion without also producing pre-C deletion. The formal basis for the cross-dialectal generalization P(del|preC) ≥ P(del|preV). Direct corollary of the closed-form deletionProb_* theorems.

                                                    MaxEnt model #

                                                    The four t/d-deletion constraints as a CON (constraint vector) for MaxEnt. Index order matches constraints: 0 = *CT, 1 = MAX, 2 = MAX-PRE-V, 3 = MAX-FINAL.

                                                    Equations
                                                    Instances For
                                                      def CoetzeePater2011.tdW (wCT wMax wMaxPreV wMaxFin : ) :
                                                      Fin 4

                                                      MaxEnt weight vector for tdCon, in the same [*CT, MAX, MAX-PRE-V, MAX-FINAL] order. Weight parameterization enables dialect-specific fitting.

                                                      Equations
                                                      Instances For
                                                        theorem CoetzeePater2011.maxent_deletion_preferred (wCT wMax wMaxPreV wMaxFin : ) (ctx : Context) (h : Constraints.harmonyScore tdCon (tdW wCT wMax wMaxPreV wMaxFin) { context := ctx, output := TDOutput.delete } > Constraints.harmonyScore tdCon (tdW wCT wMax wMaxPreV wMaxFin) { context := ctx, output := TDOutput.retain }) :
                                                        Constraints.harmonyDominates tdCon (tdW wCT wMax wMaxPreV wMaxFin) { context := ctx, output := TDOutput.delete } { context := ctx, output := TDOutput.retain }

                                                        MaxEnt harmony ordering is a decidable proxy for probability ordering: H(a) > H(b) ⟺ P(a) > P(b) by monotonicity of exp.

                                                        theorem CoetzeePater2011.maxent_aave_ordering :
                                                        have w := tdW (1006 / 10) (994 / 10) (21 / 10) (2 / 10); Constraints.harmonyScore tdCon w { context := Context.preC, output := TDOutput.delete } > Constraints.harmonyScore tdCon w { context := Context.pause, output := TDOutput.delete } Constraints.harmonyScore tdCon w { context := Context.pause, output := TDOutput.delete } > Constraints.harmonyScore tdCon w { context := Context.preV, output := TDOutput.delete }

                                                        With AAVE weights from table (23) ME-HG row, deletion probability ranks pre-C > pause > pre-V. Weights are exact transcriptions of the one-decimal-place values reported in the paper: *CT = 100.6, MAX-P-V = 2.1, MAX-FIN = 0.2, MAX = 99.4.

                                                        theorem CoetzeePater2011.nonneg_weights_preserve_ordering (wCT wMax wMaxPreV wMaxFin : ) (hPreV : wMaxPreV 0) :
                                                        Constraints.harmonyScore tdCon (tdW wCT wMax wMaxPreV wMaxFin) { context := Context.preC, output := TDOutput.delete } Constraints.harmonyScore tdCon (tdW wCT wMax wMaxPreV wMaxFin) { context := Context.preV, output := TDOutput.delete }

                                                        With non-negative MAX-PRE-V weight, harmony of pre-C deletion ≥ pre-V deletion. Pre-V delete violates {MAX, MAX-PRE-V} while pre-C delete violates only {MAX}, so H(del|preC) - H(del|preV) = wMaxPreV ≥ 0.

                                                        Banning negative weights thus makes MaxEnt respect the same typological restriction as POC ([CP11b] §4.4).

                                                        theorem CoetzeePater2011.nonneg_weights_preserve_ordering_pause (wCT wMax wMaxPreV wMaxFin : ) (hFin : wMaxFin 0) :
                                                        Constraints.harmonyScore tdCon (tdW wCT wMax wMaxPreV wMaxFin) { context := Context.preC, output := TDOutput.delete } Constraints.harmonyScore tdCon (tdW wCT wMax wMaxPreV wMaxFin) { context := Context.pause, output := TDOutput.delete }

                                                        Analogously, non-negative MAX-FINAL weight ensures pre-C ≥ pause. H(del|preC) - H(del|pause) = wMaxFin ≥ 0.

                                                        Tejano' impossibility #

                                                        Tejano' is a hypothetical dialect with reversed pre-C/pre-V rates: lowest deletion in pre-consonantal position. Created by swapping Tejano's pre-V (25%) and pre-C (62%) rates. [CP11b] §4.4.

                                                        Equations
                                                        Instances For

                                                          POC cannot generate Tejano': every ranking that produces pre-V deletion also produces pre-C deletion (§7), so P(del|preC) ≥ P(del|preV) for any POC grammar over these 4 constraints.

                                                          theorem CoetzeePater2011.maxent_can_generate_tejanoPrime :
                                                          ∃ (wCT : ) (wMax : ) (wMaxPreV : ) (wMaxFin : ), Constraints.harmonyScore tdCon (tdW wCT wMax wMaxPreV wMaxFin) { context := Context.preV, output := TDOutput.delete } > Constraints.harmonyScore tdCon (tdW wCT wMax wMaxPreV wMaxFin) { context := Context.preV, output := TDOutput.retain } Constraints.harmonyScore tdCon (tdW wCT wMax wMaxPreV wMaxFin) { context := Context.preC, output := TDOutput.retain } > Constraints.harmonyScore tdCon (tdW wCT wMax wMaxPreV wMaxFin) { context := Context.preC, output := TDOutput.delete } wMaxPreV < 0

                                                          MaxEnt CAN generate Tejano' with negative MAX-PRE-V weight: when MAX-PRE-V has a negative weight, violating it helps the candidate, rewarding deletion in pre-vocalic position.

                                                          Witness from [CP11b] table (23) Tejano' ME-HG row: *CT = 99.4, MAX = 100.6, MAX-PRE-V = −1.6, MAX-FINAL = −0.8. These are the weights the paper's MaxEnt fitting procedure learned on Tejano' frequencies.

                                                          Witness arithmetic:

                                                          • Pre-V delete: H = −(100.6·1 + (−1.6)·1) = −99
                                                          • Pre-V retain: H = −(99.4·1) = −99.4 → delete > retain ✓
                                                          • Pre-C delete: H = −(100.6·1) = −100.6
                                                          • Pre-C retain: H = −(99.4·1) = −99.4 → retain > delete ✓ (REVERSED)

                                                          [CP11b] §4.4, table (23)

                                                          theorem CoetzeePater2011.framework_separation :
                                                          deletionProb Context.preC deletionProb Context.preV ∃ (wCT : ) (wMax : ) (wMaxPreV : ) (wMaxFin : ), Constraints.harmonyScore tdCon (tdW wCT wMax wMaxPreV wMaxFin) { context := Context.preV, output := TDOutput.delete } > Constraints.harmonyScore tdCon (tdW wCT wMax wMaxPreV wMaxFin) { context := Context.preV, output := TDOutput.retain } Constraints.harmonyScore tdCon (tdW wCT wMax wMaxPreV wMaxFin) { context := Context.preC, output := TDOutput.retain } > Constraints.harmonyScore tdCon (tdW wCT wMax wMaxPreV wMaxFin) { context := Context.preC, output := TDOutput.delete }

                                                          Framework separation: POC/StOT and MaxEnt have different typological predictions. POC cannot generate all patterns that MaxEnt can.

                                                          Left conjunct: POC always has pre-C ≥ pre-V (structural implication). Right conjunct: MaxEnt can achieve pre-V > pre-C (negative weights).

                                                          Categorical OT bookends #

                                                          theorem CoetzeePater2011.max_dominates_implies_no_deletion (ctx : Context) :
                                                          have ranking := [maxC, maxPreV, maxFinal, starCT]; have tab := OptimalityTheory.Tableau.ofRanking (candidatesFor ctx) ranking ; tab.optimal = {{ context := ctx, output := TDOutput.retain }}

                                                          When MAX >> *CT, the categorical OT prediction is retention (no deletion) in all contexts.

                                                          theorem CoetzeePater2011.ct_dominates_implies_deletion (ctx : Context) :
                                                          have ranking := [starCT, maxC, maxPreV, maxFinal]; have tab := OptimalityTheory.Tableau.ofRanking (candidatesFor ctx) ranking ; tab.optimal = {{ context := ctx, output := TDOutput.delete }}

                                                          When *CT >> all faithfulness, the categorical OT prediction is deletion in all contexts.

                                                          Generic ConstraintSystem predictions (per-context MaxEnt) #

                                                          At each context, the AAVE MaxEnt model is a per-context ConstraintSystem TDOutput: candidates = {retain, delete}, score = harmony of (ctx, ·), decoder = softmaxDecoder 1. The two-candidate softmax is the logistic function, so predict .delete is genuine conditional probability P(delete | ctx).

                                                          noncomputable def CoetzeePater2011.aaveW :
                                                          Fin 4

                                                          The AAVE constraint weights from table (23) ME-HG row.

                                                          Equations
                                                          Instances For

                                                            The AAVE MaxEnt model at a fixed context, packaged as a generic ConstraintSystem. With only two candidates, predict .delete is the conditional probability P(delete | ctx).

                                                            Equations
                                                            • One or more equations did not get rendered due to their size.
                                                            Instances For

                                                              In the pre-consonantal context, the AAVE system predicts deletion over retention. Conditional probability claim: P(delete | preC) > P(retain | preC).

                                                              In the pre-vocalic context, the AAVE system predicts retention over deletion: pre-V deletion violates both MAX and MAX-PRE-V, costing more than the *CT violation incurred by retention. Conditional probability claim: P(retain | preV) > P(delete | preV).

                                                              The per-context AAVE system is a probability distribution over TDOutput. Generic property of softmaxDecoder.