Documentation

Linglib.Studies.Storme2026

[Sto26b]: Systemic Constraints in MaxEnt Grammars #

[Sto26b]

Replication of [Sto26b] "A Method to Evaluate Systemic Constraints in Probabilistic Grammars" (Linguistic Inquiry 57(1)).

Key idea #

Standard MaxEnt grammars evaluate each input→output mapping independently. Storme shows how to incorporate systemic constraints — constraints that evaluate sets of mappings jointly. The running example is *HOMOPHONY: a constraint penalizing output tuples where distinct inputs receive identical surface forms.

Persian hiatus resolution #

The case study is Persian vowel hiatus between /æ/ and /ɑ/. Classical faithfulness (MAX, IDENT) and markedness (*VV, DEP) constraints predict symmetric treatment of /æ.ɑ/ vs. /ɑ.æ/ — deletion of V1 vs. V2 should be equally probable in both. But Storme argues that *HOMOPHONY breaks this symmetry: the grammar prefers output tuples where distinct inputs produce distinct surface forms.

Formalization #

We instantiate the MaxEnt machinery — a classical constraint vector classicalCon with weight vector classicalW, plus the systemic *HOMOPHONY constraint — over the Persian hiatus domain from Farsi.Phonology, and verify the key prediction: homophony avoidance breaks the symmetry.

Constraints #

Following standard OT/MaxEnt constraint families:

@[implicit_reducible]
Equations
  • One or more equations did not get rendered due to their size.
@[implicit_reducible]
Equations
  • One or more equations did not get rendered due to their size.

MAX: penalizes deletion. 1 violation for deleteV1 or deleteV2, 0 otherwise.

Equations
  • One or more equations did not get rendered due to their size.
Instances For

    DEP: penalizes epenthesis. 1 violation for epenthesis, 0 otherwise.

    Equations
    Instances For

      *VV: markedness constraint penalizing vowel hiatus. 1 violation for faithful (hiatus preserved), 0 for all repairs.

      Equations
      Instances For

        IDENT: penalizes coalescence (feature change). 1 violation for coalescence, 0 otherwise.

        Equations
        Instances For

          The classical constraint set for Persian hiatus, as a CON (constraint vector). Index order: 0 = MAX, 1 = DEP, 2 = *VV, 3 = IDENT.

          Equations
          Instances For
            def Storme2026.classicalW :
            Fin 4

            MaxEnt weights for classicalCon (MAX = 2, DEP = 1, *VV = 3, IDENT = 2).

            Note: weights are illustrative (chosen to make epenthesis the classical winner), not fitted to Storme's experimental data. The qualitative predictions (symmetry, symmetry-breaking) hold for any positive weights since they depend on constraint structure, not specific weight values.

            Equations
            Instances For

              Harmony scores are symmetric across mirror-image inputs: H(ae.ah, deleteV1) = H(ah.ae, deleteV1) under classical constraints alone, because classical constraints only look at output structure.

              The classical grammar ranks the repairs epenthesis ≻ {deletion, coalescence} ≻ faithful (weights chosen so epenthesis is the classical winner). The harmony magnitudes are weight artifacts; the ranking is the prediction.

              *HOMOPHONY for the Persian hiatus paradigm: penalizes output tuples where distinct inputs produce identical outputs.

              Equations
              Instances For
                noncomputable def Storme2026.persianJointScore (f : Fin 4Farsi.Phonology.HiatusOutput) :

                Joint harmony score for a complete output tuple, combining classical per-mapping scores with the systemic *HOMOPHONY penalty.

                Equations
                Instances For

                  An output tuple where the mirror-image inputs /æ.ɑ/ and /ɑ.æ/ use different repair strategies (deleteV1 vs deleteV2). Both still surface as [ɑ], so homophony remains at those positions — but the tuple has fewer *HOMOPHONY violations overall than the uniform-strategy tuple.

                  Equations
                  Instances For

                    Concrete violation counts:

                    • homophonousTuple: all 4 positions use deleteV1 → C(4,2) = 6 collisions
                    • diverseTuple: 3 positions use deleteV1, 1 uses deleteV2 → 3 collisions

                    *HOMOPHONY incurs more violations on the homophonous tuple than on the diverse tuple. This is the core mechanism by which systemic constraints break symmetry.

                    The diverse tuple has at least as high joint harmony as the homophonous tuple, because it incurs fewer *HOMOPHONY violations while having the same classical constraint violations.

                    Classical constraints alone assign the same score to deleteV1 for /æ.ɑ/ and /ɑ.æ/ — the grammar cannot distinguish mirror-image inputs at the individual mapping level. (Restated from §2 for contrast with the non-separability result below.)

                    theorem Storme2026.joint_not_separable :
                    ∃ (f : Fin 4Farsi.Phonology.HiatusOutput) (g : Fin 4Farsi.Phonology.HiatusOutput), f 1, = g 1, persianJointScore f - Constraints.harmonyScore classicalCon classicalW (inputsIndexed 1, , f 1, ) persianJointScore g - Constraints.harmonyScore classicalCon classicalW (inputsIndexed 1, , g 1, )

                    Under *HOMOPHONY, the joint distribution over all four mappings is not a product of independent per-mapping distributions. The marginal probability of a specific output for /æ.ɑ/ depends on what outputs are chosen for the other inputs — this is the core mechanism by which systemic constraints influence individual mapping probabilities.

                    This theorem verifies that the joint score is not additively separable (i.e., there exist tuples f, g agreeing on position 1 but differing in joint score minus the classical score at position 1).

                    At each input, the classical MaxEnt model is a ConstraintSystem HiatusOutput ℝ: candidates = Finset.univ, score = harmony, decoder = softmaxDecoder 1. This is the same ConstraintSystem API used by HayesWilson2008.onsetSystem — different domain, identical scaffolding.

                    The systemic-constraint (*HOMOPHONY) story in §§3–6 sits above this per-input view: it couples the per-input distributions into a joint distribution on Fin 4 → HiatusOutput. With zero systemic weight, the joint factorises and each marginal equals the per-input predict.

                    The classical MaxEnt distribution at input i, packaged as a generic ConstraintSystem. Score = harmonyScore classicalCon classicalW (i, ·), decoder = softmaxDecoder 1.

                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For

                      For input /æ.ɑ/, the system predicts a higher MaxEnt probability for epenthesis (cost −1) than for deleteV1 (cost −2). This is a comparison of actual softmax probabilities (numerator / partition function over all 5 outputs), not just exponentiated harmonies.

                      The classical Persian system at /æ.ɑ/ is a probability distribution over HiatusOutput. Follows from the generic softmaxDecoder_isProb.