Documentation

Linglib.Studies.HayesWilson2008

[HW08]: A Maximum Entropy Model of Phonotactics #

[HW08]

[HW08] propose that phonotactic well-formedness is probability: a MaxEnt grammar assigns each surface form a score h(x) = Σ wⱼ · Cⱼ(x), and well-formedness is P(x) = exp(−h(x)) / Z.

Hayes & Wilson's "score" is the negation of harmonyScore: h(x) = −harmonyScore(x), so P(x) ∝ exp(harmonyScore(x)). Higher harmony = higher probability = better well-formedness. This is exactly softmax(harmonyScore, 1) on a finite candidate set.

Key contribution: ganging #

The central empirical prediction distinguishing MaxEnt from OT is ganging: two individually weak constraints can jointly override a stronger one. This is impossible with OT's strict ranking, which corresponds to exponentially separated weights (OTLimit.lean).

The Ganging definition and anti-ganging theorems live in OTLimit.lean alongside ExponentiallySeparated, since they are two sides of the same coin.

English onset data #

We encode a subset of the learned grammar (Table (4)) and verify that the model assigns higher harmony (= higher MaxEnt probability via exp_lt_exp) to attested onsets than to unattested ones (§2).

@[reducible, inline]

An English onset: a list of consonants preceding the nucleus.

Equations
Instances For

    Constraint #1 from Table (4): *[+sonorant, +dorsal]. Weight 5.64.

    Equations
    Instances For

      Constraint #5 from Table (4): *[ ][+voice, −sonorant]. Weight 5.37.

      Equations
      Instances For
        noncomputable def HayesWilson2008.onsetW :
        Fin 4

        The learned weights for the four Table (4) constraints: 5.64, 5.17, 5.37, 6.66.

        Equations
        Instances For

          Attested [k] (no violations) has higher harmony than unattested *[ŋ] (violates *[+son,+dors], cost 5.64). The harmony magnitude is a weight artifact; the ranking is the empirical prediction.

          MaxEnt probability ordering: higher harmony ⟹ higher exp(harmonyScore) ⟹ higher MaxEnt probability. Applies exp_lt_exp to harmonyScore.

          Gradient well-formedness: among unattested forms, *[ŋ] has higher MaxEnt probability than *[rk].

          Phonological MaxEnt is one instance of the framework-agnostic ConstraintSystem abstraction in Core.Optimization.System. The same ConstraintSystem record that scores phonological onsets here also scores syntactic candidates in HG/MaxEnt syntax models, RSA utterances in soft-max pragmatic listeners, etc. The decoder (softmaxDecoder 1) is what makes this MaxEnt rather than HG (argmaxDecoder) or OT (argminDecoder over a LexProfile).

          This section eats the dog food: rather than comparing exp(harmonyScore ...) directly (as in §3), we go through ConstraintSystem.predict.

          The four onsets used as MaxEnt candidates: two attested ([k], [b,r]) and two unattested (*[ŋ], *[r,k]).

          Equations
          Instances For

            [HW08]'s grammar realised as a generic ConstraintSystem over candidateOnsets, decoded by softmax at temperature 1 (built inline). The score component is harmonyScore onsetCon onsetW (the canonical MaxEnt harmony function).

            Equations
            • One or more equations did not get rendered due to their size.
            Instances For

              The system literally predicts a higher MaxEnt probability for [k] than for *[ŋ]. Unlike maxent_prob_k_gt_ŋ, this is a comparison of actual softmax probabilities (numerator / partition function), not just exponentiated harmony scores — so the partition function over candidateOnsets is part of the claim.

              The system also predicts a higher MaxEnt probability for *[ŋ] than for *[rk] — gradient well-formedness among unattested forms.

              The MaxEnt softmax decoder is a probability decoder, so the system's predictions are non-negative and sum to 1 over the candidate set. Follows from Decoder.IsProb.sum_eq_one for softmaxDecoder.