MaxEnt Constraint Systems @cite{storme-2026} #
MaxEnt grammars assign probabilities to input→output mappings using the
softmax function over weighted constraint violations — the probabilistic
generalization of OT (where α → ∞ recovers categorical optimization;
see softmax_argmax_limit).
This module extends the basic MaxEnt setup with systemic constraints (@cite{storme-2026}): constraints that evaluate tuples of mappings jointly (e.g., *HOMOPHONY penalizes distinct inputs that map to the same output). Systemic constraints require evaluating a joint distribution over the product space of all input→output mappings, then marginalizing to recover individual mapping probabilities.
The framework is generic in the candidate type — phonology is one instance, but the same machinery applies to any domain that scores candidates by weighted constraint violations.
Architecture #
MaxEntGrammarpackages inputs, candidate generation, and weighted constraints (WeightedConstraintandharmonyScorecome fromCore.Constraint.Weighted)SystemicConstraintevaluates tuples of outputs (different type signature)jointHarmonyScorecombines classical + systemic scores over the product spacemarginalProbmarginalizes the joint to recover individual mapping probabilities
Key theorem #
marginal_eq_classical_when_no_systemic: when systemic constraint weight = 0,
the marginalized probability equals classical MaxEnt. This is because the joint
score decomposes additively ⇒ the joint distribution factorizes ⇒ each marginal
equals its factor.
A MaxEnt grammar: inputs, candidate generation, and weighted constraints.
Probability of mapping input i to output o is proportional to
exp(harmonyScore(i, o)).
- inputs : List Input
The set of inputs (underlying forms).
- gen : Input → List Output
Candidate generator: produces output candidates for each input.
- constraints : List (WeightedConstraint (Input × Output))
Weighted constraints evaluating input–output pairs.
Instances For
Classical MaxEnt probability: P(o | i) = softmax over gen(i).
This is the standard MaxEnt phonology probability, without systemic
constraints. Uses softmax from RationalAction with α = 1.
Equations
- g.prob i o = Core.softmax (fun (o' : O) => Core.Constraint.harmonyScoreR g.constraints (i, o')) 1 o
Instances For
A systemic constraint evaluates a tuple of outputs — one per input — rather than individual input→output pairs.
Example: *HOMOPHONY counts pairs of distinct inputs that map to the same output. This cannot be decomposed into per-mapping evaluations.
n is the number of inputs; the tuple Fin n → O assigns an output
to each input index.
- name : String
Constraint name.
- weight : ℚ
Constraint weight.
- eval : (Fin n → O) → ℕ
Evaluation function: how many violations does this output tuple incur?
Instances For
*HOMOPHONY: penalizes output tuples where distinct inputs receive the same output. Counts the number of colliding pairs.
For n inputs, evaluates |{(i,j) : i < j ∧ f(i) = f(j)}|.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Systemic constraint score for an output tuple (ℚ, computable).
This is the coupling component: Σₖ (-wₖ · Sₖ(f)).
Equations
- Core.Constraint.systemicScore systemicConstraints f = List.foldl (fun (acc : ℚ) (sc : Core.Constraint.SystemicConstraint n O) => acc - sc.weight * ↑(sc.eval f)) 0 systemicConstraints
Instances For
Systemic constraint score as ℝ.
Equations
- Core.Constraint.systemicScoreR systemicConstraints f = ↑(Core.Constraint.systemicScore systemicConstraints f)
Instances For
Joint harmony score over the product space. Combines classical per-mapping scores with systemic tuple-level scores.
H_joint(f) = Σᵢ H_classical(iᵢ, f(i)) + Σₖ (-wₖ · Sₖ(f))
where f : Fin n → O is the full output tuple.
Equations
- One or more equations did not get rendered due to their size.
Instances For
MaxEnt grammar with systemic constraints as a CoupledSoftmax.
componentScore i v = harmonyScoreR(classicalConstraints, (inputs i, v))couplingScore f = systemicScoreR(systemicConstraints, f)
The joint probability is softmax(totalScore, 1) over all Fin n → O
output tuples. The marginal at position i recovers the individual
mapping probability under systemic pressure.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Marginal probability: marginalize the joint distribution to get the probability of a specific input→output mapping.
P_marginal(oᵢ | iᵢ) = Σ_{f : f(i) = oᵢ} P_joint(f)
This is Storme's key equation: the marginal recovers individual mapping
probabilities that reflect systemic pressure. Defined through
CoupledSoftmax.marginal so that factorization follows from
marginal_eq_independent_when_uncoupled.
Equations
- Core.Constraint.marginalProb inputs classicalConstraints systemicConstraints i o = (Core.Constraint.maxEntCoupled inputs classicalConstraints systemicConstraints).marginal i o
Instances For
Factorization: when systemic constraint weights are all zero, the marginal equals the classical MaxEnt probability.
With zero systemic weights, the coupling score is constant (= 0),
so marginal_eq_independent_when_uncoupled applies: the joint
factorizes and each marginal equals its independent per-item softmax.
Realize a MaxEntGrammar (at a fixed input) as a generic
ConstraintSystem over the universal candidate set, decoded by
softmaxDecoder at temperature 1.
This shows the legacy MaxEnt API is a special case of the
framework-agnostic abstraction in System.lean — pick the universal
candidate set, the harmony score, and the Gumbel/softmax decoder.
Equations
- g.toSystem i = { candidates := Finset.univ, score := fun (o : O) => Core.Constraint.harmonyScoreR g.constraints (i, o), decoder := Core.Constraint.softmaxDecoder 1 }
Instances For
The legacy MaxEntGrammar.prob agrees with the generic
ConstraintSystem.predict of the system produced by toSystem.
Equivalent by direct unfolding: both are softmax over the harmony
score on the universal candidate set.