Constraint Systems — The Unified Interface #
A ConstraintSystem packages the three components of any
constraint-based grammar into a single record:
- a finite candidate set,
- a score function (whatever value type the framework needs), and
- a decoder that turns scored candidates into a probability distribution.
Candidates ──score──▶ Score ──decoder──▶ P : Cand → ℝ
Different frameworks differ only in what they pick for each component. Smart constructors below assemble the standard frameworks:
otSystem— OT:Score = LexProfile Nat n, decoder =argminDecoderhgSystem— HG:Score = ℝ, decoder =argmaxDecodermaxEntSystem— MaxEnt:Score = ℝ, decoder =softmaxDecoder α
Every system exposes a single predict operation, so downstream code
can compare frameworks (or replace one with another) without touching
the rest of the analysis.
Stochastic frameworks that put noise on weights or candidates
(NHG, Normal MaxEnt, Stochastic OT) are RUM specializations: the
noise kernel composes with the deterministic decoder. See
NoiseKernel.lean for the Dirac / Gumbel / Gaussian bridges.
A constraint-based grammar: candidate set + score function + decoder.
Cand is the candidate type (often something like an output form,
a syntactic structure, or an input/output pair). Score is whatever
value the constraints aggregate to — ℝ for HG/MaxEnt, LexProfile Nat n for OT, etc.
- candidates : Finset Cand
The finite set of candidates this system evaluates.
- score : Cand → Score
The aggregated score for each candidate.
- decoder : Decoder Cand Score
The choice rule: scored candidates → probability distribution.
Instances For
The predicted probability of each candidate.
Equations
- s.predict = s.decoder.decode s.candidates s.score
Instances For
A system whose decoder is IsProb produces non-negative predictions.
A system whose decoder is IsProb predicts a probability distribution
over its candidate set.
Build an Optimality Theory system: lexicographic minimum on a violation
profile. The candidate set must be finite; profile c gives the
n-tuple of constraint violations for candidate c ranked in
order of dominance.
Equivalent to choosing the OT winner(s) by argmin on the lex order.
Equations
- Core.Constraint.otSystem candidates profile = { candidates := candidates, score := profile, decoder := Core.Constraint.argminDecoder }
Instances For
Build a (deterministic) Harmonic Grammar system: real-valued harmony score, choose the candidate(s) that maximise it.
The harmony score is harmonyScoreR constraints c = -Σⱼ wⱼ · Cⱼ(c).
Higher harmony = lower (weighted) violation = more grammatical.
Equations
- Core.Constraint.hgSystem candidates constraints = { candidates := candidates, score := Core.Constraint.harmonyScoreR constraints, decoder := Core.Constraint.argmaxDecoder }
Instances For
Build a MaxEnt Harmonic Grammar system: same harmony score as hgSystem,
but soft-decode with temperature α. As α → ∞, this converges to
hgSystem (see softmax_argmax_limit in Core.Agent.RationalAction).
The default α = 1 matches @cite{goldwater-johnson-2003}'s
standard MaxEnt formulation.
Equations
- Core.Constraint.maxEntSystem candidates constraints α = { candidates := candidates, score := Core.Constraint.harmonyScoreR constraints, decoder := Core.Constraint.softmaxDecoder α }
Instances For
Core.Constraint.Evaluation.Tableau is the established study-file API for
OT (used by mkTableau ... .optimal = {winner} patterns). The bridge below
shows that Tableau is a special case of ConstraintSystem: the OT score
profile : C → ViolationProfile n is exactly a LexProfile Nat n-valued
score (definitionally), and Tableau.optimal is exactly the support of the
argminDecoder distribution. So existing OT studies can keep their
Tableau/optimal formulation and additionally expose the unified
ConstraintSystem.predict view via tableauSystem.
An OT tableau viewed as a generic ConstraintSystem. The score type
LexProfile Nat n is definitionally ViolationProfile n, so the
argminDecoder's LinearOrder requirement is satisfied by the
standard Pi.Lex instance.
Equations
- Core.Constraint.tableauSystem t = { candidates := t.candidates, score := t.profile, decoder := Core.Constraint.argminDecoder }
Instances For
The unifying identity: tableauSystem's prediction is uniform over
Tableau.optimal. Since Tableau.optimal IS the argmin filter set
by definition, the argminDecoder reduces to the standard "uniform
over winners" formula. All other bridge results follow.
A candidate is in the support of the tableauSystem distribution
iff it is in the tableau's optimal set.
When Tableau.optimal = {winner} (the typical deterministic-OT pattern
used in study files via by decide), the unified predict view assigns
probability 1 to the winner.
And losers in a unique-winner tableau receive probability 0.
Closed-form, monotonicity, and probability-distribution lemmas for
ConstraintSystems decoded by softmaxDecoder. The hypothesis
hd : s.decoder = softmaxDecoder α is rfl for systems built via
maxEntSystem or via the explicit-record pattern study files use
({ candidates := …, score := …, decoder := softmaxDecoder 1 }), so
study-file proofs collapse to one-line delegations.
For a system whose decoder is softmaxDecoder α, the predicted
probability of an in-set candidate has the standard MaxEnt closed form
exp(α · score(c)) / Σ exp(α · score(c')).
Softmax monotonicity: for α > 0 and both candidates in the candidate
set, the candidate with strictly higher score gets strictly higher
predicted probability. Lets MaxEnt study files derive ranking facts
from raw harmony comparisons in one line.
Softmax equality: candidates with equal scores get equal predicted probability. Lets MaxEnt study files derive equiprobability from raw score equality in one line.
For a system whose decoder is softmaxDecoder α, predicted
probabilities sum to 1 over the (non-empty) candidate set.
Specialises predict_sum_eq_one so study files needn't supply
the IsProb instance argument explicitly.