[Cum15]: OT Constraints, k-ness, NSAL, and RSA Cost #
[Cum15] [KWBG14] [Las99] [WWL+23]
Self-contained study of [Cum15]: an Optimality-Theoretic
constraint system for numeral production plus the bridges to graded
roundness, RSA cost, and the [KWBG14] PrecisionMode
switch.
OT Constraints #
Speakers choose among candidate numeral expressions by trading off four constraints:
- INFO (informativeness): prefer smaller admitted set
- Granularity: match contextual precision level
- QSIMP (quantifier simplicity): prefer bare numerals
- NSAL (numeral salience): prefer round/salient numerals
The empirical bridge to the k-ness model is nsalViolations: NSAL
violations equal maxRoundnessScore - roundnessScore(n), so rounder
numbers incur fewer NSAL violations.
A ViolationProfile is Core.Optimization.Profile Nat 4 (i.e. Fin 4 → Nat),
matching the codebase's standard violation-profile abstraction. Profile
indices fix the constraint order: 0 = INFO, 1 = Granularity, 2 = QSIMP,
3 = NSAL.
Two preorders on profiles are exposed, both as Core.Order.PullbackPreorder
instances so they fit the schema established in Core/Optimization/Pareto.lean:
| view | feature | feature-space order |
|---|---|---|
paretoPreorder | id : ViolationProfile → ViolationProfile | pointwise ≤ |
qualPreorder | nonzeroSetOf : … → Finset (Fin 4) | subset ⊆ |
paretoPreorder_le_implies_qualPreorder_le is a one-line application of
PullbackPreorder.coarsen_via_monotone with nonzeroSetOf as the
connecting monotone map. The legacy criterion-based view via
SatisfactionOrdering is preserved as cumminsOrdering.
Downstream Bridges #
- NSAL as RSA cost — the OT violation count, normalised to
[0, 1], serves as the RSAcost : U → ℚparameter. Round numerals are cheap; non-round are expensive. - k-ness as
PrecisionModethreshold —roundnessScore ≥ 2triggers.approximatemode, grounding the binary meaning-projection switch (f_evsf_a) of [KWBG14] — the projection component only, not that paper's joint goal inference. The pragmatic halo width ([Las99], [Kri07a]) is also wider for round numerals.
The lexical bridge to English.NumeralModifiers documents the
expected interaction between requiresRound modifiers and high k-ness
anchors.
Form of the numeral expression. Bare numerals are simplest; modified forms add complexity (and pay a QSIMP cost).
- bare : QuantifierForm
- modified : QuantifierForm
- interval : QuantifierForm
Instances For
Equations
- Cummins2015.instReprQuantifierForm = { reprPrec := Cummins2015.instReprQuantifierForm.repr }
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Cummins2015.instDecidableEqQuantifierForm x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
A candidate numeral expression for OT evaluation.
- numeral : ℕ
- form : QuantifierForm
- contextGranularity : ℕ
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Infer granularity from a numeral's trailing zeros: 100 → 2, 110 → 1, 111 → 0.
Equations
- Cummins2015.inferGranularity n = if n = 0 then 0 else if 1000 ∣ n then 3 else if 100 ∣ n then 2 else if 10 ∣ n then 1 else 0
Instances For
INFO: violations = |admittedSet| - 1. Less informative expressions
(which admit more values) incur more violations.
Equations
- Cummins2015.infoViolations admittedCount = admittedCount - 1
Instances For
Granularity: violations = |contextGranularity − utteranceGranularity|.
Equations
- Cummins2015.granularityViolations contextGranularity uttGranularity = if contextGranularity ≥ uttGranularity then contextGranularity - uttGranularity else uttGranularity - contextGranularity
Instances For
QSIMP: bare = 0, modified = 1, interval = 2.
Equations
Instances For
NSAL: violations = maxRoundnessScore - roundnessScore(n). Maximally
round numbers (score 6) → 0; non-round (score 0) → 6. The complement
of the graded roundness score, which is the load-bearing connection
between the OT account and the k-ness model.
Equations
Instances For
NSAL violations are the complement of roundness score.
A rounder numeral always has fewer NSAL violations.
A ViolationProfile is a Profile Nat 4 — a 4-vector of violation
counts indexed by Fin 4. Index order: 0 = INFO, 1 = Granularity,
2 = QSIMP, 3 = NSAL. Reusing Core.Optimization.Profile means the
abstractions in Core/Optimization/Pareto.lean (Pareto preorder,
qualitative coarsening, coarsen_via_monotone) apply directly without
an extra projection layer.
Equations
Instances For
Compute the violation profile for a candidate. Indices: 0 = INFO, 1 = Granularity, 2 = QSIMP, 3 = NSAL.
Equations
- Cummins2015.violationProfile c admittedCount 0 = Cummins2015.infoViolations admittedCount
- Cummins2015.violationProfile c admittedCount 1 = Cummins2015.granularityViolations c.contextGranularity (Cummins2015.inferGranularity c.numeral)
- Cummins2015.violationProfile c admittedCount 2 = Cummins2015.qsimpViolations c.form
- Cummins2015.violationProfile c admittedCount 3 = Cummins2015.nsalViolations c.numeral
Instances For
Pointwise Pareto preorder on violation profiles as a PullbackPreorder
with the identity feature: v ≤ v' iff every component of v is ≤ the
corresponding component of v'. The partial-order backbone that lex-OT
extends; lex breaks ties using the constraint ranking.
Equations
- Cummins2015.paretoPreorder = Core.Order.PullbackPreorder.ofProj id fun (a a' : Cummins2015.ViolationProfile) => have this := inferInstance; this
Instances For
Qualitative coarsening as a PullbackPreorder: feature is the set of
violated constraint indices, feature-space order is Finset.⊆. The
underlying preorder is identical in extension to cumminsOrdering's, but
presented as a feature pullback so the same coarsen_via_monotone schema
applies.
Equations
- Cummins2015.qualPreorder = Core.Order.PullbackPreorder.ofProj Core.Optimization.nonzeroSetOf fun (x x_1 : Cummins2015.ViolationProfile) => inferInstance
Instances For
Pointwise dominance ⇒ qualitative dominance. A one-line corollary of
PullbackPreorder.coarsen_via_monotone with the violated-index extractor
as the connecting monotone map — identical in shape to
paretoPullbackPreorder_le_implies_supportPullbackPreorder_le in
Core/Optimization/Pareto.lean.
The four OT constraints as a criterion type. Field order matches
ViolationProfile indices (info ↔ 0, …, nsal ↔ 3).
- info : NumeralConstraint
- granularity : NumeralConstraint
- qsimp : NumeralConstraint
- nsal : NumeralConstraint
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Cummins2015.instDecidableEqNumeralConstraint x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Coarse-grain a violation profile: a constraint is "satisfied" iff it
has 0 violations. Bool-valued for SatisfactionOrdering.
Equations
- Cummins2015.constraintSatisfied v Cummins2015.NumeralConstraint.info = (v 0 == 0)
- Cummins2015.constraintSatisfied v Cummins2015.NumeralConstraint.granularity = (v 1 == 0)
- Cummins2015.constraintSatisfied v Cummins2015.NumeralConstraint.qsimp = (v 2 == 0)
- Cummins2015.constraintSatisfied v Cummins2015.NumeralConstraint.nsal = (v 3 == 0)
Instances For
A SatisfactionOrdering over violation profiles using the four
[Cum15] constraints. Coarsens the OT system: a candidate
"satisfies" a constraint iff it incurs 0 violations. The resulting
ordering is weaker than lex-OT (which discriminates by violation
degree) but captures the structural backbone: a candidate that
satisfies a strict superset of constraints is always OT-preferred.
Equations
- One or more equations did not get rendered due to their size.
Instances For
A candidate with zero violations everywhere is at-least-as-good as any other under the satisfaction ordering.
NSAL violations as a normalised RSA cost ∈ [0, 1]. Bridges the OT NSAL
constraint to the RSA cost parameter: round numerals are "cheap"
(≈ 0), non-round are "expensive" (≈ 1). The denominator is
maxRoundnessScore, so the codomain is exactly [0, 1].
Equations
Instances For
Maximally round numerals are free (zero cost). Reduces to the
Nat-level fact nsalViolations 100 = 0 (closed by decide) plus
0 / x = 0 over ℚ.
Round numerals incur strictly lower RSA cost than non-round ones.
Nat-level reduction of nsalViolations followed by norm_num over ℚ.
inferPrecisionMode (in Numerals.Precision) is defined by
roundnessScore n ≥ 2 → .approximate. This subsection records the
grounding theorems for representative numerals plus the general bridge
that every multiple of 10 falls in .approximate mode.
100 is round (score 6) → .approximate mode.
7 is non-round (score 0) → .exact mode.
General bridge. Every multiple of 10 is interpreted in
.approximate mode. Follows from score_ge_two_of_div10: the score is
at least 2, so the roundnessScore n ≥ 2 branch of inferPrecisionMode
fires.
Pragmatic halo width is strictly wider for round numerals (100) than for non-round (7) — the quantitative content of [Las99]'s halo intuition under the k-ness operationalisation.
The lexical entry for "approximately" carries requiresRound = false.
[Cum15] predicts that combinations with low-k anchors are
grammatical but marked — naturalness correlates with k-ness rather than
being a hard constraint.