Belth (2026): A Learning-Based Account of Phonological Tiers @cite{belth-2026} #
D2L (Distant-to-Local) is an iterative learner that constructs phonological tiers as a byproduct of trying to express alternations as adjacent dependencies. Given underlying-form / surface-form pairs, D2L starts by projecting all segments and tries to predict the alternating segment from its linear neighbours. When the resulting rule fails Yang's Tolerance Principle @cite{yang-2016}, D2L deletes the offending segments from the tier and tries again, until either a tolerated rule is found or no further deletion helps.
The output of D2L is a tier-based alternation rule, modelled here by the
canonical Phonology.Alternation.TierRule schema (in
Theories/Phonology/Alternation.lean); the closely-related SPE
non-tier Phonology.LocalRewrite.Rule schema in
Theories/Phonology/Process/LocalRewrite.lean is the right substrate
when the alternation does not factor through a tier projection.
The function-level subregular classification of D2L outputs lives in
Core/Computability/Subregular/Function/: tier-mediated dissimilation
rules of the form Belth converges to are Tier-Subsequential
(@cite{aksenova-rawski-graf-heinz-2020}). For Latin -alis / -aris
allomorphy (@cite{belth-2026} §5.3, rule 54), the rule D2L converges
to is
Disagree([?lat], {lat}) / [+cons] __ ∘ proj(·, [+cons]),
i.e. dissimilation of the underspecified affix-initial liquid /L/ from
the immediately preceding [+cons] tier segment. @cite{belth-2026}
reports 0.97 accuracy on a 121-word Perseus dataset; the residual ~3%
errors are tolerated by Yang's Tolerance Principle (4 ≤ 121/ln 121),
so D2L converges to this rule rather than memorizing the lexicon.
This study formalizes:
- the rule (
latinDissimRule) and its predictions on six worked examples (Belth ex. 5/53): navalis, popularis, pluvalis, floralis, legalis, lunaris; - the genuine empirical limit: lunaris surfaces with
[r], but the[+cons]-tier rule predicts[l](the immediately preceding tier segment is the nasal/n/, which is[−lat], soDisagreeoutputs[+lat]). This is one of the ~3% errors the Tolerance Principle tolerates; - a Tolerance-Principle certificate
(
latinDissimRule_tolerated_on_examples) showing the 1-of-6 exception count on this corpus is well under Yang's threshold; - a Subregular bridge (
consTier_apply_eq_tierProject) grounding the consonant tier in the @cite{lambert-2022}/@cite{heinz-rawal-tanner-2011} TSL formalism; - an OCP-on-tier OT constraint (
latinOCP, viaPhonology.Constraints.mkOCPOnTier) and an OT tableau bridge: each empirical contrast becomes a winner-loser pair,tableauERCextracts the Elementary Ranking Condition (@cite{merchant-riggle-2016}), and the OT analysis is shown to track the rule-based analysis exactly, including the same lunaris failure (the lunaris ERC is contradictory on the two-constraint inventory ⟨OCP, *r⟩, which is the OT-side analogue of the rule's underextension).
D2L's Turkish (Belth §5.1) and Finnish (§5.2) rules are sketched in §10 below. They require multi-feature dependencies and explicit Elsewhere defaults; both extensions are admittable on demand.
On verification scope. D2L itself — the learning algorithm — is not run inside Lean. Belth's empirical claim is that, given a corpus, D2L converges to specific rules. Verifying that claim end-to-end would require running D2L on naturalistic datasets (CHILDES, MorphoChallenge, Perseus), which is out of scope for a Lean formalization. We instead formalize the learned rules and verify their predictions on representative examples, plus the Tolerance Principle inequality that licenses convergence.
A minimal Latin segment alphabet sufficient for the -alis / -aris
contrast. The unspecified affix-initial liquid is L; surface
allomorphs are l and r.
- a : LatSeg
- e : LatSeg
- i : LatSeg
- o : LatSeg
- u : LatSeg
- l : LatSeg
- r : LatSeg
- L : LatSeg
- n : LatSeg
- v : LatSeg
- s : LatSeg
- g : LatSeg
- f : LatSeg
- p : LatSeg
- b : LatSeg
Instances For
Equations
- Phenomena.Phonology.Studies.Belth2026.instDecidableEqLatSeg x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
[+cons] per @cite{cser-2010}'s feature inventory: every non-vowel
is consonantal, except the orthographic ⟨v⟩, which @cite{belth-2026}
treats as the glide [w] ([−cons]). This is the choice that lets
pluv-aLis surface as [r] (the tier-preceding consonant of pluv-
is /l/, not /v/). L is [+cons] — the underspecified /L/
projects onto the consonant tier even though its [lat] value is
not yet fixed.
Equations
- Phenomena.Phonology.Studies.Belth2026.LatSeg.a.IsCons = False
- Phenomena.Phonology.Studies.Belth2026.LatSeg.e.IsCons = False
- Phenomena.Phonology.Studies.Belth2026.LatSeg.i.IsCons = False
- Phenomena.Phonology.Studies.Belth2026.LatSeg.o.IsCons = False
- Phenomena.Phonology.Studies.Belth2026.LatSeg.u.IsCons = False
- Phenomena.Phonology.Studies.Belth2026.LatSeg.v.IsCons = False
- x✝.IsCons = True
Instances For
Equations
- One or more equations did not get rendered due to their size.
[+lat] per @cite{cser-2010}: only l is lateral, and L is
underspecified for [lat]. Returning Option Bool rather than
Bool here avoids the older "junk default" pattern where L had
to fake some concrete value — none is the honest answer.
Equations
- Phenomena.Phonology.Studies.Belth2026.LatSeg.l.isLat = some true
- Phenomena.Phonology.Studies.Belth2026.LatSeg.L.isLat = none
- x✝.isLat = some false
Instances For
The consonantal tier (Belth's learned tier for Latin, rule 54).
Every [+cons] segment projects.
Equations
Instances For
The rule D2L learns under the [+cons] tier (@cite{belth-2026}
rule 54): Disagree([?lat], {lat}) / [+cons] __ ∘ proj(·, [+cons]).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Underlying form of popularis 'popular': /popul-aLis/.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Underlying form of pluvalis 'rainy': /pluv-aLis/. With /v/ off
the consonant tier (treated as [w], [−cons]), the tier-preceding
consonant in pluv-a is /l/.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Underlying form of floralis 'floral': /flor-aLis/.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Underlying form of legalis 'legal': /leg-aLis/.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Underlying form of lunaris 'lunar': /lun-aLis/. The empirical
counterexample: surface is [r], but the rule predicts [l]
because the immediately preceding [+cons] segment in /lun-a/ is
the nasal /n/ ([−lat]), and Disagree outputs [+lat].
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Equations
Instances For
Equations
Instances For
Equations
Instances For
Equations
Instances For
Surface value predicted by the learned rule: some true = surfaces
as l, some false = surfaces as r, none = the rule has no
opinion.
Equations
- Phenomena.Phonology.Studies.Belth2026.predicted ur lPos = Phenomena.Phonology.Studies.Belth2026.latinDissimRule.applyAt (List.take lPos ur)
Instances For
popularis: tier-preceding consonant is /l/ ([+lat]). Disagree
outputs [−lat] = r. ✓
pluvalis: with /v/ off the consonant tier, tier-preceding
consonant of /pluv-a/ is /l/ ([+lat]). Disagree outputs
[−lat] = r. ✓
floralis: tier-preceding consonant is /r/ ([−lat]). The
intervening /r/ blocks the dissimilation that popularis's
preceding /l/ would have triggered — exactly the long-distance
pattern Belth's tier-projection is designed to capture. ✓
legalis: tier-preceding consonant is /g/ ([−lat]). The
stem-initial /l/ is hidden from /L/ by the intervening /g/
on the consonant tier. ✓
lunaris: the rule predicts [l] (the last [+cons] segment in
/lun-a/ is the nasal /n/, which is [−lat], so Disagree
outputs [+lat]). This is the wrong prediction — the surface form
is [r]. @cite{belth-2026} reports this as one of the ~3% of
errors the Tolerance Principle tolerates; D2L does not refine the
tier further on this dataset.
Witness pairs (underlying-form-prefix, expected-surface-lat-value)
for each worked example.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The learned rule misses exactly one datum out of six (lunaris) — a 5/6 ≈ 83% match on this minimal corpus, the same single-error rate Belth reports at the corpus scale (~3%).
Yang's Tolerance Principle is the productivity gate inside D2L: a
rule with n items in scope and e exceptions is tolerated iff
e ≤ n / ln n. For our six worked examples with one exception
(lunaris), 1 ≤ 6 / ln 6 ≈ 3.35 — the rule is comfortably under the
threshold, matching @cite{belth-2026}'s 4-of-121 (~3%) result on the
full Perseus corpus.
The threshold inequality for the six-example demo: 1 exception ≤ 6/ln 6.
The learned rule passes Yang's Tolerance Principle on the worked
examples: its exception count (1, lunaris) fits under the threshold
latinData.length / ln latinData.length. This is the productivity
certificate that licenses D2L's convergence to latinDissimRule
rather than memorizing each form.
The consonant tier projection equals the canonical
tierProject from the TSL formalism in Core/Computability/Subregular/.
By construction both reduce to List.filter on the [+cons] predicate,
so the autosegmental and language-theoretic tiers coincide. This
grounds Belth's tier in the TSL_k subregular hierarchy.
The TSL_2 grammar witnessing the Latin allomorphy pattern as a
tier-based subregular language: project to the [+cons] tier, then
forbid adjacent identical symbols. Lambert's @cite{lambert-2022}
TSL_k schema, instantiated with IsCons as the tier predicate and
the OCP forbidden 2-factor [some x, some x].
Equations
Instances For
Belth's tier-based dissimilation has a natural OCP twin: surface
forms produced by the rule should not contain adjacent [+lat]
segments on the consonant tier. We expose this connection via
Phonology.Constraints.mkOCPOnTier, which already accepts a generic
Core.Tier. A tier-adjacent pair of identical liquids contributes
one violation.
Equations
Instances For
The OCP constraint is a markedness constraint by construction.
The OCP-on-tier evaluation of latinOCP on a candidate is zero iff
that candidate is in latinTSLGrammar.lang. Specialization of
Phonology.Subregular.mkOCPOnTier_zero_iff_in_ocp_lang to the Latin
grammar. The two perspectives — markedness constraint with zero
violations and TSL_2 grammar membership — coincide.
The OT analysis uses a minimal two-constraint inventory:
latinOCP(markedness): no adjacent laterals on the[+cons]tier.latinStarR(markedness): the default-[l]preference, penalizing each surface[r].
On this inventory, popularis and pluvalis each force OCP ≫ *r; navalis, floralis, legalis are vacuously satisfied. lunaris is unrankable — its ERC is contradictory (no W, one L), recapitulating the rule's empirical limit.
Substitute val for every L in a candidate string. The two
surface candidates for an underlying form are substL .l ur and
substL .r ur.
Equations
- Phenomena.Phonology.Studies.Belth2026.substL val = List.map fun (seg : Phenomena.Phonology.Studies.Belth2026.LatSeg) => if seg = Phenomena.Phonology.Studies.Belth2026.LatSeg.L then val else seg
Instances For
Markedness constraint penalizing each surface [r]: the
default-[l] preference.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The Latin ranking implied by allomorphy: OCP/[+cons] dominates *r.
Equations
Instances For
The two surface candidates for an underlying form.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Membership in latinTSLGrammar.lang is decidable: the bridge to the
integer-valued OCP score (latinOCP_zero_iff_in_TSL) transports the
Decidable (c.eval = 0) instance to language membership.
Equations
The empirically expected TSL_2 membership table for the Belth Latin inventory: each row pairs a candidate with whether the OCP-on-tier grammar should admit it. pluvalis and floralis are the discriminating cases (one candidate excluded for tier-adjacent identicals); navalis admits both (no adjacency); lunaris admits both (the OCP-on-tier grammar's empirical limit — the dissimilation rule fires even though only one tier-lateral is adjacent).
Equations
- One or more equations did not get rendered due to their size.
Instances For
The OCP-on-tier TSL_2 grammar agrees with the expected membership table
on every row of latinTSLExpected: the empirical payoff of the
latinOCP_zero_iff_in_TSL bridge in one shot.
popularis: OCP fires once on [p,o,p,u,l,a,l,i,s] (the tier-adjacent
(l, l)). Only *r-once on the [r]-candidate — but OCP outranks, so
[r] wins.
pluvalis: same OCP-ranking-decides pattern as popularis (the
consonant tier of /pluv-a/ is [p, l], so substituting L → l
creates a tier-adjacent (l, l)).
lunaris: the OT analysis selects [l], not the empirical [r] —
the same error the rule makes (§5). Both candidates have OCP=0
on the consonant tier (the intervening /n/ separates the laterals
in /lun-a-l-is/ so they are not tier-adjacent), so *r picks the
[l]-candidate.
The ERC induced by popularis (winner [r], loser [l]): OCP prefers
the winner (W on constraint 0); *r prefers the loser (L on constraint
1). The full vector is ⟨W, L⟩ = simpleERC 0 1.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The ERC induced by pluvalis: same shape as popularis.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The ERC induced by floralis.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The ERC induced by legalis.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The ERC induced by lunaris (taking the empirical [r] as the winner): OCP indifferent (no laterals are tier-adjacent on either candidate), *r prefers the loser [l]. ⟨e, L⟩ — contradictory (no W, one L). No ranking on the inventory ⟨OCP, *r⟩ can satisfy this; the OT formulation thus reproduces exactly the rule's empirical limit.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The five non-lunaris contrasts form a consistent ERC set: the
identity ranking (OCP at position 0, *r at position 1) satisfies
every word's ranking requirements. Two informative ERCs (popularis,
pluvalis) both reduce to simpleERC 0 1; the other three are
trivial.
Bridge to the dominance characterisation: under any ranking satisfying the empirical ERC set (sans lunaris), OCP dominates *r.
TSL_2 witness: Latin liquid dissimilation is tier-strictly-local at window-size 2.
BTSL_2 corollary (via IsTierStrictlyLocal.toIsBTSL in
Core.Computability.Subregular.Multitier): Latin liquid dissimilation
is in the multitier closure of strictly local languages, hence consumed
by the @cite{lambert-2026} BTC framework.