@cite{lambert-2026}: Multitier phonotactics with logic and algebra #
Lambert (2026) classifies attested phonotactic constraints β bounded and unbounded stress, harmony, and tone across ~13 languages β into the multitier (Boolean closure of tier-projected) extensions of definite, generalized definite, and finite-or-cofinite classes. Headline empirical claims:
- Uyghur backness harmony is multitier definite (BTD) β strictly weaker than the multiple-tier-based strictly local class of @cite{de-santo-graf-2019}, settling (categorically) the challenge raised by @cite{mayer-major-2018}.
- Karanga Shona tone is multitier generalized definite (BTLI) β no more complex than the default-to-opposite unbounded stress patterns, refining the melody-local analysis of @cite{jardine-2020}.
The propositional logic is BoolClosure (IsTierBased π) for π in
{IsDefinite k, IsGeneralizedDefinite k, IsStrictlyLocal k,
IsStrictlyPiecewise k, IsFiniteOrCofinite}; the algebraic side is
the syntactic-semigroup characterization of each class via Eilenberg
@cite{eilenberg-1976} variety equations (e.g., D = β¦sxΜ = xΜβ§,
ββ = β¦x^Ο y x^Ο z x^Ο = x^Ο y x^Οβ§ per @cite{straubing-1985} and
@cite{almeida-1995}). The Lean substrate (BoolClosure, IsBTC,
IsTierBased) lives in Subregular/Multitier.lean; the algebraic
characterization is queued for a future SyntacticMonoid PR.
Disclaimer 1: McCollum (2019) Uyghur gradience (linglib audit) #
This disclaimer is not a scope qualification carried by Lambert
(2026); the paper does not cite McCollum. It is a linglib-internal
audit annotation: Lambert's BTD analysis is faithful to
@cite{mayer-major-2018}'s categorical idealization, and a separate
literature line β @cite{mccollum-2019} β argues the suffix backness
assignment is not categorical in the way the multitier-definite formula
requires. The "arbitrarily specified, statistical tendency to be back"
clause that Mayer & Major report for the no-V no-C case is precisely
the locus where McCollum's gradient data resists categorical analysis.
The headline theorem uyghur_backness_isBTD characterizes the
categorical pattern only; the gradience is out of scope.
Disclaimer 2: Karanga Shona scope restriction #
The BTLI analysis applies to the verb-stem domain (post-hyphen
material in Lambert (2026) (45)). @cite{jardine-2020}'s motivation for a
melody local class extends across morphological boundaries and to
longer melodic patterns; the BTLI characterization is not a refutation
of the broader melody-local programme but a delimited result for the
verb-stem surface pattern. The headline theorem
karanga_shona_verb_stem_isBTLI is named accordingly.
Cross-framework dialogue #
The multitier substrate is the prohibition reading of constraints scaled to Boolean closure. Cross-references the new file makes explicit (rather than silently diverging from existing linglib formalizations):
- OT: linglib's
NamedConstraintframework places no complexity bound; Lambert says all phonotactics live inIsBTC. The supraregular counter-witness theorem and the positivemkForbidPairsOnTier β TSL_2theorem are queued for a futureTheories/Phonology/Subregular/OTBound.lean. - Harmonic Serialism:
Phenomena/Tone/Studies/McPhersonLamont2026.leanproves Poko surface tone HS-derivable but parallel-OT-impossible. Lambert's static BTC characterization, applied to Poko's surface stringset, would clarify static description β alternation explanation. Cross-reference to be added when Poko's surface stringset is independently classified. - Autosegmental: linglib's
Theories/Phonology/Autosegmental/{ RegisterTier, GrammaticalTone}.leanformalize multiply-linked tone representations. Lambert (2026) Β§5 self-confesses that string-based analysis loses information for tone; the loss theorem is stated aslambert_string_input_loses_tone_associations(sorry'd) below. - OCP:
Theories/Phonology/Subregular/OCP.leancarries a prohibition-vs-merger distinction;IsBTCis the mathematical home of the prohibition family at scale. The OCP file's docstring should gain a "see also: BTC" link in a follow-up retrofit. - Structure-sensitive MTSL @cite{de-santo-graf-2019}: not formalized in
linglib. Lambert's "BTD strictly supersedes SS-MTSL on Uyghur" is
recorded as a TODO theorem (
btd_supersedes_ss_mtsl_on_uyghur) for when SS-MTSL substrate lands.
Audit calibration note #
Per linglib's domain-expert agent calibration: the McCollum-2019 and Karanga-Shona-scope concerns are flagged HIGH but should be treated PROVISIONAL β they are corrections to scope, not refutations of the formal results. The Lean theorems below state the formal claims; the empirical disclaimers live in this docstring and the per-theorem docstrings.
Iban syllable type @cite{omar-1969}: stressed (ΟΜ) or unstressed (Ο).
The two-letter alphabet of Lambert (2026) Β§2.1.
Instances For
Equations
- Phenomena.Phonotactics.Studies.Lambert2026.instDecidableEqIbanSyl xβ yβ = if h : xβ.ctorIdx = yβ.ctorIdx then isTrue β― else isFalse β―
Equations
- One or more equations did not get rendered due to their size.
Instances For
The Iban stress-final language: a word is well-formed iff its final
syllable is stressed @cite{omar-1969}. As a DefiniteGrammar 1: the
permitted length-1 right-edge substring is [stressed].
Equations
Instances For
Iban stress-final β D_1 (Lambert 2026 Β§2.1, paper p. 4 example
(2)). Definitional witness: the DefiniteGrammar 1 whose permitted
final 1-suffix is [stressed]. The general k-definite Proposition (4)
characterizes this class abstractly; the Iban witness is the
specialisation for k = 1.
Amara stress @cite{thurston-1966}: penultimate-syllable stress with the monosyllabic exception (single syllable bears stress).
Equations
- One or more equations did not get rendered due to their size.
Instances For
The Amara stress-penult language.
Equations
Instances For
Amara stress-penult β D_2 (Lambert 2026 Β§2.1). Witnessed by the 2-grammar permitting either a stressed monosyllable or a stressed-then- unstressed 2-suffix.
Finnish stress @cite{suomi-toivanen-ylitalo-2008}: primary stress is
fixed to the initial syllable. The reverse-definite dual of Iban
stress-final, witnessing the IsReverseDefinite 1 class @cite{lambert-2026}
Β§2.2.
Equations
Instances For
The Finnish stress-initial language.
Equations
Instances For
Finnish stress-initial β K_1 (Lambert 2026 Β§2.2, paper p. 5
example (6)). Definitional witness: the ReverseDefiniteGrammar 1
whose permitted initial 1-prefix is [stressed].
Uyghur segment classes relevant to backness harmony @cite{mayer-major-2018} @cite{lambert-2026} (33)-(35): front vowels, back vowels, transparent (i, e), front dorsal consonants, back dorsal consonants, suffix-marked harmonizing vowels and consonants, and a residual "other" class for non-harmonizing material.
- frontVowel : UyghurSeg
- backVowel : UyghurSeg
- transparentVowel : UyghurSeg
- frontDorsal : UyghurSeg
- backDorsal : UyghurSeg
- suffixFrontVowel : UyghurSeg
- suffixBackVowel : UyghurSeg
- suffixFrontDorsal : UyghurSeg
- suffixBackDorsal : UyghurSeg
- other : UyghurSeg
Instances For
Equations
- Phenomena.Phonotactics.Studies.Lambert2026.instDecidableEqUyghurSeg xβ yβ = if h : xβ.ctorIdx = yβ.ctorIdx then isTrue β― else isFalse β―
Equations
- One or more equations did not get rendered due to their size.
Instances For
Tier predicates #
Lambert's V_f βͺ V_b is the harmonising-vowel tier; C_f βͺ C_b is the
dorsal tier; S_f and S_b are the front- and back-suffix tiers
(V_X βͺ C_X marked as suffix). Each is a Boolean predicate on
UyghurSeg.
Harmonising-vowel tier: V_f βͺ V_b.
Equations
- Phenomena.Phonotactics.Studies.Lambert2026.isHarmonyingVowel Phenomena.Phonotactics.Studies.Lambert2026.UyghurSeg.frontVowel = true
- Phenomena.Phonotactics.Studies.Lambert2026.isHarmonyingVowel Phenomena.Phonotactics.Studies.Lambert2026.UyghurSeg.backVowel = true
- Phenomena.Phonotactics.Studies.Lambert2026.isHarmonyingVowel xβ = false
Instances For
Dorsal-consonant tier: C_f βͺ C_b.
Equations
- Phenomena.Phonotactics.Studies.Lambert2026.isDorsal Phenomena.Phonotactics.Studies.Lambert2026.UyghurSeg.frontDorsal = true
- Phenomena.Phonotactics.Studies.Lambert2026.isDorsal Phenomena.Phonotactics.Studies.Lambert2026.UyghurSeg.backDorsal = true
- Phenomena.Phonotactics.Studies.Lambert2026.isDorsal xβ = false
Instances For
Front-suffix tier S_f: front-marked suffix material (vowel or dorsal).
Equations
- Phenomena.Phonotactics.Studies.Lambert2026.isSuffixFront Phenomena.Phonotactics.Studies.Lambert2026.UyghurSeg.suffixFrontVowel = true
- Phenomena.Phonotactics.Studies.Lambert2026.isSuffixFront Phenomena.Phonotactics.Studies.Lambert2026.UyghurSeg.suffixFrontDorsal = true
- Phenomena.Phonotactics.Studies.Lambert2026.isSuffixFront xβ = false
Instances For
Back-suffix tier S_b: back-marked suffix material (vowel or dorsal).
Equations
- Phenomena.Phonotactics.Studies.Lambert2026.isSuffixBack Phenomena.Phonotactics.Studies.Lambert2026.UyghurSeg.suffixBackVowel = true
- Phenomena.Phonotactics.Studies.Lambert2026.isSuffixBack Phenomena.Phonotactics.Studies.Lambert2026.UyghurSeg.suffixBackDorsal = true
- Phenomena.Phonotactics.Studies.Lambert2026.isSuffixBack xβ = false
Instances For
Atomic tier-projected definite languages #
Each of the four atomic predicates in (35) is a IsTierBased (IsDefinite 1) test. We build them as DefiniteGrammar 1 instances
on the filtered tier and lift via IsTierBased.
The language "tier T is empty": after filtering by T, the word
is []. Encoded as IsTierBased (IsDefinite 1) via the singleton
1-suffix {[]} (the only word whose right-1-suffix is [] is the
empty word).
Equations
- Phenomena.Phonotactics.Studies.Lambert2026.tierEmptyLang T = {w : List Phenomena.Phonotactics.Studies.Lambert2026.UyghurSeg | List.filter T w = []}
Instances For
The language "tier T ends with x": after filtering by T, the
word's last element is x. Encoded as IsTierBased (IsDefinite 1)
via the singleton 1-suffix {[x]}.
Equations
- One or more equations did not get rendered due to their size.
Instances For
tierEmptyLang T is in IsTierBased (IsDefinite 1). The base
1-definite language is the singleton {[]} (only [] has right-1-
suffix []).
tierEndsWithLang T x is in IsTierBased (IsDefinite 1). The
base 1-definite language is the singleton {[x]}.
The Uyghur backness language as conjunction of (35a)-(35b) #
The English implication A β B is Boolean AαΆ β¨ B; written using
mathlib's Language lattice (Boolean algebra), AαΆ β B. The full
language is the intersection of four such implications.
The Uyghur backness harmony language as the conjunction of the four implications in Lambert (2026) (35a)-(35b). Categorical idealisation β see file docstring for the @cite{mccollum-2019} gradience disclaimer.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Uyghur backness harmony β BTDβ (Lambert 2026 Β§4.3, eq. (35),
refining @cite{mayer-major-2018}). Constructive witness: the
formalised uyghurBacknessLang is the intersection of four
implications, each AαΆ β B where A and B are
IsTierBased (IsDefinite 1) (each is a tier-projected single-suffix
test). The BTD membership follows from BoolClosure.{base, compl, union, inter} applied to the four atomic tier-projected definite
languages.
Karanga Shona tone alphabet @cite{odden-1984} @cite{lambert-2026} Β§5.6: low (β) and high (h).
Instances For
Equations
- Phenomena.Phonotactics.Studies.Lambert2026.instDecidableEqKShoTone xβ yβ = if h : xβ.ctorIdx = yβ.ctorIdx then isTrue β― else isFalse β―
Equations
- One or more equations did not get rendered due to their size.
Instances For
Atomic IsGeneralizedDefinite languages (uniform k = 5) #
Each component of Lambert's Ο_F β¨ L_m β¨ H_m reduces to a Boolean
combination of edge-anchored substring tests and tier-projected
substring tests. We encode each as a Language KShoTone and prove it's
IsGeneralizedDefinite 5 directly via List.take_take /
List.drop_drop-style structural arguments. The uniform k = 5 is
chosen as 1 + max(prefix length, suffix length, tier-projection length, max Ο_F word length) = 1 + 4.
"Word starts with xs": the language {w | w.take xs.length = xs}.
Originally introduced for KShoTone but Ξ±-generic; also reused for
LugandaTone in Β§10 Kagoshima Japanese.
Equations
- Phenomena.Phonotactics.Studies.Lambert2026.startsWithLang xs = {w : List Ξ± | List.take xs.length w = xs}
Instances For
"Word ends with xs": the language {w | w.drop (w.length - xs.length) = xs}.
Ξ±-generic, reused for LugandaTone in Β§10 Kagoshima Japanese.
Equations
- Phenomena.Phonotactics.Studies.Lambert2026.endsWithLang xs = {w : List Ξ± | List.drop (w.length - xs.length) w = xs}
Instances For
"Tier-projection by T equals exactly xs": the language
{w | w.filter T = xs}. Ξ±-generic, reused for LugandaTone in Β§10
Kagoshima Japanese.
Equations
- Phenomena.Phonotactics.Studies.Lambert2026.tierEqualLang T xs = {w : List Ξ± | List.filter T w = xs}
Instances For
Boolean tier predicate for h-tier (high tones only).
Equations
Instances For
IsGeneralizedDefinite witnesses at k = 5 #
Unfolding helper: Edge.left.takeAt k w = w.take k and
Edge.right.takeAt k w = w.drop (w.length - k) by rfl. The
hypotheses from IsGeneralizedDefinite k come in the Edge.takeAt
form; we unfold via show at the top of each proof.
startsWithLang xs is IsGeneralizedDefinite k for any k β₯ xs.length. Proof: same k-prefix on both words determines the
xs.length-prefix via List.take_take. Ξ±-generic.
endsWithLang xs is IsGeneralizedDefinite k for any k β₯ xs.length. Symmetric to startsWithLang_isGenDef; the underlying
identity is w.drop (w.length - xs.length) = (w.drop (w.length - k)).drop (k - xs.length) when xs.length β€ k β€ w.length. The general case
splits on whether w is shorter than k.
tierEqualLang T xs is IsTierBased (IsGeneralizedDefinite k) for
any k > xs.length (strict β without strictness, e.g. {[h, h]} is
not GeneralizedDefinite 2 since [h, h, h] and [h, h] share both
2-prefix and 2-suffix).
The seven fully specified words from Lambert (2026) Β§5.6 (just above
eq. (46), paper p. 19). Max length 4 (hhβh).
Equations
- One or more equations did not get rendered due to their size.
Instances For
The finite-language part Ο_F of Lambert's witness β the seven
fully specified words.
Equations
Instances For
phi_F is IsGeneralizedDefinite 5. Max word length is 4, so for
k = 5 > 4, any word of length β€ 4 has k-prefix = whole word. Two
words with same 5-prefix and length β€ 4 are equal; any word with
length > 4 has a 5-prefix of length 5 (or whole) which differs from
any short word's 5-prefix.
The Karanga Shona verb-stem tone language as the disjunction Ο_F β¨ L_m β¨ H_m from Lambert (2026) Β§5.6 (formula appearing just after
eq. (49), paper p. 19).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Karanga Shona verb-stem tone β BTLIβ
(Lambert 2026 Β§5.6,
refining @cite{jardine-2020}). Constructive witness for the disjunction
Ο_F β¨ L_m β¨ H_m at uniform k = 5. Each disjunct lifts to
IsBTC (IsGeneralizedDefinite 5) via IsTierBased.of_class +
BoolClosure.base; the disjunction is closed by BoolClosure.union.
The Tsuut'ina sibilant alphabet @cite{cook-1978} is the shared
three-class SibilantTierSeg (anterior s, posterior Κ, neutral
non-sibilant) defined in Theories/Phonology/SibilantTier.lean.
Equations
Instances For
The TSL_2 grammar for Tsuut'ina asymmetric sibilant harmony
@cite{cook-1978} @cite{lambert-2026} Β§4.2: anterior preceding posterior
on the sibilant tier is forbidden. Reuses the shared substrate
asymmetricHarmonyAntFirst from Theories/Phonology/SibilantTier.lean.
Equations
Instances For
The Tsuut'ina asymmetric sibilant harmony language. Defined as the language of the TSL_2 witness so that the membership theorem is definitional.
Equations
Instances For
Tsuut'ina asymmetric harmony β TSL_2 (Lambert 2026 Β§4.2).
Definitional witness: the TSL_2 grammar tsuutinaTSLGrammar.
Refutation: Tsuut'ina β BTLI #
Lambert (2026) Β§4.2 parameterised counterexample: for every k, the
words Κα΅βΊΒΉsα΅βΊΒΉ (accepted) and Κα΅ s Κ sα΅ (rejected) share the
length-k tier-prefix and length-k tier-suffix on every Bool tier.
Both witnesses are sandwiches with asymmetric bookends posterior
on the left and anterior on the right; widths differ between
witnesses (k+1 for accepted, k for rejected). The 8-tier
enumeration (3 alphabet classes Γ 2 keep/drop) collapses to 4 since the
witnesses contain no neutrals.
The substrate Sandwich (in Core/Computability/Subregular/Sandwich.lean)
handles the (true, true) case directly via takeAt_*_sandwich (the
bookends are wide enough); the other three cases reduce to filtered
words being equal (after a replicate_succ rewrite to merge the middle
contribution into the bookend).
The accepted Tsuut'ina parameterised witness Κα΅βΊΒΉ sα΅βΊΒΉ. Sibilant
tier projection: posterior^(k+1) ++ anterior^(k+1) β no anterior
precedes any posterior.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The rejected Tsuut'ina parameterised witness Κα΅ s Κ sα΅. The
internal [anterior, posterior] is the forbidden adjacency on the
sibilant tier.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Tsuut'ina asymmetric harmony β BTLI (Lambert 2026 Β§4.2, parametrised counterexample).
Luganda tone alphabet @cite{hyman-katamba-2010}: low (β) and high (h), following @cite{jardine-2020}'s identification of "intermediate" with low in the input.
- low : LugandaTone
- high : LugandaTone
Instances For
Equations
- Phenomena.Phonotactics.Studies.Lambert2026.instDecidableEqLugandaTone xβ yβ = if h : xβ.ctorIdx = yβ.ctorIdx then isTrue β― else isFalse β―
Equations
- One or more equations did not get rendered due to their size.
Instances For
The Luganda high-tone plateauing predicate @cite{lambert-2026} (37):
no [h, β, h] subsequence (at most one high span), and if any high tone
appears then there is a later low ([h, β] is a subsequence).
Equations
- One or more equations did not get rendered due to their size.
Instances For
The Luganda high-tone plateauing language.
Equations
Instances For
Membership in lugandaLang is membership in lugandaPred.
Luganda high-tone plateauing β PT_3 (Lambert 2026 (37)). The
predicate depends only on length-β€-3 subsequence presence: the
length-3 [h, β, h], the length-2 [h, β], and the length-1 [h].
The proof reduces each conjunct of lugandaPred to the corresponding
subseqSet 3 membership question, then transfers via
subseqSet_eq_iff.
Refutation: Luganda β BTLI #
Lambert (2026) Β§5.1 parameterised counterexample: for every k, the words
βα΅ β h h β βα΅ (accepted) and βα΅ β h β h β βα΅ (rejected) share the
length-k tier-prefix and length-k tier-suffix on every Bool tier.
Both witnesses are sandwiches with bookend low and width k; the
substrate Sandwich (in Core/Computability/Subregular/Sandwich.lean)
handles the bookend-keeps-tier-affix case directly, leaving only the
small filtered-middle equality as per-witness work.
Membership separation is predicate-level (not TSL-grammar level):
lugandaPred requires no [h, β, h] subsequence and (if any high
appears) a [h, β] subsequence. Accepted satisfies both; rejected fails
the first because its explicit middle [β, h, β, h, β] contains
[h, β, h] as a subsequence.
The accepted Luganda parameterised witness βα΅ β h h β βα΅. The two
high tones are adjacent inside the middle, so no [h, β, h] subsequence
can be formed (no low position lies strictly between them).
Equations
- One or more equations did not get rendered due to their size.
Instances For
The rejected Luganda parameterised witness βα΅ β h β h β βα΅. The
explicit middle [β, h, β, h, β] contains [h, β, h] as a
subsequence β exactly the structural pattern lugandaPred forbids.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Luganda high-tone plateauing β BTLI (Lambert 2026 Β§5.1).
Parametrised-word witness: βα΅ β h h β βα΅ is accepted while
βα΅ β h β h β βα΅ is rejected, but the two share all length-k
tier-affixes on every tier.
Lambert 2026 Β§5.2 (@cite{ding-2006}): Prinmi pitch-accent lexically selects a high-tone position within a domain (morpheme or span of adjacent morphemes); high may spread progressively to the next syllable. The resulting pattern enforces:
- Obligatoriness: at least one high tone (
h). - At most one high span: no
[h, β, h]subsequence (same as Luganda Β§5.1). - Span length β€ 2: no
[h, h, h]subsequence (new conjunct not present in Luganda).
Lambert: "the same words witness this nonmembership as for high-tone
plateauing: βα΅ h h βα΅ is valid but βα΅ h β h βα΅ is not, despite
the two having the same k-affixes on every tier." So we reuse
lugandaAccepted / lugandaRejected as witnesses and luganda_tierAffixes
as the indistinguishability proof β the substrate Sandwich's
not_sublist_sandwich discharges both the [h, β, h] and the new
[h, h, h] non-subsequence claims uniformly.
Alphabet: LugandaTone reused per Lambert's unified β/h notation
across Β§5.
Disclaimer (Lambert Β§5.2): @cite{ding-2006} assumes maximally quadrisyllabic domains with significant compounding; that finite-domain restriction would make Prinmi co/finite (a stronger classification). The PT_3-and-not-BTLI result formalised here applies to the unbounded analysis Lambert presents; the bounded analysis is out of scope.
The Prinmi pitch-accent predicate @cite{lambert-2026} (39):
[h] <+ wβ at least one high tone (obligatoriness).Β¬ [h, β, h] <+ wβ at most one high span.Β¬ [h, h, h] <+ wβ high span length β€ 2 syllables.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The Prinmi pitch-accent language.
Equations
Instances For
Membership in prinmiLang is membership in prinmiPred.
Prinmi pitch-accent β PT_3 (Lambert 2026 (39)). All three
conjuncts depend only on length-β€-3 subsequence presence: the length-1
[h] and the two length-3 patterns.
Prinmi pitch-accent β BTLI (Lambert 2026 Β§5.2). Same witnesses and tier-affix proof as Luganda Β§5.1 β Lambert: "The same words witness this nonmembership as for high-tone plateauing."
Lambert 2026 Β§5.3 (@cite{donohue-1997}): Arigibi (Trans-New Guinea)
allows at most one mora with high tone (position lexically specified;
words with no high tone are allowed). The phonotactic constraint is
Β¬h..h β no [high, high] subsequence anywhere.
Lambert: "exactly analogous to culminativity in isolation, and as such it is piecewise testable and tier-based co/finite, as demonstrated in Β§2.3" (Lambert's Β§2.3 is unbounded culminativity).
Boolean tier predicate selecting LugandaTone.high. Shared by Β§8
Arigibi and Β§9 Chuave.
Equations
Instances For
The Arigibi pitch-accent language: at most one high mora.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Membership in arigibiLang is Β¬ [h, h] <+ w.
Arigibi pitch-accent β PT_2 (Lambert 2026 Β§5.3, formula Β¬h..h).
The constraint depends only on length-2 subseq [h, h].
Arigibi pitch-accent β BTN (Lambert 2026 Β§5.3). On the high
tier, the projection has length β€ 1, i.e., is either [] or [high] β
a finite set, hence co/finite.
Lambert 2026 Β§5.5 (@cite{donohue-1997}): Chuave (Trans-New Guinea)
exhibits obligatoriness β every word must contain at least one
high-tone mora. There is no restriction on placement; multiple high
spans are allowed. The phonotactic constraint is the simplest possible:
the formula h (at least one high). This is both:
- PT_1: the constraint depends only on the length-1 subsequence
[h]. - BTN (multitier finite-or-cofinite): on the high tier, the
projection must be non-empty. The non-empty list set is co/finite
(its complement is the singleton
{[]}).
The Chuave obligatoriness language: at least one mora has high tone.
Equations
Instances For
Membership in chuaveLang is high β w.
Chuave obligatoriness β PT_1 (Lambert 2026 Β§5.5). The constraint
high β w is the singleton subseq presence [high] <+ w.
Chuave obligatoriness β BTN (Lambert 2026 Β§5.5, formula
Β¬ [ββ]_{h}). On the high tier isLugHigh, the projection
w.filter isLugHigh is non-empty iff high β w. The non-empty set
{xs | xs β []} is co/finite (complement is {[]}).
Lambert 2026 Β§5.4 (@cite{kawahara-2015}, @cite{haraguchi-1977}): Kagoshima Japanese has a pitch-accent system with exactly one high tone per word, appearing on the final or penultimate mora.
Lambert formula (42), order-based PT_3:
hβ at least one high tone (obligatoriness)Β¬h..hβ no two highs (culminativity)Β¬h..β..ββ high doesn't have two lows after (forces final/penult position)
Lambert formula (43), tier-based multitier definite:
[βhβ]_{h}β high tier projection equals exactly[h](hββ β¨ hβ)β word ends with[h, β]or[h]
The PT_3 result is direct from subseqSet_eq_iff. The multitier
characterization uses tierEqualLang isLugHigh [.high] (high tier =
singleton [h]) intersected with the disjunction of two endsWithLang
cases. The Β§4 helpers (startsWithLang, endsWithLang,
tierEqualLang) were generalized to Ξ± : Type* for this section so
they apply at type LugandaTone without duplication.
We prove kagoshima_multitier_isBTLI (multitier generalized definite, k = 2).
Lambert states the stronger BTDβ classification (multitier definite β
right-edge-only); BTD substrate for endsWithLang is queued for
follow-up. The order/tier formulation equivalence (formulas 42 β 43)
is stated as a TODO theorem kagoshima_lang_eq_mt.
Alphabet: LugandaTone reused per Lambert's unified β/h notation
across Β§5.
The Kagoshima Japanese pitch-accent predicate @cite{lambert-2026}
(42): at least one high, no two highs, no [h, β, β] subseq.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The Kagoshima Japanese pitch-accent language.
Equations
Instances For
Membership in kagoshimaLang is membership in kagoshimaPred.
Kagoshima Japanese pitch-accent β PT_3 (Lambert 2026 (42)).
All three conjuncts depend only on length-β€-3 subseq presence:
length-1 [h], length-2 [h, h], length-3 [h, β, β].
The Kagoshima multitier-encoded language: high-tier-projection
equals exactly [h] AND word ends with [h, β] or [h]. Lambert
2026 (43).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Kagoshima multitier characterization β BTLIβ (Lambert 2026
(43)). Each component lifts to IsBTLI 2:
tierEqualLang isLugHigh [.high]viatierEqualLang_isTierBased(k = 2 > 1 = |xs|);endsWithLang [.high, .low]viaendsWithLang_isGenDef(k = 2 β₯ 2);endsWithLang [.high]viaendsWithLang_isGenDef(k = 2 β₯ 1). The conjunction/disjunction are closed byBoolClosure.inter/BoolClosure.union. Lambert's stronger BTDβ classification is queued for follow-up.
Equivalence (formula 42 β 43) is queued for follow-up. Lambert
states the order-based predicate kagoshimaLang and the multitier
predicate kagoshimaMTLang describe the same language, but the proof
requires structural reasoning about how "no [h, β, β] subseq"
combined with "exactly one high" forces the unique high to lie in the
final two positions. Both formulations are independently classified
above (PT_3 and BTLI_2 respectively).
Audit-flagged cross-framework engagement points. These are stated here so the disagreements (and convergences) with sibling linglib formalizations are visible rather than buried.
Each is tagged with the auditor file path it cross-references. The proofs
are deferred β typically to a follow-up substrate PR (e.g. OTBound.lean)
or to the chronologically-later sibling study.
Lambert 2026 multitier classification structurally weaker than SS-MTSL (de Santo & Graf 2019) on Uyghur: every BTD language admits an SS-MTSL acceptor, but BTD is strictly more powerful as a phonotactic classifier β Uyghur backness harmony is BTD but @cite{mayer-major-2018} shows it is not SS-MTSL. (Stated as TODO; activates when SS-MTSL substrate lands in linglib.)
String-input information loss for autosegmental tone analyses: two
distinct autosegmental representations (e.g. multiply-linked vs. singly-
linked-with-spread on a tone tier) can have the same surface string. The
BTC classification of a string language therefore is not a faithful
summary of an autosegmental analysis of the same surface tone pattern.
Cross-references Theories/Phonology/Autosegmental/RegisterTier.lean,
GrammaticalTone.lean, and Phenomena/Tone/Studies/Lionnet2025.lean.