Forbidden-Pair TSL_2 Grammars #
A reusable schema @cite{heinz-rawal-tanner-2011} for tier-based strictly
2-local languages defined by a forbidden-pair relation
R : α → α → Prop. The TSL_2 grammar TSLGrammar.ofForbiddenPairs R p
rules out any string whose tier projection contains an adjacent pair
(a, b) with R a b.
The canonical instance is R := (· = ·): no two adjacent identicals on
the tier projection. This is the OCP @cite{mccarthy-1986}
(Phonology.Subregular.OCP). The schema is general enough to capture
OCP-Feature variants (R := share-feature-X) and any single-tier
2-factor markedness constraint over a fixed subset projection.
Long-distance harmony patterns over a single segmental tier instantiate
other choices of R (e.g. R := disagree-on-feature-X).
Note that stress-rhythm constraints (*Lapse, *Clash) and the syllable
phonotactic *Coda ban are not TSL_2 instances over the segmental
alphabet: *Lapse and *Clash are SL_2 over a syllable-level alphabet
(carrying stress weight), and *Coda is SL_1 over an alphabet that
encodes syllable structure (e.g. CV templates with marked coda symbols).
The OT-side helper for the SL_1 case is mkForbidSingletonOnTier in
Theories/Phonology/OptimalityTheory/Constraints.lean; the language-side
SL_1 substrate is in Subregular/StrictlyLocal.lean (the k = 1 case).
The single bridge theorem mem_ofForbiddenPairs_lang_iff_filter_isChain
characterizes language membership as an IsChain check on the projected
string. This is the chain-side payoff of the boundary-vacuity machinery
in Subregular/Defs.lean. Phonological-constraint bridges
(mkForbidPairsOnTier_zero_iff_in_lang) layer on top in
Theories/Phonology/Subregular/.
The set of forbidden 2-factors induced by a binary relation R: pairs
[some a, some b] with R a b. Boundary-adjacent factors are not
forbidden — the chain relation CleanPair R is boundary-vacuous.
Equations
- Core.Computability.Subregular.forbiddenPairsSet R = {f : Core.Computability.Subregular.Augmented α | ∃ (a : α) (b : α), R a b ∧ f = [some a, some b]}
Instances For
Chain relation associated with a forbidden-pair relation R: two
augmented symbols are clean as a pair iff they are not both some
related by R. Boundary symbols are vacuously clean on either side
(CleanPair.isBoundaryVacuous).
Equations
- Core.Computability.Subregular.CleanPair R (some a) (some b) = ¬R a b
- Core.Computability.Subregular.CleanPair R x✝¹ x✝ = True
Instances For
The clean-pair relation is boundary-vacuous: none always satisfies it
on either side. Used to lift chain-membership of the boundary-padded
augmented string to the unpadded core.
A list of Option α has no [some a, some b] 2-factor with R a b
iff it is a chain for CleanPair R. The factor-membership/chain bridge
underlying mem_forbidPairs_lang_iff_filter_isChain.
The TSL_2 grammar capturing "no tier-adjacent pair satisfies R": tier
projection (via Tier.byClass p) followed by an SL_2 ban on
forbiddenPairsSet R. OCP and any single-tier adjacency-based markedness
constraint factor through this constructor.
Equations
- Core.Computability.Subregular.TSLGrammar.ofForbiddenPairs R p = { tier := p, decTier := inst✝, permitted := (Core.Computability.Subregular.forbiddenPairsSet R)ᶜ }
Instances For
Bridge (TSL_2 language form): membership in
TSLGrammar.ofForbiddenPairs R p reduces to an IsChain (¬ R · ·) check on
the tier-projected string. The single generic membership characterization
that all forbidden-pair markedness constraints inherit. Composes (i) the
factor/chain bridge forall_kFactors_two_not_forbiddenPairsSet_iff_isChain,
(ii) the boundary-vacuity of CleanPair R, and (iii) List.isChain_map to
drop the some.
Adjacent-pair counting #
Promoted from Theories/Phonology/OptimalityTheory/Constraints.lean:
countAdjacent R xs counts the adjacent (a, b) pairs in xs with
R a b. Alphabet-generic; nothing OT-specific. The OT-side
mkForbidPairsOnTier constructor consumes this directly, and the
chain-bridge countAdjacent_eq_zero_iff_isChain connects it to the
language-level IsChain characterization above.
Count adjacent pairs (a, b) in a list satisfying a binary relation
R. The shared engine behind any "forbidden adjacent pair" markedness
or violation count: OCP (R := (· = ·)), agreement (R := (· ≠ ·)),
and asymmetric harmony all pass through.
Equations
- Core.Computability.Subregular.countAdjacent R [] = 0
- Core.Computability.Subregular.countAdjacent R [head] = 0
- Core.Computability.Subregular.countAdjacent R (a :: b :: rest) = (if R a b then 1 else 0) + Core.Computability.Subregular.countAdjacent R (b :: rest)
Instances For
countAdjacent R xs = 0 iff no two consecutive elements of xs are
related by R. The chain-form characterisation of zero-violation
forbidden-pair candidates.
API around ofForbiddenPairs #
Membership in (ofForbiddenPairs R p).lang is decidable, derived from
the IsChain characterization.
Equations
- Core.Computability.Subregular.decidableMemOfForbiddenPairsLang R p w = decidable_of_iff (List.IsChain (fun (a b : α) => ¬R a b) (List.filter (fun (x : α) => decide (p x)) w)) ⋯
The empty string is in every forbidden-pair language: there are no tier-adjacent pairs to check.
Antitonicity in R: forbidding more pairs (R ≤ R') shrinks the
language. The chain witness for the larger R' immediately yields one for
the smaller R since ¬ R' a b → ¬ R a b.
Boundary case (R = ⊥): if no pair is forbidden, the language is
universal.
Boundary case (p = ⊥): if no symbol is on the tier, the projection is
empty and the language is universal.
Boundary case (p = ⊤): with no tier filtering the language reduces
to the SL_2 case — no two adjacent symbols of the raw string may be
R-related. Not marked @[simp]: the RHS is not a simpler normal form
than the LHS (compare lang_R_bot/lang_p_bot which simplify to
Set.univ).
Same-tier composition (membership form): forbidding the disjunction of
two relations on a single tier coincides with conjunctive membership in the
two corresponding languages. The cross-tier case (two relations on different
tiers p₁ ≠ p₂) does NOT factor through this constructor — that is precisely
the empirical and formal motivation for multi-tier strictly local languages
(MTSL).
Same-tier composition (lattice form): forbidding the disjunction of two relations on a single tier equals the meet of the two corresponding languages. The cross-tier case is not expressible — see MTSL.
Design boundary (mathematical) #
The language (ofForbiddenPairs R p).lang is neither monotone nor
antitone in the tier predicate p. Both failure directions are
witnessed by the theorems lang_p_not_monotone and lang_p_not_antitone
below. This is the formal counterpart of the linguistic intuition that
which segments project to the tier is not just a monotone refinement
but a substantive theoretical commitment, and it is why TSL_2 over
different tiers gives genuinely incomparable theories.
Equations
- One or more equations did not get rendered due to their size.
Non-monotonicity of (ofForbiddenPairs R p).lang in the tier
predicate p (forward direction): there exist p ≤ p' and a string xs
such that xs ∈ lang p but xs ∉ lang p'. Witness: R := (· = ·),
p := {s}, p' := {s, t}, xs := [s, t, t]; filter under p gives
[s] (chain), but under p' gives [s, t, t] (the adjacent t, t
violates the OCP).
Non-antitonicity of (ofForbiddenPairs R p).lang in the tier
predicate p (reverse direction): there exist p ≤ p' and a string xs
such that xs ∉ lang p but xs ∈ lang p'. Witness: R := (· = ·),
p := {s}, p' := {s, t}, xs := [s, t, s]; filter under p gives
[s, s] (the adjacent s, s violates the OCP), but under p' gives
[s, t, s] (chain — interleaving t separates the ss).
Design boundary (empirical) #
The substantive empirical force of the non-monotonicity result is the
gradient similarity-based OCP of @cite{frisch-pierrehumbert-broe-2004}:
Arabic co-occurrence restrictions decay with featural distance rather
than following a clean tier cut, so the same R paired with different
p yields incomparable predictions about which root pairs are licit.
See Phenomena/Phonology/Studies/FrischPierrehumbertBroe2004.lean for
the load-bearing instance: the natural-classes similarity metric
(FPB eq. 7), the worked examples /f, m/ → 2/9 and /b, f/ → 3/8, and
the categorical_fails_three_test_points divergence theorem witnessing
that no TSL_2 grammar with R := λ a b => similarity la b ≥ t can
reproduce three specific Table IV bins.
Dependencies on multiple tiers — e.g. simultaneous consonantal and vocalic harmony — likewise fall outside this single-tier constructor. The natural way to combine constraints across tiers is the multi-tier strictly local (MTSL) family, which is not a special case of this constructor.