Piecewise Testable Languages (PT_k) #
A language L is piecewise k-testable when membership depends only
on which length-k subsequences the input contains @cite{simon-1975}
@cite{rogers-pullum-2011} @cite{lambert-2022}. PT is to SP what LT is to
SL: the Boolean closure of the strictly variant.
Both are property-based (extensional) classifications — there is no
canonical grammar, only an indistinguishability relation on strings:
w₁ ~_PT w₂ iff their k-subsequence sets coincide. L is PT_k iff it
is closed under ~_PT.
Lambert (2026) and the multitier extension #
The multitier closure of PT (which Lambert classifies as 𝒥) covers a
large fraction of attested phonotactic constraints — including the
piecewise-testable analyses of unbounded stress patterns surveyed in
@cite{lambert-2026} §3 and the Karanga Shona tone analysis (which is
piecewise testable but not multitier generalised definite). The
substrate sits adjacent to Multitier.lean and is composed via
IsBTC IsPiecewiseTestable.
Main definitions #
subseqSet k w— the set of length-ksubsequences ofw.IsPiecewiseTestable k L— closure ofLunder equality ofsubseqSet.
The cast IsStrictlyPiecewise k L → IsPiecewiseTestable k L lives at
the end of the file (mathlib convention: cast lives with the larger class).
The set of subsequences of w of length at most k. The PT
indistinguishability relation w₁ ~_PT w₂ is equality of
subseqSet k w₁ and subseqSet k w₂.
The "up to k" semantics matches @cite{simon-1975} and the Heinz/Lambert
literature (@cite{lambert-2026} Def 11; @cite{heinz-2018}). The "exactly
k" alternative would mis-classify languages that distinguish strings
of length less than k (e.g. Luganda high-tone plateauing
@cite{lambert-2026} (37) distinguishes [h] from [ℓ] while their
length-3 subsequence sets are both empty). Unlike factorSet, no
boundary augmentation: SP/PT are insensitive to position.
Equations
- Core.Computability.Subregular.subseqSet k w = {s : List α | s.Sublist w ∧ s.length ≤ k}
Instances For
A language is piecewise k-testable iff strings with the same set
of length-≤-k subsequences are either both in L or both out.
Equations
- Core.Computability.Subregular.IsPiecewiseTestable k L = ∀ (w₁ w₂ : List α), Core.Computability.Subregular.subseqSet k w₁ = Core.Computability.Subregular.subseqSet k w₂ → (w₁ ∈ L ↔ w₂ ∈ L)
Instances For
Subsequences of length at most k are interchangeable across strings
sharing their subseqSet k. The standard ratchet for proving
IsPiecewiseTestable k L from a predicate that depends only on
length-≤-k subsequence presence.
SP_k ⊆ PT_k: every strictly-k-piecewise language is piecewise
k-testable. The SP test ("every length-k subsequence is permitted")
trivially depends only on the set of subsequences of that length.