Documentation

Linglib.Core.Computability.Subregular.PiecewiseTestable

Piecewise Testable Languages (PT_k) #

A language L is piecewise k-testable when membership depends only on which length-k subsequences the input contains @cite{simon-1975} @cite{rogers-pullum-2011} @cite{lambert-2022}. PT is to SP what LT is to SL: the Boolean closure of the strictly variant.

Both are property-based (extensional) classifications — there is no canonical grammar, only an indistinguishability relation on strings: w₁ ~_PT w₂ iff their k-subsequence sets coincide. L is PT_k iff it is closed under ~_PT.

Lambert (2026) and the multitier extension #

The multitier closure of PT (which Lambert classifies as 𝒥) covers a large fraction of attested phonotactic constraints — including the piecewise-testable analyses of unbounded stress patterns surveyed in @cite{lambert-2026} §3 and the Karanga Shona tone analysis (which is piecewise testable but not multitier generalised definite). The substrate sits adjacent to Multitier.lean and is composed via IsBTC IsPiecewiseTestable.

Main definitions #

The cast IsStrictlyPiecewise k L → IsPiecewiseTestable k L lives at the end of the file (mathlib convention: cast lives with the larger class).

def Core.Computability.Subregular.subseqSet {α : Type u_1} (k : ) (w : List α) :
Set (List α)

The set of subsequences of w of length at most k. The PT indistinguishability relation w₁ ~_PT w₂ is equality of subseqSet k w₁ and subseqSet k w₂.

The "up to k" semantics matches @cite{simon-1975} and the Heinz/Lambert literature (@cite{lambert-2026} Def 11; @cite{heinz-2018}). The "exactly k" alternative would mis-classify languages that distinguish strings of length less than k (e.g. Luganda high-tone plateauing @cite{lambert-2026} (37) distinguishes [h] from [ℓ] while their length-3 subsequence sets are both empty). Unlike factorSet, no boundary augmentation: SP/PT are insensitive to position.

Equations
Instances For
    @[simp]
    theorem Core.Computability.Subregular.mem_subseqSet {α : Type u_1} {k : } {s w : List α} :
    s subseqSet k w s.Sublist w s.length k
    def Core.Computability.Subregular.IsPiecewiseTestable {α : Type u_1} (k : ) (L : Language α) :

    A language is piecewise k-testable iff strings with the same set of length-≤-k subsequences are either both in L or both out.

    Equations
    Instances For
      theorem Core.Computability.Subregular.subseqSet_eq_iff {α : Type u_1} {k : } {w₁ w₂ : List α} (heq : subseqSet k w₁ = subseqSet k w₂) {s : List α} (hlen : s.length k) :
      s.Sublist w₁ s.Sublist w₂

      Subsequences of length at most k are interchangeable across strings sharing their subseqSet k. The standard ratchet for proving IsPiecewiseTestable k L from a predicate that depends only on length-≤-k subsequence presence.

      SP_k ⊆ PT_k: every strictly-k-piecewise language is piecewise k-testable. The SP test ("every length-k subsequence is permitted") trivially depends only on the set of subsequences of that length.