Containing (and avoiding) a fixed factor is star-free #
For any list c over an arbitrary alphabet α, the language of words that contain c as a
contiguous factor ({x | c <:+: x}) is star-free, as is its complement (the words that avoid
c). This is the workhorse fact behind forbidden-substring constraints: each forbidden factor
cuts out a star-free set, and SF is closed under the finite Boolean operations.
Over a finite alphabet c avoidance is plainly strictly local (SL_{|c|} with the single
forbidden factor c), hence star-free by Language.IsStrictlyLocal.isStarFree. The substance
here is removing the finiteness hypothesis: an arbitrary α is collapsed onto the finite
alphabet Fin |c| → Bool by the diagonal encoding enc a i = (a = c[i]), which records, for
each position of c, whether a matches it. This loses almost all information about α yet is
faithful enough to detect occurrences of c: c <:+: x ↔ (c.map enc) <:+: (x.map enc)
(containsFactor_iff_map), because at a matched window the diagonal forces x[j] = c[i]. The
finite-alphabet result then transfers back by inverse homomorphism (Language.IsStarFree.comap).
Star-free = FO[<]-definable = counter-free ([Sch65] [McNP71]).
Main results #
Language.isStarFree_containsFactor—{x | c <:+: x}is star-free,αarbitrary.Language.isStarFree_avoidsFactor—{x | ¬ c <:+: x}is star-free,αarbitrary.
Generic list lemmas (no alphabet assumptions) #
Containment over a finite alphabet #
The diagonal encoding onto a finite alphabet #
Containing a fixed factor is star-free over an arbitrary alphabet ([Sch65]
[McNP71]). For any c, the words containing c as a contiguous factor form a
star-free language. Over a finite alphabet this is strictly local; the diagonal encoding
enc c a i = (a = c[i]) collapses an arbitrary alphabet onto the finite Fin |c| → Bool while
preserving occurrences of c, and SF is closed under inverse homomorphism.
Avoiding a fixed factor is star-free over an arbitrary alphabet ([Sch65]
[McNP71]) — the complement of isStarFree_containsFactor.