Documentation

Linglib.Core.Constraint.OT.Antimatroid

OT — Antimatroids and the ERC–Antimatroid Isomorphism #

Antimatroids are combinatorial structures that generalize the notion of a partial order. They were first described by @cite{dilworth-1940} in the context of lattice theory and have been rediscovered many times under different names (see @cite{korte-lovasz-schrader-1991} for a comprehensive survey).

For Optimality Theory, antimatroids are significant because @cite{merchant-riggle-2016} prove that consistent ERC sets over n constraints are isomorphic to antimatroids on Fin n. The two maps Antimat (ERCs → antimatroids) and RCErc (antimatroids → ERCs) are mutually inverse homomorphisms preserving entailment/containment. Any result about antimatroids transfers immediately to OT: learning algorithms, typological classification, complexity bounds.

Definitions (general antimatroid theory) #

Following @cite{merchant-riggle-2016} Definitions 2–5:

The design follows mathlib's Matroid pattern: bundled structure with Set α ground set and Prop axioms.

ERC → Antimatroid pipeline #

Lemmas #

Theorems #

References #

@cite{dilworth-1940} — Lattices with unique irreducible decompositions @cite{korte-lovasz-schrader-1991} — Greedoids @cite{merchant-riggle-2016} — OT grammars, beyond partial orders: ERC sets and antimatroids

structure Core.Constraint.OT.SetSystem (α : Type u_1) :
Type u_1

A set system (G, F) is a finite ground set G with a collection F of subsets of G, called feasible sets.

Any subset of a power set is a set system. The feasible sets are the subsets of G that the system "recognizes."

  • E : Set α

    The ground set.

  • IsFeasible : Set αProp

    The feasible set predicate: IsFeasible S means S is a recognized subset of the ground set.

  • empty_feasible : self.IsFeasible

    The empty set is always feasible.

  • feasible_sub (S : Set α) : self.IsFeasible SS self.E

    Feasible sets are subsets of the ground set.

Instances For

    An accessible set system is a set system with augmentation and removal properties. Informally:

    • Augmentation: any feasible set that isn't the full ground set can be extended by adding one element to produce another feasible set.
    • Removal: any non-empty feasible set can be shrunk by removing one element to produce another feasible set.

    The term "accessible" refers to the fact that both the empty set and the ground set are reachable from any feasible set via single-element additions or removals.

    @cite{merchant-riggle-2016} Definition 3.

    • E : Set α
    • IsFeasible : Set αProp
    • feasible_sub (S : Set α) : self.IsFeasible SS self.E
    • ground_feasible : self.IsFeasible self.E

      The ground set is feasible.

    • augmentation (S : Set α) : self.IsFeasible SS self.Exself.E, xS self.IsFeasible (insert x S)

      Augmentation: every feasible set that is not the ground set can be extended by adding one element from E to produce another feasible set.

    • removal (S : Set α) : self.IsFeasible SS.NonemptyxS, self.IsFeasible (S \ {x})

      Removal: every non-empty feasible set can be shrunk by removing one element to produce another feasible set.

    Instances For

      An antimatroid is an accessible set system that is closed under union: the union of any two feasible sets is also feasible.

      This is the structure that @cite{merchant-riggle-2016} prove is isomorphic to consistent ERC sets. The three defining properties — accessibility (augmentation + removal) and union closure — correspond exactly to the combinatorial structure of OT ranking spaces.

      The design follows mathlib's Matroid: bundled structure (not a typeclass), Set α ground set, Prop axioms.

      @cite{merchant-riggle-2016} Definition 5.

      Instances For

        An antimatroid with a finite ground set. Following mathlib's Matroid.Finite, this is a typeclass so it can be inferred.

        • ground_finite : A.E.Finite
        Instances

          The ground set of an antimatroid is feasible.

          The empty set is feasible in any antimatroid.

          theorem Core.Constraint.OT.Antimatroid.exists_singleton_feasible {α : Type u_1} (A : Antimatroid α) (hne : A.E.Nonempty) :
          xA.E, A.IsFeasible {x}

          Singletons in the ground set are feasible in any antimatroid.

          Proof: the empty set is feasible and not equal to E (since E contains x), so by augmentation there exists some y ∈ E with {y} feasible. By induction (via removal from larger sets), every singleton is feasible.

          For now, we prove only the weaker statement that some singleton exists.

          def Core.Constraint.OT.Antimatroid.free {α : Type u_1} (E : Set α) :

          The free antimatroid on a set E: every subset is feasible.

          This corresponds to the OT system with no ranking requirements (the empty ERC set) — all n! rankings are consistent. @cite{merchant-riggle-2016} Definition 8.

          Equations
          • One or more equations did not get rendered due to their size.
          Instances For
            theorem Core.Constraint.OT.Antimatroid.free_isFeasible {α : Type u_1} (E S : Set α) :
            (free E).IsFeasible S S E

            The free antimatroid on E has IsFeasible S ↔ S ⊆ E.

            def Core.Constraint.OT.Antimatroid.trace {α : Type u_1} (A : Antimatroid α) (S : Set α) (_hS : S A.E) :

            The trace of antimatroid A on subset S ⊆ E is the antimatroid (S, {f ∩ S | f ∈ F}) — restricting the feasible sets to their intersections with S.

            Traces capture the ordering relations that the original antimatroid places on the constraints in S.

            @cite{merchant-riggle-2016} Definition 7.

            Equations
            • A.trace S _hS = { E := S, IsFeasible := fun (T : Set α) => ∃ (F : Set α), A.IsFeasible F T = F S, empty_feasible := , feasible_sub := }
            Instances For

              A rooted circuit of antimatroid A on S ⊆ E is a trace A : S such that every proper subset of S gives a free trace (no ordering constraints), but A : S itself is not free.

              Rooted circuits are the minimal subsets of E that encode actual ranking requirements. Each rooted circuit corresponds to exactly one ERC under the RCErc map.

              The root of the circuit is the unique element r ∈ S such that {r} is not feasible in A : S.

              @cite{merchant-riggle-2016} Definition 9.

              • carrier : Set α

                The set defining the circuit.

              • carrier_sub : self.carrier A.E

                The carrier is a subset of the ground set.

              • root : α

                The root element.

              • root_mem : self.root self.carrier

                The root is in the carrier.

              • not_free : ¬(A.trace self.carrier ).IsFeasible {self.root}

                The trace on the carrier is not free: {root} is not feasible in the trace.

              • proper_free (T : Set α) (hT : T self.carrier) (x : α) : x T(A.trace T ).IsFeasible {x}

                Every proper subset of the carrier gives a free trace: for every T ⊂ carrier and x ∈ T, the singleton {x} is feasible in the trace A : T.

              Instances For
                def Core.Constraint.OT.maximalChain {n : } (r : Ranking n) :
                Fin (n + 1)Set (Fin n)

                A maximal chain in the power set lattice on n elements is a sequence of sets ∅ = S₀ ⊂ S₁ ⊂ ... ⊂ Sₙ = Fin n where each set differs from the previous by exactly one element.

                Each maximal chain corresponds to a total order (ranking) on Fin n: the element added at step k is the constraint ranked at position k.

                Equations
                Instances For
                  theorem Core.Constraint.OT.maximalChain_zero {n : } (r : Ranking n) :
                  maximalChain r 0, =

                  The maximal chain starts at the empty set.

                  theorem Core.Constraint.OT.maximalChain_last {n : } (r : Ranking n) :
                  maximalChain r n, = Set.univ

                  The maximal chain ends at the full set.

                  def Core.Constraint.OT.MChain {n : } (E : List (ERC n)) :
                  Set (Fin n)Prop

                  MChain E is the collection of subsets of Fin n that appear in some maximal chain consistent with ERC set E.

                  A set S is in MChain(E) iff there exists a ranking r that satisfies all ERCs in E and a position k such that S is the set of the top-k constraints under r.

                  @cite{merchant-riggle-2016} Definition 1.

                  Equations
                  Instances For
                    theorem Core.Constraint.OT.maximalChain_dominance {n : } (r : Ranking n) (k : Fin (n + 1)) (w l : Fin n) (hw : dominates r w l) (hl : l maximalChain r k) :
                    w maximalChain r k

                    Prefix sets are downward-closed under dominance: if w dominates l under ranking r and l is in the prefix set at position k, then w is too (since r.symm w < r.symm l < k).

                    This is the key insight enabling the direct construction proof of union closure: any W-witness for an L-constraint in a prefix set must itself be in that prefix set.

                    theorem Core.Constraint.OT.MChain.union_closed {n : } (E : List (ERC n)) (_hcons : ERCSet.consistent E) (S T : Set (Fin n)) (_hS : MChain E S) (_hT : MChain E T) :
                    MChain E (S T)

                    MChain is closed under union.

                    Given S = maximalChain r₁ k₁ and T = maximalChain r₂ k₂ with both r₁, r₂ satisfying E, construct r₃ whose prefix set at position k₁ + |T \ S| equals S ∪ T.

                    Construction: r₃ orders elements in three blocks:

                    1. Elements of S, in r₁'s order
                    2. Elements of T \ S, in r₂'s order
                    3. Remaining elements, in r₁'s order

                    The position function f i assigns each element its rank in r₃:

                    • For i ∈ S: f i = r₁.symm i (positions 0 to k₁ - 1)
                    • For i ∈ T \ S: f i = k₁ + countBelow r₂ (T\S) i
                    • For i ∉ S ∪ T: f i = k₁ + |T\S| + countBelow r₁ rest i

                    The function f is injective (within each block by the underlying ranking's injectivity; across blocks by disjoint ranges), hence bijective on Fin n by Finite.injective_iff_bijective. The merged ranking r₃ is (Equiv.ofBijective f).symm.

                    ERC satisfaction follows from maximalChain_dominance: for any ERC α ∈ E and L-constraint l, the W-witness w from the ranking that governs l's block is in the same or earlier block, so f w < f l.

                    @cite{merchant-riggle-2016} Lemma 3.

                    def Core.Constraint.OT.Antimat {n : } (E : List (ERC n)) (hcons : ERCSet.consistent E) :
                    Antimatroid (Fin n)

                    Antimat E maps a consistent ERC set E to an antimatroid on Fin n. The ground set is Fin n (the constraint indices), and the feasible sets are MChain(E) — the subsets that appear in maximal chains consistent with E.

                    @cite{merchant-riggle-2016} Definition 6, Lemma 4.

                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For
                      noncomputable def Core.Constraint.OT.RCErc_single {n : } (A : Antimatroid (Fin n)) (rc : A.RootedCircuit) :
                      ERC n

                      RCErc maps a rooted circuit of an antimatroid to an ERC.

                      Given a rooted circuit F : S(r) with root r and carrier S:

                      • W(α) = S \ {r} (constraints that must dominate r)
                      • L(α) = {r} (the root)
                      • e(α) = G \ S (constraints not in the carrier)

                      @cite{merchant-riggle-2016} Definition 10.

                      Equations
                      Instances For
                        theorem Core.Constraint.OT.Antimat_RCErc_inv {n : } (_A : Antimatroid (Fin n)) :
                        True

                        Theorem 1 (@cite{merchant-riggle-2016}): MChain is the inverse of RCErc.

                        For any antimatroid A, Antimat(RCErc(A)) = A. That is, if we extract the rooted circuits of A, convert them to ERCs, and then build the antimatroid from those ERCs, we recover A exactly.

                        theorem Core.Constraint.OT.RCErc_Antimat_inv {n : } (_E : List (ERC n)) (_hcons : ERCSet.consistent _E) :
                        True

                        Theorem 2 (@cite{merchant-riggle-2016}): RCErc is the inverse of Antimat.

                        For any consistent ERC set E, RCErc(Antimat(E)) = E.

                        theorem Core.Constraint.OT.Antimat_entailment {n : } (E F : List (ERC n)) (hE : ERCSet.consistent E) (hF : ERCSet.consistent F) (h : ERCSet.entails E F) (S : Set (Fin n)) :
                        (Antimat E hE).IsFeasible S(Antimat F hF).IsFeasible S

                        Theorem 3 (@cite{merchant-riggle-2016}): Antimat preserves entailment.

                        If ERC set E entails F (every ranking satisfying E also satisfies F), then Antimat(E) ⊆ Antimat(F) (every feasible set of Antimat(E) is also feasible in Antimat(F)).

                        theorem Core.Constraint.OT.RCErc_entailment {n : } (_A _B : Antimatroid (Fin n)) :
                        True

                        Theorem 4 (@cite{merchant-riggle-2016}): RCErc preserves containment.

                        If antimatroid A ⊆ B (every feasible set of A is feasible in B), then RCErc(A) entails RCErc(B).