Documentation

Linglib.Theories.Syntax.Minimalist.Basic

Syntactic Objects and Containment #

@cite{chomsky-2013} @cite{marcolli-chomsky-berwick-2025}

Foundation module for the Minimalist Program formalization.

Syntactic Objects #

SyntacticObject is FreeMagma LIToken: bare phrase structure as the free magma over lexical-item tokens, following @cite{marcolli-chomsky-berwick-2025}. The two constructors are FreeMagma.of (lexical leaves) and FreeMagma.mul (binary Merge). The shims SyntacticObject.leaf / SyntacticObject.node rename them at the linguistic interface.

The Y-model branches by map, not by type: PF lives natively on SyntacticObject via linearize/phonYield; the LF lift to Tree Unit String lives in SpellOut.lean (SyntacticObject.toLFTree), which handles trace detection and phonForm extraction.

Containment Relations #

inductive Minimalist.Cat :

Categorial features (Definition 10).

Enumerates the head categories of the Minimalist Program clausal spine and nominal extended projection: cartographic left periphery, inflectional spine, v/Voice/Appl event-structure layer, nominal categorizer-and-quantity sequence, and adpositional Place/Path articulation.

Instances For
    def Minimalist.instReprCat.repr :
    CatStd.Format
    Equations
    Instances For
      @[implicit_reducible]
      Equations
      @[implicit_reducible]
      instance Minimalist.instDecidableEqCat :
      DecidableEq Cat
      Equations
      @[implicit_reducible]
      instance Minimalist.instInhabitedCat :
      Inhabited Cat
      Equations
      @[reducible, inline]

      Selectional stack consumed left-to-right

      Equations
      Instances For

        A simple LI is a ⟨CAT, SEL⟩ pair (Definition 10), optionally with a phonological form for linearization.

        Instances For
          def Minimalist.instReprSimpleLI.repr :
          SimpleLIStd.Format
          Equations
          • One or more equations did not get rendered due to their size.
          Instances For
            def Minimalist.instDecidableEqSimpleLI.decEq (x✝ x✝¹ : SimpleLI) :
            Decidable (x✝ = x✝¹)
            Equations
            • One or more equations did not get rendered due to their size.
            Instances For

              Lexical item (simple or complex from head movement, Definition 88)

              Instances For
                Equations
                • One or more equations did not get rendered due to their size.
                Instances For
                  @[implicit_reducible]
                  Equations
                  def Minimalist.LexicalItem.simple (cat : Cat) (sel : SelStack) (phonForm : String := "") :

                  Create a simple (non-complex) LI

                  Equations
                  Instances For

                    Get the outer (projecting) category of an LI

                    Equations
                    Instances For

                      Get the outer selectional requirements

                      Equations
                      Instances For

                        Is this LI complex (result of head-to-head movement)?

                        Equations
                        Instances For

                          Combine two LIs for head-to-head movement (target reprojects)

                          Equations
                          Instances For

                            LI token: specific instance of an LI in a derivation

                            Instances For
                              @[implicit_reducible]
                              Equations
                              def Minimalist.instReprLIToken.repr :
                              LITokenStd.Format
                              Equations
                              • One or more equations did not get rendered due to their size.
                              Instances For
                                @[implicit_reducible]
                                Equations
                                @[reducible, inline]

                                Syntactic object: nonplanar binary tree over LIToken ⊕ Nat, realized as FreeCommMagma (LIToken ⊕ Nat).

                                Per @cite{marcolli-chomsky-berwick-2025} Definition 1.1.1 (book p. 22), SO is the free, non-associative, commutative magma Magma_{na,c}(SO_0, M) with M(α,β) = {α,β} (unordered). Linguistically, this is the position that Merge produces unordered sets, with linearization (PF / LCA / head-directionality) as a separate Externalization step.

                                The three "constructors" at the SO interface are:

                                • .leaf tok — a lexical leaf, encoded as FreeCommMagma.of (.inl tok)
                                • .trace n — a trace marker, encoded as FreeCommMagma.of (.inr n)
                                • binary Merge, written l * r (commutative: l * r = r * l as a strict equality inside the quotient)

                                Trace handling: linglib's LIToken ⊕ Nat is a deliberate divergence from MCB #

                                MCB's SO_0 (book p. 22, Def 1.1.1) consists of lexical items and syntactic features only — not trace markers. Internal Merge in MCB is a workspace-level cut-and-extract operation: an accessible term is extracted, leaving one of three possible remainder forms (Defs 1.2.5–1.2.8, book p. 31–35):

                                • T/^c F_v (contraction) — extracted term becomes a "deeper copy" labelling the leaf left by edge-contraction; visible to semantics, cancelled at PF
                                • T/^d F_v (deletion) — edge-contracted, no trace marker carried
                                • T/^ρ F_v (admissible cut) — an unlabeled structural vertex remains as the trace; used by the combined process

                                These are three forms of the remainder tree after extraction, not three carrier choices for the SO type itself. In all three, the SO type's base alphabet is SO_0 = lexical items + syntactic features only.

                                Linglib makes a different choice. SyntacticObject := FreeCommMagma (LIToken ⊕ Nat) adds labeled, indexed trace markers to the base alphabet. This is structurally distinct from MCB's ^ρ form (where the trace is an unlabeled structural vertex in the remainder tree only). Linglib's encoding admits trace-bearing SOs at any stage of derivation, not just as remainders.

                                Why this divergence is here: chain-tracking ergonomics. The Nat index lets downstream code identify which mover produced a given trace, which is load-bearing for binding theory and reconstruction effects. MCB handles chain-identification at the workspace level (a workspace forest may have multiple connected components that are isomorphic to the same tree, and those isomorphism classes are the chain). Linglib's chain-tracking-via-index is expressively sufficient but inexpressively redundant w.r.t. MCB.

                                Phase 2+ migration target: replace LIToken ⊕ Nat with LIToken and move chain-identification to the workspace layer. All current .trace / .isTrace / mkTrace / Step.im operations become projection-side rather than substrate. See project memory note project_so_carrier_rho_projection.md.

                                The migration from the prior planar TraceTree LIToken Nat carrier landed at version 0.230.857 (Phase 0.5 substrate) + 0.230.858 (mathlib-canonical DecidableEq via CommEqv, no LinearOrder needed). See docs/nonplanar-migration-plan.md and Linglib/Scratch/PreLiePlanarCheck.lean (counterexample showing the §1.7 pre-Lie identity is FALSE on planar trees, the headline reason for nonplanarity).

                                For LF composition, see SyntacticObject.toLFTree.

                                Equations
                                Instances For
                                  @[reducible, inline]

                                  Leaf shim: SyntacticObject.leaf tok ≡ FreeCommMagma.of (.inl tok).

                                  Equations
                                  Instances For
                                    @[reducible, inline]

                                    Trace shim: SyntacticObject.trace nFreeCommMagma.of (.inr n).

                                    Equations
                                    Instances For
                                      @[reducible, inline]

                                      Binary-Merge shim: SyntacticObject.mul l r ≡ l * r. The construction side of binary Merge; * is commutative on SyntacticObject.

                                      Equations
                                      • l.mul r = l * r
                                      Instances For
                                        @[reducible, inline]

                                        Construction-only alias .node l r ≡ mul l r ≡ l * r. Pattern matching against .node does NOT work (the carrier is Quot _, so Lean cannot invert through it); use induction so with | mul l r ihl ihr instead. Retained as a construction shim to keep call sites readable during the TraceTree → FreeCommMagma migration.

                                        Equations
                                        Instances For
                                          theorem Minimalist.SyntacticObject.ind {motive : SyntacticObjectProp} (leaf : ∀ (tok : LIToken), motive (leaf tok)) (trace : ∀ (n : ), motive (trace n)) (mul : ∀ (l r : SyntacticObject), motive lmotive rmotive (l * r)) (so : SyntacticObject) :
                                          motive so

                                          The induction principle for SyntacticObject with linguistic case names. For Prop motives, Quot.ind propagates through the equivalence automatically — no swap-respect obligation needed. The mul case must cover both orderings (since l * r = r * l), which Quot.ind handles transparently for propositional motives.

                                          theorem Minimalist.SyntacticObject.ind₂ {motive : SyntacticObjectSyntacticObjectProp} (h : ∀ (a b : FreeMagma (LIToken )), motive (Quot.mk FreeMagma.CommRel a) (Quot.mk FreeMagma.CommRel b)) (s t : SyntacticObject) :
                                          motive s t

                                          Two-argument version of ind. Useful for theorems about pairs of SOs.

                                          Predicates #

                                          Each predicate is defined via FreeCommMagma.lift from a swap-respecting helper on the underlying FreeMagma. The _leaf / _trace / _mul simp lemmas unfold cleanly.

                                          @[simp]
                                          @[simp]
                                          theorem Minimalist.SyntacticObject.isLeaf_trace (n : ) :
                                          (trace n).isLeaf True
                                          @[implicit_reducible]
                                          Equations
                                          @[implicit_reducible]
                                          Equations
                                          @[reducible, inline]
                                          abbrev Minimalist.liftFM {β : Type u_1} (f : FreeMagma (LIToken )β) (h : ∀ (a b : FreeMagma (LIToken )), a.CommRel bf a = f b) :

                                          Cascade combinator: lift a FreeMagma (LIToken ⊕ Nat) → β aux to SyntacticObject → β, hiding the FreeMagma.CommRel machinery from Phenomena consumers. Same as FreeCommMagma.lift modulo the SO type ascription at the SO interface. The _respects hypothesis still has to be provided, but the type signature is consumer-friendly.

                                          Lives at Minimalist.liftFM (sibling of leafShape below), not inside the SyntacticObject namespace — liftFM f h takes no SO self-argument, so dot notation wouldn't apply.

                                          Equations
                                          Instances For

                                            Merge: the fundamental structure-building operation. Equal to FreeCommMagma.mul and to (· * ·). Commutative: merge x y = merge y x as a strict equality on the quotient.

                                            Equations
                                            Instances For
                                              @[simp]

                                              Headline of the MCB Phase 1.0 nonplanar migration: Merge is symmetric on the FreeCommMagma carrier. merge x y = merge y x as strict equality on the quotient, per @cite{marcolli-chomsky-berwick-2025} Definition 1.1.1 (book p. 22) + Remark 1.1.2 (p. 23). The earlier planar TraceTree-based theorem merge_distinguishes_children (x ≠ y → merge x y ≠ merge y x) is now PROVABLY FALSE; this merge_comm lemma is what replaces it at the substrate level.

                                              Available as a simp lemma so consumers can rewrite merge x y to merge y x (or vice versa) freely.

                                              Equations
                                              Instances For
                                                def Minimalist.internalMerge (target mover : SyntacticObject) (_h : target mover) :
                                                Equations
                                                Instances For
                                                  def Minimalist.mkLeaf (cat : Cat) (sel : SelStack) (id : ) :

                                                  Create a leaf SO from category and selection

                                                  Equations
                                                  Instances For
                                                    def Minimalist.mkLeafPhon (cat : Cat) (sel : SelStack) (phon : String) (id : ) :

                                                    Create a leaf SO with a phonological form

                                                    Equations
                                                    Instances For

                                                      Linearization-dependent operations (Phase 1.0 placeholder) #

                                                      phonYield, linearize, leftmostLeaf, outerCat traverse the SO in planar order, which is a representative-choice over the unordered quotient. Per @cite{marcolli-chomsky-berwick-2025} Remark 1.1.2 (book p. 23), linearization belongs to Externalization, not to Merge proper.

                                                      Phase 1.0 placeholder: pick an arbitrary representative via Quot.out and traverse it on the underlying FreeMagma. Noncomputable because Quot.out uses Classical.choose. Phase 2 of the migration replaces this with a head-marking + head-directionality parameter (LCA), making linearization a parameterized choice rather than an arbitrary one.

                                                      def Minimalist.mkTraceToken (index : ) :

                                                      Trace marker token: synthesized when traversal needs an LIToken at a .trace n position. Encodes the trace index in the id field (sentinel ≥ 10000), preserving backward-compat for code that inspects tok.id.

                                                      Equations
                                                      Instances For
                                                        def Minimalist.phonYieldPlanar :
                                                        FreeMagma (LIToken )List String

                                                        Underlying phonological yield on a planar FreeMagma representative. Public so HeadFunction.phonYieldWith can compose it with a chosen externalize section.

                                                        Equations
                                                        Instances For
                                                          def Minimalist.linearizePlanar :
                                                          FreeMagma (LIToken )List LIToken

                                                          Underlying linearization on a planar FreeMagma representative. Public so HeadFunction.linearize can compose it with a chosen Section.σ for harmonic head-initial linearization.

                                                          Equations
                                                          Instances For
                                                            def Minimalist.leftmostLeafPlanar :
                                                            FreeMagma (LIToken )LIToken

                                                            Underlying leftmost-leaf on a planar FreeMagma representative. Exposed (not private) so HeadFunction.head can compose it with a chosen Section.σ for harmonic head-initial head extraction.

                                                            Equations
                                                            Instances For
                                                              def Minimalist.rightmostLeafPlanar :
                                                              FreeMagma (LIToken )LIToken

                                                              Underlying rightmost-leaf on a planar FreeMagma representative. Mirror of leftmostLeafPlanar for harmonic head-final convention (per @cite{marcolli-chomsky-berwick-2025} Lemma 1.13.5).

                                                              Equations
                                                              Instances For

                                                                Extract the phonological form from an LIToken.

                                                                Equations
                                                                Instances For
                                                                  theorem Minimalist.phonYield_eq_linearize_planar (fm : FreeMagma (LIToken )) :
                                                                  phonYieldPlanar fm = List.filterMap (fun (tok : LIToken) => have p := tok.phonForm; if p.isEmpty = true then none else some p) (linearizePlanar fm)

                                                                  Underlying agreement on a planar representative: phonYield is the non-empty-phonForm filter of linearize. Used as a lemma for any HeadFunction h consumer that wants to relate h.phonYield to h.linearize.

                                                                  Create a trace SO with binding index index. Detectable via isTrace.

                                                                  Equations
                                                                  Instances For
                                                                    @[simp]
                                                                    @[simp]
                                                                    theorem Minimalist.isTrace_mul (l r : SyntacticObject) :
                                                                    isTrace (l * r) = none

                                                                    Subtree enumeration — Multiset (MCB-faithful) #

                                                                    Two operators, distinguished by whether the root is included:

                                                                    Both are Multiset-valued (multiplicity-preserving) because MCB §1.6/§1.7 counting and the pre-Lie matching coefficient c^T_{T_1,T_2} (book p. 79) require multiplicities. For linglib's typical use (distinct-LIToken.id leaves), each subtree is a distinct value, so Multiset and Finset coincide extensionally — but the substrate commits to Multiset for faithfulness.

                                                                    Multiset addition (+) preserves multiplicity (additive union); cons (::ₘ) adds an element regardless of presence.

                                                                    All subtrees of a SyntacticObject, including the root itself. Returns a Multiset (multiplicity-preserving). For the MCB accessible-terms operator that excludes the root, see Acc below. Computable via FreeCommMagma.lift from a swap-respecting helper.

                                                                    Equations
                                                                    Instances For
                                                                      @[simp]

                                                                      MCB's accessible-terms operator Acc(T) (Def 1.2.2, book p. 28): the multiset of all proper subtrees of T, excluding the root. Defined as subtrees - {self} (Multiset difference subtracts one occurrence). This is the operator that satisfies the MCB vertex-counting identity #V(F) = b₀(F) + #Acc(F) (eq. 1.2.5).

                                                                      Equations
                                                                      Instances For
                                                                        @[simp]
                                                                        @[simp]

                                                                        Terminal nodes — planar List (order matters for LCA) #

                                                                        terminalNodes returns a List because the leftmost-to-rightmost order is the input to LCA-derived linearization. Phase 1.0 placeholder via Quot.out (planar representative); Phase 2 will replace with LCA + head-directionality.

                                                                        def Minimalist.terminalNodesPlanar :
                                                                        FreeMagma (LIToken )List (FreeMagma (LIToken ))

                                                                        Underlying terminal-node enumeration on a planar FreeMagma representative. Public so HeadFunction.terminalNodesWith can compose it with a chosen externalize section.

                                                                        Equations
                                                                        Instances For

                                                                          The terminal (leaf-position) SOs of a SyntacticObject. Order depends on the chosen planar representative.

                                                                          Equations
                                                                          Instances For

                                                                            Every terminal node is leaf-position.

                                                                            @[simp]

                                                                            For a leaf SO, terminalNodes is the singleton list [.leaf tok]. Same singleton-class argument as leftmostLeaf_leaf.

                                                                            Every terminal is a subtree.

                                                                            The root is always in its own subtrees.

                                                                            subtrees is downward-monotone along the subtree relation: if t is a subtree of s (at any multiplicity), then every subtree of t is also a subtree of s (Multiset.Subset ignores multiplicity, only membership).

                                                                            @[simp]
                                                                            theorem Minimalist.immediatelyContains_mul (l r y : SyntacticObject) :
                                                                            immediatelyContains (l * r) y y = l y = r
                                                                            @[implicit_reducible]

                                                                            Decidable immediate containment. Recurses via Quot.recOnSubsingleton on the underlying FreeMagma representative.

                                                                            Equations
                                                                            • One or more equations did not get rendered due to their size.

                                                                            Containment is the transitive closure of immediate containment

                                                                            "X contains Y iff X immediately contains Y or X immediately contains some SO Z such that Z contains Y"

                                                                            This is standard syntactic dominance.

                                                                            Instances For

                                                                              Immediate containment implies containment

                                                                              theorem Minimalist.contains_trans {x y z : SyntacticObject} (hxy : contains x y) (hyz : contains y z) :

                                                                              Containment is transitive

                                                                              Leaves contain nothing

                                                                              Trace markers contain nothing.

                                                                              Immediate containment strictly decreases nodeCount

                                                                              Containment strictly decreases nodeCount

                                                                              No element contains itself (containment is irreflexive)

                                                                              @[implicit_reducible]
                                                                              instance Minimalist.decContains (x y : SyntacticObject) :
                                                                              Decidable (contains x y)

                                                                              Containment is decidable by structural recursion on the containing SO's underlying FreeMagma representative (via Quot.recOnSubsingletonDecidable p is a subsingleton up to propositional equality).

                                                                              Equations

                                                                              X is a term of Y iff X = Y or Y contains X

                                                                              "X is a term of SO Y iff X = Y or Y contains X"

                                                                              This is useful for stating when an element is "somewhere in" a structure

                                                                              Equations
                                                                              Instances For

                                                                                Every SO is a term of itself

                                                                                If Y contains X, then X is a term of Y

                                                                                @[implicit_reducible]
                                                                                instance Minimalist.decIsTermOf (x y : SyntacticObject) :
                                                                                Decidable (isTermOf x y)
                                                                                Equations

                                                                                Reflexive containment (useful for stating constraints)

                                                                                Equations
                                                                                Instances For
                                                                                  @[implicit_reducible]
                                                                                  instance Minimalist.decContainsOrEq (x y : SyntacticObject) :
                                                                                  Decidable (containsOrEq x y)
                                                                                  Equations

                                                                                  Every SO reflexively contains itself

                                                                                  Reflexive containment is transitive

                                                                                  Subtree-Finset ↔ containment bridge #

                                                                                  These theorems relate subtrees : SO → Finset SO to the propositional contains/containsOrEq relations. With subtrees Finset-typed (order-blind, computable), the bridges are provable by induction — no Quot.out oracle dependence.

                                                                                  theorem Minimalist.mem_subtrees_of_imm_contains {root w z : SyntacticObject} (hw : w root.subtrees) (hwz : immediatelyContains w z) :
                                                                                  z root.subtrees

                                                                                  Children of any subtree of root are themselves subtrees of root.

                                                                                  Containment implies subtree-Finset membership.

                                                                                  Subtree-Multiset membership implies term-of relation.

                                                                                  isTermOf z y iff z is in y.subtrees. The bridge between the propositional term-of relation and the Finset-typed enumeration.

                                                                                  X and Y are sisters IN tree root: they are distinct co-daughters of some node that is a subtree of root.

                                                                                  Equations
                                                                                  Instances For
                                                                                    @[implicit_reducible]
                                                                                    instance Minimalist.decAreSistersIn (root x y : SyntacticObject) :
                                                                                    Decidable (areSistersIn root x y)
                                                                                    Equations

                                                                                    X c-commands Y IN tree root: X has a sister (in root) that contains-or-equals Y.

                                                                                    Equations
                                                                                    Instances For
                                                                                      @[implicit_reducible]
                                                                                      instance Minimalist.decCCommandsIn (root x y : SyntacticObject) :
                                                                                      Decidable (cCommandsIn root x y)

                                                                                      cCommandsIn is decidable: bounded existential over root.subtrees (Multiset). Computable since subtrees is.

                                                                                      Equations

                                                                                      X asymmetrically c-commands Y in tree root.

                                                                                      Equations
                                                                                      Instances For
                                                                                        @[implicit_reducible]
                                                                                        instance Minimalist.decAsymCCommandsIn (root x y : SyntacticObject) :
                                                                                        Decidable (asymCCommandsIn root x y)
                                                                                        Equations

                                                                                        Note on Barker-Pullum command relations #

                                                                                        An earlier draft of this file recast cCommandsIn as B&P K-command (Core.Order.commandRelation over FreeMagma.toTree's tree-order), aiming to inherit B&P's antitonicity / intersection theorems "for free" in the same lattice as HPSG o-command and DG d-command.

                                                                                        That recast does not work: B&P's commandRelation is universal ("every branching ancestor of α also dominates β") and admits pairs that sister-form c-command excludes — e.g., on tree (a, (b, c)), the leaf a B&P-commands itself and the root, since the only branching strict ancestor is the root, which trivially dominates everything. Reinhart c-command is sister-form (∃ sister of α containing β); the two relations are not biconditionally equivalent on FreeMagma.

                                                                                        The mathlib-style resolution: keep cCommandsIn value-side and sister-form (it's what binding consumers need, and it reads directly off bare phrase structure per @cite{marcolli-chomsky-berwick-2025}). Cross-tradition unification via B&P's parametric command lives in Core/Order/Command.lean and is exercised by HPSG / DG, where its universal shape matches the native primitive. Minimalism's c-command does not reduce to it, so no bridge belongs here.

                                                                                        Multiset shape — abstract geometry over the nonplanar quotient #

                                                                                        The unlabeled SO shape: replace every leaf token / trace marker with (). The result lives in FreeCommMagma Unit — the canonical nonplanar analog of the prior planar TreeShape (which was deleted because .node L R = .node R L failed to hold under MCB §1.1.3, book p. 23). Two SOs are structurally isomorphic iff their shapes are equal as elements of FreeCommMagma Unit.

                                                                                        The unlabeled multiset-shape of an SO: forget every leaf token and trace marker, keeping only the binary-tree structure modulo nonplanar quotient. Functorial via FreeCommMagma.map.

                                                                                        Equations
                                                                                        Instances For
                                                                                          @[reducible, inline]

                                                                                          The unit shape — a single leaf in FreeCommMagma Unit. Useful abbreviation for stating shape equalities like so.shape = leafShape * (leafShape * leafShape). Was previously duplicated as private def leafShape in three Phenomena Studies files (DendikkenBasic, HaddicanEtAl, Causatives); hoisted to substrate to eliminate the triplicate.

                                                                                          Equations
                                                                                          Instances For

                                                                                            Two SOs are structurally isomorphic iff they have the same nonplanar shape (ignoring all leaf labels). Replaces the prior Bool-valued structurallyIsomorphic from the planar substrate; Prop-valued + DecidableEq is more mathlib-aligned.

                                                                                            Equations
                                                                                            Instances For
                                                                                              @[implicit_reducible]
                                                                                              Equations

                                                                                              Y-model branch to LF #

                                                                                              The LF lift SyntacticObject.toLFTree : SyntacticObject → Tree Unit String lives in Theories/Syntax/Minimalism/SpellOut.lean, where trace detection and phonForm extraction are handled. PF (linearize / phonYield) operates natively on SyntacticObject and does not require a lift.