Documentation

Linglib.Morphology.Unification

Unification: subsumption and joins for feature bundles #

[Shi86]

The information ordering of unification-based grammar ([Shi86] §3.2), on UD.MorphFeatures — the depth-1, reentrancy-free fragment of Shieber's feature structures. Every feature here is atomic-valued, so paths are single features, the reentrancy clause of subsumption is vacuous, and the definition (§3.2.2: D ⊑ D′ iff D(l) ⊑ D′(l) for all l ∈ dom(D); "an atomic feature structure neither subsumes nor is subsumed by a different atomic feature structure"; "variables subsume all other feature structures") reduces to the product of flat orders. Subsumption is registered as (Shieber's ); the all-none bundle — Shieber's variable [ ] — is .

Unification (§3.2.3) is "the most general feature structure D such that D′ ⊑ D and D′′ ⊑ D", failing on conflict: MorphFeatures.unify returns some exactly on Compatible (= bounded above) inputs, and its result is the least upper bound (unify_isLUB). The example laws of §3.2.3 are theorems: unification is idempotent (unify_self), commutative (unify_comm), and variables are identity elements (bot_unify).

Main declarations #

Theory-neutrality boundary #

Three strata with different statuses: the record is annotation consensus (Data/UD/Basic.lean); the order is shared substrate every framework consumes its own way (DM's matching clause, underspecification, syncretism down-sets); the operations / are commitments of the unification tradition — [Shi86] §3.1 states unification-as-sole-combinator as a design constraint, and rival frameworks combine differently (DM matches and competes; Minimalist Agree values asymmetrically). This file is not that tradition's headquarters: unification-based grammar (PATR, HPSG, LFG — reentrant feature structures, phrasal combination) is a syntax family whose substrate belongs in Syntax/ when consuming studies demand it. What lives here is only the tradition's morphological-bundle fragment — the algebra of one token's Feats column — which it shares with rivals: at the level of claims about morphological feature combination this file is a sibling of DM/ and Nanosyntax/, not a foundation beneath them.

Implementation notes #

Morphology owns the bundle algebra: MorphFeatures is the token's morphology (UD's Feats column), and unification at the ms-word level is the morphology/syntax interface operation. The matching clause of DM's Subset Principle (an exponent is insertable iff exponent.features ≤ morpheme.features) could consume directly, but the existing Morphology/DM/VocabularyInsertion.lean matches by List-subset on [BEq F] rather than MorphFeatures.≤; bridging is left for a future PR. The competition clause — most-specified-wins — is separate argmax machinery already implemented in that same file. (Nanosyntax's Superset Principle is not a consumer: it matches by containment of syntactic trees, see Nanosyntax/TreeSpellout.lean's NanoTree.contains, not by an order on flat bundles.) Lives apart from Data/UD/Basic.lean so the (heavily imported) standard mirror stays mathlib-free — this file is the one that pays for Mathlib.Order — and it is the canonical home for order instances on UD.MorphFeatures.

The non-distributivity of the subsumption lattice is a documented obstruction (MorphFeatures.not_distrib_witness): any ≥3-value slot (here, Case) makes the per-slot lattice the diamond Mₙ, modular but not distributive ([Car92] p. 15, eq. (4), notes this explicitly: "our partial orders are not required to be distributive (and in fact, are not even required to be modular)"). This matters for generalization-then-unification reorderings in paradigm-induction learners.

The flat order on one feature slot #

Option.FlatLE is the slot-level subsumption relation ([Shi86] §3.2.2: none below everything, distinct atoms incomparable). The definition and its refl/trans/antisymm/none_le API live in Linglib/Core/Order/Flat.lean, where they also carry the bundled PartialOrder/OrderBot/SemilatticeInf/PartialUnify instances on the order-carrying alias Flat α. Re-exported here under their original names for backward compatibility with this file's own proofs and consumers.

The 14-case feature-type index for MorphFeatures — the signature of UD's Feats column treated as a fixed finite type family.

Instances For
    @[implicit_reducible]
    Equations
    Equations
    Instances For

      Subsumption is a partial order with bottom #

      Subsumption ([Shi86] §3.2.2): f carries a subset of g's information — field-wise flat order on the option slots, implication on the reflex flag.

      Equations
      • One or more equations did not get rendered due to their size.
      Instances For

        MorphFeatures as a feature bundle, and the derived order #

        MorphFeatures realizes BundleLike over the 14-case signature MorphFeatureType ([Car92]'s abstract feature structure): each slot projects to a Flat value, with the reflex flag normalized false ↦ none, true ↦ some () (the privative Unit case). The valuation is injective, so MorphFeatures is LawfulBundleLike, and the subsumption order is the per-slot Flat order pulled back along val (subsumes_iff_val_le) — the PartialOrder/OrderBot laws derive from the bundle embedding rather than being proved field by field.

        theorem UD.MorphFeatures.val_injective :
        Function.Injective val

        The valuation is injective: a MorphFeatures bundle is determined by its per-slot assignments (with reflex reconstructed from Option Unit).

        Subsumption is exactly the bundle order: the field-wise 14-conjunct form coincides with the per-slot Flat order on the valuation val.

        @[implicit_reducible]
        Equations
        • One or more equations did not get rendered due to their size.
        @[implicit_reducible]
        Equations
        @[implicit_reducible]
        instance UD.MorphFeatures.instDecidableLe (f g : MorphFeatures) :
        Decidable (f g)

        Subsumption is decidable (each slot is).

        Equations
        @[implicit_reducible]

        The empty bundle — Shieber's variable [ ] — is bottom: "variables subsume all other feature structures … they contain no information at all" (§3.2.2).

        Equations

        Compatibility is boundedness above; unification is the least upper bound #

        Prop-native compatibility: the pair is bounded above in the subsumption order — mathlib's BddAbove, so the bounds API applies directly. Characterized by the executable compatible check via compatible_iff_bddAbove, which also makes it decidable.

        Equations
        Instances For

          The left input subsumes the merge.

          theorem UD.MorphFeatures.le_merge_right {f g : MorphFeatures} (h : f.compatible g = true) :
          g f.merge g

          The right input subsumes the merge — given compatibility (the doubly committed slots agree, so the left bias is harmless).

          theorem UD.MorphFeatures.merge_le {f g u : MorphFeatures} (hf : f u) (hg : g u) :
          f.merge g u

          The merge is below every common upper bound — minimality ("the most general feature structure", §3.2.3).

          theorem UD.MorphFeatures.compatible_of_le {f g u : MorphFeatures} (hf : f u) (hg : g u) :
          f.compatible g = true

          Bounded above implies the executable check passes.

          The Bool check is exactly boundedness above in the subsumption order: the order-theoretic identity of "compatible".

          @[implicit_reducible]
          Equations
          theorem UD.MorphFeatures.unify_eq_some_iff (f g : MorphFeatures) :
          (f.unify g).isSome = true f.Compatible g

          Unification succeeds exactly on compatible inputs (§3.2.3: otherwise it "fails").

          theorem UD.MorphFeatures.unify_isLUB {f g u : MorphFeatures} (h : f.unify g = some u) :
          IsLUB {f, g} u

          Unification is the least upper bound ([Shi86] §3.2.3: "the most general feature structure D such that D′ ⊑ D and D′′ ⊑ D").

          Generalization: the meet #

          Shieber's generalization (anti-unification): the most specific bundle subsumed by both inputs. Unlike unification it is total — the meet always exists — so MorphFeatures is a genuine SemilatticeInf with .

          @[implicit_reducible]
          Equations
          • One or more equations did not get rendered due to their size.
          @[implicit_reducible]
          Equations
          • One or more equations did not get rendered due to their size.

          Unification computes least upper bounds — further laws #

          @[implicit_reducible]

          MorphFeatures carries the pairwise-LUB structure of [Car92]'s bounded complete partial order: unify is the partial join, with unify_isLUB and compatible_iff_bddAbove supplying the two class axioms. The unification laws — commutativity, associativity (with failure propagating), -identity, idempotence, monotonicity — are inherited as one-line corollaries of the generic theorems in Core/Order/PartialUnify.lean.

          Equations
          @[simp]

          The instance-projected unify is the same function as MorphFeatures.unify.

          theorem UD.MorphFeatures.unify_eq_some_iff_isLUB {f g u : MorphFeatures} :
          f.unify g = some u IsLUB {f, g} u

          Unification succeeds with value u exactly when u is the least upper bound.

          Unification is commutative — a consequence of total compatibility: doubly committed slots agree, so the per-field left bias washes out.

          @[simp]
          theorem UD.MorphFeatures.unify_self (f : MorphFeatures) :
          f.unify f = some f

          Unification is idempotent (§3.2.3's example law).

          @[simp]

          Variables are unification identity elements (§3.2.3's example law): unifying with the empty bundle returns the other input.

          theorem UD.MorphFeatures.unify_assoc (f g h : MorphFeatures) :
          ((f.unify g).bind fun (x : MorphFeatures) => x.unify h) = (g.unify h).bind fun (x : MorphFeatures) => f.unify x

          Unification is associative, with failure propagating ([Shi86] §3.2.3's order-independence): both associations compute the lub of all three bundles.

          @[simp]

          The empty bundle is a right identity for unification.

          Unification fails exactly on incompatible inputs.

          theorem UD.MorphFeatures.unify_mono {f₁ f₂ g₁ g₂ u₂ : MorphFeatures} (hf : f₁ f₂) (hg : g₁ g₂) (h2 : f₂.unify g₂ = some u₂) :
          ∃ (u₁ : MorphFeatures), f₁.unify g₁ = some u₁ u₁ u₂

          Unification is monotone where defined: shrinking both inputs preserves success and shrinks the output. (Unguarded merge-monotonicity is false — the guard is needed.)