Documentation

Linglib.Theories.Morphology.FragmentGrammars.FragmentGrammar

Fragment grammars (FG) — the central proposal #

@cite{odonnell-2015}

The "inference-based" model of @cite{odonnell-2015} §2.4.4 / §3.1.8, and the central theoretical proposal of the book. A FragmentGrammar extends an AdaptorGrammar by adding a per-rule-RHS-position beta-binomial halt prior: at each nonterminal slot in each rule's right-hand side, a Beta-distributed weight ν controls the probability of recursing (productively expanding the slot) vs. halting (storing the slot as an open NT in a memoized fragment).

This generalizes both DMPCFG (ν → 1 everywhere: always recurse, fragments are depth-one) and MAG (ν decided once per nonterminal, not per slot: fragments are full subtrees). FG learns a separate recursion probability per (rule, position), letting fragments be arbitrary partial trees with selective open NT slots — the distinguishing feature of the book's account of productivity-and- reuse.

Per @cite{odonnell-2015} §3.1.8 the corpus probability factorizes as

fg(X, Y, Z; F) = ∏_{A ∈ V} [DMPCFG-factor on X^A] · [PYP-factor on Y^A]
              · ∏_{r ∈ R_G} ∏_{B ∈ rhs(r)}
                  B(ψ_B + z_{r,B}^↻, ψ'_B + z_{r,B}^⊥) /
                  B(ψ_B, ψ'_B)

where Z = z_{r,B}^{↻/⊥} is the corpus-aggregate count of recurse/ halt decisions at each (rule, RHS-position) pair, and B(·,·) is the Beta function (here written as a ratio of Γ-products to avoid a dependency on Mathlib.Probability.Distributions.Beta).

Why corpus probability is (D, Y, Z) → ℝ #

Z is latent — the recurse/halt assignment per fragment is part of the MAP analysis, not the observed corpus. Same situation as Y in AdaptorGrammar. Marginalizing over (Y, Z) is the MH inference target distribution of @cite{odonnell-2015} §3.2 — out of scope per the Processing-scope rule.

What we inherit from AdaptorGrammar #

FragmentGrammar G extends AdaptorGrammar G, so the entire DMPCFG

Main definitions #

§2.3.7 vs §3.1.8 — sampler vs distribution #

This file is the distribution side: corpusProbGivenStorage is the density fg(F; F) of @cite{odonnell-2015} §3.1.8 (p. 94). The sampler that draws from this density is in the sibling file FragmentLambda.lean, scaffolding @cite{odonnell-2015} §2.3.7's Church macro (fragment-lambda args body) (Figure 2.21, p. 71). The two are linked by the soundness contract fragmentLambdaDepth_marginalises_to_fg, which equates the sampler's output marginal with the §3.1.8 density.

Per @cite{odonnell-2015} §3.1.8 (p. 92) the substrate here implements the actual model used in the rest of the book — biased halt coin BINOMIAL(ν) with ν itself drawn from a Beta prior — not the fair-coin presentation of §2.3.6. The §2.3.7 fair-coin macro is recovered as the haltPriorRecurse = haltPriorHalt = 1 special case (Beta(1,1) = Uniform).

References #

structure Morphology.FragmentGrammars.FragmentGrammar {T : Type} [DecidableEq T] (G : ContextFreeGrammar T) [DecidableEq G.NT] extends Morphology.FragmentGrammars.AdaptorGrammar G :

A fragment grammar over G: an adaptor grammar augmented with a per-(rule, RHS-position) beta-binomial halt prior. At each nonterminal slot in each rule's right-hand side, the weight ν of recursing-vs-halting is Beta-distributed with pseudo-counts (haltPriorRecurse r i, haltPriorHalt r i).

Allowing per-slot halt decisions (rather than per-nonterminal as in MAG) lets fragments be arbitrary partial trees — the central theoretical commitment of @cite{odonnell-2015}.

  • pseudo : ContextFreeRule T G.NT
  • pseudo_pos (r : ContextFreeRule T G.NT) : r G.rules0 < self.pseudo r
  • haltPriorRecurse : ContextFreeRule T G.NT

    Beta pseudo-count for "recurse" outcome at (rule, RHS position).

  • haltPriorHalt : ContextFreeRule T G.NT

    Beta pseudo-count for "halt" outcome at (rule, RHS position).

  • haltPriorRecurse_pos (r : ContextFreeRule T G.NT) (i : ) : 0 < self.haltPriorRecurse r i

    Recurse pseudo-counts are strictly positive.

  • haltPriorHalt_pos (r : ContextFreeRule T G.NT) (i : ) : 0 < self.haltPriorHalt r i

    Halt pseudo-counts are strictly positive.

Instances For
    theorem Morphology.FragmentGrammars.FragmentGrammar.ext {T : Type} {inst✝ : DecidableEq T} {G : ContextFreeGrammar T} {inst✝¹ : DecidableEq G.NT} {x y : FragmentGrammar G} (pseudo : x.pseudo = y.pseudo) (pyp : x.pyp = y.pyp) (haltPriorRecurse : x.haltPriorRecurse = y.haltPriorRecurse) (haltPriorHalt : x.haltPriorHalt = y.haltPriorHalt) :
    x = y
    theorem Morphology.FragmentGrammars.FragmentGrammar.ext_iff {T : Type} {inst✝ : DecidableEq T} {G : ContextFreeGrammar T} {inst✝¹ : DecidableEq G.NT} {x y : FragmentGrammar G} :
    x = y x.pseudo = y.pseudo x.pyp = y.pyp x.haltPriorRecurse = y.haltPriorRecurse x.haltPriorHalt = y.haltPriorHalt
    @[reducible, inline]
    abbrev Morphology.FragmentGrammars.FragmentGrammar.HaltCounts {T : Type} (G : ContextFreeGrammar T) :

    A halt-count assignment gives, for each rule r and each RHS-position i, the corpus-aggregate counts of "recurse" and "halt" decisions at that slot. This is the latent variable Z in @cite{odonnell-2015} §3.1.8.

    Consistency between Z, Y (table assignment), and the observed corpus D is the responsibility of the caller (or of the MAP inference step we do not formalize).

    Equations
    Instances For
      noncomputable def Morphology.FragmentGrammars.FragmentGrammar.fgFactor {T : Type} [DecidableEq T] {G : ContextFreeGrammar T} [DecidableEq G.NT] (M : FragmentGrammar G) (r : ContextFreeRule T G.NT) (i : ) (z : × ) :

      Per-(rule, position) beta-binomial factor in the eq from §3.1.8. For pseudo-counts α = haltPriorRecurse r i, β = haltPriorHalt r i, and observed counts (zr, zh) = Z r i,

      B(α + zr, β + zh) / B(α, β)
        = Γ(α+zr)·Γ(β+zh)·Γ(α+β) / (Γ(α+β+zr+zh)·Γ(α)·Γ(β)) .
      

      Inlined via Real.Gamma rather than mathlib's ProbabilityTheory.beta to avoid pulling in Mathlib.Probability.Distributions.Beta.

      Equations
      • One or more equations did not get rendered due to their size.
      Instances For
        noncomputable def Morphology.FragmentGrammars.FragmentGrammar.corpusProbGivenStorage {T : Type} [DecidableEq T] {G : ContextFreeGrammar T} [DecidableEq G.NT] (M : FragmentGrammar G) (D : Multiset (CFGTree T G.NT)) (Y : AdaptorGrammar.TableAssignment G) (Z : HaltCounts G) :

        FG corpus probability conditional on a table assignment Y and a halt-count assignment Z. Per @cite{odonnell-2015} §3.1.8:

        fg(X, Y, Z; F) = ag(X, Y; A) · ∏_r ∏_i fgFactor r i (Z r i)
        

        The AG part is inherited from AdaptorGrammar.corpusProbGivenTables (via extends); FG adds only the per-(rule, position) Beta- binomial factors.

        Equations
        Instances For
          theorem Morphology.FragmentGrammars.FragmentGrammar.fgFactor_pos {T : Type} [DecidableEq T] {G : ContextFreeGrammar T} [DecidableEq G.NT] (M : FragmentGrammar G) (r : ContextFreeRule T G.NT) (i : ) (z : × ) :
          0 < M.fgFactor r i z

          The per-(rule, position) factor is strictly positive.

          theorem Morphology.FragmentGrammars.FragmentGrammar.corpusProbGivenStorage_nonneg {T : Type} [DecidableEq T] {G : ContextFreeGrammar T} [DecidableEq G.NT] (M : FragmentGrammar G) (D : Multiset (CFGTree T G.NT)) (Y : AdaptorGrammar.TableAssignment G) (Z : HaltCounts G) :

          FG corpus probability is nonnegative on any storage assignment.

          The "empty" halt-count assignment: zero recurse-decisions and zero halt-decisions at every (rule, position) pair. The corresponding fgFactor evaluates to 1 because the Beta- binomial ratio is B(α, β) / B(α, β) = 1.

          Equations
          Instances For
            @[simp]
            theorem Morphology.FragmentGrammars.FragmentGrammar.fgFactor_zero {T : Type} [DecidableEq T] {G : ContextFreeGrammar T} [DecidableEq G.NT] (M : FragmentGrammar G) (r : ContextFreeRule T G.NT) (i : ) :
            M.fgFactor r i (0, 0) = 1

            The per-(rule, position) factor at zero counts is 1: the Beta-binomial ratio degenerates to B(α, β) / B(α, β).

            @[simp]
            theorem Morphology.FragmentGrammars.FragmentGrammar.corpusProbGivenStorage_empty {T : Type} [DecidableEq T] {G : ContextFreeGrammar T} [DecidableEq G.NT] (M : FragmentGrammar G) :

            FG corpus probability is 1 on the empty corpus paired with the empty table assignment and empty halt-count assignment. Each fgFactor is 1, the AG factor is 1, so the overall product is 1.

            Bridge: FragmentGrammar → MultinomialPCFG via posterior MAP #

            FG = AG + per-(rule, RHS-position) beta-binomial halt prior. Like PYP in AG, the halt-prior factor is multiplicative in corpusProbGivenStorage and does not enter per-LHS rule-weight inference. So FG's posterior MAP rule weights factor through AG which factors through DMPCFG.

            The "tower equality" FG.posteriorMAP D = AG.posteriorMAP D = DMPCFG.posteriorMAP D is structural — every layer above DMPCFG in the prior tower is purely additive on rule-weight inference. This is the architectural payoff: the family-of-priors structure (extends chain) is genuinely "richer hyperparameter sets," not "different rule-weight semantics."

            noncomputable def Morphology.FragmentGrammars.FragmentGrammar.posteriorMAP {T : Type} [DecidableEq T] {G : ContextFreeGrammar T} [DecidableEq G.NT] (M : FragmentGrammar G) [∀ (a : G.NT), Nonempty (G.RulesWithLHS a)] (D : Multiset (CFGTree T G.NT)) :

            The Dirichlet posterior MAP collapse for FG. Same one-liner as AG, factoring transitively through extends: FG → AG → DMPCFG. The beta-binomial halt prior contributes a multiplicative factor to corpus probability but does not alter rule-weight inference.

            Equations
            Instances For
              theorem Morphology.FragmentGrammars.FragmentGrammar.posteriorMAP_eq_toAdaptorGrammar {T : Type} [DecidableEq T] {G : ContextFreeGrammar T} [DecidableEq G.NT] (M : FragmentGrammar G) [∀ (a : G.NT), Nonempty (G.RulesWithLHS a)] (D : Multiset (CFGTree T G.NT)) :

              Coherence (one step). FG's posterior-MAP agrees with AG's posterior-MAP. Holds by rfl — FG only adds halt priors on top of AG; rule-weight inference is unchanged.

              theorem Morphology.FragmentGrammars.FragmentGrammar.posteriorMAP_eq_toDMPCFG {T : Type} [DecidableEq T] {G : ContextFreeGrammar T} [DecidableEq G.NT] (M : FragmentGrammar G) [∀ (a : G.NT), Nonempty (G.RulesWithLHS a)] (D : Multiset (CFGTree T G.NT)) :

              Coherence (two steps). FG's posterior-MAP agrees with the underlying DMPCFG's posterior-MAP. The "prior tower" of length three collapses to a single canonical MultinomialPCFG by rfl.

              Conjugacy decomposition (mirror of AG / DMPCFG) #

              FG's posterior delegates through AG which delegates through DMPCFG. Halt-prior pseudo-counts are unchanged by corpus updates (the beta-binomial layer doesn't see corpus rule counts). Mode delegates to the underlying DMPCFG. The decomposition holds by rfl at all three layers, witnessing the rfl-tower discovery (the prior tower's extends-chain is structurally orthogonal to rule-weight inference).

              noncomputable def Morphology.FragmentGrammars.FragmentGrammar.posterior {T : Type} [DecidableEq T] {G : ContextFreeGrammar T} [DecidableEq G.NT] (M : FragmentGrammar G) (D : Multiset (CFGTree T G.NT)) :

              FG's posterior update: bump the underlying AG/DMPCFG, leave halt-prior pseudo-counts unchanged.

              Equations
              Instances For
                noncomputable def Morphology.FragmentGrammars.FragmentGrammar.mode {T : Type} [DecidableEq T] {G : ContextFreeGrammar T} [DecidableEq G.NT] (M : FragmentGrammar G) [∀ (a : G.NT), Nonempty (G.RulesWithLHS a)] :

                FG's mode in MultinomialPCFG-space: delegates through AG to DMPCFG. Halt-prior pseudo-counts contribute no rule-weight mass.

                Equations
                Instances For
                  @[simp]
                  theorem Morphology.FragmentGrammars.FragmentGrammar.posterior_zero {T : Type} [DecidableEq T] {G : ContextFreeGrammar T} [DecidableEq G.NT] (M : FragmentGrammar G) :
                  M.posterior 0 = M

                  Empty corpus = identity update (delegated to AG.posterior_zero).

                  theorem Morphology.FragmentGrammars.FragmentGrammar.posterior_add {T : Type} [DecidableEq T] {G : ContextFreeGrammar T} [DecidableEq G.NT] (M : FragmentGrammar G) (D₁ D₂ : Multiset (CFGTree T G.NT)) :
                  M.posterior (D₁ + D₂) = (M.posterior D₁).posterior D₂

                  Incrementality at the FG layer.

                  theorem Morphology.FragmentGrammars.FragmentGrammar.posteriorMAP_eq_mode_posterior {T : Type} [DecidableEq T] {G : ContextFreeGrammar T} [DecidableEq G.NT] (M : FragmentGrammar G) [∀ (a : G.NT), Nonempty (G.RulesWithLHS a)] (D : Multiset (CFGTree T G.NT)) :

                  The conjugacy decomposition at FG layer. posteriorMAP = modeposterior, holding by rfl because all FG operations delegate through DMPCFG.