Fragment grammars (FG) — the central proposal #

@cite{odonnell-2015}

The "inference-based" model of @cite{odonnell-2015} §2.4.4 / §3.1.8, and the central theoretical proposal of the book. A FragmentGrammar extends an AdaptorGrammar by adding a per-rule-RHS-position beta-binomial halt prior: at each nonterminal slot in each rule's right-hand side, a Beta-distributed weight ν controls the probability of recursing (productively expanding the slot) vs. halting (storing the slot as an open NT in a memoized fragment).

This generalizes both DMPCFG (ν → 1 everywhere: always recurse, fragments are depth-one) and MAG (ν decided once per nonterminal, not per slot: fragments are full subtrees). FG learns a separate recursion probability per (rule, position), letting fragments be arbitrary partial trees with selective open NT slots — the distinguishing feature of the book's account of productivity-and- reuse.

Per @cite{odonnell-2015} §3.1.8 the corpus probability factorizes as

fg(X, Y, Z; F) = ∏_{A ∈ V} [DMPCFG-factor on X^A] · [PYP-factor on Y^A]
              · ∏_{r ∈ R_G} ∏_{B ∈ rhs(r)}
                  B(ψ_B + z_{r,B}^↻, ψ'_B + z_{r,B}^⊥) /
                  B(ψ_B, ψ'_B)

where Z = z_{r,B}^{↻/⊥} is the corpus-aggregate count of recurse/ halt decisions at each (rule, RHS-position) pair, and B(·,·) is the Beta function (here written as a ratio of Γ-products to avoid a dependency on Mathlib.Probability.Distributions.Beta).

Why corpus probability is `(D, Y, Z) → ℝ` #

Z is latent — the recurse/halt assignment per fragment is part of the MAP analysis, not the observed corpus. Same situation as Y in AdaptorGrammar. Marginalizing over (Y, Z) is the MH inference target distribution of @cite{odonnell-2015} §3.2 — out of scope per the Processing-scope rule.

What we inherit from `AdaptorGrammar` #

FragmentGrammar G extends AdaptorGrammar G, so the entire DMPCFG

PYP infrastructure is inherited: pseudo, pseudo_pos, lhsUrn, lhsCounts, lhsFactor, pyp, pypFactor, corpusProbGivenTables, all the positivity proofs. FG only adds the beta-binomial halt prior and the per-(rule, position) factor.

Main definitions #

FragmentGrammar G — extends AdaptorGrammar G with per-(rule, RHS-position) beta-binomial pseudo-counts (haltPriorRecurse, haltPriorHalt).
FragmentGrammar.HaltCounts — abbrev for the latent recurse/halt count assignment.
FragmentGrammar.fgFactor — per-(rule, position) beta-binomial factor.
FragmentGrammar.corpusProbGivenStorage — eq from §3.1.8, conditional on table assignment Y and halt counts Z.

§2.3.7 vs §3.1.8 — sampler vs distribution #

This file is the distribution side: corpusProbGivenStorage is the density fg(F; F) of @cite{odonnell-2015} §3.1.8 (p. 94). The sampler that draws from this density is in the sibling file FragmentLambda.lean, scaffolding @cite{odonnell-2015} §2.3.7's Church macro (fragment-lambda args body) (Figure 2.21, p. 71). The two are linked by the soundness contract fragmentLambdaDepth_marginalises_to_fg, which equates the sampler's output marginal with the §3.1.8 density.

Per @cite{odonnell-2015} §3.1.8 (p. 92) the substrate here implements the actual model used in the rest of the book — biased halt coin BINOMIAL(ν) with ν itself drawn from a Beta prior — not the fair-coin presentation of §2.3.6. The §2.3.7 fair-coin macro is recovered as the haltPriorRecurse = haltPriorHalt = 1 special case (Beta(1,1) = Uniform).

References #

@cite{odonnell-2015} §2.4.4, §3.1.8.

Fragment grammars (FG) — the central proposal #

Why corpus probability is (D, Y, Z) → ℝ #

What we inherit from AdaptorGrammar #

Main definitions #

§2.3.7 vs §3.1.8 — sampler vs distribution #

References #

Bridge: FragmentGrammar → MultinomialPCFG via posterior MAP #

Conjugacy decomposition (mirror of AG / DMPCFG) #

Why corpus probability is `(D, Y, Z) → ℝ` #

What we inherit from `AdaptorGrammar` #