Operational fragment-lambda: a stochastic-lazy unfold sampler #
Operational counterpart to FragmentGrammar.lean's joint density
fg(F; F) of @cite{odonnell-2015} §3.1.8 (p. 94): the sampler that draws
from that density. The architecture follows the macro-expansion of
@cite{odonnell-2015} §2.3.7 Figure 2.21 — (fragment-lambda args body) ↦ (PYmem a b (lambda args (if (delay? args) (delay body) body))) —
mapping each Church construct to a Lean component:
(PYmem a b _)⤳pypDrawoverPYPStatewithPYM := StateT _ PMF.(if (delay? args) (delay body) body)⤳PMF.bernoullihalt-coin per §3.1.8 (p. 92).- The
(delay body)thunks ⤳LazyTree.fragmentconstructor.
The single remaining sorry is on fragmentLambdaDepth_marginalises_to_fg
— the depth-→-∞ proportionality theorem, documented in detail at the
theorem itself. Probabilistic-fixed-point machinery on ω-CPPOs of
sub-probability measures is the missing infrastructure (mathlib-level
work). All other operational machinery and preservation theorems
(pypDraw_preserves_wellFormed, fragmentLambdaDepth_preserves_wellFormed)
are real proofs via the PYM.Preserves combinator algebra.
@cite{odonnell-2015} §2.3.7, §3.1.8, §3.5.5 (hyperparameters: a = 0.5,
b = 100, ψ_B = 50).
Pitman–Yor memoisation state #
Universe note: types in this file live at Type (= Type 0) rather than
universe-polymorphic Type u. Universe polymorphism is blocked by the use of
PYM α D Unit in modify's signature: Unit : Type 0, so universe-
polymorphizing PYM would require either PUnit (Type-polymorphic Unit) or
ULift threading throughout. Both are tractable; deferred until a downstream
consumer needs higher-universe support. Linguistic type domains (NTs,
terminals, rules) are all small types so Type is sufficient in practice.
A Pitman–Yor memoisation slot for one input value. We track:
dishes— the value sampled at the i-th distinct tablecustomerCounts— how many customers have sat at each table
The lists are kept length-equal by the public API (empty, seatCustomer,
addTable); we don't enforce this as a structural invariant. Per
@cite{odonnell-2015} footnote 14 (p. 60), multiple tables may share a dish,
which is why we keep dishes as a list rather than a set or finmap.
- dishes : List D
- customerCounts : List ℕ
Instances For
Equations
- Morphology.FragmentGrammars.Operational.instInhabitedPYPSlot.default = { dishes := default, customerCounts := default }
Instances For
Equations
- Morphology.FragmentGrammars.Operational.PYPSlot.empty = { dishes := [], customerCounts := [] }
Instances For
Total customers seated at this slot (N in the book's notation).
Equations
- s.numCustomers = s.customerCounts.sum
Instances For
Seat one more customer at the existing table indexed by i.
Out-of-range indices leave the slot unchanged (List.set is a no-op
for out-of-bounds indices, and getD 0 + 1 = 1 falls back to seating
a customer at a fresh single-element entry — but the set no-op means
the new entry is dropped, leaving customerCounts unchanged).
Implemented via List.set rather than List.modify because mathlib
has a developed membership API for the former (List.mem_or_eq_of_mem_set)
that lets us prove seatCustomer_wellFormed directly.
Equations
- s.seatCustomer i = { dishes := s.dishes, customerCounts := s.customerCounts.set i (s.customerCounts[i]?.getD 0 + 1) }
Instances For
Open a fresh table with dish v and one initial customer.
Equations
- s.addTable v = { dishes := s.dishes ++ [v], customerCounts := s.customerCounts ++ [1] }
Instances For
A PYPSlot is well-formed when every customer count is positive.
This invariant is maintained by (but not enforced by) the public API:
empty has empty customerCounts (vacuous); addTable appends [1];
seatCustomer increments. Lifted to PYPState and used by the sampler
to discharge the otherwise-unreachable atomic-fallback branch in
slotToFinpartition.
Equations
- s.WellFormed = ∀ c ∈ s.customerCounts, 0 < c
Instances For
seatCustomer preserves wellformedness: each element of the
resulting customerCounts is either an unchanged original (positive
by h) or the newly-set value (s.customerCounts[i]?.getD 0) + 1
(at least 1 since getD 0 ≥ 0).
Pitman–Yor hyperparameters: discount a ∈ [0, 1) and concentration
b > 0. @cite{odonnell-2015} §3.5.5 (p. 102) sets a = 0.5, b = 100.
Note: the standard PYP allows b ≥ -a (boundary case), including
negative b for a > 0. We restrict to 0 < b to (a) match the book's
hyperparameter choice and (b) guarantee the new-table PYP weight
K·a + b is strictly positive at K = 0 — needed as a positivity
witness for PMF.normalize via tsum_ne_zero_iff. The boundary case can be added
as a separate variant if a consumer needs it.
- discount : ℝ
- concentration : ℝ
Instances For
Empty memoisation state (no customers anywhere).
Equations
- Morphology.FragmentGrammars.Operational.PYPState.empty h = { slots := fun (x : α) => Morphology.FragmentGrammars.Operational.PYPSlot.empty, hyper := h }
Instances For
Update the slot at input x, leaving all others unchanged.
Equations
- st.updateSlot x s = { slots := fun (y : α) => if y = x then s else st.slots y, hyper := st.hyper }
Instances For
All slots in an empty PYP state are PYPSlot.empty.
A PYPState is well-formed when every slot is well-formed. The
sampler's invariant: pypDraw and downstream fragmentLambdaDepth
preserve this — see pypDraw_preserves_wellFormed and
fragmentLambdaDepth_preserves_wellFormed.
Equations
- st.WellFormed = ∀ (x : α), (st.slots x).WellFormed
Instances For
The PYP-memoised stochastic monad #
PYM α D γ is the type of γ-valued stochastic computations whose state
is a Pitman–Yor memoisation table over inputs α storing dishes of type D.
This is the monad-stack equivalent of the (PYmem a b _) wrapper in
@cite{odonnell-2015} Figure 2.21.
Equations
- Morphology.FragmentGrammars.Operational.PYM α D γ = StateT (Morphology.FragmentGrammars.Operational.PYPState α D) PMF γ
Instances For
Lift a state-free PMF sample into PYM.
Equations
- Morphology.FragmentGrammars.Operational.PYM.liftBase p st = p.bind fun (v : γ) => PMF.pure (v, st)
Instances For
Preserves algebra: state-property preservation through PYM combinators #
A small algebra for proving that a PYM computation preserves a state-level
predicate P : PYPState α D → Prop. Closure laws under pure, bind,
get, liftBase, modify, dite, mapM let arbitrary PYM computations
built from these primitives discharge preservation mechanically.
Used in pypDraw_preserves_wellFormed and
fragmentLambdaDepth_preserves_wellFormed to discharge the WellFormed
invariant chain. The algebra is general (not tied to WellFormed); promoted
to Linglib/Core/Probability/PYM.lean once a second consumer needs it.
A PYM computation preserves a state-property P if every output
state with positive probability satisfies P, given the input state does.
Equations
- Morphology.FragmentGrammars.Operational.PYM.Preserves P m = ∀ (init : Morphology.FragmentGrammars.Operational.PYPState α D), P init → ∀ p ∈ (m init).support, P p.2
Instances For
Dependent if-then-else preserves if both branches do.
Pitman–Yor draw #
Per @cite{odonnell-2015} §2.3.3.2 (p. 59) and §3.1.6 (p. 89), the (N+1)-th customer at a slot sits at:
- existing table
i(1 ≤ i ≤ K) with weight(yᵢ − a) / (N + b) - a fresh new table with weight
(K·a + b) / (N + b)
where yᵢ is the count at table i, N = Σyᵢ, K = number of tables,
a = hyper.discount, b = hyper.concentration.
Sample from the PYP at slot x with base distribution base.
Either reuse the dish at an existing table (with the §3.1.6 weights), or
sample a fresh dish from base and seat the new customer at a new table.
Both branches update the memo state appropriately.
The base distribution is PYM-typed (state-passing) rather than the
usual PMF-typed because in fragment-lambda the recursive children of
a fresh sample themselves invoke the memo (via mem{L^B} for B≠A in
the §3.1.8 equations). The slots for distinct inputs are independent,
so this state-threading is well-defined; a per-restaurant PMF base
would suffice if the children's states were marginalised first.
Implementation note (exchangeability caveat). When base itself
modifies state (recursive pypDraw calls add tables at other slots, or
even at the same slot if grammar recursion revisits x), the table
indices change between when this pypDraw decides "new table" and when
it actually adds the new dish. Operationally we add to the post-base
state. By PYP exchangeability the resulting joint distribution agrees
with a sequential interpretation; a faithful implementation would
either snapshot-then-restore or work in a memo-free base : α → PMF D.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Lazy partial derivation trees #
A partial derivation tree under fragment-grammar generation. Three constructors:
terminal t— a fully-evaluated terminal symbol (the result ofunfoldreaching a terminal in @cite{odonnell-2015} Figure 2.7, p. 52).fragment x— a non-terminal stored as a fragment-leaf: the type-level representation of the(delay body)thunks of Figure 2.21 (p. 71). When the body offragment-lambdaflips heads on the halt coin, the result at this position isfragment xrather than further recursion.branch r x children— a non-terminal expanded by ruler. The rule is stored alongside the NT and children so that downstream consumers (samplesToCorpusCounts) can credit halt-vs-recurse decisions to the specific (rule, position) pair, matching @cite{odonnell-2015} §3.1.8's rule-indexed beta-binomial structure.
The third type parameter R is the rule type — typically
ContextFreeRule T G.NT for CFGs, abstract here so the same LazyTree
can be used by other generative formalisms (TAG, DM-PCFG variants).
The "complete" tree (no fragment-leaves anywhere) is the result of forcing all delayed thunks until termination.
- terminal {α β R : Type} : β → LazyTree α β R
- fragment {α β R : Type} : α → LazyTree α β R
- branch {α β R : Type} : R → α → List (LazyTree α β R) → LazyTree α β R
Instances For
Equations
Instances For
The fragment-leaf frontier: the non-terminals stored as halted sub-derivations (the "open slots" of a fragment, in @cite{odonnell-2015} §3.1.8's terminology).
Equations
- (Morphology.FragmentGrammars.Operational.LazyTree.terminal a).fragmentLeaves = []
- (Morphology.FragmentGrammars.Operational.LazyTree.fragment x_1).fragmentLeaves = [x_1]
- (Morphology.FragmentGrammars.Operational.LazyTree.branch a a_1 kids).fragmentLeaves = List.flatMap Morphology.FragmentGrammars.Operational.LazyTree.fragmentLeaves kids
Instances For
The terminal yield: the in-order sequence of terminal symbols reachable without forcing any fragment-leaf.
Equations
Instances For
fragmentLambda — depth-bounded operational unfold #
The depth bound is a structural-recursion device. The mathematically- intended object is the depth-∞ limit, which terminates almost surely when the recurse probability is bounded away from 1 (geometric depth). Formalising the limit requires probabilistic-fixed-point machinery not yet in mathlib; the bounded version is genuine and ships now.
The structure of @cite{odonnell-2015} §3.1.8 (p. 93) is
G^a_FG(d) = Σ mem{L^A}(s) · ∏ G^root_FG(s'_i) -- PYP-wrap + recurse on holes
L^A(d) = Σ θ_r ∏ [ν · G^root_FG + (1-ν) · 1] -- biased halt-or-recurse
mem{L^A} ~ PYP(a^A, b^A, L^A) -- Pitman-Yor memoization
We split this into two co-located functions:
stochasticLazyUnfoldDepth— the un-memoised §2.3.5.2 model (L^Awith children recursing back into the unfold itself, no PYP). This is @cite{odonnell-2015} §2.3.5.2'sstochastic-lazy-unfold(Figure 2.18, p. 68) — a recognised standalone sub-model in the book. Fully defined; kept here as the reference for the un-memoised model and as a sub-step for understanding the §3.1.8 architecture.fragmentLambdaDepth— the faithful §3.1.8 model (G^FG = mem{L^A}). Each call wraps withpypDraw; the inner body recurses onfragmentLambdaDepthitself for non-terminal children — children's draws also consult the memo, faithfully matching §3.1.8's mutual recursion. Lean accepts this as structural recursion through thepypDrawlambda +mapM+if(the recursive call is onn, structurally smaller thann+1).
Depth-bounded stochastic-lazy unfold per @cite{odonnell-2015}
§2.3.5.2 (Figure 2.18, p. 68). At each call to non-terminal x:
- Halt-or-recurse step: flip the biased coin (§3.1.8
BINOMIAL(ν), p. 92). With probability1 − recurseProb x, halt and returnLazyTree.fragment x. Otherwise sample a (rule, RHS) pair viarecurse x, recursively expand each non-terminal child at the next- smaller depth, map each terminal toLazyTree.terminal, and assembleLazyTree.branch rule x kids.
The rule is stored on the branch so samplesToCorpusCounts can credit
halt-vs-recurse decisions to the specific (rule, position) pair —
matching the rule-indexed beta-binomial structure of §3.1.8.
Depth 0 always halts. Pure PMF (no PYP memo state) — the un-memoised
sub-model. The PYP-wrapped version is fragmentLambdaDepth below.
Equations
- One or more equations did not get rendered due to their size.
- Morphology.FragmentGrammars.Operational.stochasticLazyUnfoldDepth recurse recurseProb recurseProb_le 0 x✝ = PMF.pure (Morphology.FragmentGrammars.Operational.LazyTree.fragment x✝)
Instances For
fragmentLambdaDepth — the PYP-memoised §3.1.8 model #
Wraps each non-terminal call with pypDraw so that previously-sampled
partial subtrees at the same non-terminal can be reused; recursive
children calls go back through fragmentLambdaDepth itself (PYP-wrapped),
faithfully matching @cite{odonnell-2015} §3.1.8's mutual recursion
G^FG = mem{L^A} ↔ L^A-body-calls-G^FG-on-children.
The inner per-call body is factored as fragmentLambdaStep so the
preservation proof can apply combinator lemmas to a function with a
visible name and type; the two presentations are definitionally equal.
Equations
- One or more equations did not get rendered due to their size.
- Morphology.FragmentGrammars.Operational.fragmentLambdaDepth recurse recurseProb recurseProb_le 0 x✝ = pure (Morphology.FragmentGrammars.Operational.LazyTree.fragment x✝)
Instances For
Wellformedness preservation #
pypDraw preserves PYPState.WellFormed: every state in the support
of the output PMF is wellformed, given a wellformed input state and a
wellformed-preserving base distribution.
The proof would observe that pypDraw's two branches are:
- existing-table:
seatCustomer i(preserves wellformedness viaPYPSlot.seatCustomer_wellFormed) - new-table:
addTable dishafter sampling frombase(preserves viaPYPSlot.addTable_wellFormed+ base's preservation hypothesis)
Combined with PYPState.updateSlot_wellFormed, both branches yield a
wellformed state. The PMF support is contained in the union of these
branches' images, so all output states are wellformed.
Status: sorry. As of 0.230.519's seatCustomer-discharge iteration,
the slot-level lemmas (seatCustomer_wellFormed, addTable_wellFormed)
are real proofs. The remaining obstacle is PMF.support reasoning through
the do-block: pypDraw is a chain of binds, and the support of a bind
is {b | ∃ a ∈ p.support, b ∈ (f a).support} (PMF.mem_support_bind_iff).
Proving the result via this chain is mechanical (~50-100 LOC) but
requires patient manipulation of PMF.support_pure, support_bind,
support of PMF.normalize, etc.
fragmentLambdaDepth preserves PYPState.WellFormed: every state in
the support of the output PMF is wellformed, given a wellformed input
state. By induction on depth: depth 0 trivially preserves (no state
change, just pure (.fragment x)); depth n+1 via
pypDraw_preserves_wellFormed with the inner body's preservation —
which itself follows by IH for non-terminal children's
fragmentLambdaDepth ... n calls.
When discharged, this theorem lets samplesToCorpusCounts and the
soundness theorem assume input-state wellformedness, which in turn lets
slotToFinpartition discharge its atomic-fallback branch via absurd
(the branch is unreachable for any state the sampler can produce).
Halt-count extraction from samples #
See module note above on collectHaltCounts semantics.
Equations
- One or more equations did not get rendered due to their size.
- (Morphology.FragmentGrammars.Operational.LazyTree.terminal a).collectHaltCounts x✝¹ x✝ = (0, 0)
- (Morphology.FragmentGrammars.Operational.LazyTree.fragment a).collectHaltCounts x✝¹ x✝ = (0, 0)
Instances For
List-recursion helper for collectHaltCounts.
Equations
- One or more equations did not get rendered due to their size.
- Morphology.FragmentGrammars.Operational.LazyTree.collectKidsHaltCounts [] x✝¹ x✝ = (0, 0)
Instances For
See module note above on toCFGTree? semantics.
Equations
- One or more equations did not get rendered due to their size.
- (Morphology.FragmentGrammars.Operational.LazyTree.terminal a).toCFGTree? = some (CFGTree.leaf a)
- (Morphology.FragmentGrammars.Operational.LazyTree.fragment x_1).toCFGTree? = none
Instances For
List-recursion helper for toCFGTree?.
Equations
- One or more equations did not get rendered due to their size.
- Morphology.FragmentGrammars.Operational.LazyTree.toCFGTreesList [] = some []
Instances For
Soundness contract #
Convert a PYPSlot to a sigma-pair Σ n, OrderedFinpartition n
suitable for AdaptorGrammar.TableAssignment. Three branches:
- Empty slot (
numCustomers = 0): returns⟨0, default⟩directly. Preserves the depth-0 lemma'sfrom rflchain (the empty case is defeq toAdaptorGrammar.emptyTables G's entry). - Non-empty wellformed slot (
∀ c ∈ customerCounts, 0 < c): builds a realComposition s.numCustomersand projects via the newComposition.toOrderedFinpartition(inLinglib/Core/Probability/PitmanYor.lean). The resulting partition faithfully captures the slot's table-grouping structure — whatpypFactoractually depends on for its EPPF value. - Non-empty non-wellformed slot (a
customerCountis0): falls back toOrderedFinpartition.atomic. This branch is unreachable under the sampler's invariant (pypDraw'saddTableinitialises counts to1,seatCustomerincrements). TheWellFormedpredicate onPYPSlotandPYPStateand the preservation theoremspypDraw_preserves_wellFormed(proved viaPYM.Preservescombinator algebra) andfragmentLambdaDepth_preserves_wellFormed(induction sorry'd, sketch in docstring) capture this invariant. When the depth-induction is discharged, this branch can be eliminated by taking(_ : init.WellFormed)as a hypothesis toslotToFinpartitionand discharging the unreachable case viaabsurd.
The numCustomers = 0 special case is load-bearing: without it, the
Composition-built partition for an empty slot is not definitionally
equal to OrderedFinpartition.atomic 0 = default (both are unique
inhabitants of OrderedFinpartition 0 modulo @[ext], but the partSize
functions are syntactically distinct: one is (empty composition).blocksFun,
the other is constant 1). Splitting on numCustomers = 0 keeps the
empty branch defeq to the prior shim, preserving the depth-0 lemma.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Convenient abbreviation for the corpus-counts triple consumed by
FragmentGrammar.corpusProbGivenStorage: derivation multiset + per-NT
table assignment + per-(rule, position) halt counts.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Extract corpus-counts triple (D, Y, Z) from a completed sample.
Maps a LazyTree (with PYP final state) into CorpusCounts T G:
D : Multiset (CFGTree T G.NT)— the underlying derivation treesY : AdaptorGrammar.TableAssignment G— table-level reuse countsZ : FragmentGrammar.HaltCounts G— recurse/halt counts per (rule, position)
Status:
Zis real (viaLazyTree.collectHaltCounts).Yis faithful for sampler-reachable slots (viaslotToFinpartitionusingComposition.toOrderedFinpartition): for any slot the sampler can produce (wherecustomerCountsare all positive —addTableinitialises to1,seatCustomerincrements), the resultingOrderedFinpartitionhas the true partition structure, andpypFactorevaluated on it gives its EPPF value. The atomic-fallback branch inslotToFinpartitionis only reached for non-wellformed slots (impossible by sampler invariant; future work proves it).Dis real (viaLazyTree.toCFGTree?): if the tree projects to a completeCFGTree(no fragment-leaves),Dis the singleton multiset of that derivation; otherwise (.fragmentsomewhere in the tree)Dis empty — incomplete samples contribute no derivation.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Depth-0 base case for the un-memoised unfold: returns
PMF.pure (.fragment start).
Depth-0 base case: the sampler at depth 0 is the trivial state-passing
pure of LazyTree.fragment start, with no state changes.
This is genuinely operational content: it pins down what the sampler
does at the depth-bound boundary, where every branch halts. Provable
by rfl because pure in the PYM = StateT _ PMF monad reduces
definitionally to fun st => PMF.pure (·, st).
Depth-0 soundness corollary: at depth 0, for the trivial sample
(LazyTree.fragment start, st), the PMF mass equals the §3.1.8 density
at the empty corpus / empty tables / empty halt-counts triple — both
equal to 1.
A .fragment leaf has no branches, so collectHaltCounts returns the
zero pair for every (rule, position), making samplesToCorpusCounts.Z
extensionally equal to emptyHaltCounts G. The depth-0 sample mass is
1 (PMF.pure), and the empty-corpus density is 1 by
corpusProbGivenStorage_empty; equality holds.
This is a real, fully-proved slice of the soundness contract. The
general soundness theorem below remains sorry-marked.
The marginal mass the sampler puts on samples whose extracted
corpus-counts equal target. Defined via PMF pushforward (PMF.map)
along samplesToCorpusCounts, which avoids the function-equality
DecidableEq issue that an explicit tsum + if-then-else formulation
would face for (Multiset × TableAssignment × HaltCounts).
The pushforward is (samplerPMF.map extract) target = ∑' s, μ s · 1[extract s = target].
Equations
- One or more equations did not get rendered due to their size.
Instances For
Soundness contract (general — proportionality form). The marginal
mass the sampler puts at corpus-counts triple (D, Y, Z) is proportional
to the §3.1.8 density M.corpusProbGivenStorage D Y Z, with the same
proportionality constant for all triples. We express this as a ratio
identity (avoiding the partition function): for any two triples the cross-
products of their marginals and densities agree. This is necessary and
sufficient for the marginal to be a normalised version of the density.
Why ratios rather than equality with a normaliser. The natural
formulation marginal D Y Z = ENNReal.ofReal (density D Y Z) / Z(M)
requires defining Z(M) = ∑'_(D,Y,Z) density D Y Z — a sum over function
spaces (TableAssignment, HaltCounts are function types). Convergence
of this sum is a real-analysis problem (it sums beta/gamma integrals);
the partition function is essentially the marginal likelihood of the
fragment-grammar model, which is itself an open computational problem
(@cite{odonnell-2015} §3.2 introduces an MH sampler precisely because
this constant is intractable). The ratio formulation sidesteps Z(M)
entirely and captures the proportionality content directly.
Why this still requires depth-→-∞. At any finite depth n, the
sampler's marginals are truncated — supported only on samples that
halt within n recursion steps. The §3.1.8 density is the limiting
distribution; at finite depth the marginals only approximately match.
The proof requires:
- Showing
(λ n, marginal at depth n) → (λ ε, density-up-to-ε)(the depth-truncated marginals converge to the true §3.1.8 marginals asn → ∞). Almost-sure halting fromrecurseProb x < 1for allxgives geometric-tail bounds; pass to the limit via dominated convergence onPMF. - PYP exchangeability to handle the operational sampler's order-of-
customer-arrival vs the §3.1.8 joint distribution's order-agnostic
claim (see
pypDraw's exchangeability caveat). - Identifying the limit's marginal at
(D, Y, Z)with the §3.1.8 product formula — induction matching each PYP draw to its AG-factor contribution and each biased-coin flip to its beta-binomial-ratio contribution toM.fgFactor.
Step 1 needs probabilistic-fixed-point machinery for monotone PMF-valued recursions (Knaster–Tarski / Kleene fixed point on ω-CPPOs of sub- probability measures) absent from mathlib. The right formal home for this is mathlib's measure-theory or analysis libraries, not linglib. Steps 2 and 3 are mechanical once Step 1 is in place.