Documentation

Linglib.Core.Probability.DirichletMultinomial

Dirichlet–multinomial distribution #

@cite{odonnell-2015}

The count-vector PMF associated with a PolyaUrn. Sequentially: the distribution of (x_1, …, x_K) after drawing N times from the urn, where x_i counts how often color i was drawn. By Dirichlet–Categorical conjugacy, equivalently obtained by sampling θ ~ Dirichlet(π) and then (x_1, …, x_K) ~ Multinomial(N; θ) and integrating out θ.

Closed-form mass at x : α → ℕ with ∑ i, x i = N:

P(x | π) = (N! / ∏ x_i!) · Γ(Σ π) / Γ(Σ π + N) · ∏ Γ(π_i + x_i) / Γ(π_i)

The first factor is the multinomial coefficient (number of draw sequences realizing the count vector x); the rest is PolyaUrn.seqProb x (the per-sequence likelihood, defined in the sibling file PolyaUrn.lean).

Two-tier mathlib pattern #

Mirrors Mathlib.Probability.Distributions.Poisson.Basic's poissonPMFReal + poissonPMF: a raw- closed form pmfReal and a proper PMF (α → ℕ) wrapper pmf constructed directly from a HasSum proof.

Status #

Normalization (pmfReal_hasSum_one) is fully discharged. The two key combinatorial lemmas are proved by induction on N via Fin.snoc decomposition:

  1. multinomCount_mul_prod_factorial: the multinomial counting identity #{seq | count seq = x} · ∏ x_i! = N!. Step N+1 partitions sequences by their last element via multinomCount_snoc_fiber.
  2. exists_seq_count_eq: every count vector with Σ = N is realized by some sequence, by IH-induction (decrement at some c with x c > 0, then snoc).

Candidate to upstream to mathlib alongside a proper Mathlib/Probability/Distributions/Dirichlet.lean / DirichletMultinomial.lean.

Why split from PolyaUrn.lean #

Downstream consumers (Theories/Morphology/FragmentGrammars/DMPCFG, AdaptorGrammar, FragmentGrammar) consume only PolyaUrn.seqProb — a corpus IS a labeled derivation sequence, not a draw from the unlabeled-count distribution. Bundling the PMF machinery into PolyaUrn.lean would force every consumer of seqProb to pay the Probability.ProbabilityMassFunction.Basic import cost (~10s of olean loading via MeasureTheory.Measure.Dirac). Mathlib analog: Distributions/Poisson/Basic.lean is distinct from Distributions/Poisson/Variance.lean for the same import-discipline reason.

Why we avoid PMF.ofFinset + Finset.piAntidiag + Nat.multinomial #

Each lives in a heavier mathlib file (ProbabilityMassFunction.Constructions, Algebra.Order.Antidiag.Pi, Data.Nat.Choose.Multinomial) and we don't need them: the PMF can be built directly from a HasSum (mathlib's PMF.ofFinset is itself just ⟨_, hasSum_sum_of_ne_finset_zero ...⟩), the support characterization fits as an inline if ∑ i, x i = N then _ else 0, and the multinomial coefficient is just (∑ i, x i)! / ∏ i, (x i)! written out. Mathlib's Poisson takes the same approach (⟨ENNReal.ofReal ∘ poissonPMFReal r, _⟩).

The labeled→unlabeled .map to PMF (Nat.Partition N) is deferred to a future file (would import Combinatorics.Enumerative.Partition and ProbabilityMassFunction.Monad for .map).

Main definitions #

noncomputable def ProbabilityTheory.DirichletMultinomial.pmfReal {α : Type u_1} [Fintype α] (u : PolyaUrn α) (x : α) :

Dirichlet–multinomial closed-form mass at a count vector, as a raw . The multinomial coefficient (∑ x_i)! / ∏ (x_i!) counts how many draw sequences realize the count vector x; multiplying by PolyaUrn.seqProb x gives the count-vector mass. For x summing to N, this equals (N! / ∏ x_i!) · Γ(Σ π) / Γ(Σ π + N) · ∏ Γ(π_i + x_i) / Γ(π_i) — the standard Dirichlet–multinomial mass.

Equations
Instances For
    theorem ProbabilityTheory.DirichletMultinomial.pmfReal_nonneg {α : Type u_1} [Fintype α] (u : PolyaUrn α) [Nonempty α] (x : α) :
    0 pmfReal u x
    theorem ProbabilityTheory.DirichletMultinomial.pmfReal_hasSum_one {α : Type u_1} [Fintype α] (u : PolyaUrn α) [DecidableEq α] [Nonempty α] (N : ) :
    HasSum (fun (x : α) => if i : α, x i = N then ENNReal.ofReal (pmfReal u x) else 0) 1

    Dirichlet–multinomial normalization. The closed-form mass sums to 1 over count vectors of any fixed total N (zero elsewhere).

    Proof structure (sequence-based, not Dirichlet-integration-based):

    1. PolyaUrn.sum_seqProb_eq_one: per-sequence likelihood sums to 1 over Fin N → α (already proved).
    2. Finset.sum_fiberwise_of_maps_to: regroup the sequence sum by count vector; the inner sum over each fiber gives (card of fiber) · seqProb x.
    3. multinomCount_mul_prod_factorial (sorried): card of fiber = N!/∏x_i!, matching pmfReal x = (N!/∏x_i!) · seqProb x.
    4. exists_seq_count_eq (sorried): every x with Σ = N lies in the count image, so the function vanishes outside this finite set.
    5. hasSum_sum_of_ne_finset_zero + ENNReal.ofReal_sum_of_nonneg: lift the real-valued finite-sum result to the desired HasSum over ENNReal.

    Both sorries are routine combinatorial constructions on Fin N → α (induction on N via Fin.snocEquiv, list-based enumeration of α).

    noncomputable def ProbabilityTheory.DirichletMultinomial.pmf {α : Type u_1} [Fintype α] (u : PolyaUrn α) [DecidableEq α] [Nonempty α] (N : ) :
    PMF (α)

    The Dirichlet–multinomial distribution as a proper PMF (α → ℕ), supported on count vectors summing to N.

    Constructed directly from HasSum (mirroring mathlib's Poisson ⟨ENNReal.ofReal ∘ poissonPMFReal r, _⟩) rather than via heavier PMF.ofFinset / Finset.piAntidiag machinery — see file docstring.

    Equations
    Instances For
      @[simp]
      theorem ProbabilityTheory.DirichletMultinomial.pmf_apply {α : Type u_1} [Fintype α] (u : PolyaUrn α) [DecidableEq α] [Nonempty α] (N : ) (x : α) (hx : i : α, x i = N) :
      (pmf u N) x = ENNReal.ofReal (pmfReal u x)

      Closed-form value at a count vector summing to N.

      theorem ProbabilityTheory.DirichletMultinomial.pmf_apply_of_sum_ne {α : Type u_1} [Fintype α] (u : PolyaUrn α) [DecidableEq α] [Nonempty α] (N : ) (x : α) (hx : i : α, x i N) :
      (pmf u N) x = 0

      Outside its support (count vectors that don't sum to N), the Dirichlet–multinomial PMF is zero.