Documentation

Linglib.Core.Probability.PitmanYor

Pitman–Yor process #

@cite{pitman-2006} @cite{odonnell-2015}

The Pitman–Yor process (PYP) is a two-parameter Bayesian non-parametric distribution on partitions of [n], generalising the Chinese Restaurant Process (the one-parameter Dirichlet process). The mathematical reference is @cite{pitman-2006} §3.2 (Saint-Flour lectures); the linguistic application that motivates this file is @cite{odonnell-2015} §3.1.6 (memoization distribution for adaptor and fragment grammars in Theories/Morphology/FragmentGrammars/).

Naming convention #

@cite{pitman-2006} writes parameters as (α, θ) with α = discount and θ = concentration; @cite{odonnell-2015} writes (a, b) for the same two. We use (discount, concentration) to match neither convention's single letters but to be self-documenting.

Main definitions #

What partitionProb actually computes #

partitionProb q evaluates Pitman's EPPF formula (@cite{pitman-2006} eq 3.6) at the multiset of block sizes q.parts. The EPPF is, per @cite{pitman-2006} p. 39, the probability that the random partition Π_n equals any specific (set) partition of [n] whose blocks have sizes (n_1, …, n_k). By the EPPF's symmetry, the value depends only on the multiset of sizes — which is what makes the Nat.Partition n → ℝ signature well-typed.

partitionProb q is therefore the prob of one specific set partition with multiset of block sizes q.parts, NOT the prob of the multiset q.parts itself. The two differ by the multiplicity factor

mult(q) = n! / (∏_{m ∈ q.parts} m! · ∏_{j} (q.parts.count j)!)

i.e. the number of set partitions of [n] whose block sizes are q.parts (@cite{pitman-2006} eq 2.2 / Nat.Partition.numSetPartitions).

Sum-to-1 identities #

Pitman 2006 gives several equivalent normalisations of the EPPF:

(a) ∑_{Π : set partition of [n]} EPPF(block sizes of Π) = 1
                                                          @cite{pitman-2006} Thm 3.2
(b) ∑_{q : Nat.Partition n} mult(q) · partitionProb q = 1
                                                          @cite{pitman-2006} eq 2.2
(c) ∑_{compositions (n_1,…,n_k) of n} (n choose n_1,…,n_k)·1/k! · EPPF(n_1,…,n_k) = 1
                                                          @cite{pitman-2006} p. 42

We formalise (a) as sum_partitionProb_set_eq_one, summing over Finpartition (Finset.univ : Finset (Fin n)). This is the form the downstream AdaptorGrammar consumer needs (since AG's Y is a labeled table assignment, equivalent to a set partition under the canonical "tables labeled by order of creation" convention).

The bare sum ∑_{q : Nat.Partition n} partitionProb q does NOT equal 1 in general — every multiset appears once in the sum, but the EPPF interpretation requires counting it mult(q) times. For example, at α = 0, θ = 1, n = 3 the bare sum is 2/3.

Limitations #

def Nat.Partition.numSetPartitions {n : } (q : n.Partition) :

The number of distinct set partitions of Fin n whose multiset of block cardinalities equals q.parts. By the standard combinatorial formula (@cite{pitman-2006} eq 2.3 in implicit form):

mult(q) = n! / (∏_{m ∈ q.parts} m! · ∏_{j ∈ q.parts.toFinset} (q.parts.count j)!)

For example, mult({1, 2}) = 3 (counts {{1},{2,3}}, {{2},{1,3}}, {{3},{1,2}}); mult({1, 1, 1}) = 1; mult({3}) = 1.

The natural-number division is exact (the denominator divides n! because both quantities count the same set of objects from different angles).

Equations
  • q.numSetPartitions = n.factorial / (Multiset.map Nat.factorial q.parts).prod / jq.parts.toFinset, (Multiset.count j q.parts).factorial
Instances For
    def Nat.Partition.consOne {n : } (q : n.Partition) :
    (n + 1).Partition

    Extend a partition of n by a new singleton block of size 1, yielding a partition of n + 1. The combinatorial counterpart of "the next customer sits at a new (singleton) table" in the PYP seating plan (@cite{pitman-2006} §3.2).

    Equations
    • q.consOne = { parts := 1 ::ₘ q.parts, parts_pos := , parts_sum := }
    Instances For
      @[simp]
      theorem Nat.Partition.consOne_parts {n : } (q : n.Partition) :
      q.consOne.parts = 1 ::ₘ q.parts
      @[simp]
      theorem Nat.Partition.consOne_card {n : } (q : n.Partition) :
      q.consOne.parts.card = q.parts.card + 1
      def Nat.Partition.replaceMem {n : } (q : n.Partition) (m : ) (hm : m q.parts) :
      (n + 1).Partition

      Replace one occurrence of m (a block of size m) with m + 1, yielding a partition of n + 1. The combinatorial counterpart of "the next customer joins an existing table of size m" in the PYP seating plan (@cite{pitman-2006} §3.2).

      Equations
      • q.replaceMem m hm = { parts := (m + 1) ::ₘ q.parts.erase m, parts_pos := , parts_sum := }
      Instances For
        @[simp]
        theorem Nat.Partition.replaceMem_parts {n : } (q : n.Partition) (m : ) (hm : m q.parts) :
        (q.replaceMem m hm).parts = (m + 1) ::ₘ q.parts.erase m
        @[simp]
        theorem Nat.Partition.replaceMem_card {n : } (q : n.Partition) (m : ) (hm : m q.parts) :
        (q.replaceMem m hm).parts.card = q.parts.card
        def Finpartition.toNatPartition {n : } (P : Finpartition Finset.univ) :
        n.Partition

        Convert a set partition of Fin n (i.e., a Finpartition of Finset.univ : Finset (Fin n)) to its block-size multiset (Nat.Partition n). This is the projection that loses the set-partition structure (which elements are in which block) and keeps only the cardinalities.

        Used by AdaptorGrammar.pypFactor to evaluate Pitman's EPPF (PitmanYor.partitionProb) on a labeled table assignment, since the EPPF formula depends only on block sizes.

        Equations
        • P.toNatPartition = { parts := Multiset.map Finset.card P.parts.val, parts_pos := , parts_sum := }
        Instances For
          def OrderedFinpartition.toNatPartition {n : } (c : OrderedFinpartition n) :
          n.Partition

          Convert an ordered set partition of Fin n (mathlib's OrderedFinpartition n, parts ordered by greatest element) to its block-size multiset (Nat.Partition n). This is the projection that drops the part-ordering data and keeps only the multiset of block cardinalities.

          Used by PitmanYor.partitionProb evaluation: the EPPF formula depends only on block sizes, so the OrderedFinpartition's specific ordering is irrelevant — but the structure gives us mathlib's extendEquiv which matches Pitman's seating-plan recursion exactly.

          Equations
          • c.toNatPartition = { parts := Multiset.map c.partSize Finset.univ.val, parts_pos := , parts_sum := }
          Instances For
            @[simp]
            theorem OrderedFinpartition.toNatPartition_parts {n : } (c : OrderedFinpartition n) :
            c.toNatPartition.parts = Multiset.map c.partSize Finset.univ.val
            theorem OrderedFinpartition.partSize_mem_toNatPartition {n : } (c : OrderedFinpartition n) (k : Fin c.length) :
            c.partSize k c.toNatPartition.parts
            theorem OrderedFinpartition.extendLeft_toNatPartition {n : } (c : OrderedFinpartition n) :

            Mathlib's OrderedFinpartition.extendLeft (= "add new singleton block") corresponds to Nat.Partition.consOne at the multiset level.

            theorem OrderedFinpartition.extendMiddle_toNatPartition {n : } (c : OrderedFinpartition n) (k : Fin c.length) :
            (c.extendMiddle k).toNatPartition = c.toNatPartition.replaceMem (c.partSize k)

            Mathlib's OrderedFinpartition.extendMiddle k (= "extend block k by 1 element") corresponds to Nat.Partition.replaceMem (partSize k) at the multiset level.

            def Composition.toOrderedFinpartition {n : } (c : Composition n) :
            OrderedFinpartition n

            Convert a Composition n (mathlib's "list of positive integers summing to n" view) into the corresponding OrderedFinpartition n whose i-th block embeds to the consecutive interval [c.sizeUpTo i, c.sizeUpTo (i + 1)) of Fin n.

            Bridges mathlib's two combinatorial views: Composition n carries clean list-of-sizes data (used wherever a per-input size profile arises); the OrderedFinpartition view exposes per-element membership embeddings (needed by OrderedFinpartition.toNatPartition to project to Nat.Partition n, which is what Pitman's EPPF formula consumes — see partitionProb).

            The realization uses mathlib's Composition.embedding : Fin (c.blocksFun i) ↪o Fin n directly for the emb field. Disjointness and coverage follow from mem_range_embedding_iff plus the interval characterization range (embedding i) = [sizeUpTo i, sizeUpTo (i+1)).

            Equations
            • One or more equations did not get rendered due to their size.
            Instances For
              def ProbabilityTheory.stepPochhammer {R : Type u_1} [CommSemiring R] (x s : R) (m : ) :
              R

              The step Pochhammer product [x]_{m, s} := ∏_{k=0}^{m-1} (x + k · s) of @cite{pitman-2006} eq 3.7 (written there as (x)_{m↑s}) and @cite{odonnell-2015} eq 3.13.

              Generalises the standard rising factorial in the step s:

              Used by PitmanYor.partitionProb with two values of the step: s = 1 (block-size factors) and s = discount (table-creation factor). The latter — step-discount — is the defining generalisation of the Pitman–Yor process over the one-parameter Dirichlet process / Chinese Restaurant Process.

              Equations
              Instances For
                @[simp]
                theorem ProbabilityTheory.stepPochhammer_zero {R : Type u_1} [CommSemiring R] (x s : R) :
                stepPochhammer x s 0 = 1
                theorem ProbabilityTheory.stepPochhammer_succ {R : Type u_1} [CommSemiring R] (x s : R) (m : ) :
                stepPochhammer x s (m + 1) = stepPochhammer x s m * (x + m * s)
                theorem ProbabilityTheory.stepPochhammer_one_eq_ascPochhammer_eval {R : Type u_1} [CommSemiring R] (x : R) (m : ) :
                stepPochhammer x 1 m = Polynomial.eval x (ascPochhammer R m)

                The step-1 case is mathlib's rising factorial. Bridges the PYP file's notation to Mathlib.RingTheory.Polynomial.Pochhammer, so users grepping for Pochhammer find both names.

                theorem ProbabilityTheory.stepPochhammer_zero_step {R : Type u_1} [CommSemiring R] (x : R) (m : ) :
                stepPochhammer x 0 m = x ^ m

                The step-0 case collapses to a power: [x]_{m, 0} = x^m.

                theorem ProbabilityTheory.stepPochhammer_nonneg {R : Type u_1} [CommSemiring R] [PartialOrder R] [IsOrderedRing R] (x s : R) (m : ) (h : ∀ (k : Fin m), 0 x + k * s) :

                Step product is nonnegative when every factor is nonnegative. General-purpose; the PYP-specific factor positivity lives in PitmanYor.partitionProb_nonneg.

                A Pitman–Yor process @cite{pitman-2006} §3.2 / @cite{odonnell-2015} §3.1.6 with discount α ∈ [0, 1] and concentration θ > -α (@cite{pitman-2006} eq 3.5, second case; the first case α = -κ < 0, θ = mκ for integer m is excluded — irrelevant for the linguistic application here).

                Sequential interpretation (@cite{pitman-2006} §3.2 (α, θ) seating plan): the first customer sits at table 1; the (N + 1)-th customer sits at occupied table i (current size n_i) with probability (n_i - α) / (N + θ), or at a new table with probability (K · α + θ) / (N + θ), where K is the current number of occupied tables. Higher discount α favours new tables.

                The constraint θ > -α is strict (matching @cite{pitman-2006} eq 3.5). The boundary θ = -α is degenerate: the closed-form formula has 0/0 ratios that Lean's x/0 = 0 convention resolves to 0, giving partitionProb = 0 for partitions with K ≥ 2 blocks but mathematically should give the all-elements-together-or-singleton limit. Since this boundary is a separate (degenerate) distribution not a regular PYP, we exclude it here.

                Boundary cases (well-defined under the strict constraint):

                • discount = 0: reduces to the one-parameter Chinese Restaurant Process (Dirichlet process with parameter θ > 0).
                • discount = 1: every customer sits at a new table; partitionProb is 0 on any partition with a non-singleton block (formula has 1 - α = 0 factor) and equals 1 on the all-singletons partition.
                Instances For
                  theorem ProbabilityTheory.PitmanYor.ext {x y : PitmanYor} (discount : x.discount = y.discount) (concentration : x.concentration = y.concentration) :
                  x = y
                  noncomputable def ProbabilityTheory.PitmanYor.partitionProb (p : PitmanYor) {n : } (q : n.Partition) :

                  The Pitman–Yor exchangeable partition probability function (EPPF) of @cite{pitman-2006} Theorem 3.2 (eq 3.6) / @cite{odonnell-2015} eq 3.14, evaluated at the multiset of block sizes q.parts:

                  EPPF(n_1, ..., n_K | α, θ) = [θ + α]_{K-1, α} · ∏_{i=1}^K [1 - α]_{n_i - 1, 1}
                                               / [θ + 1]_{N - 1, 1}
                  

                  where K = q.parts.card is the number of blocks and N = n is the total number of elements.

                  Semantics: by @cite{pitman-2006} p. 39, this is the probability that the random partition Π_n equals any specific (set) partition of [n] with the given multiset of block sizes — NOT the probability of the multiset itself. The two differ by the multiplicity factor n! / (∏ m_i! · ∏ count_j!). See the module docstring for the three equivalent normalisation identities (Pitman 3.2 / 2.2 / p. 42).

                  The EPPF is symmetric in (n_1, …, n_k), so the value is well-defined as a function of the multiset q.parts. This is what makes the Nat.Partition n → ℝ signature well-typed; it does not make partitionProb a probability distribution on Nat.Partition n (it isn't — see module docstring).

                  The n - 1 and q.parts.card - 1 terms use ℕ truncating subtraction. This is intentional and correct on the boundary: at n = 0 (and hence the unique empty partition with K = 0), all three step factors collapse to the empty product 1, giving EPPF = 1 — the vacuous probability of the unique set partition of zero elements.

                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For
                    theorem ProbabilityTheory.PitmanYor.partitionProb_nonneg (p : PitmanYor) {n : } (q : n.Partition) :

                    Pitman–Yor partition probabilities are nonnegative. Each step factor in @cite{pitman-2006} eq 3.6 is nonnegative under the PYP constraints (0 ≤ α ≤ 1, θ ≥ -α), and the overall quotient and product preserve nonnegativity. Used downstream by AdaptorGrammar.corpusProbGivenTables_nonneg.

                    theorem ProbabilityTheory.PitmanYor.partitionProb_eq_of_parts_eq (p : PitmanYor) {n : } (q q' : n.Partition) (h : q.parts = q'.parts) :

                    Exchangeability is by construction: partitionProb reads only the multiset q.parts, never an underlying ordering. Stated for grep-discoverability — the proof is rfl.

                    theorem ProbabilityTheory.PitmanYor.partitionProb_consOne_mul (p : PitmanYor) {n : } (q : n.Partition) :
                    (n + p.concentration) * p.partitionProb q.consOne = (q.parts.card * p.discount + p.concentration) * p.partitionProb q

                    Cross-multiplied form of the seating-plan transition for a new singleton block (@cite{pitman-2006} §3.2 (α, θ) seating plan, "new table" branch). The seating-plan probability of the (n+1)-th customer creating a new table is (K · α + θ) / (n + θ), where K is the current number of blocks; this is the multiplicative counterpart, valid in all cases including n = θ = 0.

                    Equivalent (modulo dividing both sides by (n + θ) when nonzero) to: partitionProb q.consOne = partitionProb q · (K · α + θ) / (n + θ).

                    theorem ProbabilityTheory.PitmanYor.partitionProb_replaceMem_mul (p : PitmanYor) {n : } (q : n.Partition) (m : ) (hm : m q.parts) :
                    (n + p.concentration) * p.partitionProb (q.replaceMem m hm) = (m - p.discount) * p.partitionProb q

                    Cross-multiplied form of the seating-plan transition for joining an existing block of size m (@cite{pitman-2006} §3.2 (α, θ) seating plan, "join occupied table" branch). The seating-plan probability of the (n+1)-th customer joining an existing table of size m is (m - α) / (n + θ); this is the multiplicative counterpart.

                    Equivalent (modulo dividing by (n + θ)) to: partitionProb (q.replaceMem m hm) = partitionProb q · (m - α) / (n + θ).

                    theorem OrderedFinpartition.sum_partSize_eq {n : } (c : OrderedFinpartition n) :
                    k : Fin c.length, c.partSize k = n

                    Helper: sum of OrderedFinpartition.partSize over all part-indices equals n. Follows from the cardinality of the bijection Σ k, Fin (partSize k) ≃ Fin n.

                    theorem ProbabilityTheory.PitmanYor.partitionProb_extend_sum (p : PitmanYor) (n : ) (c : OrderedFinpartition n) :
                    p.partitionProb c.extendLeft.toNatPartition + k : Fin c.length, p.partitionProb (c.extendMiddle k).toNatPartition = p.partitionProb c.toNatPartition

                    The Pitman-Yor seating-plan addition rule (@cite{pitman-2006} eq 2.9 for the (α, θ) EPPF). Summing the partition probabilities over all "ways to seat the (n+1)-th customer" recovers the partition probability of the predecessor. Proved via Lemmas A (partitionProb_consOne_mul) and B (partitionProb_replaceMem_mul) plus the arithmetic identity (K · α + θ) + ∑_k (partSize k - α) = n + θ.

                    theorem ProbabilityTheory.PitmanYor.sum_partitionProb_ord_eq_one (p : PitmanYor) (n : ) :
                    c : OrderedFinpartition n, p.partitionProb c.toNatPartition = 1

                    Pitman 2006 Theorem 3.2. The EPPF is a probability mass function on the set of set partitions of [n], expressed via mathlib's OrderedFinpartition n (canonical labeling by greatest element):

                    ∑_{c : OrderedFinpartition n} partitionProb c.toNatPartition = 1
                    

                    The proof is by induction on n using mathlib's OrderedFinpartition.extendEquiv — the bijection (c : OrderedFinpartition n) × Option (Fin c.length) ≃ OrderedFinpartition (n + 1) — to decompose the sum. The inductive step is partitionProb_extend_sum, which combines Lemma A (partitionProb_consOne_mul) and Lemma B (partitionProb_replaceMem_mul) with the arithmetic identity (K · α + θ) + ∑_k (partSize k - α) = n + θ.

                    OrderedFinpartition carries a specific canonical ordering of parts (by greatest element), but partitionProb depends only on the block-size multiset, so this is the right sample space: each set partition of [n] corresponds to exactly one OrderedFinpartition n.