Documentation

Linglib.Core.Probability.Hypergeometric

The hypergeometric PMF #

The hypergeometric distribution describes sampling without replacement from a finite population. Drawing n items from a population of N containing K successes (and N - K failures), the probability of observing exactly k successes is

P(k | N, K, n) = C(K, k) · C(N − K, n − k) / C(N, n)

This file defines PMF.hypergeometric N K n as a PMF (Fin (n + 1)) and proves the sum-to-1 condition via Chu-Vandermonde's identity (Nat.add_choose_eq).

Main definitions #

Use cases in linglib #

Both papers previously open-coded the C(K, k) · C(N - K, n - k) numerator and the C(N, n) denominator separately in their obsKernelRaw and PMF.normalize machinery. With this primitive, the observation kernel is a one-line definition.

theorem Nat.sum_choose_mul_choose_eq (N K n : ) (h_K : K N) :
kFinset.range (n + 1), K.choose k * (N - K).choose (n - k) = N.choose n

Vandermonde's identity, sum form: when K ≤ N, ∑ k ∈ range (n + 1), C(K, k) * C(N - K, n - k) = C(N, n).

A Finset.range-style restatement of Nat.add_choose_eq, with the substitution m := K, n := N - K (using K + (N - K) = N from K ≤ N), and the antidiagonal-to-range conversion.

noncomputable def PMF.hypergeometric (N K n : ) (h_n : n N) (h_K : K N) :
PMF (Fin (n + 1))

The hypergeometric PMF: probability of observing exactly k successes when drawing n items without replacement from a population of N containing K successes.

Requires n ≤ N and K ≤ N (otherwise C(N, n) = 0 would make the denominator vanish). The numerator C(K, k) · C(N - K, n - k) automatically returns 0 for impossible combinations (k > K or n - k > N - K) via the behavior of Nat.choose.

Equations
  • PMF.hypergeometric N K n h_n h_K = PMF.ofFintype (fun (k : Fin (n + 1)) => (K.choose k * (N - K).choose (n - k)) / (N.choose n))
Instances For
    theorem PMF.hypergeometric_apply (N K n : ) (h_n : n N) (h_K : K N) (k : Fin (n + 1)) :
    (hypergeometric N K n h_n h_K) k = (K.choose k * (N - K).choose (n - k)) / (N.choose n)

    Closed-form value of the hypergeometric PMF.

    theorem PMF.hypergeometric_apply_ne_zero_iff (N K n : ) (h_n : n N) (h_K : K N) (k : Fin (n + 1)) :
    (hypergeometric N K n h_n h_K) k 0 k K n - k N - K

    The hypergeometric PMF is non-zero iff the count is feasible: both k ≤ K and n - k ≤ N - K.

    theorem PMF.hypergeometric_apply_eq_ofReal (N K n : ) (h_n : n N) (h_K : K N) (k : Fin (n + 1)) :
    (hypergeometric N K n h_n h_K) k = ENNReal.ofReal ((K.choose k) * ((N - K).choose (n - k)) / (N.choose n))

    ENNReal.ofReal-friendly closed form: lifts the value to a single ENNReal.ofReal (rational), suitable for norm_num discharge.

    theorem PMF.hypergeometric_apply_zero (N K n : ) (h_n : n N) (h_K : K N) :
    (hypergeometric N K n h_n h_K) 0, = ((N - K).choose n) / (N.choose n)

    The kernel value at k = 0: C(N - K, n) / C(N, n).

    theorem PMF.hypergeometric_apply_last (N K n : ) (h_n : n N) (h_K : K N) :
    (hypergeometric N K n h_n h_K) n, = (K.choose n) / (N.choose n)

    The kernel value at k = n (when n ≤ K): C(K, n) / C(N, n).

    theorem PMF.hypergeometric_symm (N K n : ) (h_n : n N) (h_K : K N) (k : Fin (n + 1)) :
    (hypergeometric N (N - K) n h_n ) k = (hypergeometric N K n h_n h_K) n - k,

    Symmetry under K ↔ N − K and k ↔ n − k: complementary parameterization.