The hypergeometric PMF #
The hypergeometric distribution describes sampling without replacement from
a finite population. Drawing n items from a population of N containing
K successes (and N - K failures), the probability of observing exactly
k successes is
P(k | N, K, n) = C(K, k) · C(N − K, n − k) / C(N, n)
This file defines PMF.hypergeometric N K n as a PMF (Fin (n + 1)) and
proves the sum-to-1 condition via Chu-Vandermonde's identity
(Nat.add_choose_eq).
Main definitions #
PMF.hypergeometric N K n h_n h_K: the PMF for given parameters withn ≤ NandK ≤ N.PMF.hypergeometric_apply: the closed-form value at eachk.
Use cases in linglib #
- @cite{goodman-stuhlmuller-2013}: speaker observation of marbles (N = 3).
- @cite{herbstritt-franke-2019}: urn-based probability expressions (N = 10).
Both papers previously open-coded the C(K, k) · C(N - K, n - k)
numerator and the C(N, n) denominator separately in their obsKernelRaw
and PMF.normalize machinery. With this primitive, the observation kernel
is a one-line definition.
Vandermonde's identity, sum form: when K ≤ N,
∑ k ∈ range (n + 1), C(K, k) * C(N - K, n - k) = C(N, n).
A Finset.range-style restatement of Nat.add_choose_eq, with the
substitution m := K, n := N - K (using K + (N - K) = N from K ≤ N),
and the antidiagonal-to-range conversion.
The hypergeometric PMF: probability of observing exactly k successes
when drawing n items without replacement from a population of N
containing K successes.
Requires n ≤ N and K ≤ N (otherwise C(N, n) = 0 would make the
denominator vanish). The numerator C(K, k) · C(N - K, n - k) automatically
returns 0 for impossible combinations (k > K or n - k > N - K) via the
behavior of Nat.choose.
Equations
- PMF.hypergeometric N K n h_n h_K = PMF.ofFintype (fun (k : Fin (n + 1)) => ↑(K.choose ↑k * (N - K).choose (n - ↑k)) / ↑(N.choose n)) ⋯
Instances For
Closed-form value of the hypergeometric PMF.
The hypergeometric PMF is non-zero iff the count is feasible:
both k ≤ K and n - k ≤ N - K.
ENNReal.ofReal-friendly closed form: lifts the value to a single
ENNReal.ofReal (rational), suitable for norm_num discharge.
The kernel value at k = 0: C(N - K, n) / C(N, n).
The kernel value at k = n (when n ≤ K): C(K, n) / C(N, n).
Symmetry under K ↔ N − K and k ↔ n − k: complementary parameterization.