<<Up     Contents

Hypergeometric distribution

The hypergeometric distribution is a discrete probability distribution that describes the number of successes in a sequence of n draws from a finite population without replacement.

A typical example is the following: There is a shipment of N objects in which D are defective. The hypergeometric distribution describes the probability that in a sample of n distinctive objects drawn from the shipment exactly k objects are defective.

In general, if a random variable X follows the hypergeometric distribution with parameters N, D and n, then the probability of getting exactly k successes is given by

<math> P(X = k) = {{{D \choose k} {{N-D} \choose {n-k}}}\over {N \choose n}}</math>

The probability is positive, when k is between max(0, D + n - N) and min(n, D).

The formula can be understood as follows: There are <math> N \choose n </math> possible samples (without replacement). There are <math> D \choose k </math> ways to obtain k defective objects and there are <math> {N-D} \choose {n-k} </math> ways to fill out the rest of the sample with non-defective objects.

When the population size is large (i.e. N is large) the hypergeometric distribution can be approximated reasonably well with a binomial distribution with parameters N (number of trials) and p = D / N (probability of success in a single trial).

wikipedia.org dumped 2003-03-17 with terodump