Binomial distribution
The binomial distribution is a key concept in probability that models situations where you repeat the same experiment several times, and each time there are only two possible outcomes—success or failure. It helps us answer questions like “What’s the chance of getting 7 correct answers out of 10 guesses?” or “How likely is it to find 2 defective items in a box of 5?”
This distribution is built on a type of process called a Bernoulli sequence—a series of independent trials with constant probability of success. From this structure, we can calculate the likelihood of different outcomes, as well as work out expected values, variance, and other important properties.
Use this page to revise the following concepts of the binomial distribution:
- Bernoulli sequence
- The binomial distribution
- Conditional probability of the binomial distribution
- Expectation of the binomial distribution
- Variance and standard deviation of the binomial distribution
- Graph of the binomial distribution
Bernoulli sequence
A Bernoulli sequence is a set of independent and identical trials with only two possible outcomes, commonly called success and failure. The probability of success is constant across trials and is denoted by \(p\); the probability of failure is then \(1−p\).
Flipping a coin, testing a batch of products for defective items, or asking survey respondents a yes/no question are all Bernoulli sequences—each trial follows the same rules, has just two outcomes, and the probability of success remains constant throughout. These sequences form the structure from which binomial probabilities are calculated.
The binomial distribution
The binomial distribution is a discrete probability distribution that models the probability of getting a certain number of successes in a sequence of Bernoulli trials.
The random variable, \(X\), represents the number of successes and the number of trials is given by \(n\). \(X\) follows a binomial distribution if the conditions of a Bernoulli sequence are met. We then call \(X\) the binomial random variable and say \(X\) has a binomial distribution with parameters \(n\) and \(p\). We denote this \(X \sim \operatorname{Bi}(n,p\) or \(X \stackrel{d}{=} \operatorname{Bi}(n, p)\).
The full set of probabilities for all possible values of \(X\) (from \(0\) to \(n\)) makes up the binomial probability distribution.
Therefore, the binomial probability density function can be found by the formula
\[ \Pr\left(X=x\right) =\binom{n}{x}p^x\left(1-p\right)^{n-x}, \quad x\in\mathbb{Z} \]
Where
- \( \Pr(X = x) \) is the probability of getting exactly \( x \) successes
- \( p \) is the probability of success on a single trial
- \( 1 - p \) is the probability of failure
- \( n \) is the total number of trials
- \( x \) is the number of successes
The term \(\binom{n}{x}\) used in this formula is called the combinatorial factor. We read this as read as \(n\) choose \(r\). The combinatorial factor represents the total number of ways to choose \(r\) elements from n different elements, given the order does not matter. We calculate this formula as
\[\binom{n}{x}=\frac{n!}{x!(n-x)!} \text{, for } x \leq n\]
Where
- \(n\) is the number of trials
- \(x\) is the number of successes
Important to binomial distributions is the symmetrical property of the combinatorial factor. Observe that for any \(0 \leq x \leq n\),
\[\binom{n}{x}=\binom{n}{n-x}\]
and
\[\binom{n+1}{x}=\binom{n}{x}+\binom{n}{x-1}\]
Consider a binomial distribution for the number of heads in a trial of \(15\) coin flips. Since each flip has two equally likely outcomes, the probability of obtaining exactly \(x\) heads is given by \(\Pr(X=x)=\binom{15}{x}(\frac{1}{2})^{15}\).
For example, the probability of obtaining exactly \(2\) heads (and therefore \(13\) tails) is \(\Pr(X=2)=\binom{15}{2}(\frac{1}{2})^{15}\).
By symmetry, the probability of obtaining \(13\) heads (and \(2\) tails) is identical as \(\binom{15}{2}=\binom{15}{15-2}\), illustrating the symmetric nature of binomial outcomes around the midpoint.
From the symmetrical property, we can also infer the following rules
\[\binom{n}{0}=\binom{n}{n}=1 \text{ and } \binom{n}{1}=\binom{n}{n-1}=n\]
Worked Example
A soccer player is to take a penalty shot at goal. There are only two possible outcomes, the player scores (success) or the player misses (failure). The probability of success for this player is \(p=0.7\).
How many different combinations of outcomes are there in which the player scores exactly 2 goals out of 10 attempts?
Hence, find the probability that if the soccer player has \(10\) penalty shots at goal, they score exactly \(2\) goals, and the probability that the player scores at least \(2\) goals.
\(n=10\) trials
\(x=2\) successes
If we denote a goal as \(G\) and a miss as \(M\), then each possible outcome sequence of 10 shots is a string of length 10 containing exactly two \(G^{\prime}\) s and eight \(M^{\prime}\) s. Examples of such outcomes include:
- GGMMMMMMMM
- GMGMMMMMMM
- MMMMMMMMGG
- MGMGMMMMMM
Rather than list all possible arrangements of two \(G^{\prime}\) and eight \(M^{\prime}\), we can instead calculate the the number of distinct arrangements using the combinatorial formula
\[\binom{n}{x}=\frac{n!}{x!(n-x)!}\]
In this case, \(n=10\) and \(x =2\), \(\binom{10}{2}=\frac{10!}{2!(8)!}=45\).
The term \(\binom{10}{2}\) represents the number of distinct ways the \(2\) successful shots can occur among the \(0\) attempts. There are \(45\) such distinct arrangements, all equally likely, hence the need for the combinatorial factor.
Therefore, we can calculate the probability by
\[\begin{align} \Pr\left(X=x\right)&=\binom{n}{x}p^x\left(1-p\right)^{n-x} \\ \Pr\left(X=2\right)&=\binom{10}{2}\left(0.7\right)^2\left(0.3\right)^8 \\ &=0.00145 \end{align} \]
To determine the probability that the player scores \(2\) or more goals, then
\[\begin{align} \Pr{\left(X\geq2\right)}&=1-\Pr\left(X=0\right)-\Pr \left(X=1\right) \\ &=1-\binom{10}{0}\left(0.7\right)^0\left(0.3\right)^{10}-\binom{10}{1}\left(0.7\right)^1\left(0.3\right)^9 \\ &=1-0.000059-0.00014 \\ &=0.99985 \end{align} \]
Check your understanding
View
Conditional probability of the binomial distribution
Conditional probability helps us update the likelihood of an event when we know that another event has occurred. In the context of a binomial distribution, it means we can refine our probability calculations based on some given outcome or condition.
The method for calculating conditional probability is the same as in basic probability.
\[\Pr(A|B)=\frac{\Pr(A\cap B)}{\Pr(B)}\textsf{ or }\frac{\text{intersection}}{\text{condition}}\]
Check your understanding
View
Expectation of the binomial distribution
For a binomial distribution the expected value, or mean of a binomial random variable \(X\) is found by the equation:
\[\mathrm{E}\left(X\right)=np\]
Where:
- \(p\) is the probability of success
- \(n\) is the number of trials
For example, if you toss a coin 100 times, you expect 50 of them would land on heads. Thus we have the probability of success \(p=- \frac{1}{2}\), multiplied by the number of trials \(n=100\).
Variance and standard deviation of the binomial distribution
The variance and standard deviation of binomial distribution can be found by:
Where:
- \(p\) is the probability of success
- \(n\) is the number of trials
This unique formula for finding the variance of a binomial distribution is derived from the more widely applied \(\operatorname{Var}(X)=\mathrm{E}\left(X^2\right)-[\mathrm{E}(X)]^2\). The expected value of \(X^2\) can be calculated as
\[\mathrm{E}\left(X^2\right)=\sum_{x=1}^n x^2\binom{n}{x} p^x(1-p)^{n-x}=n^2 p^2-n p^2+n p\]
Therefore,
\[\operatorname{Var}(X)=\mathrm{E}\left(X^2\right)-[\mathrm{E}(X)]^2=-n p^2+n p=n p(1-p)\]
Check your understanding
View
Graph of the binomial distribution
The binomial distribution can be represented as a graph by plotting the number of successes \((x)\) against the probability of each outcome \(\Pr(X=x)\). The shape of the graph depends on the parameters \(n\) (number of trials) and \(p\) (probability of success). This graph can also be called a probability mass function (PMF).
For example, if \(n=100\) and \(p=0.5\), you would expect to get many outcomes around \(\mathrm{E}(X)=50\), and fewer of the outcomes will be closer to 0 or closer to \(100\). Therefore, the PMF is:
If, instead, \(n=100\) and \(p=0.2\), you would expect to get many outcomes around \(\mathrm{E}(X)=20\). Therefore the graph has many outcomes clustered on the left-hand side of the graph, with a long tail on the right, and so the data is referred to as right-skewed.
Finally, if \(n=100\) and \(p=0.8\), you would expect to get many outcomes around \(\mathrm{E}(X)=80\). Therefore the distribution is left-skewed.
