Discrete random variables

A discrete random variable is a type of random variable that can take on a countable set of distinct values. Common examples include the number of children in a family, the outcome of rolling a die, or the scores awarded in a gymnastics competition.

To describe the behaviour of a discrete random variable, we use a probability distribution. A probability distribution lists all the possible values the variable can take along with the probability of each value occurring. The probabilities must be between 0 and 1, and their total must add up to 1.


Use this page to revise the following concepts relating to discrete random variables:


Displaying discrete random variables

Probability distributions are used to organise and display the outcomes and probabilities of discrete random variables. This makes it easier to see all possible outcomes and their associated probabilities at a glance.

For example, the sample space for rolling a six-sided die is 1, 2, 3, 4, 5 and 6. If we let \(X\) be the roll, and \(x\)  represent the outcomes, the distribution can be represented in a table.

\( x\) \(1 \) \( 2\) \( 3\) \(4 \) \(5 \) \( 6\)
\(\Pr(X=x) \) \( \frac{1}{6}\) \( \frac{1}{6}\) \( \frac{1}{6}\) \( \frac{1}{6}\) \( \frac{1}{6}\) \( \frac{1}{6}\)

This table represents a discrete probability function, which shows the probability associated with each possible value of a discrete random variable.

Such distributions can also be displayed graphically, to make patterns and comparisons easier to see.

For example, if 20 people were asked how many pets they own, 4 people said no pets, 8 people said 1 pet, 6 people said 2 pets and 2 people said 3 pets. If one person is selected at random from the 20 people, the probabilities of the sample space can be represented as a graph called a probability mass function.

Probability mass function (PMF) graph showing discrete values of a random variable X on the horizontal axis and corresponding probabilities on the vertical axis. Red dashed vertical lines indicate the probability for each value of X. The probabilities are highest around the middle values and lower at the extremes, suggesting a binomial-like distribution.

As the distribution accounts for all possible outcomes, the sum of all probabilities for \(x\) must equal \(1\). That is,

\[0 \leq \Pr(x) \leq 1\] \[\sum \Pr(x)=1\]

The Mean

The mean is a measure of the long-term average result of a chance event if it were repeated many times. It may also be called the expected value, as gives a sense of what outcome you would "expect" on average over a large number of trials.

To find the mean for discrete random variables, multiply each possible outcome by its probability, and then sum these products.

\[\begin{align} \mu= E\left(X\right)&=\sum_{i=1}^{n}{x_ip_i}\\ &=x_1p_1{+x}_2p_{2}+\ldots+x_np_n  \end{align}  \]

Where:

  • \( \mu \) is the population mean (expected value of the random variable \( X \)
  • \( E(X) \) is the the expected value of the random variable \( X \)
  • \( x_i \) is the the \( {\text{i-th}\) outcome value that \( X \) can take
  • \( p_i \) is the probability associated with \( x_i \), i.e., \( \Pr(X = x_i) \)

Worked Example

Calculate the mean outcome of rolling a fair six-sided die.

The possible outcomes are \(\{1,2,3,4,5,6\}\), each with a probability of \(\frac{1}{6}\). We can then calculate the expected value as

\[\begin{align} E\left(X\right)&=\left(1\times\frac{1}{6}\right)+\left(2\times\frac{1}{6}\right)+\left(3\times\frac{1}{6}\right)+\left(4\times\frac{1}{6}\right)+\left(5\times\frac{1}{6}\right)+\left(6\times\frac{1}{6}\right) \\ &=3.5  \end{align}  \]

Variance and Standard Deviation

Variance and standard deviation are measures of how spread out the values of a probability distribution are from the mean. They help quantify the variability in the outcomes of a random variable.

The variance of a random variable \(X\), denoted \(VAR(X)\) measures the average of the squared differences between each outcome and the mean. In other words, it tells us how far the values of \(X\) tend to deviate from the expected value on average.

\[VAR\left(X\right)=E (X-\mu)^2\]

Where \(\mu = E(X)\) is the mean (expected value) of the distribution.

An equivalent and often more convenient formula is the shortcut formula

\[VAR\left(X\right)=E\left(X^2\right)-\left(E\left(X\right)\right)^2\]

Where

  • \(E(X^2)\) is the expected value of the square of the outcomes, and
  • \(\left(E(X)\right)^2\) is the square of the expected value.

The standard deviation, denoted \(\sigma\)  or \(SD\), is the square root of the variance. It is often a preferred or more convenient way to describe the spread of a probability distribution, because it is in the same units as the original data.

\[SD(X)=\sqrt{{VAR}(X)}\]

Worked Example

Find the variance and standard deviation of the following data set.

\( x\) \(1 \) \( 2\) \(3 \) \(4 \)
\( \Pr(X=x)\) \( 0.2\) \(0.4 \) \( 0.1\) \( 0.3\)

Calculate the variance

\(E(X)=1\times0.2+2\times0.4+3\times0.1+4\times0.3=2.5\)

\(E(X^2)=1^2\times0.2+2^2\times0.4+3^3\times0.1+4^4\times0.3=7.5\)

\( \begin{align} VAR\left(X\right)&=E\left(X^2\right)-\left(E\left(X\right)\right)^2 \\ &=7.5-(2.5)^2 = 1.25 \end{align} \)

Calculate the standard deviation

\(\begin{align} SD(X)&=\sqrt{VAR(X)} \\ &=\sqrt{1.25} = 1.12 \end{align}\)