Random Variables

1. Introduction

Definition 1

  • Random Variable: A random variable {X} is a mapping

    \displaystyle  X \colon \Omega \rightarrow \mathbb{R} \ \ \ \ \ (1)

    which assigns real numbers {X(\omega)} to outcomes {\omega} in {\Omega}.

  • 2. Distribution Functions

    Definition 2

  • Distribution Function: Given a random variable {X}, the cumulative distribution function (also called the \textsc{cdf}) is a function {F_X \colon \mathbb{R} \rightarrow [0,1]} defined by:

    \displaystyle  F_X(x) = \mathbb{P}(X \leq x) \ \ \ \ \ (2)

  • Theorem 3 Let {X} have \textsc{cdf} {F} and let {Y} have \textsc{cdf} {G}. If {F(x) = G(x)} for all {x \in \mathbb{R}} then {\mathbb{P}(X \in A) = \mathbb{P}(Y \in A)} for all {A}.

    Definition 4 {X} is discrete if it takes countably many infinite values.

    We define the probability function or the probability mass function for {X} by {f_X(x) = \mathbb{P}(X = x)}.

    Definition 5 A random variable {X} is said to be continuous if there exists a function {f_X} such that

  • {f_X(x) \geq 0} for all {x \in \mathbb{R}},
  • {\int_{-\infty}^{\infty}f_X(x)dx = 1}
  • for all {a, b \in \mathbb{R}} with {a \leq b} we have

    \displaystyle  \int_a^b f_X(x)dx = \mathbb{P}(a < X < b) \ \ \ \ \ (3)

    The function {f_X} is called the probability density function and we have

    \displaystyle  F_X(x) = \int_{-\infty}^x f_X(t)dt \ \ \ \ \ (4)

    and {f_X(x) = F_X'(X)} at all points for which {F_X} is differentiable.

  • 3. Important Discrete Random Variables

    Remark 1 We write {X \sim F} to denote that the random variable {X} has a \textsc{cdf} {F}.

    3.1. The Point Mass Distribution

    \textsc{The Point Mass Distribution}. X has a point mass distribution at {a}, written {X \sim \delta_a} if {\mathbb{P}(X = a) = 1}. Hence {F_X} is

    \displaystyle  F_X(x) = \begin{cases} 0& x < a \\ 1& x \geq a. \end{cases} \ \ \ \ \ (5)

    3.2. The Discrete Uniform Distribution

    \textsc{The Discrete Uniform Distribution}. Let {k > 1} be a given integer. Let {X} have a probability mass function given by:

    \displaystyle  f_X(x) = \begin{cases} 1/k & 1 \leq x \leq k \\ 0 & \text{otherwise}. \end{cases} \ \ \ \ \ (6)

    Then {X} has a discrete uniform distribution on {{1, \dotsc , k}}.

    3.3. The Bernoulli Distribution

    \textsc{The Bernoulli Distribution}. Let {X} be a random variable with {\mathbb{P}(X = 1) = p} and {\mathbb{P}(X = 0) = 1 - p} for some {p \in [0, 1]}. We say that {X} has a Bernoulli Distribution written as {X \sim \text{Bernoulli}(p)}. The probability function {f_X} is given by {f_X(x) = p^x(1 - p)^{(1 - x)} \text{ for } x \in {0, 1}}.

    3.4. The Binomial Distribution

    \textsc{The Binomial Distribution}. Flip a coin {n} times and let {X} denote the number of heads. If {p} denotes the probability of getting heads in a single coin toss and the tosses are assumed to be independent then the \textsc{pdf} of {X} can be shown to be:

    \displaystyle  f_X(x) = \begin{cases} \begin{pmatrix} n \\ x \end{pmatrix} p^x(1 - p)^{(n-x)} & 0 \leq x \leq n \\ 0 & \text{otherwise}. \end{cases} \ \ \ \ \ (7)

    3.5. The Geometric Distribution

    \textsc{The Geometric Distribution}. {X} has a geometric distribution with parameter {p \in [0, 1]}, written as {X \sim \text{Geom}(p)} if

    \displaystyle  f_X(x) = p(1 - p)^{(x - 1)} \text{ for } x \in \{1, 2, 3, \dotsc , \}. \ \ \ \ \ (8)

    {X} is the number of flips needed until the first head appears.

    3.6. The Poisson Distribution

    \textsc{The Poisson Distribution}. {X} has a Poisson distribution with parameter {\lambda > 0}, written as {X \sim \text{Poisson}(\lambda)} if

    \displaystyle  f_X(x) = e^{-\lambda}\frac{\lambda^{-x}}{x!} \text{ for } x \geq 0. \ \ \ \ \ (9)

    {X} is the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since the last event.

    4. Important Continuous Random Variables

    4.1. The Uniform Distribution

    \textsc{The Uniform Distribution}. For {a, b \in \mathbb{R} \text{ and } a < b}, X has a uniform distribution over {(a, b)}, written {X \sim \text{Uniform}(a, b)}, if

    \displaystyle  f_X(x) = \begin{cases} \frac{1}{b - a} & x \in [a, b] \\ 0 & \text{otherwise}. \end{cases} \ \ \ \ \ (10)

    4.2. The Normal Distribution

    \textsc{The Normal Distribution}. We say that {X} has a normal (or Gaussian) distribution with parameters {\mu} and {\sigma}, written as {X \sim N(\mu, \sigma^2)} if

    \displaystyle  f_X(x;\mu, \sigma) = \frac{1}{\sigma\sqrt{2\pi}}exp\left\{-\frac{1}{2\sigma^2}(x-\mu)^2\right\}, \text{ where } \mu, \sigma \in \mathbb{R}, \sigma > 0. \ \ \ \ \ (11)

    The parameter {\mu} is the “center” (or mean) of the distribution and {\sigma} is the “spread” (or standard deviation) of the distribution. We say that {X} has a standard Normal distribution if {\mu = 0} and {\sigma = 1}. A standard Normal random variable is denoted by {Z}. The \textsc{pdf} and \textsc{cdf} of a standard Normal are denoted by {\phi(z)} and {\Phi(z)}. The \textsc{pdf} is plotted in Figure There is no closed-form expression for {\Phi}. Here are some useful facts:

  • If {X \sim N(\mu, \sigma^2)}, then {Z = (X - \mu) / \sigma \sim N(0, 1)}.
  • If {Z \sim N(0, 1)}, then {X = \mu + \sigma Z \sim N(\mu, \sigma^2)}.
  • If {X_i \sim N(\mu_i, \sigma_i^2)} for {i = 1, \dotsc , n} are independent, then we have

    \displaystyle  \sum_{i = 1}^nX_i \sim N\left(\sum_{i=1}^n \mu_i, \sum_{i=1}^n \sigma_i^2\right). \ \ \ \ \ (12)

    It follows from {(i)} that if {X \sim N(\mu, \sigma^2)}

    \displaystyle  \mathbb{P}\left(a < X < b\right) \ \ \ \ \ (13)

    \displaystyle  \mathbb{P}\left(a < X < b\right) = \mathbb{P}\left(a < \mu + \sigma Z < b\right) \ \ \ \ \ (14)

    \displaystyle  \mathbb{P}\left(a < X < b\right) = \mathbb{P}\left(\frac{a - \mu}{\sigma} < Z < \frac{b - \mu}{\sigma}\right) \ \ \ \ \ (15)

    \displaystyle  \mathbb{P}\left(a < X < b\right) = \Phi\left(\frac{b - \mu}{\sigma}\right) - \Phi\left(\frac{a - \mu}{\sigma}\right). \ \ \ \ \ (16)

    Example 1 Suppose that {X \sim N(3, 5)}. Find {P(X > 1)}.

    Solution:

    \displaystyle  \mathbb{P}\left(X > 1\right) \\ = \mathbb{P}\left(3 + Z\sqrt{5} > 1\right) \\ = \mathbb{P}\left( Z > \frac{-2}{\sqrt{5}}\right) \\ = 1 - \Phi\left(\frac{2}{\sqrt{5}}\right) \\ = 1 - \Phi\left(0.894427\right) \\ = 0.81. \ \ \ \ \ (17)

    Example 2 For the above problem, also find the value {x} of {X} such that {\mathbb{P}(X < x) = .2}. Solution:

    \displaystyle  0.2 = \mathbb{P}\left(X < x\right) \\ = \mathbb{P}\left(3 + Z\sqrt{5} < x\right) \\ = \mathbb{P}\left(Z < \frac{x - 3}{\sqrt{5}}\right) \\ = \Phi\left(\frac{x - 3}{\sqrt{5}}\right) \ \ \ \ \ (18)

    From the normal table, we have that {\Phi(-0.8416) = 0.2}

    \displaystyle  \Phi(-0.8416) = \Phi\left(\frac{x - 3}{\sqrt{5}}\right) \\ -0.8416 = \left(\frac{x - 3}{\sqrt{5}}\right) \\ x = \left(3 - 0.8416\times\sqrt{5}\right) \\ x = 1.1181. \ \ \ \ \ (19)

    4.3. The Exponential Distribution

    \textsc{The Exponential Distribution}. {X} has an exponential distribution with parameter {\beta > 0}, written as {X \sim \text{Exp}(\beta)}, if

    \displaystyle  f_X(x) = \frac{1}{\beta}e^{-x/\beta}, \text{ for } x > 0. \ \ \ \ \ (20)

    4.4. The Gamma Distribution

    \textsc{The Gamma Distribution}. For {\alpha > 0}, the Gamma function is defined as

    \displaystyle  \Gamma(\alpha) = \int_0^\infty y^{\alpha - 1} e^{-y} dy. \ \ \ \ \ (21)

    {X} has a Gamma distribution with parameters {\alpha} and {\beta} (where {\alpha, \beta > 0}), written as {X \sim \text{Gamma}(\alpha, \beta)}, if

    \displaystyle  f_X(x) = \frac{1}{\beta^{\alpha}\Gamma(\alpha)}x^{\alpha - 1}e^{-x/\beta}, \text{ for } x > 0. \ \ \ \ \ (22)

    {X} is the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since the last event.

    5. Bivariate Distributions

    Definition 6 Given a pair of discrete random variables {X} and {Y}, their joint mass function is defined as {f_{X, Y}(x,y) = \mathbb{P}(X = x, Y = y)}

    Definition 7 For two continuous random variables, {X} and {Y}, we call a function {f_{X,Y}} a \textsc{pdf} of random variables {(X, Y)} if

  • {f_{X, Y}(x, y) \geq 0} for all {(x, y) \in \mathbb{R}^2},
  • {\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}f_{X,Y}(x, y) \thinspace dx \thinspace dy = 1}
  • For any set {A \in \mathbb{R}^2} we have

    \displaystyle  \int \int_A f_{X,Y}(x, y) \thinspace dx \thinspace dy = \mathbb{P}((X,Y) \in A). \ \ \ \ \ (23)

  • Example 3 For {-1 \leq x \leq 1}, let {(X, Y)} have density

    \displaystyle  f_{X, Y}(x,y) = \begin{cases} cx^2y x^2 \leq y \leq 1, \\ 0 \text{otherwise}. \end{cases} \ \ \ \ \ (24)

    Find the value of {c} .

    Solution: We equate the integral of {f} over {\mathbb{R}^2} to {1} and find {c}.

    \displaystyle  1 = \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}f_{X,Y}(x, y) \thinspace dy \thinspace dx \\ = \int_{-1}^{1}\int_{x^2}^{1}f_{X,Y}(x, y) \thinspace dy \thinspace dx \\ = \int_{-1}^{1}\int_{x^2}^{1}cyx^2 \thinspace dy \thinspace dx \\ = \int_{-1}^{1}c\left(\frac{1 - x^4}{2}\right)x^2 \thinspace dx \\ = \left(\frac{c}{2}\right)\left(\int_{-1}^{1}x^2 \thinspace dx - \int_{-1}^{1}x^6 \thinspace dx \right)\\ = \left(\frac{c}{2}\right)\left( \frac{2}{3} - \frac{2}{7}\right)\\ = \left(\frac{4c}{21}\right) \\ c = \frac{21}{4} \ \ \ \ \ (25)

    6. Marginal Distributions

    Definition 8 For the discrete case, if {X, Y} have a joint mass distribution {f_{X, Y}} then the marginal distribution of {X} is given by

    \displaystyle  f_X(x) = \mathbb{P}(X = x) = \sum_y \mathbb{P}(X = x, Y = y) = \sum_y f_{X,Y}(x, y) \ \ \ \ \ (26)

    and of {Y} is given by

    \displaystyle  f_Y(y) = \mathbb{P}(Y = y) = \sum_x \mathbb{P}(X = x, Y = y) = \sum_x f_{X,Y}(x, y) \ \ \ \ \ (27)

    Definition 9 For the continuous case, if {X, Y} have a probability distribution function {f_{X, Y}} then the marginal distribution of {X} is given by

    \displaystyle  f_X(x) = \int f_{X,Y}(x, y) \thinspace dy \ \ \ \ \ (28)

    and of {Y} is given by

    \displaystyle  f_Y(y) = \int f_{X,Y}(x, y) \thinspace dx \ \ \ \ \ (29)

    7. Independent Random Variables

    Definition 10 Two random variables, {X} and {Y} are said to be independent if for every {A} and {B} we have

    \displaystyle  \mathbb{P}(X \in A, Y \in B) = \mathbb{P}(X \in A)\mathbb{P}(Y \in B) \ \ \ \ \ (30)

    Theorem 11 Let {X} and {Y} have a joint \textsc{pdf} {f_{X, Y}}. Then {X \amalg Y} if {f_{X, Y}(x, y) = f_X(x)f_Y(y)} for all values of {x} and {y}.

    8. Conditional Distributions

    Definition 12 Let {X} and {Y} have a joint \textsc{pdf} {f_{X, Y}}. Then the conditional distribution of {X} given Y is defined as

    \displaystyle  f_{X|Y}(x|y) = \frac{f_{X, Y}(x, y)}{f_Y(y)} \ \ \ \ \ (31)

    9. Multivariate Distributions and \textsc

    Samples}

    Definition 13 Independence of {n} random variables: Let {X = \begin{pmatrix} X_1, \dotsc, X_n \end{pmatrix}} where {X_1, \dotsc, X_n} are random variables. Let {f(x_1, x_2, \dotsc, x_n)} denote their \textsc{pdf}. We say that {X_1, \dotsc, X_n} are independent if for every {A_1, \dotsc, A_n},

    \displaystyle  \mathbb{P}(X_1 \in A_1, \dotsc, X_n \in A_n) = \prod_{i=1}^n \mathbb{P}(X_i \in A_i) \ \ \ \ \ (32)

    Definition 14 If {X_1, \dotsc, X_n} are independent random variables with the same marginal distribution {F}, we say that {X_1, \dotsc, X_n} are \textsc{iid} (identically and independently distributed) random variables and we write:

    \displaystyle  \begin{pmatrix} X_1, \dotsc, X_n \end{pmatrix} \sim F \ \ \ \ \ (33)

    If {F} has density {f} then we also write {\begin{pmatrix} X_1, \dotsc, X_n \end{pmatrix} \sim f}. We also call {X_1, \dotsc, X_n} a random sample of size {n} from {F}.

    10. The Multivariate Normal Distribution

    \textsc{The Multivariate Normal Distribution} In the multivariate normal distribution, the parameter {\mu} is a vector and the parameter {\sigma} is a matrix {\Sigma}. Let

    \displaystyle  Z = \begin{pmatrix} Z_1 \\ \vdots \\ Z_k \end{pmatrix} \ \ \ \ \ (34)

    where {Z_1, \dotsc, Z_k \sim N(0, 1)} are independent. The joint density of {Z} is

    \displaystyle  f_Z(z) = \frac{1}{{(2\pi)}^{k/2}}\text{exp}\biggl\{-\frac{1}{2}\sum_{j=1}^k {z_j}^2\biggr\} = \frac{1}{{(2\pi)}^{k/2}}\text{exp}\left\{-\frac{1}{2}z^Tz\right\} \\ f_X(x;\mu, \Sigma) = \frac{1}{{(2\pi)}^{k/2}{|\Sigma |}^{1/2}}\text{exp}\left\{-\frac{1}{2}(x - \mu)^T\Sigma^{-1}(x - \mu)\right\} \ \ \ \ \ (35)