Probability Distributions
Topics:
Discrete probability distribution
Binomial distribution
Continuous probability distribution
Normal distribution
Normal approximation to a binomial distribution
Central Limit Theorem
Books and resources:
Aalen 4, 5
Kirkwood and Sterne 5, 6, 15.1-15.2
Discrete probability distribution
Here we deal with stochastic trials where the set of all possible outcomes is countable and known.
A discrete random variable maps each of the trial outcomes to a numeric value.
A discrete probability distribution assigns a probability for each of the possible numeric values representing the outcomes.
All possible outcomes for a probability distribution should sum to 1.
Bernoulli trials
Often a process has two outcomes.
Coin tossing: outcomes are head or tail.
HIV test looks for the presence or absence of antibodies in the blood.
A child is born with a certain condition or not.
Or from a stochastic trial with multiple outcomes, there are two outcomes of interest:
throw a dice, you are only interested in whether you get a 6 or not.
A child is born with a weight lower than 2500 grams or not.
Binomial trials
A bionomial trial consist of a series of Bernoulli trials that satisfy the following:
In each trial, we can record whether certain event \(A\) occurs or not.
The probability of A, \(P(A)\) is the same in each trial, it is denoted by \(p\).
All trials are independent.
Suppose we carry out n trials, looking for an event A in each trial. As a result we obtain a sequence:
\[A, \bar{A}, A, A, \bar{A}, ..., A\]
Say that \(A\) takes place \(x\) times. This means \(\bar{A}\) takes place \(n-x\) times.
What is the probability for a certain sequence?
Recall that probabilities for independent events can be multiplied.
\[P(sequence) = p \times (1-p) \times p ... \times p\]
For \(x\) number of \(p\) and \(n-x\) number of \(1-p\),
\[P(sequence) = p^{x} (1-p)^{n-x}\]
The order of the sequence does not matter. The number of occurence matters. The number of ways that \(x\) objects can be chosen from a total of \(n\) objects, regardless of order is given by the bionomical coefficient.
Binomial coefficient
We want to find the number of ways that \(x\) objects can be chosen from a total of \(n\) objects, regardless of order
Binomial coefficient: \(\binom nx\)
\[\binom nx = \frac{n!}{x!(n-x)!}\]
“x factorial”: \(x! = x \times (x-1) \times ... 2 \times 1\)
Example: \(\binom 4 3 = \frac{4\times 3 \times 2 \times 1}{3 \times 2 \times 1 \times 1} = 4\)
Binomial distribution
The probability that the event \(A\) occurs exact \(x\) times is given by:
\[P(X = x) = \binom n x p^{x}(1-p)^{n-x}\]
i.e. the number of distributing \(x\) events \(A\) in a sequence of length \(n\), times the probability that one particular sequence with \(x\) events \(A\) occurs.
For a binomial distribution, the mean (or expected value) is \(np\) and the variance is \(np(1-p)\).
Continuous probability distribution
Here we deal with continuous stochastic variables. In contrast to counting variables, continuous variables can take any (of the infinite) values within a given range. For instance, height, cholesterol and annual salary can be considered continuous variables.
A probability distribution (or density) for a continuous variable \(X\) is a function \(f\) satistying the following:
- \(f(x)\geq 0~~\text{for all}~~x~~\text{in}~~X\).
- The total area under the curve of \(f\) is 1.
- The probability that \(x\) is between two values \(a\) and \(b\), denoted \(P(a\leq x \leq b)\) equals the area under the curve from \(a\) to \(b\).
Normal distribution
This is the most important probability distribution in statistics. It will be widely used in this course.
Probability density function of the normal distribution has this formula:
\[f(x) = \frac{1}{\sigma \sqrt{2\pi}} \text{exp}(- \frac{(x-\mu)^2}{2\sigma^2})\]
\(\mu\) is the mean
\(\sigma\) is the standard deviation
\(\text{exp}(x) = e^{x}\)
Properties of the normal distribution
Symmetric, bell-shape
\(\mu\) and \(\sigma\) define the location and variation
From the center (mean), going two standard deviations each way covers approximately 95% of the distribution
Standard normal distribution
A normal distribution \(N(\mu, \sigma)\) with \(\mu = 0, \sigma = 1\)
Any normal disribution can be transformed into a standard normal distribution \(N(0, 1)\) by substracting the meand and dividing by the standard deviation:
if \(X \sim N(\mu, \sigma)\), then \(Y = \frac{X-\mu}{\sigma} \sim N(0, 1)\).
Standard normal distribution probabilities are commonly presented in tables, so people can check them easily. But in this course we will learn how to compute them with R.
Normal approximation to the binomial distribution
Recall the binomial distribution,
\[P(X = x) = \binom nx p^x (1-p)^{n-x}\]
where \(\binom nx = \frac{n!}{x!(n-x)!}\), and \(\binom n 0 = 1\)
For large \(n\), the binomial distribution can be approximated by the normal distribution.
To approximate a binomial distribution we use a normal distribution with:
\(\mu = np\)
\(\sigma = \sqrt{np(1-p)}\)
Rule of thumb:
\(Binom(n, p) \rightarrow N(\mu, \sigma)\quad \text{for} \quad n \rightarrow \infty\) if \(np \geq 5\) and \(n(1-p) \geq 5\)
Central limit theorem CLT
CLT in simple words: the distribution of sample mean will be nearly normal regardless of what distribution the variable in the population is, as long as the sample size is large enough.