# Probability Distributions in Statistics

These are my notes on probability distributions of random variables in statistics.

**Probability Distributions of Random Variables**

A continuous random variable takes all possible values in a given range. An

example is the distance traveled by a car using one gallon of gas. Occasionally,

when a discrete variable takes lots of values, it is treated as a continuous

variable.

The probability distribution of a continuous random variable or the continuous

probability distribution is a graph or a formula giving all possible values

taken by a random variable and the corresponding probabilities. It is also known

as the density function, or probability density function.

Let X be a continuous random variable taking values in the range(a,n). Then, the

area under the density curve is equal to the probability. The total probability

under the curve =1. The probability that X takes a specific value is equal 0.

The reason is that the probability of getting any X exactly is 0. For example,

the chance of it raining exactly 3.00233221 inches is 0, but the chance of it

raining between 3.00 and 3.01 inches is small but measurable.

This may seem hard to understand but remember that there are an infinite number

of points on your probability density function and all of their probabilities

add up to 1. Pretend you only have 10 events whose probabilities add up to 1. If

they all had equal probability, then each would have a 0.1 probability. Now

imagine there are 100 events, you would have a 0.01 probability for each one.

Now infinitely many events and the probability goes to 0.

The cumulative distribution function of a random variable X is \(P(X) \leq

x_{0}\) for any \(a < x_{0} < b\). It is equal to 0 for any \(x_0 < a\), and it

is equal to 1 for any \(x_0 > b\).**Normal Distribution**

The discovery of the normal distribution is credited with Carl Gauss. It is also

known as the bell curve or Gaussian distribution. This is the most commonly used

distribution in statistics because it closely approximates the distribution of

many different measurements.

If a random variable x follows a normal distribution with mean \(\mu\) and

standard deviation \(\sigma\), then it is denoted \(X=N(\mu \sigma)\).

The standard normal is the normal distribution with a mean of 0 and a standard

deviation of 1. Any normal random variable can be transformed into the standard

normal using the relation \(Z=\frac{X-\mu}{\sigma}\). The value of variable Z

for any specific value of X is known as the z-score. For example, suppose

\(X=N(10,2)\). The z-score for X=12.5 is then \(Z=\frac{x-\mu}{\sigma} =

\frac{12.5-10}{2}=1.25\).

This process is called z-scoring. It does not change the distribution at all. It

simply changes the units on the x-axis, shifting it over by \(-\mu\) and

relabeling the axis in units of standard deviation so that 1 unit =1 standard

deviation. **Properties of the Normal Distribution**

It is continuous. It is symmetric around its mean. It is bell-shaped.

Mean=median=mode. The curve approaches the horizontal axis on both sides of the

mean without ever touching or crossing it. Nearly all of the distributions lie

within three standard deviations of the mean. It has two inflection points: one

at \(\mu-\sigma\) and one at \(\mu+\sigma\).

The normal distribution is fully determined by two parameters, mean and

variance(standard deviation). The location of the distribution on the number

line depends on the mean of the distribution. The shape of the distribution

depends on the standard deviation. A normal distribution with a larger standard

deviation is more spread out, while one with a smaller standard deviation is

more tightly bunched. **Using the Normal Distribution Table**

If the random variable X follows a normal distribution with mean \(\mu\) and

standard deviation \(\sigma\), then the random variable

\(Z=\frac{X-\mu}{\sigma}\) follows a standard normal distribution, a normal

distribution with mean 0 and standard deviation 1.

To find the area under the standard normal distribution, a normal distribution

with mean 0 and standard deviation 1, you can simply look at the standard normal

probability table. To find the area under the curve, the probability, for any

normal distribution other than the standard normal, we convert it to a standard

normal using the above formula. Approximately 68% of the area under the curve

lies between \(\mu-\sigma\) and \(\mu+\sigma\). Approximately 95% of the area

under the curve lies between \(\mu-2\sigma\) and \(\mu+2\sigma\). Lastly,

approximately 99.73% of the area under the curve lies between \(\mu-3\sigma\)

and \(\mu+3\sigma\). **Combining Independent Random Variables**

Sometimes we are interested in linear combinations of independent random

variables. If we know the means and the variances of two random variables, we

can determine the means and the variances of a linear combination of these

variables. If x and Y are normally distributed, then a linear combination of the

two will also be normally distributed. **Sampling Distributions**

A parameter is a numerical measure of a population. for example, a student's GPA

is computed using grades from all his courses. GPA is a parameter.

A statistic is a numerical measure of a sample. An example is the percent of

votes received by a presidential candidate. Generally, not every eligible voter

votes in an election. Therefore, the president is elected based on the support

received from a sample of the eligible voters, so the person is a statistic. If

every eligible voter does vote, then the percent of votes received would be a

parameter.

The sampling distribution is the probability distribution of all possible values

of a statistic. Different samples of the same size from the same population will

result in a different statistic values. Therefore, a statistic is a random

variable. Any table, list, graph, or formula giving all possible values a

statistic can take and their corresponding probabilities gives a sampling

distribution of that statistic.

The standard error is the standard deviation of the distribution of a statistic.**Central Limit Theorem**

Regardless of the shape of the distribution of the distribution of the

population, if the sample size is large and there is finite variance, then the

distribution of the sample means will be approximately normal, with

mean\(\mu_{x}=0\) and standard deviation \(\sigma_{x}=\frac{\sigma}{\sqrt{n}}\).

Basically, the central limit theorem tells us that regardless of the shape of

the population distribution, as the sample size n increases. Th shape of the

distribution of X becomes more symmetric and bell-shaped or more like a normal

distribution. The center of the distribution of X remains at \(\mu\). The spread

of the distribution of X decreases, and the distribution becomes more peaked.**Independent Events**

Two events, A and B, are independent if the outcome of one event does not

affect the probability of the other. In other words, if \(P(A)=P(A)|(B)\) and

\(P(B)=P(B)|(A)\) then A and B are independent. Knowing A would not give you

any information about B.**Geometric Distribution**

A geometric distribution tells you the probability of having your first success

on the kth trial. for example, it can be used to describe the probability of

making heads on 4 out of 5 coin flips.**Binomial Distribution**

The binomial distribution tells you the probability of having k successes in n

trials.

The central limit theorem states that if you take a sample of size n from a

population with finite variance, and n is large enough, the distribution of

sample means will be normally distributed. Even if you are sampling from a

skewed distribution like height or income, the distribution of sample means

will be normal.