Probability Distributions in Statistics

These are my notes on probability distributions of random variables in statistics.

This is my favorite Statistics book on Amazon, if you are interested in learning Statistics I highly recommend it



Probability Distributions of Random Variables
A continuous random variable takes all possible values in a given range. An
example is the distance traveled by a car using one gallon of gas. Occasionally,
when a discrete variable takes lots of values, it is treated as a continuous
variable. 

The probability distribution of a continuous random variable or the continuous
probability distribution is a graph or a formula giving all possible values
taken by a random variable and the corresponding probabilities. It is also known
as the density function, or probability density function. 

Let X be a continuous random variable taking values in the range(a,n). Then, the
area under the density curve is equal to the probability. The total probability
under the curve =1. The probability that X takes a specific value is equal 0. 

The reason is that the probability of getting any X exactly is 0. For example,
the chance of it raining exactly 3.00233221 inches is 0, but the chance of it
raining between 3.00 and 3.01 inches is small but measurable. 

This may seem hard to understand but remember that there are an infinite number
of points on your probability density function and all of their probabilities
add up to 1. Pretend you only have 10 events whose probabilities add up to 1. If
they all had equal probability, then each would have a 0.1 probability. Now
imagine there are 100 events, you would have a 0.01 probability for each one.
Now infinitely many events and the probability goes to 0.

The cumulative distribution function of a random variable X is \(P(X) \leq
x_{0}\) for any \(a < x_{0} < b\). It is equal to 0 for any \(x_0 < a\), and it
is equal to 1 for any \(x_0 > b\).

Normal Distribution
The discovery of the normal distribution is credited with Carl Gauss. It is also
known as the bell curve or Gaussian distribution. This is the most commonly used
distribution in statistics because it closely approximates the distribution of
many different measurements.

If a random variable x follows a normal distribution with mean \(\mu\) and
standard deviation \(\sigma\), then it is denoted \(X=N(\mu \sigma)\). 

The standard normal is the normal distribution with a mean of 0 and a standard
deviation of 1. Any normal random variable can be transformed into the standard
normal using the relation \(Z=\frac{X-\mu}{\sigma}\). The value of variable Z
for any specific value of X is known as the z-score. For example, suppose
\(X=N(10,2)\). The z-score for X=12.5 is then \(Z=\frac{x-\mu}{\sigma} =
\frac{12.5-10}{2}=1.25\).

This process is called z-scoring. It does not change the distribution at all. It
simply changes the units on the x-axis, shifting it over by \(-\mu\) and
relabeling the axis in units of standard deviation so that 1 unit =1 standard
deviation. 

Properties of the Normal Distribution
It is continuous. It is symmetric around its mean. It is bell-shaped.
Mean=median=mode. The curve approaches the horizontal axis on both sides of the
mean without ever touching or crossing it. Nearly all of the distributions lie
within three standard deviations of the mean. It has two inflection points: one
at \(\mu-\sigma\) and one at \(\mu+\sigma\). 

The normal distribution is fully determined by two parameters, mean and
variance(standard deviation). The location of the distribution on the number
line depends on the mean of the distribution. The shape of the distribution
depends on the standard deviation. A normal distribution with a larger standard
deviation is more spread out, while one with a smaller standard deviation is
more tightly bunched. 

Using the Normal Distribution Table
If the random variable X follows a normal distribution with mean \(\mu\) and
standard deviation \(\sigma\), then the random variable
\(Z=\frac{X-\mu}{\sigma}\) follows a standard normal distribution, a normal
distribution with mean 0 and standard deviation 1. 

To find the area under the standard normal distribution, a normal distribution
with mean 0 and standard deviation 1, you can simply look at the standard normal
probability table. To find the area under the curve, the probability, for any
normal distribution other than the standard normal, we convert it to a standard
normal using the above formula. Approximately 68% of the area under the curve
lies between \(\mu-\sigma\) and \(\mu+\sigma\). Approximately 95% of the area
under the curve lies between \(\mu-2\sigma\) and \(\mu+2\sigma\). Lastly,
approximately 99.73% of the area under the curve lies between \(\mu-3\sigma\)
and \(\mu+3\sigma\). 

Combining Independent Random Variables
Sometimes we are interested in linear combinations of independent random
variables. If we know the means and the variances of two random variables, we
can determine the means and the variances of a linear combination of these
variables. If x and Y are normally distributed, then a linear combination of the
two will also be normally distributed. 

Sampling Distributions
A parameter is a numerical measure of a population. for example, a student's GPA
is computed using grades from all his courses. GPA is a parameter. 

A statistic is a numerical measure of a sample. An example is the percent of
votes received by a presidential candidate. Generally, not every eligible voter
votes in an election. Therefore, the president is elected based on the support
received from a sample of the eligible voters, so the person is a statistic. If
every eligible voter does vote, then the percent of votes received would be a
parameter. 

The sampling distribution is the probability distribution of all possible values
of a statistic. Different samples of the same size from the same population will
result in a different statistic values. Therefore, a statistic is a random
variable. Any table, list, graph, or formula giving all possible values a
statistic can take  and their corresponding probabilities gives a sampling
distribution of that statistic. 

The standard error is the standard deviation of the distribution of a statistic.

Central Limit Theorem
Regardless of the shape of the distribution of the distribution of the
population, if the sample size is large and there is finite variance, then the
distribution of the sample means will be approximately normal, with
mean\(\mu_{x}=0\) and standard deviation \(\sigma_{x}=\frac{\sigma}{\sqrt{n}}\). 

Basically, the central limit theorem tells us that regardless of the shape of
the population distribution, as the sample size n increases. Th shape of the
distribution of X becomes more symmetric and bell-shaped or more like a normal
distribution. The center of the distribution of X remains at \(\mu\). The spread
of the distribution of X decreases, and the distribution becomes more peaked.

Independent Events
Two events, A and B, are independent if the outcome of one event does not
affect the probability of the other. In other words, if \(P(A)=P(A)|(B)\) and
\(P(B)=P(B)|(A)\) then A and B are independent. Knowing A would not give you
any information about B.

Geometric Distribution
A geometric distribution tells you the probability of having your first success
on the kth trial. for example, it can be used to describe the probability of
making heads on 4 out of 5 coin flips.

Binomial Distribution
The binomial distribution tells you the probability of having k successes in n
trials. 

The central limit theorem states that if you take a sample of size n from a
population with finite variance, and n is large enough, the distribution of
sample means will be normally distributed. Even if you are sampling from a
skewed distribution like height or income, the distribution of sample means
will be normal.