Beginning Probability in Statistics

These are my notes on beginning probability in statistics.

This is my favorite Statistics book on Amazon, if you are interested in learning Statistics I highly recommend it

Random Phenomena

A random phenomenon is a situation in which we know what outcomes can possibly occur, but we don’t know which outcome will happen. In general, each occasion upon which we observe a random phenomenon is called a trial. At each trial, we note the value of the randomness phenomenon, and call that the trial’s outcome. When we combine outcomes, the resulting combination is an event. We call the collection of all possible outcomes the sample space. We will denote the sample space as S. 

 

The law of large numbers says that as we repeat a random process over and over, the proportion of times that an event occurs does settle down to one number. We call this number the probability of an event. But the law of large numbers requires two key assumptions. First, the random phenomenon we are studying must not change, the outcomes must have the same probabilities for each trial. The events must also be independent. Informally, independence means that the outcome of one trial does not affect the outcomes of the others. The law of large numbers says that as the number of independent trials increases,. The long-run relative frequency of repeated events gets closer and closer to a single value.

 

Because the law of large numbers guarantees that relative frequencies settle down in the long run, we can give a name to the value that they approach. We call it the probability of the event. Because this definition is based on repeatedly observing the event’s outcome, this definition is often called empirical probability. 

 

Even though the law of large numbers seems natural, it is often misunderstood because the idea of the long-run is hard to grasp. Many people believe that an outcome of a random event that hasn’t occurred in many trials is due to occur. We know that in the long-run, the relative frequency will settle down to the probability of that outcome. 

 

Example 1

You have just flipped a fair coin and seen six heads in a row.

 

Does the coin owe you some tails? Suppose you spend that coin and your friend gets it in exchange. When she starts flipping the coin, should she expect a run of tails?

 

Of course not. Each flip is a new event. The coin cannot remember what it did in the past, so it cannot owe any particular outcomes in the future.

 

The lesson of the law of large numbers is that sequences of random events do not compensate in the short run and do not need to do so to get back to the right long-run probability. If the probability of an outcome does not change and events are independent, the probability of any outcome in another trial is always what it was, no matter what has happened in other trials.

 

Modeling Probability

Probability was first studied extensively by a group of French mathematicians who were interested in games of chance. Rather than experiment with the games, they developed mathematical models. When the probability comes from a mathematical model and not from observation, it is called theoretical probability. To make things simple, they started by looking at games in which the different outcomes were equally likely.

 

It is easy to find probabilities for events that are made up of several equally likely outcomes. We just count all the outcomes that the event contains. The probability of the event is the number of outcomes in the event divided by the total number of possible outcomes. 

\[P(A) = \frac{outcomes}{possible-outcomes}\]

For example, the probability of drawing a face card from a deck is:

\[P = \frac{face=cards}{cards} = \frac{12}{52} = \frac{2}{13}\]

 

Formal Probability

If the probability is 0, the event never occurs, and likewise if it has probability 1, it always occurs. Even if you think an event is very unlikely, its probability can’t be negative, and even if you are sure it will happen, its probability can’t be greater than 1. 

 

We have been careful to discuss probabilities only for situations in which the outcomes were finite, or even countably infinite. But if the outcomes can take on any numerical value at all, we say they are continuous. 

 

If a random phenomenon has only one possible outcome, it is not very interesting. So, we need to distribute the probabilities among all the outcomes a trial can have. When we assign probabilities to these outcomes, the first thing to be sure of is that we distribute all of the available probability. If we look at all the events in the entire sample space, the probability of that collection of events has to be 1. So the probability of the entire sample space is 1. Making this more formal gives the Probability Assignment Rule: The set of all possible outcomes of a trial must have probability 1.

 

Suppose the probability that you get to class on time is 0.8. What’s the probability that you do not get to class on time? It is 0.2. The set of outcomes that are not in the event A is called the complement of A. This leads to the Complement Rule: The probability of an event not occurring is 1 minus the probability that it does occur.

 

Example 2

If P(green) = 0.35, what is the probability the light is not green when you get to your destination?

 

Not green is the complement of green, so P(not green) = 1 - P(green) = 1-.35=0.65

There is a 65% chance I will not have a green light. 

 

Suppose the probability that a randomly selected student is a sophomore is 0.20, and the probability that they are a junior is 0.30. What is the probability that the student is either a sophomore or a junior, written P(A or B)? If you guessed 0.50, you have deduced the Addition Rule, which says that you can add the probabilities of events that are disjoint. To see whether two events are disjoint, we take them apart into their component outcomes and check whether they have any outcomes in common. Disjoint events have no outcomes in common. The Addition Rule states: For two disjoint events A and B, the probability that one or the other occurs is the sum of the probabilities of the two events. 

P(A or B) = P(A) + P(B), provided that A and B are disjoint. 

 

Example 3

Suppose we find out that P(yellow) is 0.04. What is the probability that the light is red?

 

The light must be red, green, or yellow, so if we can figure out the probability that the light is green or yellow, we can use the complement rule to find the probability that it is red. To find the probability that the light is green or yellow, I can use the Addition Rule because these are disjoint events: The light can’t be both green and yellow at the same time.

P(green or yellow) = .35 + .04 = .39

Red is the only remaining alternative, and the probabilities must add up to 1, so:

P(red) = P(not green or yellow) = 1-P(green or yellow) = 1-.39=.61

 

The addition rule can be extended to any number of disjoint events, and that is helpful for checking probability assignments. Because individual sample space outcomes are always disjoint, we have an easy way to check whether the probabilities we have assigned to the possible outcomes are legitimate. The Probability Assignment Rule tells us that to be a legitimate assignment of probabilities, the sum of the probabilities of all possible outcomes must be exactly 1. No more, no less. For example, if we were told that the probabilities of selecting at random a freshman, sophomore, junior, or senior from all the undergraduates at a school were .25,.23,.22, .20, respectively, we would know that something was wrong. These probabilities add only to .90, so this is not a legitimate probability assignment. Either a value is wrong or we just missed some possible outcomes. 

 

Suppose your job requires you to fly from Atlanta to Houston every Monday morning. The airline’s website reports that this flight is on time 85% of the time. What is the chance it will be on time two weeks in a row? That is the same as asking for the probability that your flight is on time this week and it is on time again next week. For independent events, the answer is simple. Remember that independence means that the outcome of one event does not influence the outcome of the other. What happens with your flight this week does not influence whether it will be on time next week, so it is reasonable to assume that those events are independent. 

 

The Multiplication Rule  says that for independent events, to find the probability that both events occur, we just multiply the probabilities together. This rule can be extended to more than two independent events. What is the chance of your flight being on time for a month? We can multiply the probabilities of it happening each week:

.85*.85*.85*.85 = .522

Of course, to calculate this probability, we have used the assumption that the four events are independent. Many statistics methods require an Independence Assumption, but assuming independence does not make it true. Always think about whether that assumption is reasonable before using the Multiplication Rule.

 

Example 4

We have determined that the probability that we encounter a green light at the corner is .35, a yellow light .04, and a red light .61. Let us think about how many times during your morning commute in the week ahead you might hit a red light there.

What is the probability you find the light red on both Monday and Tuesday?

Because the color of the light I see on Monday does not influence the color I will see on Tuesday, these are independent events. I can use the Multiplication Rule:

P(red Monday and Tuesday) = P(red) * P(red) = .61 * .61 = .3721

There is about a 37% chance I will hit red lights both Monday and Tuesday mornings.

 

What is the probability you do not encounter a red light until Wednesday?

For that to happen, I would have to see green or yellow on Monday, green or yellow on Tuesday, and then red on Wednesday. I can simplify this by thinking of it as not red on Monday, not red on Tuesday, and then red on Wednesday.

P(not red) = 1-P(red) = 1-.61 = .39, so:

P(not red Monday and Tuesday) = P(not red) * P(not red) * P(red)

=.39 * .39 * .61 = .092781

There is about a 9% chance that this week I will hit my first red light on Wednesday morning

 

What is the probability that you will have to stop at least once during the week?

Having to stop at least once means that I have to stop for the light 1,2,3,4, or 5 times next week. It is easier to think about the complement, never having to stop at a red light. Having to stop at least once means that I did not make it through the week with no red lights.

P(having to stop at light at least once in 5 days)

=1-P(no red lights for 5 days in a row)

=1-P(not red and not red and not red and not red and not red)

\[=1-(.39)^5\}] 

=1-.0090 = .991

I am not likely to make it through the intersection without having to stop sometime this week. 

 

Note that the phrase at least is often a tip off to think about the complement. Something that happens at least once does happen. Happening at least once is the complement of not happening at all, and that is easier to find.

 

Example 5

What is the probability that a Japanese M&M’s survey respondent selected at random chose either pink or teal?

Plan and decide which rules to use and check the conditions they require.

The events pink and teal are disjoint because one respondent can’t choose both. We can apply the Addition Rule.

Show your work.

P(pink or teal) = P(pink) + P(teal) = .38 + .36 = .74

Interpret your results in the proper context.

The probability that the respondent chose either pink or teal is .74 or 74%

 

If we pick two respondents at random, what is the probability that they both said purple?

The word both suggests we want P(A and B), which calls for the Multiplication Rule. Think about the assumption.

Independence Assumption: The choice made by one respondent does not affect the choice of the other, so the events are independent. I can use the Multiplication Rule.

Show your work. For both respondents to choose purple, each one has to choose purple.

P(both purple) = P(first purple and second purple) 

= P(first purple) * P(second purple) = .16 * .16 = .0256

Interpret your results in the proper context.

The probability that both chose purple is .0256

 

If we pick three respondents at random, what is the probability that at least one chose purple?

The phrase, at least, often flags a question best answered by looking at the complement, and that is the best approach here. The complement of at least one preferred purple is none of them preferred purple. Think about the assumption.

P(at least one purple) = P(none purple) = 1 - P(none purple)

P(none purple) = P(not purple and not purple and not purple)

Independence Assumption: These are independent events because they are choices by three random respondents. I can use the Multiplication Rule.

We calculate P(none purple) by using the Multiplication Rule.

P(none purple) = P(first not purple) * P(second not purple) * P(third not purple)

=P(not purple)^3

P(not purple) = 1 - P(purple) = 1 - .16 = .84 

So, P(none purple) = .84^3 = .5927 

Then we can use the Complement Rule to get the probability we want.

P(at least 1 purple) = 1 - P(none purple) = 1 - .5927 = .4073

Interpret your results in the proper context

There is about a 40.7% chance that at least one of the respondents chose purple.

 

Beware of probabilities that don’t add up to 1. To be a legitimate probability assignment, the sum of the probabilities for all possible outcomes must total 1. If the sum is less than 1, you may need to add another category and assign the remaining probability to that outcome. If the sum is more than 1, check that the outcomes are disjoint. If they are not, then you cannot assign probabilities by just counting relative frequencies.

 

Do not add probabilities of events if they are not disjoint. Events must be disjoint to use the Addition Rule. The probability of being younger than 80 or a female is not the probability of being younger than 80 plus the probability of being female. That sum may be more than 1.

 

Do not multiply probabilities of events if they are not independent. The probability of selecting a student at random who is over 6’10” tall and on the basketball team is not the probability the student is over 6’10” tall times the probability he is on the basketball team. Knowing that the student is over 6’10” changes the probability of his being on the basketball team. You cannot multiple these probabilities. The multiplication of probabilities of events that are not independent is one of the most common errors people make in dealing with probabilities.

 

Do not confuse disjoint and independent. Disjoint events cannot be independent. If A = {you get an A in this class} and B = {you get a B in this class}, A and B are disjoint. Are they independent? If you find out that A is true, does that change the probability of B? Yes it does. So they cannot be independent.

 

Random Phenomenon

A phenomenon is random if we know what outcomes could happen, but not which particular values will happen.

 

Trial

A single attempt or realization of a particular phenomenon

 

Outcome

The value measures, observed, or reported for an individual instance of trial

 

Event

A collection of outcomes. Usually, we identify events so that we can attach probabilities to them. We denote events with bold capital letters such as A, B, or C.

 

Sample Space

The collection of all possible outcome values. The collection of values in the sample space has a probability of 1. We denote the sample space with a boldface capital S.

 

Law of Large Numbers

This law states that the long-run relative frequency of an event’s occurrence gets closer and closer to the true relative frequency as the number of trials increases.

 

Independence

Two events are independent if learning that one event occurs does not change the probability that the other event occurs. 

 

Probability

The probability of an event is a number between 0 and 1 that reports the likelihood of that event’s occurrence. We write P(A) for the probability of the event A.

 

Empirical Probability

When the probability comes from the long-run relative frequency of the event’s occurrence, it is an empirical probability.

 

Theoretical Probability

When the probability comes from a model, it is theoretical probability.

 

Personal Probability

When the probability is subjective and represents your personal degree of belief, it is a personal probability.

 

Probability Assignment Rule

The probability of an entire sample space must be 1. P(S) = 1.

 

Complement Rule

The probability of an event not occurring is 1 minus the probability that it does occur.

 

Addition Rule

If A and B are disjoint events, then the probability of A or B is P(A or B) = P(A) + P(B)

 

Disjoint

Two events are disjoint if they share no outcomes in common. If A and B are disjoint, then knowing that A occurs tells us that B cannot occur. Disjoint events are called mutually exclusive.

 

Legitimate Assignment of Probabilities

An assignment of probabilities to outcome is legitimate if each probability is between 0 and 1 and the sum of the probabilities is 1.

 

Multiplication Rule

If A and B are independent events, then the probability of A and B is P(A and B) = P(A) * P(B)

 

Question 1

In a dresser are three blue shirts, four red shirts, and nine black shirts.

What is the probability of randomly selecting a red shirt?

4/16=.25

What is the probability that a randomly selected shirt is not black?

Not black means blue and red shirts.

7/16=.4375

 

Question 2

A recent study conducted by a health statistics center found that 23% of households in a certain country had no landline service. This raises concerns about the accuracy of certain surveys, as they depend on random-digit dialing to households via landlines. Pick three households from this country at random.

What is the probability that all three of them have a landline?

The Multiplication Rule says that for independent events, to find the probability that both events occur, multiply them together. 

.77 * .77 * .77 = .457

What is the probability that at least one of them does not have a landline?

The Complement Rule says that the probability of an event not occurring is 1 minus the probability that it does occur.

1 - .457 = .543

What is the probability that at least one of them does have a landline?

1 - .012 = .988

 

Question 3

For each of the following, list the sample space and tell whether you think the events are equally likely.

A sample space is the collection of all possible outcome values. The collection of values in the sample space has a probability of 1.

 

Roll two dice, record the sum of the numbers. = {2,3,4,5,6,7,8,9,10,11,12}

The events are not equally likely

 

A family has 3 children, record each child’s sex in order of birth.  

=bbb,bbg,bgb,bgg,gbg,ggb,ggg

= are equally likely

 

Toss four coins and record the number of tails.

=0,1,2,3,4

Are not

 

Toss a coin 10 times and record the length of the longest run of heads.

=0,1,2,3,4,5,6,7,8,9,10

Are not

 

Question 4

The plastic arrow on a spinner for a child’s game stops rotating to point at a color that will determine what happens next. Are the given probability assignments possible?

 

Each probability is between 0 and 1 and the sum of the probabilities is 1, is possible

Each probability is between 0 and 1 and the sum of the probabilities is 1, this is possible

The sum of the probabilities is greater than 1, this is not possible

Each probability is between 0 and 1 and the sum of the probabilities is 1, this is possible

At least one probability is not between 0 and 1, this is not possible