# Statistical and Critical Thinking

Here are my notes and thoughts regarding statistical and critical thinking in Statistics.

Surveys provide data that enable us to improve products or services. Surveys guide political candidates, shape business practices, influence social media, and affect many aspects of our lives.

A voluntary response sample is a sample in which respondents themselves decide whether to participate. Those with a strong interest in the topic are more likely to participate. Sample data must be collected in an appropriate way, such as through a process of random selection. If sample data are not collected in an appropriate way, the data may be so completely useless that no amount of statistical torturing can salvage them.

When using methods of statistics with sample data to form conclusions about a population, it is absolutely essential to collect sample data in a way that is appropriate.

Data are collections of observations, such as measurements, genders, or survey responses. A single data value is called a datum. The term data is plural.

Statistics is the science of planning studies and experiments, obtaining data, and organizing, summarizing, presenting, analyzing, and interpreting those data and then drawing conclusions based on them.

A population is the complete collection of all measurements or data that are being considered. Typically, a population is the complete collection of data that we would like to make inferences about.

A census is the collection of data from every member of the population.

A sample is a subcollection of members selected from a population.

Because populations are often very large, a common objective of the use of statistics is to obtain data from a sample and then use those data to form a conclusion about the population.

A voluntary response sample is one in which the respondents themselves decide whether to be included.

The word statistics is derived from the Latin word status, meaning state. Early uses of statistics involved compilations of data and graphs describing various aspects of a state or country.

The following types of polls are common examples of voluntary response samples. By their very nature, all are seriously flawed because we should not make conclusions about a population on the basis of samples with a strong possibility of bias.

- Internet polls: people online can decide whether to respond.
- Mail-in polls: in which people can decide whether to reply.
- Telephone polls:in which newspaper, radio, or television announcements ask that you call a special number to respond.

**Analyze**

After completing our preparation by considering the context, source, and sampling method, we begin to analyze the data.

**Graph and Explore**

An analysis should begin with appropriate graphs and explorations of data.

**Apply Statistical Methods**

A good statistical analysis does not require strong computational skills. A good statistical analysis does require using common sense and paying careful attention to sound statistical methods.

**Conclude**

The final step in our statistical process involves conclusions, and we should develop an ability to distinguish between statistical significance and practical significance.

Statistical significance is achieved in a study when we get a result that is very unlikely to occur by chance. A common criterion is that we have statistical significance if the likelihood of an event occurring by chance is 5 percent or less. Getting 98 girls in 100 random births is statistically significant because such an extreme outcome is not likely to result from random chance. Getting 52 girls in 100 births is not statistically significant because that event could easily occur with random chance.

Practical significance is when it is possible that some treatment or finding is effective, but common sense might suggest that the treatment or finding does not make enough of a difference to justify its use or to be practical.

**Misleading Conclusions**

When forming a conclusion based on a statistical analysis, we should make statements that are clear even to those who have no understanding of statistics and its terminology. We should carefully avoid making statements not justified by statistical analysis.

**Sample Data Reported**

When collecting data from people, it is better to take measurements yourself instead of asking subjects to report results. Ask people what they weigh and you are likely to get their desired weights, not their actual weight.

**Loaded Questions**

If survey questions are not worded carefully, the results of a study can be misleading. Survey questions can be loaded or intentionally worded to elicit a desired response.

**Order of Questions**

Sometimes survey questions are unintentionally loaded by such factors as the order of the items being considered.

**Nonresponse**

A nonresponse occurs when someone either refuses to respond to a survey question or is unavailable. When people are asked survey questions, some firmly refuse to answer.

**Percentages**

To find a percentage of an amount, replace the % symbol with division by 100, and then interpret “of” to be multiplication.

6% of 1200 responses = \(\frac{6}{100} * 1200 = 72 \)

**Decimal to Percentage**

To convert from a decimal to a percentage, multiply by 100%.

\[ 0.25 \rightarrow 0.25 * 100% = 25% \]

**Fraction to Percentage**

To convert from a fraction to a percentage, divide the denominator into the numerator to get an equivalent decimal number. Then multiply by 100 percent.

\[ \frac{}3}{4} = 0.75 \rightarrow 0.75 * 100% = 75% \]

**Percentage to Decimal**

To convert from a percentage to a decimal number, replace the % symbol with division by 100.

\[ 85% = \frac{85}{100} = 0.85 \]

A parameter is a numerical measurement describing some characteristic of a population.

A statistic is a numerical measurement describing some characteristic of a sample.

If we have more than one statistic, we have statistics. Another meaning of statistics is the science of planning studies and experiments; obtaining data, organizing, summarizing, presenting, analyzing, and interpreting those data.

Some data are numbers representing counts or measurements, whereas others are attributes that are not counts or measurements. Quantitative data consist of numbers representing counts or measurements.

Categorical data consist of names or labels. Categorical data are sometimes coded with numbers, with those numbers replacing names. Although such numbers might appear to be quantitative, they are actually categorical data.

**Include Units of Measurement**

With quantitative data, it is important to use the appropriate units of measurement, such as dollars, hours, feet, or meters. We should carefully observe information given about the units of measurement, such as all amounts are in thousands of dollars or all units are in kilograms.

**Discrete or Continuous**

Quantitative data can be further described by distinguishing between discrete and continuous types. Discrete data result when the data values are quantitative and the number of values is finite. Continuous or numerical data result from infinitely many possible quantitative values, where the collection of values is not countable.

The concept of countable data plays a key role in the preceding definitions, but it is not a particularly easy concept to understand. Continuous data can be measured, but not counted. If you select a particular value from continuous data, there is no next data value.

**Levels of Measurement**

Another common way of classifying data is to use four levels of measurement; nominal, ordinal, interval, and ratio. When we are applying statistics to real problems, the level of measurement of the data helps us to decide which procedure to use. Don’t do computations and don’t use statistical methods that are not appropriate for the data.

**Ratio**

There is a natural zero starting point and ratios make sense. These are heights, lengths, distances, and volumes.

**Interval**

Differences are meaningful, but there is no natural zero starting point and ratios are meaningless. Body temperatures in degrees is an example.

**Ordinal**

Data can be arranged in order, but differences either can’t be found or are meaningless. Examples are ranks of colleges.

**Nominal**

Categories only. Data cannot be arranged in order. An example is eye colors.

The nominal level of measurement is characterized by data that consist of names, labels, or categories only. The data cannot be arranged in some order.

Because nominal data lack any ordering or numerical significance, they should not be used for calculations. Numbers such as 1,2,3, or 4 are sometimes assigned to the different categories, but these numbers have no real computational significance and any average calculated from them is meaningless and possibly misleading.

Data are at the ordinal level of measurement if they can be arranged in some order, but differences between data values cannot be determined or are meaningless.

Ordinal data provide information about relative comparisons, but not the magnitudes of the differences. Usually, ordinal data should not be used for calculations such as an average, but this guideline is sometimes ignored.

Data are at the interval level of measurement if they can be arranged in order, and differences between data values can be found and are meaningful. Data at this level do not have a natural zero starting point at which none of the quantity is present.

Data are at the ratio level of measurement if they can be arranged in order, differences can be found and are meaningful, and there is a natural zero starting point. For data at this level, differences and ratios are both meaningful.

The distinction between the interval and ratio levels of measurement can be a bit tricky. For the ratio test, focus on the term ratio and know that the term twice describes the ratio of one value to be double the other value. To distinguish between the interval and ratio levels of measurement, use a ratio test by asking this question: Does use of the term twice make sense? Twice makes sense for data at this level of measurement, but it does not make sense for data at the interval level of measurement.

For the true zero test, and for ratios to make sense, there must be a value of true zero, where the value of zero indicates that none of the quantity is present, and zero is not simply an arbitrary value on a scale. The temperature of 0 F is arbitrary and does not indicate that there is no heat, so temperatures on the Fahrenheit scale are at the interval level of measurement not the ratio level.

Big data refers to data sets so large and so complex that their analysis is beyond the capabilities of traditional software tools. Analysis of big data may require software simultaneously running in parallel on many different computers.

Data science involves applications of statistics, computer science, and software engineering, along with some other relevant fields such as sociology or finance.

**Example of Data Set Magnitudes**

- Terabytes
- Petabytes
- Exabytes
- Zettabytes
- Yottabytes

**Statistics in Data Science**

The modern data scientist has a solid background in statistics and computer systems as well as expertise in fields that extend beyond statistics. The modern data scientist might be skilled with Hadoop software, which uses parallel processing on many computers for the analysis of big data. The modern data scientist might also have a strong background in some other field such as psychology, biology, medicine, chemistry, or economics.

**Missing Data**

When collecting sample data, it is quite common to find that some values are missing. Ignoring missing data can sometimes create misleading results. If you make the mistake of skipping over a few different samples when you are manually typing them into a statistics software program, the missing values are not likely to have a serious effect on the results. However, if a survey includes many missing salary entries because those with very low incomes are reluctant to reveal their salaries, those missing values will have the serious effect of making salaries appear higher than they really are.

A data value is missing completely at random if the likelihood of its being missing is independent of its value or any of the other values in the data set. That is, any data value is just as likely to be missing as any other data value.

A data value is missing not at random if the missing value is related to the reason that it is missing.

Missing data at random can happen and an example is when using a keyboard to manually enter ages of survey respondents and makes the mistake of failing to enter the age of 37 years. The data value is missing completely at random.

**Biased Results**

Based on the two definitions and examples from the previous page, it makes sense to conclude that if we ignore data missing completely at random, the remaining values are not likely to be biased and good results should be obtained. However, if we ignore data that are missing, not at random, it is very possible that the remaining values are biased and results will be misleading.

**Correcting for Missing Data**

There are different methods for dealing with missing data. One very common method for dealing with missing data is to delete all subjects having any missing values. If the data are missing completely at random, the remaining values are not likely to be biased and good results can be obtained, but with a smaller sample size. If the data are missing not at random, deleting subjects having any missing values can easily result in a bias among the remaining values, so results can be misleading.

We can also input missing data values when we substitute values for them. There are different methods of determining the replacement values, such as using the mean of the other values, or using a randomly selected value from other similar cases, or using a method based on regression analysis.

When analyzing sample data with missing values, try to determine why they are missing, then decide whether it makes sense to treat the remaining values as being representative of the population. If it appears that there are missing values that are missing not at random, know that the remaining data may well be biased and any conclusions based on those remaining values may well be misleading.

In an experiment, we apply some treatment and then proceed to observe its effects on the individuals. The individuals in experiments are called experimental units and they are often called subjects when they are people. In an observational study, we observe and measure specific characteristics, but we don’t attempt to modify the individuals being studied.

Experiments are often better than observational studies because well planned experiments typically reduce the chance of having the results affected by some variable that is not part of the study. A lurking variable is one that affects the variables included in the study, but it is not included in the study.

**Design of Experiments**

Good design of experiments includes replication, blinding, and randomization.

Replication is the repetition of an experiment on more than one individual. Good use of replication requires sample sizes that are large enough so that we can see effects of treatments.

Blinding is used when the subject doesn’t know whether he or she is receiving a treatment or a placebo. Blinding is a way to get around the placebo effect, which occurs when an untreated subject reports an improvement in symptoms.

Randomization is used when individuals are assigned to different groups through a process of random selection. The logic behind randomization is to use chance as a way to create two groups that are similar.

A simple random sample of n subjects is selected in such a way that every possible sample of the same size n has the same chance of being chosen.

Unlike careless or haphazard sampling, random sampling usually requires very careful planning and execution.

**Simple Random Sample**

A sample of n subjects is selected so that every sample of the same size n has the same chance of being selected

**Systematic Sample**

Select every kth subject

**Convenience Sample**

Use data that are very easy to get

**Stratified Sample**

Subdivide populations into strata or groups with the same characteristics, then randomly sample within those strata.

**Cluster Sample**

Partition the population in clusters or groups, then randoml;y select all members of the selected clusters.

**Multistage Sampling**

Professional pollsters and government researchers often collect data by using some combination of the preceding sampling methods. In a multistage sample design, pollsters select a sample in different stages, and each stage might use different methods of sampling.

In a cross sectional study, data are observed, measured, and collected at one point in time, not over a period of time.

In a retrospective study, data are collected from a past timer period by going back in time.

In a prospective study, data are collected in the future from groups that share common factors.

**Experiments**

In an experiment, confounding occurs when we can see some effect, but we can’t identify the specific factor that caused it.

A randomized block design uses the same basic idea as stratified sampling, but randomized block designs are used when designing experiments, whereas stratified sampling is used for surveys.

**Matched Pairs Design**

Compare two treatment groups by using subjects matched in pairs that are somehow related ort have similar characteristics.

**Rigorously Controlled Design**

Carefully assign subjects to different treatment groups, so that those given each treatment are similar in the ways that are important to the experiment. This can be extremely difficult to implement, and often we can never be sure that we have accounted for all of the relevant factors.

**Sampling Errors**

In statistics, you could use a good sampling method and do everything correctly, and yet it is possible to get wrong results. No matter how well you plan and execute the sample collection process, there is likely to be some error in the results.

A sampling error occurs when the sample has been selected with a random method, but there is a discrepancy between a sample result and the true population result, such an error results from chance sample fluctuations.

A non sampling error is the result of human error, including such factors as wrong data entries, computing errors, questions with biased wording, false data provided by respondents, forming biased conclusions, or applying statistical methods that are not appropriate for the circumstances.

A non random sampling error is the result of using a sampling method that is not random, such as using a convenience sample or a voluntary response sample.

**The Gold Standard**

Randomization with placebo/treatment groups is sometimes called the gold standard because it is so effective.

List the elements in the set

{x|x is a natural number between 5 and 13}

{6,7,8,9,10,11,12}

List the elements in the set

{x|x is an integer between -1 and 1}

{0}

Perform the exponentiation by hand

-5^2

= -25

Express 2*2*2*2*3*3*3*3*3 using exponents

= 2^4 * 3^5

Rewrite the expression using exponents

4*3*3*3

= 3^3 * 4^1

Perform the indicated operations

5^3 * 3^2

= 1125

Write the number in scientific notation

0.0001

= 1 * 10^-4

Write the number in scientific notation

3000

= 3 * 10^3

Write the number in scientific notation

4,420,000

= 4.42 * 10 ^6

A newspaper posted this question on its website: How often do you seek medical information online? Of 1072 internet users who chose to respond, 38 percent of them responded with frequently. What term is used to describe this type of survey in which the people surveyed consist of those who decided to respond?

- The respondents are a self selected sample
- The respondents are a voluntary response sample

What is wrong with this type of sampling method?

- Responses may not reflect the opinions of the general population
- Many people may choose not to respond to the survey

Determine whether the source given below has the potential to create a bias in a statistical study.

A certain medical organization tends to oppose the use of meat and dairy products in our diets, and that organization has received hundreds of thousands of dollars from an animal rights organization.

- There does appear to be a potential to create a bias. There is an incentive to produce results that are in line with the organization's creed and that of its founders.

An article noted that chocolate is rich in flavonoids. The article reports that regular consumption of foods rich in flavonoids may reduce the risk of coronary heart disease. The study received funding from a candy company and a chocolate manufacturers association. Identify and explain at least one source of bias in the study described.

- The researchers may have been more inclined to provide favorable results because funding was provided by a party with a definite interest. The bias could have been avoided if the researchers were not paid by the candy company and the chocolate manufacturers.

Determine whether the sampling method described below appears to be sound or is flawed.

In a survey of 572 subjects, each was asked how often he or she drank milk. The survey subjects were internet users who responded to a question that was posted on a news website.

- It is flawed because it is a voluntary response sample

Determine whether the sampling method described below appears to be sound or is flawed.

In a survey of 735 human resource professionals, each was asked about the importance of the experience of a job applicant. The survey subjects were randoml;y selected by pollsters from a reputable market research firm.

- It appears to be sound because the data are not biased in any way

Determine whether the results appear to have statistical significance, and also determine whether the results appear to have practical significance.

In a study of a birth sex selection method used to increase the likelihood of a baby being born female, 1929 users of the method gave birth to 946 males and 983 females. There is about a 21% chance of getting that many babies born female if the method had no effect.

- Does not have statistical significance
- Not many
- 51%
- Does not have practical significance

In the data table below, the x-values are the weight of cars and the y-values are the corresponding highway fuel consumption amounts.

4034 3364 4179 3674 3599

26 32 28 29 30

Given the context of the car measurement data, what issue can be addressed by conducting a statistical analysis of the values?

- Is there a relationship or an association between the weight of a car and its fuel consumption amount?

A magazine ran a survey about a website for downloading music. Readers could register their responses on the magazine’s website. Identify what is wrong.

- The sample is a voluntary response sample, so there is a good chance that the results do not reflect the population.

A polling company reported that 27 percent of 1013 surveyed adults said that their cill phones are very harmful.

What is the exact value of 27% of 1013

- = 273.51

Could the result from part A be the actual number of adults who said that cell phones are very harmful?

- No, the result from part A could not be the actual number of adults who said cell phones are very harmful because a count of people must result in a whole number.

What could be the actual number of adults who said that cellular phones are harmful?

- = 274 (just round)

Among the 1013 respondents, 406 said that cell phones are not at all harmful. What percentage said that cell phones are not harmful?

- = 40.08% (406/1013)*100

A polling company reported that 59% of 2302 surveyed adults said that they play basketball.

What is the exact value that is 59%m of 2302?

- 1358.18 (.59*2302)

Could the result from part B be the actual number of adults who said they play basketball?

- No, the result from part A could not be the actual number of adults who said they play basketball because a count of people must result in a whole number.

What could be the actual number of adults who said they play basketball?

- = 1358

Among the 2302 respondents, 301 said that they only play hockey. What percentage of respondents said that they only play hockey?

- = 13.08% (301/2302)

Determine whether the data described below are qualitative or quantitative and explain why.

The types of food served by restaurants.

- The data are qualitative because they don’t measure or count anything.

State whether the data described below are discrete or continuous.

The populations of cities

- The data are discrete because the data can only take on specific values

Determine whether the given value is a statistic or a parameter

A homeowner measured the voltage supplied to his home on 5 days of a given week, and the average value is 147.6

- The given value is a statistic for the week because the data collected represent a sample

A particular country has 50 total states. If the areas of all 50 states are added and the sum is divided by 50, the result is 194,953 kilometers. Determine whether this result is a statistic or a parameter

- The result is a parameter because it describes some characteristic of a population

A parameter is a numerical measurement describing some characteristic of a population.

A statistic is a numerical measurement describing some characteristic of a sample.

State whether the data described below are discrete or continuous.

The numbers of people looking at a website at different times.

- The data are discrete because the data can only take on specific values

Determine whether the value given below is from a discrete or continuous data set.

When a car is randomly selected and weighed, it is found to weigh 1531.3 kg.

- A continuous data set because there are infinitely many possible values and those values cannot be counted.

Determine whether the value is from a discrete or continuous data set

Number of beats in a song is 5

- Discrete because it is countable

Nominal is categories only and data cannot be arranged in an ordering scheme.

Ordinal is categories but are ordered and differences cannot be found or are meaningless.

Interval are when differences are meaningful but there is no natural zero starting point.

Ratio is when there is a natural zero starting point and ratios are meaningful.

Determine which of the four levels of measurement is most appropriate for the data below

Body temperature in degrees Fahrenheit

- The interval level of measurement is most appropriate because the data can be ordered, differences can be found and are meaningful, and there is no natural starting zero point.

Determine which of the four levels of measurement is most appropriate

Favorite types of music

- Nominal

Determine which of the four levels of measurement is most appropriate

Ages of children: 5,6,7,8 and 9

- Ratio

Determine which of the four levels of measurement is most appropriate for the data below.

Volume of planets in cubic meters

- The ratio level of measurement is most appropriate because the data can be ordered, differences can be found and are meaningful, and there is a natural starting zero point.

Identify the level of measurement of the data, and explain what is wrong with the given calculation.

In a survey, the hair colors of respondents are identified as 10 for brown hair, 20 for blonde hair, 30 for black hair, and 40 for anything else. The average is calculated for 702 respondents and the result is 22.3

The data are at the _________ level of measurement

- Nominal

What is wrong with the given calculation?

- Such data are not counts or measures of anything, so it makes no sense to compute their average

Identify the level of measurement of the data, and explain what is wrong with the given calculation.

In a set of data, course grades are represented as 10 for A, 20 for B, and 30 for C. The average of the 769 course grades is 25.4.

The data are at the _____ level of measurement.

- Ordinal

What is wrong with the given calculation?

- Such data should not be used for calculation such as an average

Which of the following is associated with a parameter?

- Data that were obtained from an entire population. (A parameter is a numerical measurement describing some characteristic of a population. So, a parameter is associated with data that were obtained from an entire population.)

Which level of measurement consists of categories only where data cannot be arranged in an ordering scheme?

- Nominal. (The nominal level of measurement is characterized by data that consist of names, labels, or categories only. The data cannot be arranged in an ordering scheme such as low to high.)

Determine whether the given description corresponds to an observational study or an experiment?

In a study of 405 men with a particular disease, the subjects were photographed daily.

- The given description corresponds to an observational study

In a double blind experiment designed to test the effectiveness of a new medication as a treatment for lower back pain, 1643 patients were randomly assigned to one of three groups.

What does it mean to say that the experiment was double blind?

- The subjects in the study did now know whether they were taking a placebo or the new medication, and those who administered the pills also did now know.

In a study designed to test the effectiveness of a medication as a treatment for lower back pain, 1643 patients were randomly assigned to one of three groups. In what specific way was replication applied in the study?

Replication is the repetition of an experiment on more than one individual.

- The group sample sizes are all large so the researchers could see the effects of the treatment.

Determine whether the description corresponds to an observational study or an experiment.

Fifty patients with lung cancer are divided into two groups. One group receives an experimental drug to fight cancer, the other a placebo. After two years the spread of the cancer is measured.

Does the description correspond to an observational study or an experiment?

- Experiment

Identify the type of sampling used in the situation below.

In a poll conducted by a certain researcher, 980 adults were called after their telephone numbers were randomly generated by a computer, and 61% were able to correctly identify the vice president.

- Random sampling

Identify which type of sampling is used: random, systematic, convenience, stratified, or cluster.

A magazine asks its readers to call in their opinion regarding the quality of the articles.

- Convenience sampling

Determine whether the study is an experiment or an observational study, and identify a major problem with the study.

In a survey, 1465 internet users chose to respond to this question posted on a newspaper electronic edition. Is news online as satisfying as print and TV news? 52% of the respondents said yes.

- This is an observational study because the researchers do not attempt to modify the individuals.

What is a major problem with the study?

- This is a convenience sample with voluntary response, which has a high chance of leading to bias.

Determine whether the study is an experiment or an observational study, and then identify a major problem with the study.

A study involved 22071 male physicians. Based on random selections, 11037 of them were treated with aspirin and the other 11034 were given placebos. The study was stopped early because it became clear that aspirin reduced the risk of myocardial infarctions by a substantial amount.

This is an_______

- Experiment

Because the researchers_______

- Apply a treatment to the individuals

What is a major problem with the study?

- The results apply only to male physicians

Determine whether the study is an experiment or an observational study, and then identify a major problem with the study.

A medical researcher tested for a difference in systolic blood pressure levels between male and female students who are 12 years of age. She randomly selected four males and four females for her study.

- This is an observational study because the researcher does not attempt to modify the individuals

What is a major problem with the study?

- The sample is too small

_____ is used when subjects are assigned to different groups through a process of random selection.

Randomization is used when subjects are assigned to different groups through a process of random selection. The logic behind randomization is to use chance as a way to creator two groups that are similar. Although it might seem that we should not leave anything to chance in experiments, randomization has been found to be an extremely effective method for assigning subjects to groups.

- Randomization

A study is conducted to measure children’s growth rates without any treatment applied to the children. What best classifies this study?

- Observational study (An observational study involves observing and measuring specific characteristics without attempting to modify the subjects being studied)

Which of the following corresponds to the case when every sample of size n has the same chance of being chosen?

- Simple random sample (A simple random sample of n subjects is selected in such a way that every possible sample of the same size n has the same chance of being chosen)