Final Exam STAT 2001. True or False. Justify for full credit.
(a) If all the observations in a data set are identical, then the variance for this data set
is zero.
True. Variance is based on variation of scores from t
...
Final Exam STAT 2001. True or False. Justify for full credit.
(a) If all the observations in a data set are identical, then the variance for this data set
is zero.
True. Variance is based on variation of scores from the mean. If all the data are
identical then the mean will be equal to each of the data and hence the variance will be
equal to zero.
(b) If A and B are disjoint, P(A) = 0.4 and P(B) = 0.5, then P(A AND B) = 0.2.
False. For disjoint events, p(A and B) is always equal to zero.
(c) The mean is always equal to the median for a normal distribution.
True. The data for a normal distribution is symmetrical around the mean. The median
will thus be equal to the mean.
(d) A 95% confidence interval is wider than a 98% confidence interval of the same
parameter.
False. The reliability factor associated with 95% interval is smaller than that of
the 98% interval. The 95% interval is thus smaller
(e) It’s easier to reject the null hypothesis in a hypothesis test at 0.05 significance
level than at 0.01 significance level.
True. The rejection region for 0.05 level is larger than that of the 0.01 level.
2. Choose the best answer. Justify for full credit.
(a) A study was conducted at a local college to analyze the average GPA of students
graduated from UMUC in 2016. 100 students graduated from UMUC in 2016 were randomly
selected, and the average GPA for the group is 3.5. The value 3.5 is a
i (i) statistic
ii (ii) parameter
iii (iii) cannot be determined
3.5 is a sample mean. Values that describe samples are called statistics.
(b) The hotel ratings are usually on a scale from 0 star to 5 stars. The level of this
measurement is
i (i) interval
ii (ii) nominal
iii (iii) ordinal
iv (iv) ratio
The measure is qualitative in nature but it also in an ordered manner.
(c) On the day of the Virginia Primary Election, UMUC News Club organized an exit poll at
three polling stations were randomly selected and all voters were surveyed as they left those
polling stations. This type of sampling is called:
i (i) cluster
ii (ii) convenience
iii (iii) systematiciv (iv) stratified
The selection of groups is random and all individuals in the group are sampled. This is
characteristic of stratified sampling.
3. A random sample of 100 students was chosen from UMUC STAT 200 classes. The
frequency distribution below shows the distribution for study time each week (in hours).
(Show all work. Just the answer, without supporting work, will receive no credit.)
(a) Complete the frequency table with frequency and relative frequency. Express the relative
frequency to two decimal places.
Relative frequency = frequency/ total
Frequency = relative frequency * total
Study Time (in hours) Frequency
Relative
Frequency
0.0 – 4.9 2 =2/100
5.0 - 9.9 13 =13/100
10.0 - 14.9 20 =20/100
15.0 -19.9 =0.45*100 0.45
20.0 – 24.9 =100 –(2+13+20+45) =20/100
Total 100 =100/100
Summarized table
Study Time (in hours) Frequency
Relative
Frequency
0.0 – 4.9 2 0.02
5.0 - 9.9 13 0.13
10.0 - 14.9 20 0.2
15.0 -19.9 45 0.45
20.0 – 24.9 20 0.2
Total 100 1
(b) What percentage of the study times was at least 15 hours?
Total Relative frequency ( >- 15 hours) = 0.45 + 0.2 = 0.65
Percentage = 0.65*100% = 65%
(c) In what class interval must the median lie? 5.0 – 9.9, 10.0 -14.9, 15.0 – 19.9, or 20.0 – 24.9?
Why?
15.0 – 19.9,
The class contains the 50th and 51st values that make up the median
4. Answer each question based on the given information, and explain your answer in each case.(a) What is the range in the grade distribution?
Range = highest value – lowest value
Range = 100-30 = 70
(b) Which of the following score bands has the most students?
(i) 30 - 50
(ii) 50 - 70
(iii) 85 - 100
(Iv) Cannot be determined
We are certain that 30 -50 and 85-100 has 25% of the students each. However, we cannot know
the certain value for 50 -70. The answer can thus not be determined based on the box plot
(c) How many students in the sample are in the score band between 65 and 100?
30 students
65 is the median and 100 is the maximum. 50% of values lie between the median and maximum.
Number of students = 50%*60 = 30
5. A basket contains 3 white balls, 2 yellow balls, and 5 red balls. Consider selecting one ball at a time
from the basket. (Show all work. Just the answer, without supporting work, will receive no credit.)
(a) Assuming the ball selection is with replacement. What is the probability that the first ball is white
and the second ball is also white?
With replacement, the ball selected is returned to the basket
= p( first ball is white) * p(second ball is white)
= 3/10 * 3/10
= 9/100
= 0.09
(b) Assuming the ball selection is without replacement. What is the probability that the first ball is
yellow and the second ball is red?
Without replacement, the ball selected is not returned to the basket
= p( first ball is yellow) * p(second ball is red)
= 2/10 * 5/9
= 10/90
= 1/9
6. There are 1000 juniors in a college. Among the 1000 juniors, 300 students are taking STAT200,
and 150 students are taking PSYC300. There are 100 students taking both courses. Let S be the event
that a randomly selected student takes STAT200, and P be the event that a randomly selected student
takes PSYC300. (Show all work. Just the answer, without supporting work, will receive no credit.)
(a) Provide a written description of the complement event of (S OR P).
The complement of S OR P is the event that the student neither takes STAT 200 nor PSYC 300.(b) What is the probability of complement event of (S OR P)?
P((S OR P)’) = 1- P(S OR P)
P(S OR P) = (number taking STAT 200 + number taking psyc 300 - number taking both courses)/
total number of students
P(S OR P) = ( 300 +150 -100)/1000 = 350/1000 = 0.35
P((S OR P)’) = 1- P(S OR P)
P((S OR P)’) = 1- 0.35 = 0.65
7. Consider rolling a fair 6-faced die twice. Let A be the event that the product of the two rolls is at
most 5, and B be the event that the first one is a multiple of 3.
(a) What is the probability that the product of the two rolls is at most 5 given that the first one is a
multiple of 3? Show all work. Just the answer, without supporting work, will receive no credit.
Total sample space will have 6*6 = 36 events
Event(s) where, product of the two rolls is at most 5 given that the first one is a multiple of 3
will be (3,1)
p(that the product of the two rolls is at most 5 given that the first one is a multiple of 3) =
number of events that the product of the two rolls is at most 5 given that the first one is a
multiple of 3/ total events
= 1/ 36
(b) Are event A and event B independent? Explain.
The events are not independent.
For independent events,
P(A| B) = P(A)
However for our case , p(A|B) is not equal to p(A)
8. Answer the following two questions. (Show all work. Just the answer, without supporting work,
will receive no credit).
(a) A bike courier needs to make deliveries at 6 different locations. How many different routes can he
take?
= 6!
= 6*5*4*3*2*1
= 720
(b) Mimi has eight books from the Statistics is Fun series. She plans on bringing three of the eight
books with her in a road trip. How many different ways can the three books be selected?Order is not important
Number of ways = 8C3 =
( ) = 56
9. Let random variable x represent the number of heads when a fair coin is tossed three times.
(a) Construct a table describing the probability distribution.
N= 3
P =0.5
P(x=0) = 3c0 * 0.53 * 0.50 = 0.125
P(x=1) = 3c1* 0.52 * 0.51 = 0.375
P(x=2) = 3c2 * 0.51 * 0.52 = 0.375
P(x=3) = 3c3 * 0.50 * 0.53 = 0.125
x P(x)
0 0.125
1 0.375
2 0.375
3 0.125
(b) Determine the mean and standard deviation of x. (Round the answer to two decimal places)
Mean = ∑ ( )
x P(x) x* p(x)
0 0.125 0
1 0.375 0.375
2 0.375 0.75
3 0.125 0.375
Mean = 1.5
Standard deviation = √variance
Variance = ∑ ( ) - mean2
x P(x)
x^2*
p(x)
0 0.125 0
1 0.375 0.375
2 0.375 1.5
3 0.125 1.125
Total 3Variance = 3 – 1.52 = 0.75
Standard deviation = √variance = √0.75 = 0.866
10. Mimi just started her tennis class three weeks ago. On average, she is able to return 20% of her
opponent’s serves. Assume her opponent serves 8 times.
(a) Let X be the number of the serves that Mimi returns. As we know, the distribution of X is a
binomial probability distribution. What is the number of trials (n), probability of successes (p) and
probability of failures (q), respectively?
N= 8
P = 0.2
(b) Find the probability that that she returns at least 1 of the 8 serves from her opponent. (round the
answer to 3 decimal places) Show all work. Just the answer, without supporting work, will receive no
credit.
P(at least 1 of the 8 serves) = 1- p( 0 servers)
P(x=0) = 8c0 * 0.20 * 0.88 = 0.1678
P(at least 1 of the 8 serves) = 1- 0.1678 = 0.8322
11. The heights of pecan trees are normally distributed with a mean of 10 feet and a standard
deviation of 2 feet. Show all work. Just the answer, without supporting work, will receive no credit.
(a) What is the probability that a randomly selected pecan tree is between 7 and 11 feet tall? (round
the answer to 4 decimal places)
Z score = (X –u)/ std dev
Z score for x = 7
Z= ( 7-10)/2 = -1.5
Z score for x = 11
Z= ( 11-10)/2 = 0.5
P( -1.5< z < 0.5) = 0.6247
(b) Find the 40th percentile of the pecan tree height distribution. (round the answer to 2 decimal
places)
Z score associated with 40th percentile = -0.25
height(x) = z score * std dev + mean
height = -0.25 *2 + 10 = 9.5 ft12. Based on the performance of all individuals who tested between July 1, 2012 and June 30, 2015,
the GRE Quantitative Reasoning scores are normally distributed with a mean of 152.47 and a standard
deviation of 8.93. (https://www.ets.org/s/gre/pdf/gre_guide_table1a.pdf). Show all work. Just the
answer, without supporting work, will receive no credit.
(a) Consider all random samples of 49 test scores. What is the standard deviation of the sample
means? (Round your answer to three decimal places)
SM = �/√n
SM = 8.93/√49 = 1.2757
(b) What is the probability that 49 randomly selected test scores will have a mean test score that is
greater than 150? (Round your answer to four decimal places)
Z statistic = ( x bar – u)/ SM
Z= (150 -152.47)/1.2757
Z= -1.94
P(Z> -1.94) = 0.9738
13. An insurance company checks police records on 600 randomly selected auto accidents and notes
that teenagers were at the wheel in 90 of them. Construct a 95% confidence interval estimate of the
proportion of auto accidents that involve teenage drivers. Show all work. Just the answer, without
supporting work, will receive no credit.
Confidence interval = phat ± z critical * √(p*(1-p)/n)
Point estimate = phat = 90/600 = 0.15
Z critical associated with 95% interval = 1.96
Confidence interval = 0.15± 1.96 * √(0.15*0.85/600)
Confidence interval = 0.15 ± 0.03
Confidence interval = (0.12, 0.18)
14. In a study designed to test the effectiveness of acupuncture for treating migraine, 100 patients
were randomly selected and treated with acupuncture. After one-month treatment, the number of
migraine attacks for the group had a mean of 2 and standard deviation of 1.5. Construct a 95%
confidence interval estimate of the mean number of migraine attacks for people treated with
acupuncture. Show all work. Just the answer, without supporting work, will receive no credit.
Confidence interval = sample mean ± z critical * �/√n
Z critical associated with 95% interval = 1.96
Confidence interval = 2 ±1.96* 1.5√100
= 2 ± 0.294
= (1.706, 2.294)15. Mimi is interested in testing the claim that banana is the favorite fruit for more than 50% of the
adults. She conducted a survey on a random sample of 100 adults. 58 adults in the sample chose
banana as his / her favorite fruit.
Assume Mimi wants to use a 0.10 significance level to test the claim.
(a) Identify the null hypothesis and the alternative hypothesis.
Ho: p ≤ 0.5
Ha: p>0.5(b) Determine the test statistic. Show all work; writing the correct test statistic, without
supporting work, will receive no credit.
Z statistic = (p-hat – p)/ (√(p*(1-p)/n))
P hat = 58/100 = 0.58
Z=
√ = 1.6
(c) Determine the P-value for this test. Show all work; writing the correct P-value, without
supporting work, will receive no credit.
P value = p(z> 1.6) = 0.0548
(d) Is there sufficient evidence to support the claim that banana is the favorite fruit for more
than 50% of the adults.? Explain.
Yes. The p value is less than the 0.1 significance level. There is thus sufficient evidence to support
the claim that banana is the favorite fruit for more than 50% of the adults.
16. In a study of memory recall, 5
people were given 10 minutes to
memorize a list of 20 words. Each
was asked to list as many of the
words as he or she could remember
both 1 hour and 24 hours later. The
result is shown in the following
table. Number of Words Recalled
ubject 1 hour
later
24 hours
later
1 14 12
2 18 15
3 11 9
4 13 12
5 12 12
Is there evidence to suggest that the mean number of words recalled after 1 hour exceeds the mean
recall after 24 hours? Assume we want to use a 0.05 significance level to test the claim.
(a) Identify the null hypothesis and the alternative hypothesis.
Ho: Ud = 0
Ha: Ud > 0
b) Determine the test statistic. Show all work; writing the correct test statistic, without supporting
work, will receive no credit.1 hour later
24 hours
later
d ( 1hour later – 24 hr
later)
14 12 2
18 15 3
11 9 2
13 12 1
12 12 0
Total 8
Md 1.6
Sd 2.804758
t= Md /(Sd/√ n)
t= 1.6 /( 2.805/√5)
t= 1.276
(c) Determine the P-value. Show all work; writing the correct P-value, without supporting work, will
receive no credit.
Df = n-1 = 4
P value = p( t> 1.276) at 4 df = 0.1355
(d) Is there sufficient evidence to support the claim that the mean number of words recalled after 1
hour exceeds the mean recall after 24 hours? Justify your conclusion.
No. The p value is greater than the 0.05 significance level. There is thus insufficient evidence to
support the claim that the mean number of words recalled after 1 hour exceeds the mean recall
after 24 hours
17. In a pulse rate research, a simple random sample of 600 men results in a mean of 80 beats per
minute, and a standard deviation of 11.3 beats per minute. Based on the sample results, the researcher
concludes that the pulse rates of men have a standard deviation less than 12 beats per minutes. Use a
0.05 significance level to test the researcher’s claim.
(a) Identify the null hypothesis and alternative hypothesis.
Ho: � ≥ 12
Ha: �< 12
(b) Determine the test statistic. Show all work; writing the correct test statistic, without supporting
work, will receive no credit.
X2 = (n-1)*
�
X2 =(600-1) * = 531.15(c) Determine the P-value for this test. Show all work; writing the correct P-value, without supporting
work, will receive no credit.
P value = upper tail probability value of X2 = 531.15 at 599 df = 0.0217
(d) Is there sufficient evidence to support the researcher’s claim? Explain.
Yes. The p value is less than the 0.05 significance level. There is thus sufficient evidence to support
the claim that the pulse rates of men have a standard deviation less than 12 beats per minutes.
18. The UMUC MiniMart sells four different types of teddy bears. The manager reports that the four
types are equally popular. Suppose that a sample of 500 purchases yields observed counts of 150,
125, 105, and 120 for types 1, 2, 3, and 4, respectively.
Type 1 2 3 4
Number 150 125 105 120
Assume we want to use a 0.05 significance level to test the claim that the four types are equally
popular.
(a) Identify the null hypothesis and the alternative hypothesis.
Ho: The four types of teddy bears are equally popular
Ha: The four types of teddy bears are not equally popular
(b) Determine the test statistic. Show all work; writing the correct test statistic, without supporting
work, will receive no credit.
Categories Observed Expected (fo-fe)2/fe
1 150 500*0.25=125 (150-125)2/125 = 5
2 125 500*0.25=125 (125-125)2/125 = 0
3 105 500*0.25=125 (105-125)2/125 = 3.2
4 120 500*0.25=125 (120-125)2/125 = 0.2
Sum = 500 500 8.4
χ2=∑ ( ) = 8.4
(c) Determine the P-value. Show all work; writing the correct P- value, without supporting work, will
receive no credit.
Df = categories – 1 = 4-1 = 3
P value = upper tail probability value of X2 = 8.4 at 3 df = 0.0384
(d) Is there sufficient evidence to support the manager’s claim that the four types are equally
popular? Justify your answer.
No. The p value is less than the 0.05 significance level. There is thus sufficient evidence to reject
the manager’s claim that the four types are equally popular19. A STAT 200 instructor believes that the average quiz score is a good predictor of final exam
score. A random sample of 10 students produced the following data where x is the average quiz
score and y is the final exam score.
x 80 93 50 60 100 40 85 70 75 85
y 70 96 50 70 96 38 83 65 77 87
(a) Find an equation of the least squares regression line. Show all work; writing the correct equation,
without supporting work, will receive no credit.
x (X- Mx) (X- Mx)2 y (y- My) (y- My)2 (X- Mx)(y- My)
80 6.2 38.44 70 -3.2 10.24 -19.84
93 19.2 368.64 96 22.8 519.84 437.76
50 -23.8 566.44 50 -23.2 538.24 552.16
60 -13.8 190.44 70 -3.2 10.24 44.16
100 26.2 686.44 96 22.8 519.84 597.36
40 -33.8 1142.44 38 -35.2 1239.04 1189.76
85 11.2 125.44 83 9.8 96.04 109.76
70 -3.8 14.44 65 -8.2 67.24 31.16
75 1.2 1.44 77 3.8 14.44 4.56
85 11.2 125.44 87 13.8 190.44 154.56
Total 738 3259.6 732 3205.6 3101.4
Mean 73.8 73.2
Mx = 73.8
My = 73.2
b1 = SSxy / SSx = 3101.4/ 3259.6 = 0.9515
bo = My - b1 *Mx = 73.2 - 0.9515* 73.8 = 2.9818
Equation is
̂= 0.9515x + 2.9818
(b) Based on the equation from part (a), what is the predicted final exam score if the average quiz
score is 90? Show all work and justify your answer.
̂= 0.9515x + 2.9818
̂= 0.9515*90 + 2.9818
̂= 88.62
20. A study of 10 different weight loss programs involved 200 subjects. Each of the 10 programs had
20 subjects in it. The subjects were followed for 12 months. Weight change for each subject was
recorded. We want to test the claim that the mean weight loss is the same for the 10 programs.(a) Complete the following ANOVA table with sum of squares, degrees of freedom, and mean square
(Show all work):
Source of
Variation
Sum of
Squares
Degrees
of
Freedom
(df)
Mean
Square
(SS) (MS)
Factor
(Between)
65.4 9 7.266666667
Error
(Within)
587.65 190 3.092894737
Total 653.05 199 N/A
SSwithin = SStotal –SSbetween = 653.05 -65.4 = 587.65
Df between = groups – 1 = 10 -1 = 9
Df within = Df total – Df between = 199 -9 = 190
MS between = SSbetween / df between = 65.4/9= 7.2667
MS within = SS within / df within = 587.65/ 190 = 3.0929
b) Determine the test statistic. Show all work; writing the correct test statistic, without supporting
work, will receive no credit.
F statistic = MS between/ Ms within = 7.2667/ 3.0929 = 2.349
(c) Determine the P-value. Show all work; writing the correct P-value, without supporting work, will
receive no credit.
Df = (9,190)
P value associated with ( F(9, 190) = 2.349 ) = 0.0155
(d) Is there sufficient evidence to support the claim that the mean weight loss is the same for the 10
programs at the significance level of 0.05? Explain.
No. The p value is less than the 0.05 significance level. There is thus sufficient evidence to reject
the claim that the mean weight loss is the same for the 10 programs at the significance level of
0.05.
[Show More]