Exercises - Central Limit Theorem

  1. Compare the probability distribution for rolling a single 6-sided die to the probability distribution for the mean of two 6-sided dice (draw the histograms).

    The distribution for rolling a single 6-sided die is uniform, while the distribution for the mean of two 6-sided dice is unimodal (notably more normal than the uniform distribution) with mean 3.5, and a smaller standard deviation.

  2. A survey found that the American family generates an average of 17.2 pounds of glass garbage each year. Assume the standard deviation of the distribution is 2.5 pounds.

    1. Find the probability that the mean of a sample of 55 families will be between 17 and 18 pounds.

    2. Why can the central limit theorem be applied?

    1. For the distribution of sample means, $\mu = 17.2$, while $\sigma = 2.5/\sqrt{55} = 0.3371$. We want $P(17 \lt x \lt 18)$, so we find $z_{17} = (17-17.2)/0.3371 = -0.5933$ and $z_{18} = (18-17.2)/0.3371 = 0.2373$ and the related probability $P(-0.5933 \lt z 0.2373) = 0.3173$ is our answer.

    2. We are considering a distribution of sample means, so the Central Limit Theorem applies. (Also, as $55 \gt 30$, we can approximate this distribution of sample means as a normal distribution.)

  3. The average teacher's salary in New Jersey is $\$52,174$. Suppose that the distribution is normal with standard deviation $\$7500$.

    1. What is the probability that a randomly selected teacher makes less than $\$50,000$ per year?

    2. If we sample 100 teachers' salaries, what is the probability that the sample mean is less than $\$50,000$ per year?

    3. Why is the probability in part (a) higher than the probability in part (b)?

    1. $\mu = 52174$ and $\sigma = 7500$. Finding $z_{50,000} = (50000 - 52174)/7500 = -0.2899$, we seek $P(x \lt 50000) = P(z \lt -0.2899) = 0.3860$

    2. In the distribution of sample means of size $100$, we have $\mu = 52174$, while $\sigma = 7500/\sqrt{100} = 750$. So, we find $z_{50,000} = (50000 - 52174)/750 = -2.8987$, and calculate $P(\overline{x} \lt 50000)$ as $P(z \lt -2.8987) = 0.0019$.

    3. The Central Limit Theorem suggests that the distribution of sample means is narrower than the distribution for the population -- leaving less area (and hence probability) in the tails.

  4. Assume SAT scores are normally distributed with mean 1518 and standard deviation 325.

    1. If one SAT score is randomly selected, find the probability that it is between 1440 and 1480.

    2. If 16 SAT scores are randomly selected, find the probability that they have a mean between 1440 and 1480.

    3. Why can the central limit theorem be used in part (b) even though the sample size does not exceed 30?

    1. $\mu = 1518$ and $\sigma = 325$. Finding $z_{1440} = (1440-1518)/325 = -0.2400$ and $z_{1480} = (1480-1518)/325 = -0.1169$, we calculate $P(1440 \lt x \lt 1480)$ as $P(-0.2400 \lt z \lt -0.1169) = 0.0483$.

    2. In the distribution of sample means of size $16$, we have $\mu = 1518$, while $\sigma = 325/\sqrt{16} = 81.25$. Finding $z_{1440} = (1440-1518)/81.25 = -0.96$ and $z_{1480} = (1480 - 1518)/81.25 = -0.4677$, we calculate $P(1440 \lt \overline{x} \lt 1480)$ as $P(-0.96 \lt z \lt -0.4677) = 0.1515$.

    3. The Central Limit Theorem tells us that the distributions of the sample means tend towards a normal distribution as the sample size increases. In this case, the original population distribution was already normally distributed, so all of the distributions of sample means must already be normal.

  5. The lengths of pregnancies are normally distributed with a mean of 268 days and a standard deviation of 15 days.

    1. If one pregnant woman is randomly selected, find the probability that her length of pregnancy is less than 260 days.

    2. If 25 pregnant women are put on a special diet just before they become pregnant, find the probability that their lengths of pregnancy have a mean that is less than 260 days (assuming that the diet has no effect).

    3. If the 25 women do have a mean of less than 260 days, does it appear that the diet has an effect on the length of pregnancy, and should the medical supervisors be concerned?

    1. $\mu = 268$ and $\sigma = 15$. Finding $z_{260} = (260-268)/15 = -0.5333$, we calculate $P(x \lt 260)$ as $P(z \lt -0.5333) = 0.2969$.

    2. In the distribution of sample means of size $25$, we have $\mu = 260$, while $\sigma = 15/\sqrt{25} = 3$. Finding $z_{260} = (260 - 268)/3 = -2.6666$, we calculate $P(x \lt 260)$ as $P(z \lt -2.6666) = 0.0038$.

    3. Seeing a sample like this (i.e., with a mean of less than 260 doys) is clearly a rare event ($0.0038$ is less than one percent). So if the one and only sample we found had this mean pregnancy length, it casts doubt as to whether or not the mean for these women is still $268$ days (much like seeing the incredibly rare event of 99 out of a 100 coin flips resulting in heads casts doubt on your belief that the coin flipped is fair). The only thing that separates these women from the general population is their special diet -- so yes, it appears the diet had an effect on the length of their pregnancy. Medical supervisors should be concerned.

  6. Assume that a test has a mean score of 75 and a standard deviation of 10. Assume the distribution of scores is approximately normal.

    1. What is the probability that a person chosen at random will make 100 or above on the test?

    2. What score should be used to identify the top 2.5%?

    3. In a group of 100 people, how many would you expect to score below 60?

    4. What is the probability that the mean of a group of 100 will score below 70?

    1. $\mu = 75$ and $\sigma = 10$. Finding $z_{100} = (100-75)/10 = 2.5$, we calculate $P(x \gt 100)$ as $P(z \gt 2.5) = 0.0062$.

    2. Note that the top $2.5\%$ corresponds to $0.025$ in area right of some $z$-score. But then the area left of this $z$-score is $1-0.025 = 0.975$. Using a table or technology, we find this corresponds to $z = 1.960$. Recalling that a $z$ score is a number of standard deviations away from the mean (with positive $z$-scores associated with being to the right of the mean and negative ones being to the left of the mean), the cut-off test score we seek is $\mu + z\sigma = 75 + (1.960)(10) = 94.6$

    3. Note, this problem does NOT ask about an average score of the 100 people -- so we are NOT looking at the distribution of sample means. Instead, we simply find the probability that a score is below $60$ and then multiply by $100$. Note $\mu = 75$ and $\sigma = 10$, so finding $z_{60} = (60-75)/10 = -1.5$, we calculate $P(x \lt 60)$ as $P(z \lt -1.5) = 0.0668$. Finally, multiplying by $100$ we get the expected number in a group of $100$ people to do this poorly -- namely, about 7 people.

    4. This problem IS asking about the mean of a group of $100$, so we ARE talking about the distribution of sample means. Thus, for the distribution of sample means, $\mu = 75$, while $\sigma = 10/\sqrt{100} = 1$. Finding $z_{70} = (70 - 75)/1 = -5$, we calculate $P(x \lt 70)$ as $P(z \lt -5) \approx 2.8 \times 10^{-7}$ which is very, very small!