Exercises - Descriptive Statistics

  1. Given the following data: 100, 95, 95, 90, 85, 75, 65, 60, 55. Find the median, mean, and mode. Is there a most appropriate measure?

    median = 85, mean = 80, mode = 95; The mode is certainly inappropriate. Beyond that, the data set is too small to know whether the mean or median is more appropriate.

  2. Make a sketch of the following, indicating the approximate locations for the mean, median and mode:

    1. a normal distribution
    2. a skewed distribution
    3. a rectangular distribution


    See notes.

  3. Given the data set: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ''x''. Find the smallest positive integer value for ''x'' such that ''x'' is an outlier. Find the value for ''both'' definitions.

    Experiment to find these values.

  4. Given the following set of golf scores: 67, 70, 72, 74, 76, 76, 78, 80, 82, 85. Find the median, mean, mode, and standard deviation. What percentage of scores are in the interval of one standard deviation from the mean?

    median = 76, mean = 76, mode = 76, standard deviation = 5.52; 6 out of 10 scores are between 70.48 (76 - 5.52) and 81.52 (76 + 5.52), giving 60% of scores for this data set within one standard deviation of the mean

  5. Give at least five uses and at least five misuses of statistics.

  6. Give the four levels or categories of data and give an example of each.

  7. What amount of data does Chebyshev's Theorem guarantee is within three standard deviations from the mean? Compare this result to the empirical rule. Why are there differences?

    $k=3$ in the formula and $k^2 = 9$, so $1 - 1/9 = 8/9$. Thus $8/9$ of the data is guaranteed to be within three standard deviations of the mean; for normal (bell-shaped) data, one should expect around $99\%$ of the data within this range by the empirical rule.

  8. Given the following grades on a test: 86, 92, 100, 93, 89, 95, 79, 98, 68, 62, 71, 75, 88, 86, 93, 81, 100, 86, 96, 52

    1. Make a stem-and-leaf plot to represent this data.

      10 | 0 0
       9 | 2 3 3 5 6 8 
       8 | 1 6 6 6 8 9 
       7 | 1 5 9
       6 | 8 
       5 | 2
      

    2. Find the mode, median, mean, range, standard deviation, and interquartile range

      mode = 86, median = 87, mean = 84.5, range = 52 to 100, standard deviation = 13.17, interquartile range (IQR) = 77 to 94

    3. What percentage of scores lie within one standard deviation from the mean? two standard deviations?

      13 out of 20 lie between 71.33 and 97.67, so 65% lie within one standard deviation of the mean; 19 out of 20, or 95% lie within two standard deviations from the mean.

    4. Are there any outliers? Explain clearly.

      For a score to be an outlier, the score must be outside the interval of three standard deviations from the mean (44.99 to 124.01). There is not an outlier.

  9. What is an experimental design and why is it important? Describe a completely randomized experimental design and a rigorously controlled design.

    See the notes.

  10. Given this sample of freshman GPA scores:

    1. Is there an outlier? (Check both tests and explain)

      1.8 is NOT an outlier, as 1) it is not below 1.35 (i.e., the mean minus three standard deviations); and 2) It is not below 1.425 (Q1 minus 1.5 times 0.95).

    2. Draw a frequency histogram using 5 to 6 categories. Be consistent with the rules for making histograms.

      One set of boundaries includes: 1.65 to 2.05; 2.05 to 2.45; 2.45 to 2.85; 2.85 to 3.25; 3.25 to 3.65; 3.65 to 4.05, as shown below -- although a variety of answers will work.

    3. Is the distribution significantly skewed?

      Pierson's Index gives a value of $-0.95$, so it is not significantly skewed (although the distribution looks skewed from the histogram)

    4. What percentage of scores is within one standard deviation of the mean? two standard deviations? three standard deviations?

      62.5%, 95.8%, 100%

    5. Are your findings consistent with the minimum amount of data within two standard deviations guaranteed by Chebyshev's Theorem?

      Yes, Chebyshev's Theorem guarantees 3/4 or 75% of the data. The data set had 95.8%, more than the 75% minimum.