Page 84 - 2024-bfw-starnes-TPS7e-SE proofs.indd
P. 84
SecTIoN 1D Summary 71
SECTION 1D Summary
• A numerical summary of a distribution of quantitative data should include
measures of center and variability.
• The mean and the median measure the center of a distribution in differ-
© 2024 BFW Publishers PAGES NOT FINAL - For Review Purposes Only - Do Not Copy
ent ways. The median is the midpoint of the distribution, the number such
that about half the observations are smaller and half are larger. The mean
is the average of the observations. In symbols, the sample mean is given by
Σ x i
x = .
n
• A statistic is a number that describes a sample. A parameter is a number that
describes a population. We often use statistics (like the sample mean x) to esti-
mate parameters (like the population mean µ).
• The simplest measure of variability for a distribution of quantitative data is the
range, which is the distance from the minimum value to the maximum value.
• When you use the mean to describe the center of a distribution, use the stan-
dard deviation to measure variability. The standard deviation gives the typical
distance of the values in a distribution from the mean. In symbols, the sample
Σ (x − ) x 2
standard deviation is given by s x = i . The value obtained before
n −1
2
taking the square root is known as the sample variance, denoted by s . The
x
standard deviation s is 0 when there is no variability and gets larger as variabil-
x
ity from the mean increases.
• When you use the median to describe the center of a distribution, use the
interquartile range (IQR) to describe the distribution’s variability. The first
quartile Q has about one-fourth of the individual data values at or below it,
1
and the third quartile Q has about three-fourths of the individual data values
3
at or below it. The interquartile range measures variability in the middle half
of the distribution and is found by calculating IQR = Q 3 – Q .
1
• The median is a resistant measure of center because it is relatively unaffected
by extreme values. The mean is not resistant. Among measures of variability,
the IQR is resistant, but the range and standard deviation are not.
• The mean and standard deviation are good descriptions for roughly symmetric
distributions with no outliers. The median and IQR are a better description for
skewed distributions or distributions with outliers.
• The most common method of identifying outliers in a distribution of quantitative
data is the 1.5 × IQR rule. According to this rule, an individual data value is an
outlier if it is less than Q 1 – 1.5 × IQR or greater than Q 3 +1.5 × IQR. Another
method for identifying outliers is the 2 × SD rule, which says that any data value
more than 2 standard deviations from the mean of the distribution is an outlier.
AP® EXAM TIP • The five-number summary of a distribution consists of the minimum, Q , the
1
AP® Daily Videos median, Q , and the maximum. A boxplot displays the five-number summary,
3
marking outliers with a special symbol. The box shows the variability in the middle
Review the content of this
section and get extra help by half of the distribution. The median is marked within the box. Lines extend from
watching the AP® Daily Videos the box to the smallest and largest observations that are not outliers. Boxplots are
for Topics 1.7–1.9, which are helpful for comparing the center (median) and variability (range, IQR) of multi-
available in AP® Classroom. ple distributions. Boxplots aren’t as useful for identifying the shape of a distribution
because they do not display peaks, clusters, gaps, and other interesting features.
© 2024 BFW Publishers PAGES NOT FINAL - For Review Purposes Only, all other uses prohibited - Do Not Copy or Post in Any Form.
02_StarnesTPS7e_40934_un01_p1_001_086_6pp.indd 71 13/09/23 5:39 PM