Guide to Worksheet 1-2
Worksheet No.1. In this worksheet we will see how to describe the data in a concise way. The data must be summarized as succinctly as possible, and graphical techniques for summarizing data, histogram and stem-and-leaf plot, are introduced. A measure of central location is useful: there are two possible choices for measures of central location, mean and median.
- Each variable must be accompanied by its own description (qualitative or quantitative, possible values, the unit of measurement, etc) [Part (a) in every problem].
- The shape of a distribution, the existence of possible outliers, a possibility of two or more distinct populations within data should be discussed along with histogram whenever possible [1b, 2c].
- Median is not easily influenced by extreme values; in fact, the unknown highest value(s) do not affect the median [4c, 5c]. For this reason the median may be used as measure of center even if data are not complete [5c].
- Overall mean can be reconstructed from means of subgroup (strata), while the median cannot be found in this way [6c].
Worksheet No.2. The objective of this worksheet is to learn when and how to use the measures of variability. Accordingly your work should reflect your understanding of this topic.
- How can you interpret the percentile value, say 25% percentile for example? Can you tell a certain value is smaller than 90% percentile without knowing actual percentile? [1c]
- Standard deviation, coefficient of variation, and interqartile range. When do you need them? How do you use them? [2,3,4,5]
- Empirical Rules. What they do? When can you apply them? When do they fail? [2,5]
- In order to correctly answer the questions in Problem 4, it is important to understand what measurement has been done, and how variables are orgainized. In problem 4, the data itself is the collection of deviation from the target strength (power), and supplier 2 has achieved the smallest deviation [4de]. What can you read from the boxplot? How does it help you understand the data?
- Mean or median, which one becomes more appropriate to represent the center of distribution when a sample distribution is skewed significantly? [5d]
© TTU Mathematics