We are faced then with deciding which estimator to use for asymmetric distributions. Our choice should depend on the objective of the study and the use to be made of the estimator. Does one want to estimate where most of the data in a data set lie, or is it important to give extra weight to the extreme values for the purpose at hand? In Figure 6, the AM clearly overestimates where the bulk of the data lies. One could argue, however, that when working with a potentially harmful substance such as 233py in the environment, it may be preferable to be conservative in the sense that we tend to overestimate rather than underestimate average soil concentrations. Stem-and-Leaf Displays Probably the best approach when working with asymmetric distributions is to Even this, compute more than one estimate of the "average" of the data set. variability or scatter the about information much convey not will however, One method for obtaining such information is to plot the present in the data. A preferred method, however, as was done in Figure 6. form data in histogram is called a "stem-and-leaf" display (Tukey, 1972). This gives all the informa- tion of a histogram in addition to retaining the actual numerical values which The construction of a stem-and-leaf makes it a simple matter to find the median. display is illustrated using the following 239py concentrations in soil that are displayed in histogram form in Figure 6: 8.2 9.4 12.8 1.7 7.9 8.9 6.7 10.3 21.3 11.3 0.8 9.0 3.5 0.5 10.7 2.0 7.6 4.4 3.1 11.3 2.4 3.6 16.4 4.8 5.6 4.4 5.8 305.0 6.8 8.7 18.2 3.0 3.4 5.6 11.2 11.0 2.5 21.0 3.6 20.0 1.9 14.3 7.9 9.2 5.9 2.6 10.2 The first step is to select a "stem" which corresponds to the intervals of a histogram. For the above data set, units of 10s appear to be a reasonable choice. The stem appears as in column (a) in Table 2. The "leaf" of the display is the next digit of the number, illustrated in colum(b) of Table 2 for the first 5 numbers in column 1 of the above data set. Doing this for all 47 numbers gives column (c) in the table. Note that two data with the same stem value appear in the same row, @.g., 21.3 and 21.0. By reordering the leaf values from smallest to largest for each stem, and by adding a depth column, we obtain the final stem-and-leaf display given in column (d). Note that the "leaf" part is just a histogram, but each "bar" of the histogram now contains the actual numerical values of the data. The "depth" column is constructed by counting the number of observations starting at both ends. Thus, the entry at position 7 on the stem contains the median (central observa- tion). acd 256