likelihood

ated parameter values.
this data was the

function

to find

"Simplex" algorithm;

wo

the associ-

there are currently available a

The computations outlined above are performed for a sequence of successively more complex statistical models of mixtures of normal distri-~butions, starting with a single normal distribution, then a mixture of

two normal distributions with the same mean, then two normal distributions with different means,

and so on.

The choice of the most compli-

N(u,o)
where

Mathematical Expression of a Mixture of
Normal Distributions
Figure 2.

Co

1

27

¥3
e

1 (x 3uy’

cated model to try is made from the cumulative distribution plot, such
as shown in Figure 1.
In our data a maximum of four distributions was
used.
These models were compared using the likelihoad ratio test; the
details of this test are also found in elementary mathematical statistics

texts,

Note that,

with this test,

we cannot decide if any one

model is a good fit to the data cr not, we can only decide the relative.
merits of the models used. The successively more complex models were
pairwise compared using the likelihood ratio principle until no statistically significant improvement in the fit to the data was found.
A model consisting of a mixture of three normal distributions was found
to best represent our example data; these distributions are plotted in
the black curves on Figure 3. The broken line curve was calculated from
the

sample mean and standard deviation of all the data assuming a Single

normal distribution. The relative size of the black curves, the area
under these curves, is drawn here to represent the proportions of the
data in each of the component distributions.
The broken line curve is
simply drawn a convenient size. Specifically this data set is best
modeled by 21 percent of the data having a mean of

390 and a standard

deviation of 195 (the wide curve at the bottom), 22 percent of the data
having a mean of 394 and a standard deviation of 11 (the lower middle
curve),

and

57 percent of the data having a mean of 454 and a standard

deviation of 13 (the tallest curve).
Some additional

information is of

essentially the

same as the

37.

interest.

The

total sample size was

The “known value" of the material sent to the laboratories was 452,
454 mean of

the

tallest curve,

thus we

suspect that this component curve represents the "good" laboratories.

The average of all the data,

represented by the broken line curve,

was

440, which, if used to characterize the data set, suggests a bias from
the known value--a bias

that disappears

if we use the component curves.

Consider the range of concentrations defining the 95-percent area of
the "good” or tallest curve.
Eleven percent of the wide distribution
is within this good range.
Since the wide group represents 21 percent
of the data, about 2 percent (21%*11%) of the data values would be misclassified as good when in fact the values are from poorly performing
laboratories that just happened to hit

it right this time.

Less

than

One percent of the lower middle curve is actually within this 95 pDercent

interval of

the upper middle curve.

One could,

of course,

this exercise with any wcceptance criteria one wished.
lower middle group represents a group of

597
5

its maximum and

The particular maximization technique used on

number of other good algorithms.

—e
—

f(x)=p,*N(u.o;) + p,*N(u.,0;)+ (1-p,—p,) N(u,,9;,)

@iractly upon the

eight

repeat

We suspect the

laboratories with good

Select target paragraph3