                     Another View of Accuracy Testing
                           by John E. Leslie III
                   Copyright 1993 by John E. Leslie III
                            All Rights Reserved


Premise
     In his recent article entitled "Accuracy Testing" (Precision Shooting,
August 1993, p.58), Peter Craig pointed out the problems associated with using
three or five shot groups to try to determine the best load for a particular
firearm.  Mr. Craig demonstrated, using a computer simulation, that the laws
of probability associated with shooting can cause a randomly chosen shot group
fired with less consistent ammunition, to be smaller than another random shot
group fired with ammunition, which was fact more consistent.
     This apparition is much more likely to occur in shot groups containing
fewer shots than in shot groups containing a large number of shots.  Mr. Craig
proved that, as more shots are fired, the laws of probability catch up with
the looser grouping ammunition and show it up for what it really is.
     I have been doing research to try to identify the best statistical
measure of shot group dispersion or "tightness" and decided to recreate Mr.
Craig's computer simulation and add the statistics which I have been
researching.

My Simulation
     My simulation, like Mr. Craig's, examined the success rate of the group
size statistic at identifying the shot group fired by the tighter grouping
load out of four possible choices.  Each ammunition "load" was 20% less
consistent than the previous load.  This selection process was repeated 65,000
times to get an accurate representation of the statistics' success rate for
groups containing each number of shots.
     In addition to group size, I tested four other statistics: the figure of
merit, the diagonal, the mean radius, and the radial standard deviation.  A
graph showing all of these statistics' success rates for correctly determining
the tightest grouping load is included as Figure 1.  These percentages are not
absolute numbers, as we will see later in this article.  Greater or lesser
differences between the ammunition loads will change the success rates.

Group Size
     Group size, also known as Extreme Spread, is the most widely used
measure of shot group dispersion.  It is defined as the maximum distance
between any two shots within the group.
     In my opinion, there are several problems with using group size, most
notably, the measure's domination by the group's outliers.  Outliers are shots
which have a low probability of occurrence (otherwise they would not stand out
so).  Since group size measures the distance between extreme shots, what it
really measures is the spread between the least likely to be repeated shots in
the group.  Also, by only using data from two shots within the group, it
ignores the data represented by the other (more likely to be repeated) shots.
     While group size outperformed all of the other measures for the three
shot groups, it was the worst statistic for all groups of four or more shots.

Figure of Merit
     The figure of merit (FOM) is the average of the maximum horizontal group
spread and the maximum vertical group spread.
     This measure uses data from at least two shots but more likely four
shots.  Since you are using more data points (shots), the effect of an outlier
gets diluted:  it now has a 25% influence rather than 50% as with group size.
In the simulation, the FOM proved to be  superior in groups of four or more
shots to group size in correctly choosing the tighter grouping load.  I
believe this is due to the use of twice as many data points.

Diagonal
     The diagonal statistic uses inputs similar to the FOM.  It is calculated
by taking the square root of the sum of the maximum horizontal spread squared
and the maximum vertical spread squared.  The success ratios for the diagonal
were almost identical to those of the FOM; in fact, these two measures are
shown on the same line on the graph in Figure 2.  I believe these results
reflect the similarity of their inputs.  All of the advantages mentioned above
for the FOM also apply to the diagonal.

Mean Radius
     The mean radius, as the name implies, is simply the average distance
from the group center of all of the shots of the group.
     This measure uses data from every shot, not just two or four shots.
Here once again, additional information helped improve the usefulness of the
statistic:  the mean radius was a more reliable predictor of the smallest load
than either the group size or FOM/diagonal statistics.

Radial Standard Deviation
     The radial standard deviation (RSD) is similar to the standard
deviations we are all familiar with except that it is two dimensional.  It is
calculated by taking the square root of the sum of the horizontal variance and
the vertical variance.
     Like the mean radius, this statistic uses all of the available data
points of the shot group. The RSD was the most accurate measure I examined for
determining which group was from the tighter grouping load.

Different Sized Loads
     Having established that the RSD was superior at selecting the best load
in the previous simulation, I wanted to determine the effect of varying
magnitudes of differences among the loads.  My first simulation used four
loads which were progressively 20% larger than the previous load.  I decided
to run the simulation twice more - once using half of that difference between
loads (the 10% difference loads) and once using twice the original difference
between the loads (the 40% difference loads).  Both simulations showed the
same relative accuracy rankings between the statistics as my first simulation,
but the amount of the improvement of the RSD (and the other statistics) over
group size varied greatly.  My comparison of the RSD's accuracy relative to
the group size's accuracy is shown in Figure 2.
     This study showed that the statistics' accuracy was quite sensitive to
the amount of variation between the loads.  While the RSD was always more
accurate than group size, its advantage shrunk when faced with identifying the
more obviously differing loads (40% differences).  However, the RSD was
dramatically better than group size at distinguishing among the more difficult
to differentiate loads (10% differences).  When the going got tough, the RSD
clearly demonstrated its superiority.

Different Numbers of Loads
     A final dimension of the RSD versus group size question that I examined
was whether the statistics' ranking would be affected by distinguishing
between two loads rather than the four loads used in the other simulations.
The results of the two-load simulation were identical, in both relative
ranking and magnitude, to the results of the four-load simulation.

Conclusion
     This exercise has proven to me that the group size statistic, which we
all put so much faith in, is marginally adequate for the task.  The radial
standard deviation can distinguish between loads with fewer shots fired and a
higher degree of confidence.
     The consequences of this finding are important for all shooters, not
just reloaders.  Position shooters cannot only match their ammunition to their
firearm more reliably using the RSD but they can also use this statistic to
judge changes in their position construction.  If the RSD of their groups
declined after the change, they would know that they should keep the
modification.  Shooters can also use this measure to evaluate alterations to
their equipment:  How much of an improvement did I get from fire lapping my
barrel?  Did my new stock really improve my accuracy?  Is it worthwhile for me
to separate my rimfire ammunition by rim thickness?  Does it matter, from an
accuracy point of view, how thoroughly I clean my firearm?
     The major drawback to using the RSD is the hassle of calculating it.
First you must determine the cartesian (x & y axes) coordinates of all the
shots in the group.  Then you must average all of the x values and all of the
y values separately to find the coordinates of the group center.  Next you
would calculate the variances of the group in the x and y directions.
Finally, you would add the two variances and find the square root of the
total.  After you have done this a few times you realize why group size is
still so popular: it is so much easier to calculate!
     Fortunately, the personal computer revolution comes to the rescue.
There are several pc programs, including one which I wrote for IBM
compatibles named ScorStat, which can help you do some or all of the
necessary calculations.  I would expect to see additional programs become
available as this type of statistical analysis becomes more popular.
