OneZero Innovations
  • AutoTrickler
  • ShotMarker
  • Two-Box Chrono
  • Blog
  • Stats Calculator

Thinking statistically

5/31/2017

13 Comments

 
Rifles are random number generators. Each time you pull the trigger, the bullet chooses a single outcome from infinite possibilities based on countless random factors. Without a time machine, you can never know exactly where the next bullet will go.
Picture
​However, you can predict the most likely outcome, and precisely describe the chances of it being high or low, left or right, fast or slow.

Many people shy away from statistics because, well, math. It seems complicated and unnecessary. On the contrary. It is a way of thinking that hones your intuition and helps you make better decisions.

Using no equations whatsoever, I'm going to show you how to think statistically about every shot you fire. Just by understanding the relationship between a sample and a population, you can learn how to predict the future.

It starts with a roll of the dice.

Suppose I was to roll one six-sided die, once. What will be the outcome?

You could take a wild guess, but you would most likely be wrong. You could also say "I have no idea", but of course there is a better answer.
Picture
We know the result will be 1, 2, 3, 4, 5, or 6, and (unless the die is loaded) the chance of each result is equal. So the most accurate response would be to list the possible outcomes and probabilities.

You might say that I'm asking the wrong question. If I were to ask how the die outcomes are generated in the first place, then the answer is clearly "1, 2, 3, 4, 5, 6, with equal probability".

This is called a "uniform" distribution because the chance of each possible outcome is equal.

Now it doesn't matter if I roll the die or not. We can now describe every die roll to ever take place in the future, because we understand the underlying mechanic. And that's a lot more interesting than the outcome of any one roll.

Take a sample from a population.

Each time I roll the die, I get a result. The future is transformed into the past. What was once a distribution of probabilities is now a specific value. This is a "sample".

Each die roll is independent of any other roll from the past or future. Contrary to the belief of many gamblers, a die doesn't change how it works just because I rolled it a few seconds ago. The die just doesn't care.


You can measure a sample. A data set has an average and an extreme spread, with formulas that involve plugging each number into a calculator. This is all very straightforward and tedious.

What we really want to do is predict the future. The random process itself, the underlying distribution that samples are taken from, is called the "population". This is what actually controls our fate.

True power comes from measuring and describing the population. If we understand the population, we can predict what future samples will most likely be. Not each sample (that would be nice), but the expected distribution of all future samples.

Measure a population with samples.

If we were to roll the die a few times, we would get some random numbers. The die will probably produce one outcome more often than another. It's very unlikely that 6 rolls will produce exactly 1, 2, 3, 4, 5, and 6. Just as each die roll is unpredictable, a small sample does not always represent the population.
Picture
Number of times each value is rolled, with 6 rolls.
However, as we keep rolling, the collection begins to take shape. Only then is it safe to conclude that the underlying distribution is uniform. If you aren't convinced, you can always keep rolling and see what happens.
Picture
Results from 1000 die rolls.
Your rifle also has a random process that drives it. It's not as simple as a uniform distribution, but it does have a way about it that can be described. If the rifle could talk, it might tell you what that is.

The rifle talks to us by generating samples at $1 a pop. If we want to know how it truly works, we need to play its game. With enough samples, we can try to measure the population, but it can be expensive.

Have you ever fired a few groups and concluded that your rifle is accurate or your velocity spread is low? Have you ever compared two loads to see which was better? Have you ever given someone else reloading advice based on your test results?

If so, then you inferred something about the population from a sample. We have all done this. Whether you drew the correct conclusion or not depends on statistics, not how desperately you wish for a positive result. The rifle just doesn't care.

The universe is normal.

The first step of measuring the population is to have a way to describe it. The population is not just a number - it is a distribution of all possible future outcomes and probabilities. How can we define this in some meaningful, concise way?

With a uniform distribution, the extreme range makes sense. You only need about 30 rolls to be 99% confident you have correctly found the range 1 to 6, and this completely describes the population.

Unfortunately, rifles are not so simple. The energy of the primer, case volume, energy of the powder, weight of the bullet, and neck tension all affect the initial launch of the bullet. The dynamics of the barrel, bullet spin, and how the bullet was engraved affect the initial trajectory. After leaving the barrel, the bullet tips and turns and then stabilizes in a new direction, and then is further disturbed by air fluctuations.
Picture
It's fair to say there's a lot of random variables at play, each with their own mysterious random processes. How can we ever begin to describe the nature of the overall population that controls point of impact, or even just muzzle velocity?

It turns out nature gave us a trick. We start by doubling down on our die analogy.

If you've played board games, you may know that 7 is the most popular roll when adding two dice. The outcome of one die is uniform between 1 and 6, but it's somehow more likely that two dice add to 7 than any other number. Why? Because there's more ways to get 7 than any other (1+6, 2+5, and 3+4).
Picture
With only two dice, we see that the probability near the mean is higher than at the extremes. Even though both parts are uniform, the combination is not.

With three dice, the distribution starts to look like the well-known bell curve. This is the aptly named "normal distribution".
Picture
Number of ways to roll each possible outcome of three dice.
The more independent random variables in the system, the more it follows this distribution. It doesn't take much complexity for the bell curve to emerge, and it's no coincidence that almost every random process in nature works like this. It's simply how the math works out.
Picture
A normal distribution is an ideal that fits most data most of the time, but it does not represent the nature of extreme events. Real world processes generally have limits and edge cases controlled by unpredictable factors that happen very rarely.

For normally distributed sample data, the extreme spread is a misleading measure of the variation because it ignores the bulk of the data and focuses entirely on whether extreme events happened to occur in that sample.

We should instead be focusing on describing the results that are most likely to happen again. We need a metric which best represents the variation that we care about describing, and takes all the data into account. 
This is the "standard deviation" (SD).
Picture
The standard deviation defines variation by describing where most the samples taken from a population should be. From the mean, about 2/3 of future samples should fall within +/- 1 SD, and 95% within +/- 2 SD.
Picture
Example 30-shot groups from populations with a different SD.
As you collect more and more data, the measured SD of that sample becomes closer and closer to the true SD of the population. In contrast, the extreme spread will always grow with sample size, as more and more extreme events occur over time. It's easier to measure, but it's not nearly as reliable as the SD.

Measuring with confidence.

So now we understand how to describe the population distribution within the rifle. It has a shape and a size. Now how do we actually measure it in practice?

With a rifle, we have no choice but to guess what the population is from the samples it provides. The larger the sample, the more likely we are to have correctly measured the population. This is called "confidence".

Let's play a game. I'll be the rifle, and you try to test me. Your first 10 shots have velocities of:

  • 2750, 2757, 2732, 2739, 2748, 2750, 2766, 2753, 2742, 2755

I generated these numbers from a normal distribution, but I'm not going to tell you what the true mean and SD of the population was. All you have to go from is this sample. Your best guess based on this data would be that the population mean is 2749 and SD is 9.2.

Now, do you think you've got it within 5%? 10%? Or would you like to keep shooting.

Most would say 10 shots doesn't sound like enough. If you had 10 more, and the result of the second group was similar to the first, then maybe that's enough to make you feel more confident.

  • 2751, 2747, 2771, 2752, 2756, 2752, 2760, 2763, 2757, 2754

The second group does not agree with the first. From this sample alone we would have to say the population mean is 2756 and SD is 6.6. The SD has decreased by almost 30%.

Am I trying to trick you? I promise you nothing has changed with the rifle. It's not neck tension, bad primers, or a flier. It just happened.

Now what? How about we combine both groups into a single 20-shot sample. The mean and SD of that is 2753 and 8.8. Surely that must be closer. When do we stop?

I'll break the suspense. The true mean and SD of the rifle is 2750 and 7.0. You were pretty close on the mean, which is typical, but the SD is a different story. It's a lot more difficult to measure variation than most people would assume.

Now I'm going to let you fire 80 more shots to round out a 100-shot group. Now with all that data and what would surely be a sore shoulder, the best guess is 7.42. This isn't even within 5% of the correct SD!

​The real solution here is to actually calculate the confidence intervals. You can choose the level of confidence you are comfortable with, whether it's 70%, 80%, or 90%. Then you know what kind of variation to expect and how many shots you need to be confident in a result.
Picture
The above chart shows the confidence intervals as a percentage of the measured sample SD. For example, if you fire a 20 shot group and measure an SD of 10, there's an 80% chance the true population is between 8.4 and 12.8.

If you shoot long range competitively, you'd probably like your SD to be in the 5-7 range. How many times have you heard people say what their SD is? It's easy to say what the SD of any sample is - a simple calculation. But to say what the SD of your rifle and ammo truly are without recognizing confidence would be wishful thinking.
Picture
The same goes for group sizes. Great rifles shoot large groups sometimes, and vice versa. You never really know for sure until you have statistical confidence.

All the above 5-shot groups were generated from the exact same population. This is a likely result from a full day of testing - even if absolutely nothing changed about your load!

Just because you changed something for one group doesn't mean the result you see is because of that change. More often than not, you are just fooling yourself with samples that just happen to be different.


The amount of testing you need is all about balancing the cost of more bullets against the risk of your results being incorrect. Set out to answer simple questions, and then statistics will tell you exactly where you stand and what you need to do.
Picture
Every time you look at two test groups and say which is better, whether it's accuracy on paper or velocity spread, you are answering a statistics question that can be calculated. If you have the right tools to answer that question correctly, then you will be able to make better decisions about what to do next.

It's not just about firing more shots. It's about understanding what to expect, and planning your tests so that you can walk away with confidence. Even 20 shot groups can be useless depending on the scenario, and the best of us make this mistake all the time.


Next week I will focus on the ways you can actually apply these concepts to improve your shooting. Some ideas I have to discuss:
  • How best to measure group size (hint - not extreme spread).
  • How many shots is enough?
  • How to compare two groups to see which is better.
  • How to design and plan tests that won't end up being a waste of time.
  • How to actually calculate confidence intervals. I will try to build an online calculator myself, because I can't find one that does what's needed.
13 Comments
Dennis F
6/2/2017 09:53:40 pm

Interesting pov regarding the pros and cons of understanding mv etc.
Just wanted to increase your field of theory here by suggesting that you look simultaneously at the projectile velocities and SD readings from a Silver mountain target (and this being at the long ranges vs the 100yds/m). I've found that I've had low SDs of around 4 and below (10shots, and not always..) with my loadings/tuning and found the target end to show SDs of 15 to 21.
It seems to me that we target shooters luv the unknown, yet seek those perfect moments between both ends.
I suppose I'm throwing in another variable regarding those groupings and welcome your insight.

Reply
Adam
6/2/2017 09:55:16 pm

The SD at the muzzle and the target should be almost the same. The velocity drops predictably or else it wouldn't be possible to shoot accurately at long range. The high SD you are seeing at the target end is due to how the e-target works - it only provides a good SD measurement if many things are lined up just right.

Reply
Van Thein
6/2/2017 10:55:06 pm

This arrived at an opportune moment. Target shooting last night with my 4" barreled 9mm pistol at 25 yards. Repeated groups of 10 multiple times. Frequently had 80 % inside the 6" circle, not always well inside. Started getting frustrated because I couldn't tighten up the groups. Then I realized the gun is unlikely capable of better than a 2-3" group at that distance, the ammo was not carefully loaded--bullet weights were different, charges were different, basically plinking ammo., I was tired, and the lighting was poor. I started thinking there's a degree of randomness that has to result from these variables. After looking at the hits on the array of targets and reading your blog I decided to be fair to myself I need to make very precise reloads and look at smaller samples next time. That's how they check these guns at the factory--a 3 shot group staying within 3 inches is considered acceptable. Hope I can do that.

Reply
Adam
6/3/2017 10:11:05 pm

In the long run it's always better to have more data than less. 3 shot groups will be smaller, but they won't help you measure the performance. Fire more shots but measure the SD instead of the ES.

Reply
Alex West
6/5/2017 10:38:46 am

I am very interested in how to tell if two groups are from the same population. I hope you go into this in your next blog post.

Reply
Adam
6/5/2017 12:16:59 pm

Will do. It's the F-test, to compare two SD's. However it's not as simple as using the online calculators you can find. I have come up with my own Excel spreadsheet to convert the numbers the calculators provide into the true result we need. It takes some explanation and I plan to build a one-step calculator that will do this.

Reply
Collin
6/6/2017 09:23:35 am

Hi Adam. As an Actuary - I really like this post! Always did find it funny how such small sample sizes were often used as the sole differentiator between different loads.

Given the inherent variability - is there a better way to still do adequate load development whilst not sacrificing barrel life too much (given the amount of independent error in small samples)?

Reply
Adam
6/6/2017 10:51:19 pm

This is a great question. What load development process is best is a challenge to know. I hope to work towards that. It's a balance of iteratively testing while also making sure to have confidence in whatever you end up with. All the while, statistics need to be driving your decision of what to try next. I have the idea to write a simulation to test strategies and figure out which works best.

Reply
Collin
6/7/2017 06:07:45 am

Great idea!

If you go the simulation route, I wonder if you could go one step further. Some machine learning algorithms might be incredibly beneficial here.

If you set a target threshold for the total number of shots your willing to sacrifice (in terms of barrel life) to compare two different loads I wonder if you could then setup a neural network to help determine the best approach to testing.

A simpler and more supervised learning process might be along the lines of choosing the best m and n where m is the number of samples and n is the std deviation of each sample to best differentiate between two loads (again with the constraint on total shots fired). This could then be scaled to a multi-load comparison.

Adam
6/7/2017 08:40:18 am

That's where I was thinking. I have no experience with neural networks - do you? It seems it's all about maximizing confidence in a good load vs. total number of shots fired. I have experience with simulation so I would naturally just try iterating over a range of different strategies and look for trends.

Shane Webber link
4/14/2019 03:02:14 pm

For those of us just getting started, is there a qualified central acronym listing we could refer to? Looking through google is tough, dogpile is murder and youtube gets parents worried. For the rest of us we get fed up. Thank you

Reply
Steve L
6/30/2020 10:57:57 pm

Great article. I'm wondering how you get the varying percentages for the plus and minus confidence intervals, -35% and +137% for the 5-shot at 90% CI. When calculating CI in excel it gives me a single number that is +/- the mean. How would I create this function in excel is what I am ultimately asking?

Reply
Kelly link
12/30/2020 12:40:58 pm

Great reading your ppost

Reply



Leave a Reply.

      Subscribe to receive an email notification when there is a new post.
    Subscribe!

    Who am I?

    Adam MacDonald: Canadian FTR shooter, inventor, problem solver.

    With this blog I will share my experiences with load development, shooting strategy, and development of new products.

    RSS Feed

    Archives

    May 2019
    January 2018
    October 2017
    August 2017
    June 2017
    May 2017

Happy to support www.watersrifleman.com
Thank you for your interest!
​Reload responsibly!
adamjmac@gmail.com
autotrickler.com
theshotmarker.com
twoboxchrono.com
OneZero Innovations Inc (2019)
iOS App Privacy Policy
  • AutoTrickler
  • ShotMarker
  • Two-Box Chrono
  • Blog
  • Stats Calculator