How Binomial is Skeet Shooting?

How Binomial is Skeet Shooting?#

Based on Chapter 5 of Think Stats

import os
import urllib.request
import urllib.parse

def download(url):
    filename = os.path.basename(urllib.parse.unquote(url))
    if not os.path.exists(filename):
        urllib.request.urlretrieve(url, filename)
        print("Downloaded " + filename)


download("https://github.com/AllenDowney/ThinkStats/raw/v3/nb/thinkstats.py")

try:
    import empiricaldist
except ImportError:
    !pip install empiricaldist

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from thinkstats import decorate

The Binomial Distribution#

In the sport of skeet shooting, competitors use shotguns to shoot clay disks that are thrown into the air. In international competition, including the Olympics, there are five rounds with 25 targets per round, with additional rounds as needed to determine a winner.

As a model of a skeet-shooting competition, suppose that every participant has the same probability of hitting every target, p. Of course, this model is a simplification – in reality, it’s likely that some competitors have a higher probability than others, and even for a single competitor, it might vary from one attempt to the next. But even if it is not realistic, this model make some surprisingly accurate predictions, as we’ll see.

To simulate the model, I’ll use the following function, which takes the number of targets, n, and the probability of hitting each one, p, and returns a sequence of 1s and 0s to indicate hits and misses.

def flip(n, p):
    choices = [1, 0]
    probs = [p, 1 - p]
    return np.random.choice(choices, n, p=probs)

Here’s an example that simulates a round of 25 targets where the probability of hitting each one is 90%.

flip(25, 0.9)

array([1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1,
       1, 1, 1])

If we generate a sequence of 1000 attempts, and compute the Pmf of the results, we can confirm that the proportions of 1s and 0s are correct, at least approximately.

from empiricaldist import Pmf

seq = flip(1000, 0.9)
pmf = Pmf.from_seq(seq)
pmf

	probs
0	0.116
1	0.884

Now we can use flip to simulate a round of skeet shooting and return the number of hits.

def simulate_round(n, p):
    seq = flip(n, p)
    return seq.sum()

In a large competition, suppose 200 competitors shoot 5 rounds each, all with the same probability of hitting the target, p=0.9. We can simulate a competition like that by calling simulate_round 1000 times.

n = 25
p = 0.9
results_sim = [simulate_round(n, p) for i in range(1000)]

The average score is close to 22.5, which is the product of n and p.

np.mean(results_sim), n * p

(22.537, 22.5)

Here’s what the distribution of the results looks like.

from empiricaldist import Pmf

pmf_sim = Pmf.from_seq(results_sim, name="simulation results")

pmf_sim.bar(alpha=0.6)
decorate(xlabel="Hits", ylabel="PMF")

_images/47d8ce66397ff03809c3ac9277081fa35a2b43356980d8fda9e3bddae329be90.png

Instead of running a simulation, we could have predicted this distribution. Mathematically, we can show that the distribution of these outcomes follows a binomial distribution, which has a PMF that is easy to compute.

from scipy.special import comb


def binomial_pmf(k, n, p):
    """Compute the binomial PMF.

    k (int or array-like): number of successes
    n (int): number of trials
    p (float): probability of success on a single trial

    returns: float or ndarray
    """
    return comb(n, k) * (p**k) * ((1 - p) ** (n - k))

This function computes the probability of getting k hits out of n attempts, given p. If we call this function with a range of k values, we can make a Pmf that represents the distribution of the outcomes.

ks = np.arange(16, n + 1)
ps = binomial_pmf(ks, n, p)
pmf_binom = Pmf(ps, ks, name="binomial model")

And here’s what it looks like compared to the simulation results.

from thinkstats import two_bar_plots

two_bar_plots(pmf_sim, pmf_binom)
decorate(xlabel="Hits", ylabel="PMF")

_images/ea35be730cb8c75652b0ef236be2740983a520d10df4221c9ef29a4578f7b6ed.png

They are similar, with small differences because of random variation in the simulation results.

This agreement should not be surprising, because the simulation and the model are based on the same assumptions – particularly the assumption that every attempt has the same probability of success. The real test of a model is how it compares to real data.

From the Wikipedia page for the men’s skeet shooting competition at the 2020 Summer Olympics, we can extract a table that shows the results for the qualification rounds of the competition.

Downloaded from https://en.wikipedia.org/wiki/Shooting_at_the_2020_Summer_Olympics_–_Men’s_skeet on July 15, 2024.

filename = "https://github.com/AllenDowney/ThinkStats/raw/v3/data/Shooting_at_the_2020_Summer_Olympics_%E2%80%93_Men's_skeet"
download(filename)

Downloaded Shooting_at_the_2020_Summer_Olympics_–_Men's_skeet

tables = pd.read_html("Shooting_at_the_2020_Summer_Olympics_–_Men's_skeet")
table = tables[6]
table.head()

	Rank	Athlete	Country	1	2	3	4	5	Total[3]	Shoot-off	Notes
0	1	Éric Delaunay	France	25	25	25	24	25	124	+6	Q, OR
1	2	Tammaro Cassandro	Italy	24	25	25	25	25	124	+5	Q, OR
2	3	Eetu Kallioinen	Finland	25	25	24	25	24	123	NaN	Q
3	4	Vincent Hancock	United States	25	25	25	25	22	122	+8	Q
4	5	Abdullah Al-Rashidi	Kuwait	25	25	24	25	23	122	+7	Q

filename = "https://github.com/AllenDowney/ThinkStats/raw/v3/data/Shooting_at_the_2024_Summer_Olympics_%E2%80%93_Men's_skeet"
download(filename)

tables = pd.read_html("Shooting_at_the_2024_Summer_Olympics_–_Men's_skeet")
table = tables[5]
table.head()

	Rank	Athlete	Country	1	2	3	4	5	Total	Shoot-off	Notes
0	1	Conner Prince	United States	24	25	25	25	25	124	12.0	Q, EQOR
1	2	Tammaro Cassandro	Italy	25	25	24	25	25	124	11.0	Q, EQOR
2	3	Lee Meng-yuan	Chinese Taipei	25	24	25	25	25	124	7.0	Q, EQOR
3	4	Vincent Hancock	United States	25	25	25	24	24	123	NaN	Q
4	5	Stefan Nilsson	Sweden	24	24	24	25	25	122	6.0	Q

The table has one row for each competitors, with one column for each of five rounds. We’ll select the columns that contain the results and use flatten to put them into a single array.

columns = ["1", "2", "3", "4", "5"]
results = table[columns].values.flatten()

total_shots = 25 * len(results)
total_hits = results.sum()
n, total_shots, total_hits

(25, 3750, 3545)

With 30 competitors, we have results from 150 rounds of 25 shots each, with 3750 hits out of a total of 3575 attempts. So the overall success rate is 95.3%.

p = total_hits / total_shots
p

0.9453333333333334

Now we can compute a Pmf that represents the binomial distribution with n=25 and the value of p we just computed.

ps = binomial_pmf(ks, n, p)
pmf_binom = Pmf(ps, ks, name="binomial model")

And we can compare that to the Pmf of the actual results.

pmf_results = Pmf.from_seq(results, name="actual results")

two_bar_plots(pmf_results, pmf_binom)
decorate(xlabel="Hits", ylabel="PMF")

_images/c35d19d5f7ce2c881a5db5a015bd26585327b6f6858519e9a8b52476419e362c.png

The binomial model is a good fit for the distribution of the data.

from thinkstats import two_cdf_plots

two_cdf_plots(pmf_results.make_cdf(), pmf_binom.make_cdf())
decorate(xlabel="Hits", ylabel="CDF")

_images/9c62a48f447a298945dfc4447348efe9f784370da56c0d0c7c63a11bf4bc8e0d.png

How Binomial is Skeet Shooting?

Contents

How Binomial is Skeet Shooting?#

The Binomial Distribution#