# Bayesian model for highly applied decision making in American football

*In American football, the attacking team is given 4 attempts to go 10 yards and then the team is allowed to continue attacking (possession). And very often, before playing the 4th attempt, the coaches have to decide — try to get the remaining to the minimum 10 yards with the risk of not reaching and giving the ball to the opponent at the current point of the field, or immediately play the punt, thus securing himself in defence. To make this decision easier and more efficient, we will build a Bayesian model in this article.*

*Prerequisites: Basic knowledge of Bayesian Theorem and a solid knowledge of the rules and terminology of American football.*

# A task

Choose to play the punt or play the 4th attempt in the “4 and *j* yards” in situation *i* yards on the field.

# Events

From the statement of the problem, it follows that we must consider two possessions (one of our own and the next after the current one, which the opponent takes). During these two possessions, 4 events can occur, fully describing significant and possible results (several events can occur):

**if we play the 4th attempt:**

*A*: Our team scores a touchdown after two possessions*B*: our team will miss a touchdown after two possessions (including a return touchdown to our touch zone)

**if we play a punt:**

*С*: our team will miss a touchdown after two possessions*D*: Our team scores a touchdown after two possessions (pick-six)

# Decision

## General idea

Thus, the task is reduced to comparing four probabilities:

*P (A)*: Probability to score a direct touchdown when choosing to play 4th attempt,*P (B)*: Probability of missing a touchdown when choosing to play 4th attempt,*P (С)*: Probability of missing a touchdown when choosing a punt,*P (D)*: Probability to score a touchdown when a punt is selected.

And the choice whether to play the 4th attempt or not comes down to solving the inequality:

## P (A) — P (B)

## ?

## P (C)— P (D)

The events that affect the probabilities on the left side of the inequality are: entered and missed touchdowns on the basis of two possessions, as well as the first down scored on the basis of the 4th attempt.

These events are statistically dependent, we will use Bayesian formula. The problem can be described in terms of the usual theory of probability (and even reduced to it), but in order to show the completeness of the dependences of the probabilities, we will use Bayes’ theorem.

The events on the right side of the inequality are a missed touchdown and a entered touchdown on the basis of two possessions, as well as the number of yards our team will push an opponent back while the punt. The last event (the yards by which the scrimmage line will be moved after the punt) we will take as a constant and take the average value from the statistics. Thus, these events are statistically independent, so we will use the usual unconditional probabilities here.

## 4th attempt play

The probability of entering a touchdown on the decision to play the 4th try, or *P (A)*, depends on whether the 4th try is successful. And it also depends on how successfully our team implements the situation of the first down, on a specific part of the field, to a touchdown. These probabilities fully describe all possible outcomes, and most conveniently, they can be taken from the accumulated (for your own team) statistics:

*P (X)*: the statistical probability of passing*j*yards in one attempt,*P (A | X)*: The statistical probability of getting a touchdown from the*i*-th yard of the field (from situation 1–10).

*Here we neglect the possible yards gained on the 4th attempt and for simplicity we take i as the current mark of the second marker.*

Passing to the terms of Bayesian theorem, we set *P (A | X)* as the posterior probability under the event *X*, and *P (A)* we set the required prior probability.

Thus, the basic formula of Bayes’ theorem is:

## P (A | X) = (P (X | A) * P (A)) / P (X)

where* P (X | A)* is the probability of realizing the 4th attempt, provided that our team scores a touchdown, which, according to common sense, is equal to one. Thus, our desired prior probability is:

## P (A) = P (A | X) * P (X)

As a result, we consider *P (A)* to be a simple multiplication of the probability of passing *j* yards by the probability of scoring from the *i*-th yard from situation 1–10. We take both probabilities from statistics.

The probability of missing a touchdown when deciding to play the 4th attempt, or *P (B)*, is the sum of the two prior probabilities:

*P (Y)*: the probabilities of missing a touchdown when unsuccessful play of the 4th attempt (from the place of its drawing, from situation 1–10 to attack the opponent). Moreover, the probability of an unsuccessful play of the 4th attempt is*1 — X*.*P (Z)*: Probabilities of missing a touchdown in case of a successful 4th attempt, for example, when changing possession on the following drives.

These two probabilities are a priori, that is, we must take into account the probability of a successful 4th attempt. That is, in the case of an unsuccessful drawing of the 4th attempt:

## P (Y) = P (Y | (1-X)) * P (1-X)

and in case of a successful drawing of the 4th attempt:

## P (Z) = P (Z | X) * P (X)

For simplicity, let’s take for *P (Z | X)* the simple statistical probability of missing a touchdown after kickoff. Simplifying a little more, we can reduce the probability of missing after kickoff to the probability of missing from *30 + k* yards, that is, from the place where we, on average, move the opponent on the kickoff.

These probabilities fully describe all possible outcomes, and most conveniently, they can be taken from the accumulated (for your own team) statistics.

So, again using Bayesian theorem:

## P (B) = P (Y | (1-X)) * P (1-X) + P (Z | X) * P (X)

And the total damage (after all, we are already in a vulnerable situation when playing the 4th attempt and we consider in which case the damage will be less) from the decision to play the 4th attempt:

## P (A | X) * P (X) — P (Y | (1-X)) * P (1-X) — P (Z | X) * P (X)

The meaning of the expression reduces to calculating the difference in probable touchdowns for two possessions.

## Punt

When playing a punt, we actually give up our attempt (of the two we are considering) and the calculation comes down to the likely damage when our team plays in defense.

To do this, we need to know where the opponent will start his possession and the probability of missing a touchdown. For simplicity, the probability to score a touchdown in the possession of an opponent (pick-six) is assumed to be zero. Thus, *P (D) = 0*.

Alternatively, you can take this value from statistics

For simplicity, we will assume that our team, when play a punt, pushes the opponent back by the same average distance. So, from the accumulated statistics we take:

*k*: the average number of yards that our team pushes the opponent away by playing the punt, with the returned yards,*P (C)*: The statistical probability of missing a touchdown from*i + k*yards on the field (from situation 1–10).

Take the look that *P(C)* is taken for* i + k* yards, that is, for the current position on the field plus the average number of yards after the punt.

# Outcome

To make a decision about whether to play the 4th attempt or to play the punt, one must compare the possible damage (taking into account the possible benefits) from the first decision and the possible damage from the second. Moreover, we can take all the data from the accumulated statistics. In addition, the attentive reader will notice that* P (Y | (1-X)), P (Z | X) *and *P(C)* are the same thing, only they are taken for different *i *or positions on the field.

## P (A | X) * P (X) — P (Y | (1-X)) * P (1-X) — P (Z | X) * P (X)

## ?

## P (C)

Thus, having the statistics of the plays for your own team and indicating the current position of the team on the field before the 4th attempt (the number of yards to the first down and the position of the scrimmage line), you can evaluate the chances and choose the most effective outcome.

To demonstrate the idea, I wrote a script in a jupiter notebook where you can play around with indicators and position on the field, plus plots showing the distribution of odds in the case of a punt or a play of the 4th attempt.

I hope the topic is not so narrow that it will not be useful to anyone at all. But it was interesting for myself to practice, on the ground level, in application of Bayesian models. If I made a mistake somewhere — write, I will make changes or additions to the article.