Guest column by Cole Jacobson
As Tom Brady entered the huddle with 1:46 remaining in the 2020 NFC Championship Game seeking to reach his NFL-record 10th Super Bowl, his Buccaneers had two options. The Bucs faced a third-and-4 while holding on to a 31-26 lead. Option one: run the ball and (probably) not get a first down, but force the Packers to burn their final timeout before giving Aaron Rodgers possession. Option two: put the ball in the air to try to seal the game, but with the risk of an incomplete pass that would allow Green Bay to get the ball back and hold on to that coveted timeout.
Most of us remember the outcome; Tampa Bay chose to pass and converted on a pass interference penalty on cornerback Kevin King, and the Packers never touched the ball again. Brady ended up holding his seventh Lombardi Trophy two weeks later. But while the Buccaneers’ gamble worked out in this instance, individual anecdotes are not evidence. This raises the question: when one team has the ball with the lead late in a football game, is it more effective to run the ball and kill the clock, or throw the ball to try to prevent the trailing team from getting the ball again? Using data from NFLFastR, I attempted to find out.
I started with all of the play-by-play data available for the past 20 seasons, including playoffs (2001 to 2020). I originally wanted a shorter time span to account for passing becoming both more common and more efficient in recent years, but decided that maximizing sample sizes was most important. I isolated all scrimmage plays to occur in the last two minutes of each game, excluding kneeldowns. From there, I created a set where the offensive team was leading by one possession (2,773 plays) and a set where the offensive team was trailing by one possession (11,090 plays). I limited data to the last two minutes because the two-minute warning served as a very distinct barometer at which play-calling becomes far more influenced by the clock; e.g., a losing team’s odds of scoring if it gets the ball back won’t be greatly altered by whether it gets possession with 2:50 to go compared to 3:30.
Summary of the project: while it contrasts most football analytics discourse, running the ball tends to be more beneficial than passing in helping leading teams hold onto that lead, mainly when the trailing team is out of timeouts. Passing is even more efficient than running in these situations than it normally is, due to the defensive tendency to sell out to stop the run when trailing, but this gap in efficiency generally isn’t enough to outweigh the prospect of killing 40 valuable seconds. However, data shows that, if/when the losing team gets the ball back, timeouts aren’t very valuable to it. As a result, offenses’ primary motivation for running the ball should be to kill clock, not to force the defense to burn its timeouts, because clock matters far more than timeouts to the losing team.
Is passing even more effective than usual in late-game clock-killing situations?
The first order of business is to figure out if passing is more successful in these clock-killing situations than it is at other points in a game. Basic game theory would suggest that this is the case: the defense knows that the offense wants to drain the clock, which means the defense will be selling out to stop the run by stuffing the box while playing no-help man coverage or some variation of it, which opens the door for the offense to pass more successfully than it normally would. And that’s indeed how it plays out:
The above charts show the rate at which teams leading by one possession in the final two minutes of a game got a first down or touchdown on any given play. (The black brackets represent 95% confidence intervals.) I use NFLFastR’s distinction between “pass” vs. “run,” which is based on the intent of a play rather than its result (i.e., scrambles and sacks are still “pass plays,” even though the ball wasn’t thrown). We can see that for any medium- to long-distance to go, passing is drastically more efficient than running, and even the 1- to 2-yard range is ambiguous based on the small sample size (25 passes).
By comparison, here’s how the same table and graph look if we consider all plays throughout an entire game, regardless of clock or whether the offensive team leads:
I scrapped the confidence intervals here, because the sample size is sufficient enough. While it is true here that passing is more effective for any distance besides the 1- to 2-yard range, the gaps between passing and running are much smaller here than in the prior chart. For example, when considering our “Project Plays” (offense leading by one possession in the final two minutes) in the 3- to 6-yard range, passing earned a first down/touchdown on 43.8% of plays, while running did so on 21.4%. In contrast, for all offensive plays in the 3- to 6-yard range at any point of a game, passing earned a first down/touchdown on 47.0% of plays, while running did so on 34.2%. The latter gap is still significant, but less so than the former.
If we isolate third downs, which are traditionally the most polarizing in terms of late-game play-call choices, we can verify this further. Note that there have only been 78 passing plays on first/second downs in our “Project Plays” since 2001. For what it’s worth, those plays have been extremely effective: 30 of the 78 resulted in first downs/touchdowns despite an average of 9.73 yards to go, and the plays have had an average gain of 7.77 yards, including penalties.
The above table/chart represent the third downs within our “Project Plays,” while the following ones represent all third downs at any point of a game:
With the exception of the 10-plus-yard range, which may be impacted by a small sample size (52 passes in the “Project Plays” group), we see a similar breakdown here to our first data set. This is particularly flagrant in the 3- to 6-yard range; in the “Project Plays,” we had an 18.8% chance of first down/touchdown with a run and 43.4% with a pass, whereas at any point of a game, we had a 42.3% chance with a run and 47.0% with a pass.
On its own, this is by no means groundbreaking information. Any casual football fan’s intuition would be that running becomes harder in these late-game situations where the defense is losing and selling out to stop the run. But it’s valuable to see the data confirm this assumption, and it provides a useful first step for our overall project.
How much does time remaining impact the trailing team’s odds of scoring?
If our concern was only to learn which of passing or running was more effective, we could wrap it up. But the issue at hand isn’t just about whether passing creates a better chance at moving the chains than running. Rather, it’s about whether that advantage created by passing the ball is big enough to outweigh the benefits of potentially killing 40-plus seconds by running the ball.
To approach this aspect, we have to figure out just how much those few extra seconds help out the losing team if/when they get possession back. We’ll introduce the concept of “adjusted time left,” which involves adjusting the clock for how many timeouts the trailing team has. (See the bottom methodology section for more on that.)
I created a new data frame which includes every drive that began in the final two minutes with the offense trailing by one possession. This allows us to dig into how the time remaining (and other variables, such as field position and timeouts remaining) impact the trailing team’s odds of scoring. We use odds of scoring rather than net points per drive (e.g. made field goal counts as +3, pick-six counts as -7) because, in the specific context of a team trailing by one score in the final seconds, it doesn’t have any concern for a turnover that allows the defense to score. When a last-second desperation play leads to points for the defense, it doesn’t harm the offense’s chances of winning any more than a turnover on downs would. In other words, losing by 14 isn’t any different than losing by 7.
Consequently, we’ll dissect the probability of having a scoring drive instead. With this modification, we can treat a pick-six the same as we would an incompletion on a Hail Mary to end the game, as shown in this best-fit plot:
Field position refers to distance from the opponent’s end zone (i.e., “70” refers to the offense’s own 30-yard line). These curves reflect what we expect: more time on the clock leads to a higher chance of a scoring drive. While the orange line representing drives starting in opponent territory is particularly steep, perhaps due to a smaller sample size of 83 drives, the clock still plays an important role in the other situations too.
Suppose we take a drive starting at the offense’s own 25, represented by the turquoise curve. Having 100 seconds left and no timeouts leads to approximately a 15% chance of scoring, whereas having 60 seconds left and no timeouts leads to approximately an 8% chance. A gap of 7% doesn’t sound like much, but when framed in the context that the theoretical drive starting with 1:40 left has almost double the scoring probability as one starting with 1:00 to go, we realize the importance.
To go beyond eye-balling the above plot, I created multivariate regression models to predict a team’s chances of scoring based on clock, field position, and timeouts. Below is a summary of one of those models, with examples of how it can be applied:
This linear model estimates that a team starting from its own 35-yard line, with one timeout left and 1:10 on the clock, would have a 15.5% chance of scoring, compared to an 8.9% chance if the clock was at 0:30 instead of 1:10. This predictive model isn’t perfect (needless to say, it’s impossible to have a negative probability of scoring in any context), but for the most part, the model’s projected impact of losing out on 40 seconds is extremely similar to what we saw in the best-fit plot, which was directly taken from real results.
However, the model shows that the number of timeouts is not a strong indicator of the offense’s chances of scoring, with a much less significant P-value than the other variables. This is because timeouts aren’t particularly useful for the losing team once it already has the ball, since plays during a two-minute drill will never use the majority of the play clock. In other words, the losing team needs its timeouts most when it is on defense, because that’s when it can use them to prevent the leading team from burning 40-plus seconds. This distinction is extremely important: when the losing team has the ball, clock matters drastically more than timeouts do.
Combining it all: for the leading team, what’s ultimately more conducive to winning?
We’ve quantified how much more likely it is for passing to result in a first down/touchdown than running, and how much burning game clock plays a role in lessening the losing team’s odds of scoring if it gets the ball back. Which of those two traits matters more for the leading team?
I initially had two approaches to figuring out which of passing or running was more conducive to the leading team keeping its lead: one based on the actual eventual winners of each game, and one based on NFLFastR’s win probability added (WPA) metric. I ran code for both options, but I will only share the first one, primarily to save space but also because WPA’s volatility on a play-to-play basis made the results a bit noisy. While the route of looking at which team won can often be a dangerous one to take in football discourse, since it’s easy to get sucked into the faulty “running more often leads to winning” mindset, it’s safer here because we’re isolating situations where the offensive team has already gained a lead in the final two minutes. As a result, we’re avoiding the “correlation vs. causation” mishap that can often happen when a team runs the ball 20-plus times in the second half after leading 27-7 at halftime.
With that being said, below are a table/graph displaying the rate at which teams leading by one possession in the final two minutes of a game end up winning that game, based on whether it runs or passes on any third-down play:
Though this is antithetical to just about every football analytics project ever made, in all four distance ranges, running the ball has led to a win more often than passing the ball. We must point out that our confidence intervals suggest that there could be some noise, particularly in the 1- to 2-yard range. But still, the fact that running has led to wins more often than passing even after we control for score, down, and clock is noteworthy, because most “running the ball leads to winning” takes don’t account for the fact that passing teams are often trailing.
One way to get more insight here is to stratify by whether the defense has a timeout or not; i.e., whether the offense is almost guaranteed to be able to kill 40-plus seconds with a run play. Below is how often teams eventually went on to win after third downs in our “Project Plays” group, when the defense had no timeouts at the time of the snap:
And here are the same plays, except when the defense did have at least one timeout:
While we can disregard the 1- to 2-yard range because of its extremely small sample size for passes, the other ranges show us an interesting discrepancy. Consider the chart where the defense had zero timeouts: in all four columns, running has led to wins more often than passing even after controlling for score, down, and clock. The confidence intervals do overlap in every yardage range, but the fact that the same trend (running over passing) exists for each one can’t be ignored. Meanwhile, consider the chart where the defense does have a timeout: running vs. passing is essentially a wash, with passing even being slightly higher in the 3- to 6-yard range. This leads us to a conclusion that makes sense based on our findings about how timeouts don’t vastly help trailing teams when they have the ball: running the ball is beneficial to the leading team when it knows it can kill 40-plus seconds, but does not appear to have a tangible impact on boosting win percentage otherwise.
We’ve established that when the defense has a timeout, there’s far more incentive to air the ball out. Have coaches behaved properly in recognition of this?
Overall, that’s a resounding yes. When defenses have no timeouts remaining on third downs in the final two minutes, passes have been extremely rare. In contrast, when the defense does have a timeout, coaches have been noticeably more willing to air the ball out. Broadly, coaches have stuck to the correct mindset: run if you know it’ll kill the clock, but don’t be afraid to put the ball in the air otherwise.
Real-World Application: Back to the 2020 NFC Championship Game
Let’s say you’re sick of all the nerd graphs, and you just want to look at how any of this can be applied in a real game. We can look back at that third down from the Buccaneers’ win over the Packers. Keep in mind that the following numbers are not based on team personnel; i.e., they treat the 2020 Packers the same as any NFL team from 2001 to 2020.
Tampa Bay faced a third-and-4 from its own 37 with 1:46 left. Suppose that Tampa Bay would have clinched the game with a first down. (Technically, it wasn’t impossible that Green Bay could have still gotten the ball back, but the chances of getting possession at all, let alone scoring a touchdown in that brief time, are so negligible even for a Hail Mary wardaddy such as Aaron Rodgers that we shouldn’t bother.)
In the past 20 years, teams facing a third down with 3 to 5 yards to go, while leading by one possession in the final two minutes, converted on 27.6% of all attempts. They ran the ball on 72.4% of such attempts, with a conversion rate of 22.5%, and attempted a pass play (including scrambles and sacks) on the other 27.6% of plays, with a success rate of 41.2%.
Let’s assume that, if Green Bay had forced a punt, it would’ve gotten the ball back with 1:30 to go at its own 25-yard line. They would have a timeout if Tampa Bay threw incomplete, but would not if Tampa Bay ran the ball.
Teams to get the ball when trailing by one possession, with at least one timeout, from their own 15- to 35-yard line and with 1:15 to 1:45 remaining, have scored on that drive 15.9% of the time. Teams in the exact same conditions, but with no timeouts, have scored 12.9% of the time.
If Tampa Bay passes the ball: 41.2% chance of winning on that play, and an 84.1% chance of getting a game-clinching stop after a punt, assuming Green Bay doesn’t burn a timeout. Total win probability: 0.412 + (0.588 * 0.841) = 90.7%.
If Tampa Bay runs the ball: 22.5% chance of winning on that play, and an 87.1% chance of getting a game-clinching stop after a punt, assuming Green Bay burns a timeout. Total win probability: 0.225 + (0.775 * 0.871) = 90.0%.
Based on this extremely general calculation that doesn’t account for either team’s strengths or weaknesses, Tampa Bay’s decision to pass the ball was the correct one, by a small margin.
Possible Sources of Error/Other Comments on Methodology
Like any football analytics project, this shouldn’t be blindly obeyed in all possible contexts. Analytics are used properly when they’re helping teams make informed decisions in the moment rather than forcing coaches to disregard all other factors at play. For example, player personnel has a major impact. If a coach particularly has respect for the other team’s passing attack, like the Buccaneers facing Rodgers, he has more justification to make sure the opposing team doesn’t get another chance. Similarly, scouting plays a major role too. If one team has discerned that the opposing defensive coordinator always sends the house when trailing by a score in the final minutes, it should exploit that aggressiveness through the air.
Beyond that general disclaimer, one key caveat specific to this project is the unfortunate necessity of going back to 2001. We all understand how much better the NFL collectively is at throwing the ball than it was 15, or even five, years ago. (Not to mention that older years have more NFLFastR data entry errors.) I originally wrote this project to be since 2011 rather than 2001, but ultimately went with the option that would give more substantial sample sizes. We must keep in our minds that the “average team” in the eyes of this project’s data is worse at passing than the true “average team” in today’s NFL.
Another important trait to point out is the unfortunate necessity of classifying every play as either a run or pass. Needless to say, not every play call is that black-and-white, particularly with the explosion of RPOs in recent seasons. It’s not fair to label every play as a pass or run as if the categories are fully binary, but we do the best we can with the information supplied to us.
As for the methodology, here’s an explanation on what “adjusted time left” entails beyond the summary of “clock adjusted to include timeouts.” For situations where the losing team was on defense, the formula was simple: real clock, plus 40 seconds for each timeout the defensive team had left. A defense having 1:00 left with one timeout is more or less equivalent to 1:40 left and no timeouts, because the offense will burn all 40 seconds of the play clock when it can. When the losing team was on offense, it was a more complicated formula to approximate how many extra offensive plays a timeout might create. For example, having 0:08 left with one timeout is comparable to having 0:15 with no timeouts, in that the offense almost certainly has at least two but no more than three plays left. As such, in my formula, a trailing offense having 0:08 left with one timeout leads to 0:15 of “adjusted time” remaining.
Some readers may be wondering why expected points added (EPA) was not featured. This is because EPA is based on when the next scoring play in a game will be, rather than a team’s expected point differential for the full remainder of a game. In almost any football situation, this distinction doesn’t matter much: a team that increases its chances of getting the game’s next scoring play is almost always also increasing its chances of winning the game. But in this project, it’s actually likely to increase your chances of winning despite decreasing your own chances of scoring again with a clock-killing run for a short gain. EPA is a strong metric when evaluating situations where the offense cares about scoring, which is nearly always the case. But when draining clock is a bigger priority than getting points, the stat is relatively useless.
Cole Jacobson is an Editorial Researcher at the NFL Media office in Los Angeles. He played varsity sprint football as a defensive lineman at the University of Pennsylvania, where he was a 2019 graduate as a mathematics major and statistics minor. With any questions, comments, or ideas, he can be contacted via email at email@example.com and @ColeJacobson32 on Twitter.