Introducing new NFL run-blocking and run-stopping stats: How our metrics work

Two years ago, ESPN analytics created a revolutionary new way to measure the pass-block and pass-rush performance of individual NFL players using NFL Next Gen Stats player tracking data. This year, we present two new companion metrics for run-blocking and run-stopping: run block win rate (RBWR) and run stop win rate (RSWR).

Our pass-blocking metrics were a big hit, and not just with viewers and readers. More than a few general managers and front-office staff reached out to the ESPN analytics team not long after they launched, asking for full details on every player. We got feedback from some teams on why some of their players were ranked where they were. So why did it take two years to do the same thing for the running game?

Part of the answer is that we worked on some other great projects, like a pass coverage classification model, and a route and route concept classification model.

The other part of the answer is that run blocking and run defense is far more complicated than pass blocking. After studying the running game for months, I can understand its allure to coaches. It’s exceptionally strategic on multiple levels; the main broadcast view of the game does the running game an injustice. To many fans, run plays seem like a random scrum of blockers and defenders, culminating in the proverbial cloud of dust, but the chess match between coordinators on run plays is as interesting as anything else in sports.

To get this project right required months of study, watching game film and talking with experts. Hopefully it was worth the wait:

Jump to:
Details of how the model works
How do we know it’s any good?
What are the model’s flaws?

OK, what exactly is run block win rate and run stop win rate?

RBWR and RSWR tell us the proportion of plays in which a player “won” his block on designed running plays. It uses angles, distances and speeds throughout the execution of a play to tell who is blocking whom and to determine whether the defender was able to meaningfully beat the block (or blocks in the case of a double-team).

What can these stats tell us?

Like their companion pass-block metrics, RBWR and RSWR provide a measure of blocker and defender performance in the running game. Now, for the first time, we have a complete set of objective, individual stats for players whose performances often go overlooked. Additionally, these metrics are designed to be as independent as possible from the skill and performance of the runner, so we can now better assess running back performance.

How do they work?

The running game is all about position and angles, and so are these metrics. First, each block is identified — that’s the easy part. Next, our system determines whether the defender was able to defeat the block. It does this by using a large set of rules based on relative distance to the runner, relative velocity to the runner and many other more complicated measures. A defender doesn’t have to make the tackle to win his block. He can penetrate the backfield to cause a disruption, contain the runner behind the line of scrimmage, or squarely fill his assigned gap to earn a win. If a blocker allows his defender to win, he is debited with a loss.

There are several other wrinkles. One of them is that an unblocked defender can get a win if he makes the tackle inside 3 yards. Another is that if a blocker fails to land a block, which happens when a blocker can’t get to a linebacker quick enough, he gets what we call a “whiff” loss.

How do these compare to player grading by a scout or other observers?

The running game can be quite chaotic, so it’s often difficult for anyone (or anything) to fully make sense of everything that happens on a play. As the creator of these metrics, I’d freely admit that an expert watching the all-22 video would probably do a better job grading run blocking and stopping performance than these metrics. There are two ways in which these metrics, however, can be better:

  • They are mercilessly objective. Show five talent evaluators one play, and you might get five different evaluations back. Our metrics don’t care about form or technique — only outcomes. And our system doesn’t care if it’s watching a star left tackle or a backup right tackle. There’s no subjectivity involved, and it isn’t influenced by reputation, salary or hype.

  • Our system is lightning fast. Evaluating a player on video isn’t easy, and it will take at least several views to accurately grade just one player. With many players to grade on every snap, it can take quite a while to grade a single play. I know — I’ve watched nearly a thousand run plays many times over in the course of this project. I’m told that because of the time involved, it’s common for evaluators to take notice of the highly remarkable players on each play, such as a defender penetrating the backfield for a tackle for loss, or a blocker who opened a large hole. Otherwise, the players who don’t instantly stand out merely receive a neutral grade. Because our new metrics don’t suffer from the pressure of time, they can evaluate every player with equal attention on every snap.


The nitty-gritty details of RBWR and RSWR

What can these stats do?

Similar to pass rush win rate (PRWR) and pass block win rate (PBWR), our RSWR and RBWR metrics provide objective statistics at the core of the sport for players at positions who traditionally lacked meaningful metrics. Blocking and beating blocks are fundamental to the sport, so to understand what makes a play work or what makes a team good, it’s critically important to understand blocking performance.

Based on the raw player-tracking data from the NFL’s Next Gen Stats platform, these metrics measure block execution throughout the course of every run play. Note that quarterback scrambles are excluded, but designed runs are always included.

As a preview, here is the 2019 leaderboard for RBWR among offensive tackles who played a minimum of 250 run snaps:

The league-wide average for RBWR is about 70%. The next group of 10 includes well-regarded names, such as Mitchell Schwartz, Bryan Bulaga, Taylor Lewan and Tyron Smith.

And here is the 2019 leaderboard for RSWR for edge defenders in 2019 who played a minimum of 200 run snaps. The league-wide average for edge players is about 30%.

The next group of 10 edge defenders features Melvin Ingram, Trey Flowers, Cameron Jordan and Nick Bosa. J.J. Watt would be in the top 10 (27.7% RSWR), except that he fell under the qualifying cutoff of 200 run snaps.

We can average all of the win rates of each by team to assess team-level run blocking and stopping execution. We only include the box players, however, which includes players lined up on offense as linemen, tight ends, fullbacks and running backs, and would include defenders lined up as interior linemen, edge defenders, linebackers and “box” safeties.

The Jets led the league in RSWR last season. Accordingly, they were second in the league in both yards per carry allowed and expected points added on run plays, which should boost confidence in the metric. The Jets were followed by the Seahawks, Cardinals, Cowboys, Broncos and Patriots in RSWR.

It should be no surprise that the Ravens topped the league in RBWR, helping them lead all teams in both total yards and efficiency. The fact that the Baltimore blockers were so strong across the board in nearly every position, including TE and FB, raises a concern, which I’ll address below. Behind the Ravens were the Packers, Panthers, Saints, Lions and Chiefs.

How the model works

The first step of the process is simply identifying each block. Similar to our pass-block metrics, these metrics use a combination of distance and orientation — the closer and squarer a blocker is to a defender, the more likely he is blocking him. But the run metrics use slightly different parameters than the pass metrics due to the more chaotic nature of run blocking.

Determining whether a defender wins a block works using a long list of geometric and kinematic rules (think basic physics) designed to identify when a defender becomes a threat to a runner. This is often when he penetrates the line, fills a gap, or successfully contains an outside runner. The central idea is to capture the goals of run defense: the “fit/fill” (filling a gap), the “spill” (penetrating the line) and the “force” (containing the outside run). If a defender does any of those three things, subject to another set of conditions, he gets credited with a win on that play. Additionally, if a defender manages to make a tackle within 3 yards of the line of scrimmage, despite still being geometrically blocked, he gets a win.

The additional conditions that must be met to get a win include things like having a closing velocity to the runner (so that a defender hasn’t been shoved away from the runner’s path) and the runner isn’t already downfield of the block. The defender can’t be “pancaked” off the line — pushed back into the second-level defenders, which is the worst thing a front line defender can allow. The defender can’t be immobilized or knocked down, and he must beat his block within a reasonable time from the snap. The geometry must allow for the defender to reasonably intercept the runner by a set number of yards downfield, and he can’t be too far away laterally from the runner. There are several more rules, but I’d have to explain dot products, and you don’t really want that. Ultimately, the rules were tweaked and tested over thousands of run plays until a satisfactory combination captured the essence of performance in the run game.

How do we know it’s any good?

Our model isn’t like a traditional analytics model, where there is a set of data with the right answers, which would allow us to train it and then validate it by determining how many blocks the model got correct or incorrect. One way to test the model’s performance would be to simply watch a lot of plays and be satisfied it’s working as intended, which is what I did throughout during its development. But we have to be more rigorous than that.

If these metrics worked well — if they are truly capturing meaningful information about blocking execution — they’d be able to make predictions about play outcomes. For example, if there was one run stop win by a defender on a play, we would expect that, on average, the gain would be smaller than if there were zero run stop wins. And if there were two run stop wins on a play, the average gain would be even smaller, and so on.

It turns out that average gain on running plays according to the number of run stop wins by defenders decreases exactly as expected. Here’s a table that lists the average gain for each number of run stop wins on a play. Expected points added (EPA) is listed as well, which accounts for the complexities of down, distance and yard line.

This result is cheating just a bit, because our system includes the criteria that a tackle before 3 yards of gain qualifies as a run stop win. Still, if we remove that criteria from the system, the averages result in the exact same pattern — steady decrease in average gain as the number of run stop wins increases, from 5.4 yards for no run stop wins down to 1.3 yards for six run stop wins. This is a simple but powerful indication that the metrics are capturing meaningful aspects of player performance.

Further, if our metrics are truly measuring player skill, they should be somewhat consistent from year to year. To test the reliability of a metric, statisticians often measure how well it correlates with itself over time. For example, if you measure your height on Monday and again on Friday only to discover you’ve lost five inches, you can be sure there’s something wrong with the ruler. But if the ruler gives about the same measurement each time, you can be confident the ruler is doing its job.

A correlation coefficient is a number between minus-1 and plus-1 that measures the relationship between metrics, where minus-1 would indicate a perfectly opposite relationship, 0 would indicate no relationship and plus-1 would be a perfectly direct relationship. For reference, in baseball the correlation between batters’ on-base percentage in one season and the next is about plus-0.6. This correlation is what tells baseball analysts that getting on base is mostly attributable to skill rather than luck.

Happily, players’ year-to-year RBWR and RSWR are remarkably consistent. Here are the year-to-year correlations for blockers’ RBWR from 2018 to ’19:

  • Tackles: +0.71

  • Guards: +0.51

  • Centers: +0.47

  • Tight ends: +0.57

And here are the year-to-year correlations for for defenders in RSWR:

These correlations are strong evidence that our metrics are capturing player skill to a large degree. And together with the steady decrease in average gain as the number of run stop wins increases, we can be confident that their results are meaningful.

It can’t be perfect, though. What are the flaws?

Just like any observer who doesn’t know the play call, our system knows neither the blocker nor defender assignments. Fortunately, at the professional level missed assignments in the running game are relatively rare. Ultimately, however, we can still be confident that these metrics, at a minimum, measure execution.

Ideally these metrics would be completely independent of who the runner is and what the scheme is. And although they’re designed to be that way, it remains somewhat impossible to parse out the influence of the runner. Remember when I wrote earlier that the Ravens’ broad success across nearly every position in RBWR suggested a problem? That’s one example.

The Ravens’ option scheme — and the presence of Lamar Jackson as a running threat — makes the blockers’ jobs easier. Option runs remove a defender from the play — the one being optioned against — which adds one more blocker relative to the number of defenders. This allows more than just a numerical advantage. Each of the blocks now becomes slightly easier on their own, because the numbers advantage means the play design can have better angles for each block. Their extensive use of motion at the snap also helps blocking angles.

Thankfully, the Ravens provided a natural experiment to test how much scheme could influence our metrics. Nearly halfway through 2018, Baltimore replaced Joe Flacco with Jackson. For the season as a whole, Baltimore ranked first in team RBWR, but when Flacco was the starter, it still ranked seventh. So the Ravens’ blockers are probably benefiting from their scheme, and their RBWR numbers may be inflated, but not by much. And the Ravens’ offense under Jackson is likely an extreme case of when runner and scheme change the game.

How do the metrics account for double-teams and other considerations?

Not all blocks are equal. But for now, we want to keep things simple, so we aren’t going to build in an adjustment for block type or difficulty. Instead we’ll report the win rate, which is a simple percentage, but also report double-team rates at the same time. In the running game, double-team blocks are one of the harder things for an offensive lineman to do. They are usually what are known as combo blocks, in which two adjacent blockers will both hit an interior defender before one of them slides up to the second level to block a linebacker. So it’s not immediately clear how and when to make adjustments.

And as with most statistics, our metrics rely on a reasonable sample size before making conclusions. And within that sample size, the tendency is that many of the factors not accounted for will even out, although there’s no guarantee.

The good news is we can classify almost all blocks — down blocks, reach blocks, traps, whams and so on. And these categories will help us understand how difficult they tend to be.

Where does this all lead?

The metrics by themselves can tell us about how well players perform in the trenches, but our overall system can tell us much more. Our system will soon allow us to classify every run play, so we can tell you how often each team runs inside zone, outside zone, power, counter or option. We can also marry these metrics and classifications to either traditional stats, like yards per carry, or advanced stats like EPA to gain even more insight.

Look for our blocking metrics on our air, in stories on ESPN.com and in a weekly leaderboard post that will launch after the Week 1 games.