Can machine learning make sense of the NFL’s big data?

When it comes to using data to determine how to build a team or manage a game, the National Football League appears years behind its professional sports brethren such as Major League Baseball and the National Basketball Association. But perhaps the increasing popularity of machine learning can change that by helping NFL teams make more sense of their very complex datasets.

Delving deep into the world of computer science might sound like overkill, but professional football is big business in America, and an analytic edge off the field might be just as important as athletic or strategic edges on the field. Heck, it might help create them.

The cult of gut instinct and the human eye

The New York Times highlights the current state of statistical reliance among NFL teams in an article on Sunday. The  NYT’s Judy Battista reports that teams are finally beginning to hire statisticians and take statistical analysis seriously in limited areas — but there’s always a disclaimer. Football is such a variable-rich and complex game, her sources claim, that the human eye and human intuition will always be best at assessing certain things.

As one anonymous source put it when discussing the difficulty of evaluating players before the NFL draft: “At the end of the day, the tape is going to be our first choice. They have to look good on film.”

His point and those of others with whom Battista spoke are fair. For example:

  • Offensive line play can be difficult to gauge because the line is a five-person unit designed to work well together, not as a collection of individuals.
  • How do you statistically assess a middle linebacker who doesn’t make a lot of tackles but who’s always in the mix and disrupting the offense?
  • When it comes to calling plays, there might be limited data on any given situation (e.g., a particular down and distance to go from a particular spot on the field), and the outcomes might be very much influenced by the players on the field in each of those prior situations.

This is a lot different than in a sport like baseball, NFL analytics expert Tony Khan told Battista, where it’s much easier to break down statistics to an individual level and make Moneyball-like decisions about given players and circumstances.

Maybe that’s why there’s an innate anti-statistic bias among many NFL executives, as well as those in other leagues. Dashiell Bennett at The Atlantic uncovered a lot of disbelief about the statistical analysis of major sports leagues even at the MIT Sloan Sports Analytics Conference in March. “[T]ime again,” Bennett noted, “… when a reasoned and ably researched idea was presented, we heard some variation from those in the crowd of ‘That’s interesting, but…’.”

How machines can inform human knowledge

However, companies of all sorts are increasingly using machine learning algorithms (and related techniques) to detect patterns and correlations among complex datasets, and there’s a growing number of software products hitting the market that either incorporate machine learning or are built entirely upon it. It stands to reason that NFL teams might consider giving these techniques and technologies a chance, too.

Machine learning seems ideally suited to analyzing football data because of the complexity of the incredible number of variables that teams have to consider. Rather than requiring human beings to hypothesize and test, for example, how a mix of player statistics, combine performance, situational data and weather affect a given outcome, the algorithms can step up and find out whatever variables actually matter most.

Source: Google

And all that game-film analysis NFL teams like to do? Well, if teams were inclined to spend their hefty profits hiring data scientists and systems engineers, perhaps they could replicate Google’s results in video recognition in order to automate some portions of the film analysis that costs coaches untold hours. Instead of detecting cats or human bodies, perhaps algorithms could uncover patterns in gameday videos that humans either can’t see or don’t even think to look for.

(The Atlantic‘s Bennett also highlighted a nascent motion-tracking technology called SportVU (that’s not machine learning, but still very interesting) that analyzes player location on sports fields and the movement of the ball in order to help players figure out where they want to be in any given situation.)

At this point, coaches, general managers and other football experts come into the picture to impart their own expertise on the information provided by their computers. Maybe they decide, for example, that a player’s salary isn’t really worth much weight despite a strong correlation uncovered by the algorithms between how much the players on the field are paid and the percentage of fourth downs converted. Or perhaps they instruct the engineers to recalibrate the video-recognition models to ignore officials and referees when analyzing film.

It’s the same technique that companies such as ZestFinance use in order to make their machine learning efforts more accurate. As ZestFinance Co-Founder and CEO (and former Google CIO) Douglas Merrill explained to me recently, his company was able to significantly increase the accuracy of its credit-underwriting model by applying “human artistry” to add context or prune the results after its machine learning algorithms had analyzed some 70,000 variables.

Even if NFL teams don’t go crazy building computer science divisions and predictive models, machine learning could almost certainly provide some new information on which coaches and others can apply their considerable subject-area expertise. Even the source who espoused to NYT’s Battista his reliance on game film also acknowledged, “Ideally, you want the objective and subjective to match up.” Good NFL teams have the latter figured out; machine learning could help them get a better handle on the former, too.

Feature image courtesy of Shutterstock user B Calkins.


GigaOM