– The K Zone –
How to Win at Baseball, by Ian Joffe
April 17, 2018
With limited salaries and a finite number of high draft picks, teams are constantly forced to chose how, out of dozens of options, to build their team. Rosters can focus on hitting or pitching. They can look for power or on-base skills. They can make a core of speed and defense. A team might even try to build around leadership and personality traits. A roster with any kind of emphasis, or even a general well-roundedness, has the potential to be effective, but I want to figure out what teams are most effective. So, to do that, I turned to my Fangraphs spreadsheets and Python editor.
For data, I scraped information off all 480 teams from 2002-2017 (going back to 2002 because that’s when the pitching stats that I wanted became available). As the first step in seeing which skills are most effective to build around, I constructed a set of scatter plots that set each statistical category and team wins along the two axes. The categories I checked look at overall hitting (wRC+), on-base ability (OBP+), power (ISO+), speed (SB+), and two pitching metrics (xFIP+ and SIERA+), all of which, as you can see, have been normalized so that 100 is league average. In retrospect, I should have included at least one defensive statistic to look at, but I neglected to because given my process, it would have taken a long time to include that data, and now it’s too late. Here are the scatter plots for each stat, plus their Pearson correlation coefficients:
As we could have predicted, teams with good stats tended to win more games. Because they are only slight, the differences in P-Values doesn’t tell us much here given the fact that baseball wins are not highly controlled experiments, and everything is in the same ballpark. That is, every stat except one:
It turns out that steals had absolutely no correlation to wins, in fact, a set of 480 randomly dispersed points may have correlated even better. It’s possible that teams only run more because they have less power, but managers tend to keep the same strategy even when they move teams, so I would instead just say that in general, speed is not a key to winning at baseball. Sure, a steal now and then helps if there’s a high likelihood of reaching the base, but building a team around speed and hoping to win is a poor strategy, and historically has not worked.
To create a more telling story about which teams succeed and which teams fail, I looked at how teams that ended up in certain tiers were built. I defined a “playoff team” as a roster in the top 30%, a “Championship Series team” as one in the top 12%, and a “World Series champion” team as one in the top 3% (note that this has nothing to do with how the playoffs actually went, because the playoffs are essentially random). I then applied a label to teams based on whether they emphasized hitting or pitching by subtracting xFIP+ from wRC+. A team with a difference of 20+ has a “heavy hitting emphasis,” a team with a 10-20 differential has “some hitting emphasis,” a team with a value between 10 and -10 has “no significant emphasis,” a roster between -10 and -20 has “some pitching emphasis,” and finally a team with a difference under -20 is labeled with a “heavy pitching emphasis.” Here is the overall distribution of teams by emphasis:
As you can see, and potentially predict, most teams have no emphasis. More importantly, however, is that many more teams have some pitching emphasis than hitting. Keeping that in mind, let’s look at the distribution within each tier:
While the strength of the balanced team largely holds, we see an immediate dropoff in the number of teams who emphasize pitching, strongly or at all, and the number of teams who weight hitting is starting to grow.
As we move to the top 12% of teams, no rosters that emphasized pitching remain. And, nearly half of the teams emphasize batting.
And finally, as we reach the few elite teams, the vast majority have a hitting emphasis. Out of the 10 teams total that showed a heavy batting emphasis, all of them were playoff caliber and half of them were champion caliber. While teams with a hitting emphasis made up only 9% of total rosters, they comprise 42% of CS teams and 85% of championship teams. Meanwhile, not a single team who emphasized pitching made it to the top 12%, and despite being 31% of total teams, those who focused on pitching only made up 1% of the playoff teams overall. The lesson here seems clear: Build around hitting if you want success. When given the choice between two equally talented players in the free agent pool, or even more importantly the June draft, chose the hitter. There could be a few reasons for this. One reasonable theory may be the value of defense distracts and sets the value of the pitcher to, if you take an extreme stance, the point where pitchers become replaceable as long as the team retains a strong defensive cast. It’s also arguable that it’s easier to find good pitchers and more teams have been able to build pitching depth, as seen in the overall distribution. So, it would be harder to use pitching as a competitive advantage. Or, maybe because so many pitchers are used in today’s game, the value of each becomes diluted, therefore only when teams move to improve their hitting can they gain a competitive advantage. To be clear, I’m not saying that pitching doesn’t help a team; we saw from the correlation plots that it certainly does. However, given limited resources, ignoring hitting in pursuit of strong pitching – or even looking at the two in equal light – is not a recipe for success.
Now, let’s take a look at another potential difference in strategy: power vs. on-base skills. This one is a little harder to quantify because, while hitting and pitching make up almost all of the factors in a baseball game (minus defense), power and contact exist in a far less controlled experiment. But it’s worth a look anyways. I labeled the emphasis of teams in favor of power vs. on-base skills in a similar way I did with hitting and pitching (with the +20, -10, etc. differentials), except I used ISO+ and OBP+. Here is the initial distribution among all teams:
It’s pretty similar to the full distribution among hitting and pitching, with a heavy spike in the middle. Here’s the distribution among playoff teams:
It looks like power is winning out a little, although don’t read too much into the small sample of teams with heavy on-base emphasis. Still, the distribution doesn’t change too much.
As we continue through the postseason, we see a continued normal percent loss in each category, about equivalent to the percent lost overall.
And the trend continues, with “some power emphasis” remaining as about 20% of teams throughout the playoffs and categories with smaller amounts to start off with being eliminated as a whole. Unlike with pitching vs. hitting, there is no clear story here. I wouldn’t even say that a balance is necessarily the best option, because it started so heavily weighted. So, teams can go either way. As long as the focus is on hitting, they can win through a power-heavy strategy, contact-heavy build, or a balance.
There was one last thing I wanted to check out: a comparison of playoff teams to trends. It’s possible that while since 2002, power and contact have been equal, in certain mini-eras one has been more valuable. This would be because of a league trend. Perhaps the winning team is the one that’s ahead of the trend and really exaggerates it. Or, the winning teams could be the ones who zig while everyone else zags, finding bargains along the way. So, over the 16-year period, I graphed the league trends in ISO versus the median ISO+ of a playoff team, and applied a polynomial regression:
There is no clear pattern between the power-emphasis of winning teams and the league trend. If anything, the playoff teams look to be behind the curve (imagine shifting the green line over about four years to the right). This further goes to show the original point, that teams can build both power, contact, or a mix, and will still have the same ability to win, no matter what the rest of the league is doing.
While these findings certainly apply to all methods of roster-building (such as free agency, trades, and Rule 5), it seems most important during the amateur draft, given the wide diversity of players available and the fact that there is usually little clarity on the future potential/reality of drafted players. That especially goes for systems that already lean hurler-heavy. Teams should seriously consider taking batters over pitchers, even if the pitchers appear to have slightly more raw ability. Because, simply, it works.
Image Attributed to: