Monday, February 5, 2018

Baseball Prospectus' new metric has a Bartolo Colon problem

Last Tuesday, Baseball Prospectus rolled out three new metrics for evaluating pitcher performance - Power (PWR), Command (CMD) and Stamina (STM). I was particularly drawn to the PWR metric, which is described as a way of evaluating how much a pitcher fits into the "power pitcher" archetype. It's an intriguing and novel approach to evaluating and classifying pitchers, I think it's great new lens for looking pitchers. But when I looked a little closer at the 2016 PWR scores, something jumped out at me.

  Baseball Prospectus PWR Leaders, SP, 2016 

Oh no. 

Bartolo. Colon. 

Oh no. Bartolo Colon is not a power pitcher. Bartolo Colon is the exact opposite of a power pitcher. Bartolo's peak fastball velocity by Baseball Prospectus's metrics was 72nd out of 84 pitchers with 150 IP in 2016. Bartolo does not blow anyone away with his 90 MPH fastball, he relies on pinpoint placement to generate whiffs and mixes in offspeed stuff to generate weak contact (indeed, Colon's 2016 ranked 7th in CMD). PWR is still an effective measurement - look at all of the other pitchers it (correctly) classifies as power pitchers. But there does not exist an interpretation of the phrase where one could think of Colon's 2016 as emblematic of a power pitcher - so what gives? Can PWR be adjusted to relieve it of its Bartolo Colon problem? Like a computer program, if I want to debug this, I have to know how PWR works. Fortunately, BP tells us how PWR is calculated in a fairly straightforward manner:
As of right now, our Power Score is comprised of these three identifiable parts: Fastball velocity (three parts), fastball percentage (two parts), and the velocity of all offspeed pitches (one part). There are some other factors that we considered when developing this metric—such as the tendency to work up in the zone, and to lean on fastballs in put-away counts—but the current version of this metric only includes the three main components discussed above.
While I don't have access to BP's exact numbers used for calculating the PWR, I rigged up a rough approximation using the Pitchf/x numbers available on FanGraphs by normalizing each of the above components and weighing them as described above. I plotted my values (xPWR) against BP's (PWR) and they look reasonable, so I'll try to use xPWR to mess around and see if I can resolve PWR's Bartolo Colon issue while maintaining their current level of accuracy for evaluating actual power pitchers. 

xPWR vs. PWR 

2016 Colon has a xPWR score of 54, not 59, and he's only 20th in xPWR, which doesn't seem so bad until you realize that Colon's xPWR puts him squarely between Jose Fernandez and Max Scherzer. Colon needs dramatic adjustment, and hopefully, the adjustments I make in terms of xPWR can be translated to PWR as well. The best way to fix a problem is to address the cause, so why is Colon registering an abnormally high PWR score? The main culprit is likely his Fastball%. Here are the leaders in FB% from 2016:

MLB FB% Leaders (2016)

NameFB%PWRxPWR
Bartolo Colon89.5%5954
Aaron Sanchez74.3%5864
J.A. Happ73.5%5657
Robbie Ray71.1%6362
Jimmy Nelson71.0%5960
Jose Quintana66.5%4650
Kevin Gausman66.3%5963
Ian Kennedy66.2%5152
Doug Fister65.8%3637
Brandon Finnegan65.6%5252

I know that FB% is worth about one-third of PWR, and Bartolo is in a league of his own when it comes to FB%. Hence, the most likely culprit appears to be Colon's insane FB%. There have only been two seasons where pitchers threw 2000+ pitches in a season and posted an FB% above 89%, and both belong to Bartolo Colon - 2012 and 2016. Starters (and to a large extent, relievers) do not typically rely upon their fastballs so much, and since Colon is such an outlier, using normalized scores makes him stand out in a big way. The closest any starter came to Colon's crazy FB% values was Henderson Alvarez in 2014 (82.7%), so Colon receives a (rather unfair) bonus in PWR scores for throwing so many fastballs, one that makes up for his lack of velocity. 

Colon cheats the PWR metric by throwing pitches that are technically fastballs and are classified as such but aren't nearly fast as a traditional fastball. The flaw in PWR is that it assumes that any pitch classified as a fastball is, well, fast - but this isn't the case for Bart, and so he presents an anomaly. Perhaps I can rectify giving Bart such an advantage by reducing the weight of FB% - if I drop the weight on FB% to one part instead of two, our top pitchers (min 150 IP) by xPWR (v2) look like this:

MLB xPWR (v2) Leaders (min 150 IP, 2016)

PitcherxPWR
Noah Syndergaard65
Carlos Martinez62
Yordano Ventura62
Aaron Sanchez62
Robbie Ray61
Michael Fulmer60
Jon Gray59
Danny Duffy59
Jose Fernandez59
Carlos Rodon57

And here are are our best xPWR scores for relievers (min 40 IP):

MLB xPWR (v2) Leaders (min 40 IP, 2016)

PitcherxPWR
Aroldis Chapman87
Arquimedes Caminero78
Trevor Rosenthal74
Zach Britton73
Pedro Baez73
Carlos Estevez72
Craig Kimbrel72
J.C. Ramirez72
Edwin Diaz71
Hunter Strickland70

Note that I scaled the original values to best match the scale of PWR. Colon has (rather ignominiously) dropped out of the top ten, with his xPWR (v2) falling all the way to 45 - the same as John Lackey and Jake Odorizzi. The leaders in xPWR (v2) all fit the profile of a power pitcher - hard throwers, fast offspeed stuff, rely heavily on the fastball - and Colon can't cheat the metric as much. 

But at the same time, we're still committing the same mistake as the originally PWR metric in assuming that fastballs are thrown hard, just to a lesser degree. Maybe we should revamp our approach to the PWR metric. Perhaps we can simply use average pitch speed across all pitches. This approach rewards pitchers for simply throwing hard and doing so frequently. If I use total average pitch velocity and normalize those values to fit with PWR, Bart's exploit of FB% can't work. At the same time, taking a straight average of pitch velocity and normalizing it incorporates all of the tenets of PWR (fastball velocity, FB%, and offspeed velocity), so we're staying true to the spirit of the original metric. Let's use this approach for xPWR (v3). Here are the leaders for 2016 in xPWR (v3) among pitchers with 150+ IP...

MLB xPWR (v3) Leaders (min 150 IP, 2016)

PitcherxPWR
Noah Syndergaard73
Aaron Sanchez63
Carlos Martinez63
Michael Fulmer63
Robbie Ray62
Yordano Ventura61
Jon Gray60
Jimmy Nelson60
Jeff Samardzija60
Carlos Rodon60


... and relievers with 40+ IP.

MLB xPWR (v3) Leaders (min 40 IP, 2016)

PitcherxPWR
Aroldis Chapman88
Arquimedes Caminero79
Zach Britton77
Trevor Rosenthal75
Pedro Baez73
Jeurys Familia73
Carlos Estevez73
J.C. Ramirez72
Craig Kimbrel72
Edwin Diaz71

And what of our good friend Bartolo? Colon's xPWR (v3) score falls around 47, the same range as Kyle Gibson and Felix Hernandez. This third method gives us a lot less range in terms of scores, so it's more difficult to differentiate between players - but at the same time, it does just as good of a job of identifying pitchers who fall into the power-pitcher archetype while leaving out those who are not. 

Is PWR "broken" in its current state? Of course not. Almost every metric has a few players who can cheat it one way or another - Colon happens to be extremely good at cheating the PWR metric. With a couple changes, however, BP might be able to keep Colon from breaking into the top ten with a ridiculous PWR score while maintaining the integrity of the metric as a method of evaluating how well pitchers fit into the PWR archetype.

Update: I talked briefly about this with Jeff Long at BP on Twitter. Long said that the metric was working as intended in that the metric identified Colon's approach as representative of a power pitcher. I would tend to disagree in that the key metric of power-pitching, velocity, is relatively absent from Colon's approach.