Wednesday, October 31, 2018

Introducing rFIP and nFIP

One of the most valuable tools in all of baseball is pitch framing: the idea of stealing strikes for pitchers is severely underrated by metrics such as fWAR and rWAR, and from our few measurements of it, we know that pitch framing from an individual catcher can be worth upwards of 20-30 runs over the course of a full season. Metrics like WARP assign these values to catcher, and while DRA adjusts pitcher performance to a degree to account for factors like pitch framing, it's difficult to truly grasp the impact of pitch framing on a pitcher's performance. This is where rFIP and nFIP come in.

Background, rFIP

It's intuitive that pitchers that throw a lot of strikes will register a lot of strikes, and pitchers that throw a lot of balls will register a lot of walks. In this sense, the relationship between the ratios of strikes per ball and strikeouts per walk is quite strong. On a team by team level, the ratio of strikes/ball to strikeouts/walk recorded an R2 value of 0.9253 in 2018, indicating an extremely strong relationship between the two variables.

Since 2015, strikes/ball and strikeouts/walk on a pitcher-by-pitcher level recorded a strong relationship as well among qualified pitchers, albeit to a lesser degree (R2 value of 0.6445).

Despite having a slightly less strong relationship than the team-by-team relationship, both linear regressions reveal the same formula for approximating strikeouts per walk from strikes per ball:

In other words, a pitchers' strikeouts per walk can be approximated reasonably well from the number of strikes and balls that they record.

Knowing that a pitchers' strikes and walks are affected by pitch framing, we might seek to remove the influence of the catcher and umpire on strike and ball calls by looking solely at the number of pitches a pitcher throws in the zone (which are technically strikes according to MLB rules, but via mistakes from umpires and pitch framing efforts, might be called as balls) and the number of pitches a pitcher throws out of the zone (which are technically balls but may end up as strikes thanks to pitch framing). We can seek to contextualize this value by calculating a pitcher's fielding independent pitching value (FIP) by calculating a pitcher's strikeout to walk ratio given their balls and strikes as called by a robotic umpire (with Statcast serving as our robot) - this leads us to rFIP, robotic-FIP.

We must operate with multiple caveats in calculating rFIP. The primary caveat is that we are assuming a two-dimensional strike zone as Statcast does: Statcast defines the strike zone as the imaginary plane that runs perpendicular to the ground and parallel to the front edge of the plate, as shown in an illustration from MLB's officially rulebook below.

However, baseball's strike zone is not two dimensional, but three dimensional. According to major league baseball's official rules, "The STRIKE ZONE is that area over home plate the upper limit of which is a horizontal line at the midpoint between the top of the shoulders and the top of the uniform pants, and the lower level is a line at the hollow beneath the kneecap". Thus, the strike zone is not a plane, but a prism. A more accurate representation of the strike zone is shown below from Wikipedia.

Our approximation of the strike zone will miss some borderline pitches that do, in fact, cross through the strike zone but do not touch the front plane. Still, the number of strikes and balls that our methodology misses will be quite small, as it is quite difficult to throw a pitch that is a strike while avoiding that front plane.

Our methodology also assumes that a pitcher would record the same sum of strikeouts and walks regardless of the ratio of strikeouts to walks that they recorded. In other words, a pitcher might expect to record the same number of balls in play, HBP, and HR in a season regardless of how many walks/strikeouts they yield, so this not a completely unfair assumption.

To demonstrate how rFIP is calculated strictly, I will carry through an example using 2018 Mets starter Jacob deGrom, who had the greatest single season pitching performance of the past four seasons by rFIP.

I went into Statcast's database and pulled deGrom's strikeouts, walks, hit-by-pitches, home runs, and IP outs to calculate deGrom's FIP. deGrom recorded 268 K, 43 BB, 5 HBP, 10 HR, and 643 IP outs in 2018 according to Statcast. Note that these values are slightly off from deGrom's actual totals - 269 K, 46 BB, 5 HBP, 10 HR, 651 IP outs - because Statcast has some missing values. Still, the difference between our calculated FIP for deGrom - 1.94 FIP - is only marginally different from deGrom's actual FIP (1.99 FIP). This is the baseline FIP for deGrom.

I then pulled deGrom's total strikes and total walks, then I measured how many strikes were "stolen strikes" - that is to say, called strikes recorded as outside of the strike zone - and how many balls were "lost strikes" - called balls recorded as inside the strike zone. In 2018, deGrom recorded 1698 strikes, 80 of which were stolen strikes, and 999 balls, 54 of which were lost strikes. To calculate deGrom's strike total as called by a robotic umpire, I subtracted the number of stolen strikes from deGrom's total number of strikes and then added his lost strike total to that figure. deGrom's ball total as called by a robotic umpire was equivalent to the number of balls deGrom recorded minus his lost strike total plus his stolen strike total. deGrom's rStrikes (robotic strike calls) were 1672 (1698 - 80 + 54 = 1672) and his rBalls were 1025 (999 - 54 + 80 = 1025).

From there, I can use deGrom's rStrike/rBall ratio and the relationship between strikes per ball and strikeouts per walk to approximate deGrom's strikeouts per walk assuming a robotic umpire. deGrom's rStrike/rBall ratio was 1.63, so the equivalent strikeouts per walk ratio would be 7 × 1.63 - 6 = 5.41 (not far off from deGrom's 2018 figure of 5.85). deGrom recorded 268 + 43 = 311 strikeouts and walks in 2018, so his adjusted totals according to his rK/BB and the sum of his strikeouts and walks would be 263 strikeouts and 48 walks. Plugging these back into the FIP formula yields an rFIP of 2.07.

The benefits of rFIP is that it serves to isolate the influence of pitch framing, bad umpires, or other factors and distill a pitchers' ability to throw strikes while incorporating a pitchers' other tendencies (HBP and HR). It also serves to identify which pitchers do not throw many strikes but are saved from these tendencies thanks to their catchers (Zack Greinke is a prime example of this, running an extremely high FIP - rFIP differential during his time in Arizona with Jeff Mathis as his personal catcher).

Background, nFIP

But Major League Baseball does not have robotic umpires at the moment, and given the current strength of the umpires' union, it might not for a very long time. In that sense, we might wish to know what a players' performance might look like if they were pitching to a normal catcher/umpire duo. To accomplish this, we will use nFIP - normal FIP - which captures a pitchers' performance based on their strikes and balls as if they had an average catcher.

Since 2015, pitchers have seen 7.8% of their out-of-zone pitches converted from balls to strikes, and 5.4% of their zone pitches converted from strikes to balls. Following our deGrom example, deGrom's expected strikes with an average catcher/umpire duo would be 1672 rStrikes - 1672 rStrikes × 5.4% + 1025 rBalls × 7.8% = 1662 nStrikes, and deGrom's expected balls would be 1025 rBalls - 1025 rBalls × 7.8% + 1672 rStrikes × 5.4% = 1035 nBalls.

Using the same methodology as rFIP in approximation K/BB from strikes and balls, deGrom's FIP becomes 2.10 - about a 0.16 difference between his FIP, which indicates that deGrom had a little bit of help from his catchers in getting his value).

As a quick "stupid-check", we can also see that our league FIP (4.12) is extremely close to our league nFIP (4.17), so we know that our conversion methods are rather effective.

Advantages of rFIP and nFIP

rFIP is quite useful for us in that it gives us a solid idea of how effective a pitcher is at throwing strikes. In terms reliability from year N to year N+1 among pitchers from 2015-2018 with at least 100 IP in year N and N+1, FIP and nFIP have roughly the same correlation (R2 of 0.2607 and 0.2643 respectively), rFIP has an improved year-to-year reliability with an R2 of .2860.

Neither rFIP nor nFIP appears to predict year N+1 FIP with any reliability (R2 of 0.1796 and 0.1846 respectively), which is to be expected: since we have neutralized the impact of catching on a pitchers ability, the influence of the catcher is seen in only FIP.


As an important reminder, rFIP and nFIP are simplified models that are not attempting to be one-hundred percent accurate. We are approximating strike and ball calls, we are approximating a pitchers' strikeout/walk rates from those approximations, and we are assuming that a pitcher has no influence on pitch framing, which is most likely not a completely certain assumption. Still, our model passes the eye test in that it capably identifies terrific seasons and does not appear to unduly punish pitchers for their catchers' skill.


A full leaderboard of rFIP and nFIP leaders since 2015, both split by season and career totals over that span, can be seen below. The default selection for the filter includes qualified pitching leaders.


The SQL code used to calculate rFIP and nFIP from a Statcast database can be found here.

As always, if you have any questions, suggestions, or feedback, please leave a comment or tweet me @John_Edwards_.