Friday, March 15, 2019

Don't Draft Him, Draft Him! Points Edition

Photo: Ian D'Andrea/Flickr


The Ringer just released an article called "The 2019 Fantasy Baseball Do Not Draft Team" - an article in which the author claims "This piece isn’t a warning to entirely avoid drafting any of those players, but rather a plea to think twice before making an investment" but whose title betrays the nature of the piece. On it, the author argues that guys like Trevor Story (22.3 ADP on Yahoo, 6th highest ranked SS) and Blake Snell (30.5 ADP on Yahoo, 7th highest ranked SP) are too highly ranked because "regression", but pretty clearly ignores the context of any positional depth on the list - which plays a large part in ADP. The article's author fails to mention projections for these players even once - a little mind-boggling given that projection systems are entirely designed around regressing breakouts and tend to do a pretty good job of it.

I'm here to arm you with a better model for avoiding players to draft on your fantasy day. We're not going to wave hands at ADP and say "yeah, he broke out, but I'm going to apply a general rule to everyone who broke out in 2018 and say that you shouldn't draft this guy". We're also not going to ignore positional depth or projection systems - on the contrary, we'll make sure that it's baked in. And the basis of our model won't be "regressing from a breakout" - we'll look at where we can get equivalent production for a much better ADP.

N.B. This list is based on ESPN's standard points system. I might do a Roto piece with this same premise a little later this week, we'll see.

Catcher

Don't Draft: J.T. Realmuto, Gary Sanchez
Do Draft: Yadier Molina, Buster Posey

It can be tempting to go grab a T1 catcher just to have the position locked up, especially if Realmuto is on the board and you're feeling anxious. However, at an ADP of 55.4 on ESPN, taking Realmuto often means losing out on a fairly valuable S-tier relief ace like Kenley Jansen or Edwin Diaz. Gary "I Hit .186 Last Season" Sanchez looks primed for a rebound, and if it really means that much to you, you can probably grab him at his 79.7 ADP - but you might be better off waiting for ol' reliable Molina and Posey to fall to you.

Player                   Position     ESPN Points (Razzball)        ESPN Points (Depth Charts)          ESPN Points (ESPN)        ESPN ADP
J.T. RealmutoC29231433755.4
Gary SanchezC29732727079.7
Willson ContrerasC237235261121.9
Buster PoseyC262316289127.2
Yasmani GrandalC241258262139.1
Yadier MolinaC271264284144.8
Wilson RamosC214197248167.9
Danny JansenC212233210242.9

Sanchez projects to be as good of a catcher if not better than Realmuto on the rebound and Posey/Molina represent comparable products at ADPs 70+ picks higher than Realmuto. Sure, you can bank on Realmuto breaking out and scoring runs in a loaded Phillies lineup, but breakouts for catchers can only go so far and carry a similar base degree of risk across all players at the position. With Molina, the closest thing the position has had to an Iron Man at the position in the past decade, you could reap similar value, even at Molina's advanced age.

Besides, remember that S-tier relief ace I was talking to you about? According to Razzball, RPs with an ADP of 70 or lower - Diaz, Treinen, Jansen, and Chapman - are projected at about 350 fantasy points, but relievers with an ADP between 120 and 140 are projected at about 268 fantasy points. Sanchez and Realmuto are projected for an average of 295 points, but Molina and Posey are projected for an average of 267 points. You'll lose more points by missing out on Diaz than you if you can't get Realmuto. Make the smart move.

First Base

Don't Draft: Freddie Freeman, Paul Goldschmidt
Do Draft: Anthony Rizzo, Carlos Santana

This is one of those where you're probably fine if you don't take my advice. But still! You can snag a great degree of value here if you play your cards right.

It is absolutely mind-boggling to me that people are taking Rizzo (32.8 ADP) after Freeman (21.3 ADP) and Goldschmidt (21.8 ADP). Mind-boggling. Sure, you can look at their numbers from last season and say, "Well, Freeman was better!"

Player                      Position      Runs    TB     RBI    BB    K      SB    ESPN Points
Paul Goldschmidt1B9531683901737418
Freddie Freeman1B94312987613210458
Anthony Rizzo1B7426610170806437

But Rizzo hit just .149 in March/April, fueled by a .172 BABIP. Here's how the rest of the season shook out for these 1B, beyond April:

Player                      Position     Runs       TB       RBI   BB    K       SB     ESPN Points
Paul Goldschmidt1B7426672721376353
Freddie Freeman1B7525579551128360
Anthony Rizzo1B662529266655416

Rizzo's has always been a little slow out of the gate (.243/.371/.460 hitter in March and April), but that difference isn't large enough to call it tangibly different from his career line, and it's a damn good line for a first baseman. Rizzo is easily the best fantasy 1B in the league, and he's worth an early reach at a fairly skinny position.

If you miss out on Rizzo, all is not lost! Give Carlos Santana a look. Here are both players' Razzball projections:

Player                 Position     Runs   TB    RBI    BB   K   SB    ESPN Points    ESPN ADP
Anthony Rizzo1B95277927791745732.9
Carlos Santana1B832338292883405139.9

Santana's elite strikeout avoidance makes him as good of a play as Freeman (409 points) or Goldschmidt (400) at a much higher ADP. Spend a top 10 pick elsewhere knowing that you can get Santana with a reach and find a similar level of production. But don't forget to reach! 1B is a bloodbath this year past Matt Carpenter, so I can't blame you if you want to grab Freeman or  Goldschmidt just for the certainty. But if you get either of those two players with Rizzo still on the board, shame on you, shame!

Second Base

Don't Draft: Javier Baez
Do Draft: Daniel Murphy

Javier Baez is a very good player! He's generally a very good fantasy player. But his ADP is so ridiculously high that, coupled with his risk profile, I can't in good conscience recommend drafting him.

In a format that punishes strikeouts and rewards walks, Baez's value is significantly overrated - even in the midst of a fringe MVP campaign, Baez was one of the worst hitters in the league in terms of discipline.

Name                   Team            PA        BB%        K%             BB/K
Dee GordonMariners5881.50%13.60%0.11
Salvador PerezRoyals5443.10%19.90%0.16
Javier BaezCubs6454.50%25.90%0.17
Kevin PillarBlue Jays5423.30%18.10%0.18
Tim AndersonWhite Sox6065.00%24.60%0.20
Derek DietrichMarlins5515.30%25.40%0.21
Chris DavisOrioles5227.90%36.80%0.21
Evan LongoriaGiants5124.30%19.70%0.22
Ryon HealyMariners5245.20%21.60%0.24
Amed RosarioMets5924.90%20.10%0.24
Already, Baez's value is going to be limited by his strikeout tendencies. Does that make him a bad fantasy play? Not really, power is the main attraction for Baez. Unfortunately for Baez, he's not the most outstanding player at the position in that regard. Here are the projected leaders in SLG by FanGraphs' Depth Charts projections.

Name                              Team            2B     HR    SLG    ISO
Daniel MurphyRockies40220.5100.199
Javier BaezCubs31310.5010.229
Jose AltuveAstros33170.4640.158
Jonathan SchoopTwins26210.4610.198
Gleyber TorresYankees25270.4590.198
Asdrubal CabreraRangers33200.4540.179
Ozzie AlbiesBraves34200.4520.180
Rougned OdorRangers29270.4490.200
Adalberto MondesiRoyals30240.4490.192
Brian DozierNationals27220.4460.200
Atop the SLG leaderboards is none other Daniel Murphy. Even after a difficult and injury-filled age-32 season, Murphy is projected for a big rebound, especially within the doubles-friendly confines of Coors Field. Couple that with Murphy's elite strikeout rates (since 2015, Murphy is 6th in K% among qualified hitters, just behind Michael Brantley), and Murphy represents both a much better fantasy player than Baez (409 versus 373 fantasy points per Razzball) and a much better value (65.7 ADP versus 19.8 on ESPN). Sure, Murphy will probably lose 2B eligibility after this season, but for now, he's the second-best option at the position behind Jose Altuve.

Third Base

Don't Draft: Kris Bryant
Do Draft: Justin Turner, Mike Moustakas

Third base is stacked this year. Jose Ramirez, Nolan Arenado, and Alex Bregman can each anchor your fantasy team, and you're probably in good shape if they are. If they're not, however, you're still in pretty good shape, because 3B is also deep this year - which gives you plenty of opportunities to grab value.

And this isn't to say that Kris Bryant is overrated, or a bad pick - yeah, there are some injury concerns, but a guy who can deliver a 30 HR/100 R/100 RBI campaign at 3rd is usually worth the 29.1 ADP, especially with OF eligibility. Bryant is the target of this piece not because he's a bad bet or overvalued, but because he's at the bottom of the first tier of 3B, and he's the closest to the next tier - which is itself not that far away from the first.

Player                 Position         ESPN Points (Razzball)       ESPN Points (Depth Charts)         ESPN Points (ESPN)       ESPN ADP     
Jose Ramirez3B5195165574.6
Nolan Arenado3B4874805165.9
Alex BregmanSS/3B48446049115.2
Anthony Rendon3B42742743441.3
Kris Bryant3B/OF40939338729.1
Matt Carpenter1B/3B38736340764.7
Justin Turner3B37540937590.2
Mike Moustakas3B369395395117.8

Immediately, Rendon jumps out as a target - his remarkable consistency coupled with his high ADP indicates that he could be a nice target in the 3rd or 4th round. But I instead want to highlight Turner and Moustakas - guys who have been similarly consistent fantasy assets but have fallen by the wayside in terms of ADP. Turner is 34 and missed a lot of time with injury, but when he was healthy, he demolished AL West pitching, hitting an impressive .312/.406/.518. Here are two players' stat lines since 2016 - Turner and mystery player A.

Player                   Position    BB%        K%          AVG       OBP        SLG
Justin Turner3B10.9%11.4%0.3180.4110.524
Mystery Player3B11.3%13.7%0.2850.3740.504
Who's mystery player A? Alex Bregman. Sure, it's a scant misleading - Bregman is on the upswing and Turner is on the wrong side of the aging curve, but it speaks to the kind of player Turner is. When he's healthy, he gives Arenado a run for his money for "Best 3B in the NL West".

And don't think that I forgot about Moustakas! Even though he's going to be playing a lot of games at 2B this season (which also makes him an excellent 2B target, for those of you playing along at home), he profiles similarly to Turner in terms of strikeout avoidance while bringing a lot more HR power to the fold (Depth Charts projects Moustakas to record the 2nd most HR in the majors for 3B, behind Arenado). Both guys are excellent later targets at their ADP, and a guy like Moustakas would be the best CI in your fantasy league.

Shortstop

Don't Draft: ???
Do Draft: Andrelton Simmons

Imma be real with y'all - ADP for shortstops is pretty fair this year. That's part of the reason why I'm ragging on the Ringer for advising you not to draft Trevor Story - even if you buy into the regression, he's still pretty fairly valued at a position with a decent amount of fantasy depth. There are enough good shortstops that in eight and ten team leagues, everybody can have a shortstop that they're happy with by the end of the 50th round. Chances are if there's a shortstop you want, you're going to be just fine to go out and get them.

This is more of a sleeper pick suggestion, then, but it's some damn good ones! Andrelton Simmons is principally known as a glove-first defensive shortstop, but rather quietly he's become a genuinely good hitter, as well as one of the most disciplined hitters at his position. Here are last year's K/BB leaders at shortstop:

Name                              Team              PA           BB%      K%          BB/K
Andrelton SimmonsAngels6005.8%7.3%0.80
Didi GregoriusYankees5698.4%12.1%0.70
Manny Machado- - -7099.9%14.7%0.67
Francisco LindorIndians7459.4%14.4%0.65
Jurickson ProfarRangers5949.1%14.8%0.61
Xander BogaertsRed Sox5809.5%17.6%0.54
Trea TurnerNationals7409.3%17.8%0.52
Marcus SemienAthletics7038.7%18.6%0.47
Jean SeguraMariners6325.1%10.9%0.46

That's a big difference between Simmons and the rest of the field. Given that Simmons has averaged 150+ games, 72+ runs, 70+ RBI,  and 14+ SB over the past two seasons, it looks like Simmons is a terrific asset if you missed out on those top-tier SS or you're trying to find a top-tier MI given current ADP (a stupid 170.6 on ESPN).

Outfield

Don't Draft: Bryce Harper, J.D. Martinez
Do Draft: Andrew Benintendi, Juan Soto

This is another instance of, "those guys are some good players, but you can do better for cheaper!". Harper (who, as I was writing this article, took a 96 MPH fastball off his ankle having already missed valuable training time in arriving late the Phillies camp) and Martinez are among the best outfielders in baseball, and their ADPs are justifiably high (14.5 and 5.6). However, you should be more bullish on some of the younger blood in the league - namely Benintendi and Soto (38.2 and 33.9) because of some upcoming role changes.

Benintendi is a stud - the kind of guy who's perfect for leagues that punish Ks. He scores a lot of runs, he gets his walks, he hits for decent power, steals 20+ bases, and he runs well-below average K rates. He was a stud last year (445 points), and he's going to be even more of a stud last year now that he's going to be the primary leadoff hitter for the Sox. Last year, Benintendi was principally the no. 2 hitter for the Sox, but now he's going to be batting ahead of Mookie Betts and J.D. Martinez, literally the 2nd and 3rd best hitters in the major leagues by wRC+ last season. That's going to be a lot of runs, and it should be plenty to offset the drop in RBI totals from the move (leadoff hitters for Boston recorded 32.6% of plate appearances with runners on in 2018, compared to 47.5% for hitters batting second).

Soto, meanwhile, looks primed to become the centerpiece of the Nationals' lineup following Harper's departure. Soto most frequently batted 5th in the Nationals' lineup, but with Harper gone, he could move in the cleanup spot this season - with Adam Eaton, Trea Turner, and Anthony Rendon likely ahead of him, Soto could easily bust 100 RBIs this year given the speed and on-base prowess of that trio.

Starting Pitching

Don't Draft: ???
Do Draft: German Marquez, Chris Archer

Like shortstop, starting pitching doesn't really have any bad values hanging around. There are the obvious three top aces - deGrom, Scherzer, and Sale - and then a bunch of very good pitchers, a bunch of okay pitchers, a bunch of bad pitchers, and a bunch of guys who should really never be drafted. I'm going to talk about two guys who belong in the "very good" tier but are being drafted like they're in the "okay" tier: German Marquez and Chris Archer. Here's what SPs between 100 and 150 ADP look like in terms of projected points.
Player                  Position       ESPN Points (Razzball)        ESPN Points (Depth Charts)          ESPN Points (ESPN)        ESPN ADP
Zack WheelerSP348330386114.3
Miles MikolasSP329356364120.7
Masahiro TanakaSP341.2314376122.4
German MarquezSP375.1406335129.9
Charlie MortonSP304.5324367133.8
Chris ArcherSP351.2383344141.5
J.A. HappSP309.5324358142.4
Luis CastilloSP301.8320387143.8

For both Razzball and Depth Charts, Marquez and Archer jump out as seriously undervalued assets. It's not just some love from Steamer - Steamer, ZiPS, and ATC project Marquez for an ERA around 3.80 (though PECOTA pegs Marquez for a 3.31 ERA - hubba hubba!), and the spread for Archer is a little wider but still quite good.

Marquez is a skilled pitcher who always plays to the Coors caveat, but at a bare minimum, he brings a considerable asset to the table that many more don't - health. Here are Marquez's professional innings thrown across levels since 2015: 139.0, 187.1, 172.0, 196.0. Maybe this screams "health risk" to you in terms of workload, especially with Marquez's FBv, but the fact remains that Marquez has managed high workloads and could be one of very few pitchers to hit 200 IP this year. Even if you're not sure about his skill level, the playing time alone should sell you on Marquez as a target.

Archer, meanwhile, feels like a familiar mistake - every year, somebody drafts him with the expectation that he'll finally regain his All-Star form and every year that somebody is disappointed. But maybe you should buy into him on the grounds of change of scenery. The Pirates started throwing a lot more sliders in 2018 before acquiring Archer, so there's some hope here that Archer might finally rediscover the sauce on his beautiful slider with the help of a coaching/analytics staff that has revitalized their rotation with the addition of the pitch. At a bare minimum, last season was only the first time since 2014 that Archer hadn't reached at least 190 innings, so you can go after him for the workload and consider anything else gravy.

Relief Pitching

Don't Draft: Roberto Osuna, Felipe Vazquez
Do Draft: Raisel Iglesias

There's a big drop off in terms of ADP with closers - Felipe Vazquez (93.9 ADP) is the highest reliever with an ADP under 100, and Kirby Yates (117.6 ADP) is the lowest reliever with an ADP over 100. The point values generally align with those draft tiers - as I mentioned earlier, the difference between a top-tier reliever and a guy you're looking to get at 130+ ADP is nearly an 80 point drop at a minimum. But you can cheat the system here and avoid going after Osuna (87.1 ADP) or Vazquez by targeting Raisel Iglesias (133.8 ADP).

Iglesias was a so-so fantasy RP last season, racking up fewer points than Jeremy Jeffress but more than Sean Doolittle. He's a much surer bet as a fringe top-tier RP this year, however, because of an increase in save opportunities - the Reds swung a number of trades to bolster their roster, and the NL Central looks crowded as all get out. According to FanGraphs's playoff odds, the Cubs have the best odds in the division of making the playoffs at just 65.5%, and just 10 games behind them, the Reds are projected for fourth. As we saw last season in the AL West, a crowded playoff field leads to plenty of close, competitive games, which in turn leads to a number of save opportunities - factors that helped Edwin Diaz and Blake Treinen rack up absurd numbers of saves. Iglesias is nowhere near the level of those elite closers, but he'll certainly get a boost from playing on a competitive team in a competitive division. Couple that with some HR regression and you have a serious value at a 133.8 ADP.


Sunday, February 17, 2019

Overwatch League Fantasy: Should you start players in sweeps?

In fantasy football, team match-ups play a huge role in deciding when to start and bench players - fantasy owners usually starting wide-receivers against the Jets (who allowed an average of 28.5 fantasy points per game to WR in 2018), but tended to sit anyone who wasn't a superstar against the Jaguars (who allowed just 16.8 points per game to WR in 2018, best in the NFL). Overwatch Fantasy is little different - fantasy owners should pay close attention to who teams are playing against - team strength can play a huge role in your fantasy points.

Should you start a player even when you think their team will get curb-stomped? How about when you think your team will do the curb-stomping? What about betting on a crucial fifth map, where players can boost their play time with some extra minutes? Here's your guide to figuring out who/when to start in the Overwatch League based on how you think team units will play.

A brief note on terminology
  • The word "Map" is used to refer to a single discrete instance of team competition within an Overwatch League game, or "Match". An example of a map might be Busan, Dorado, or Route 66.
  • The word "Match" is used to refer to multiple maps which are played in a game. Teams generally play four maps in a single match but may play a fifth map should the teams be tied after playing four maps.

Part One: Establishing Baselines

To determine when you should be starting and sitting players in matchups, we'll first need to establish baselines for fantasy play. We'll use OWL Stats from 2018's regular season with HighNoon.GG's fantasy scoring system. It is of note that these stats represent fantasy stats from a 2/2/2 meta as opposed to a GOATS meta, but much of the same principles hold true as we are generally looking at the view from 20,000 feet as opposed to breaking these stats down by hero-choice or role.

We'll first calculate the average number of fantasy points accumulated per game. In Overwatch League stage play in 2018, the league recorded collectively the following values:

Total Damage      Total Elims   Total Heals
130,285,657.71212,44048,777,512.61

HighNoon.GG's scoring system uses the following scoring system based off of these metrics:
  • 1 point per 1,000 damage recorded
  • 1 point per 1,000 healing recorded
  • 0.5 points per elimination recorded
The breakdown of fantasy points recorded leaguewide in each category is as follows:

Damage Fantasy Points    Eliminations Fantasy Points      Healing Fantasy Points
130285.66106220.0048777.51

Thus, a total of 285,283.17 fantasy points were recorded by OWL players in 2018. The Overwatch League had 250 matches during stage play and a total of 12 players were fielded by both teams at any given time - thus, the average fantasy points per player-slot per match was 95.09 fantasy points. Thus, 95.09 is our total baseline for comparison. The breakdown of average points per category per match per player-slot is as follows:

Damage Fantasy Points     Eliminations Fantasy Points      Healing Fantasy Points
43.4335.4116.26

Part Two: Winners and Losers

One of the considerations in terms of starting/sitting might be who is expected to win and lose a game. In general, winning teams record an average of 101.76 fantasy points per match per role slot, and losing teams recorded 88.42 fantasy points per match per role slot. This revelation is patently obvious - fantasy points generally measure positive objectives, and a team will reach these objectives frequently en route to winning.

However, it is of note that this differential comes almost entirely from eliminations.
Damage Fantasy Points    Eliminations Fantasy Points   Healing Fantasy Points
Average43.4335.4116.26
Losing Teams42.0430.1916.19
Winning Teams44.8240.6216.33

There is almost a ten-point spread in eliminations between winning and losing teams, but only a two-point spread in damage and less than 0.2-point spread in healing. This result suggests that main-supports whose value largely comes from healing, such as Unkoe, Gido, and Revenge, do not necessarily need to have the outcome of the match factored into the decision to start or sit those players.

A quick check of last year's fantasy point totals confirms this. The following players had the smallest differential in fantasy point totals between wins and losses among players with at least 300 minutes played in both wins and losses, with % of Points from healing representing that player's rate stat across both wins and losses:
Player   Points/Game in Wins   Points/Game in Losses   Differential   % of Points from Healing
Closer47.0055.33-8.3377.2%
Mistakes89.5695.13-5.565.1%
Bani45.0950.63-5.5391.1%
sinatraa81.2785.40-4.134.6%
Libero89.9293.75-3.831.4%
Kellex66.3767.53-1.1682.2%
Hydration79.0080.13-1.135.2%
Moth62.0862.080.0081.9%
Gesture86.6486.120.520.2%
Coolmatt89.7789.160.610.1%

And the following players had the largest differentials:
Player   Points/Game in Wins   Points/Game in Losses   Differential   % of Points from Eliminations
ShaDowBurn110.0073.7836.2244.5%
Eqo112.1979.3332.8540.4%
Asher79.9347.4032.5348.7%
Seagull115.9085.6730.2339.1%
Agilities100.6073.6226.9837.9%
Envy130.09103.3326.7647.8%
Carpe112.8688.2524.6147.1%
NotE117.2392.8324.4044.5%
Boombox132.54108.6923.8525.8%
FLETA105.0581.5023.5541.3%

Again, it is not unreasonable to expect players to perform poorly against better opponents, but this information confirms that it is more difficult for players - especially elim-heavy fantasy players - to rack up more fantasy points in losses than wins.

This information indicates that expected match outcome is a non-factor in determining whether or not to start a main-support player, yet it may be worthwhile to bench a DPS player who relies on eliminations for points should you anticipate that they may be walking into a potential loss for an inferior DPS player playing for a team who expects to win.

Part Three: Expected Map Differential and Fantasy Points

There are a number of different lines of thinking in starting fantasy players when taking into team strength into consideration. We will define three different rationalizations, and then objectively examine them. Please do not read too much into my characterizations of each team - the point is not how I evaluate each team, rather, they are names ascribed to examples of teams of fictional strength.

Example A: Curb Stomping
A fantasy owner owns Meko, a player for the notable powerhouse NYXL. NYXL's only match this week is against the Florida Mayhem, a fairly poor team that NYXL is expected to sweep with ease. The fantasy manager starts Meko on the grounds that Meko will pick up many points in an easy victory over an inferior Mayhem team. However, should this manager consider that these games may be over more quickly, thus robbing Meko of the chance to pick up more fantasy points?

Example B: Getting Curb-Stomped
A fantasy owner owns Geguri, a player for the fairly weak Shanghai Dragons. The Dragons' only match this week is against the Philadelphia Fusion, the runner-ups from the Overwatch League championships in 2018 and a very strong team this season, and are overwhelmingly favored. The fantasy manager starts Geguri on the basis that she is an excellent flex-tank. However, will Geguri's production struggle given that she is playing against a superior team and that, in a sweep, the games may be over more quickly?

Example C: The Even Match
A fantasy owner owns Shadowburn, a player for the middle-of-the-road Paris Eternal. The Eternal play the Atlanta Reign in their only match this week, and it is expected to be a close and tight game. Despite the fact that Shadowburn may be receiving a healthy degree of competition, the owner starts Shadowburn on the basis that the games will be long and drawn out, thus giving Shadowburn more time to accumulate fantasy points.

Which of these lines of thinking are logical? Let's examine how many fantasy points players in different map spreads tend to receive.

There are a number of map-differentials that teams might encounter. A clean sweep represents a 4-0, a somewhat closer match would result in a 3-1 win, and a tied match after four maps means that one team will be walking away with a 3-2 win. There is also the possibility for draws, meaning that 3-0 and 2-1 outcomes are possible.

From 2018, here are the average points per match per player slot for teams in these different situations:

Map Differential   Outcome   Points/Game
3 to 2Winner115.30
2 to 1Winner113.13
3 to 2Loser109.09
3 to 0Winner106.53
2 to 1Loser101.76
3 to 1Winner95.07
4 to 0Winner94.57
3 to 0Loser93.92
3 to 1Loser83.56
4 to 0Loser72.99
In general, 3-2 matches tend to be the most productive in terms of fantasy outcomes for both teams - indicating that scenario C represents the greatest potential for fantasy points in a vacuum. It is also apparent that 3-2 wins and losses are dragging the averages for wins and losses overall upwards.

Starting a player in a game where the team is expected to win 3-1 or 4-0 represents an average fantasy opportunity, with both values appreciably close to the overall average for fantasy points per game per player slot. However, starting a player in a game that they might be expected to lose 4-0 means that they might stand to finish in excess of twenty points below average - a significant handicap.

However, evenly matched games present the greatest fantasy opportunity - a match which goes to a fifth map represents an opportunity for about fifteen additional points for players on both sides. These kinds of match-ups should be targeted - given two players of identical caliber, the correct play is to start the player in the match that would be more evenly matched.

What about the risk of losing a 3-1? The 3-1 loss column likely exaggerates the actual difference in terms of the fantasy penalty for a team losing 3-1 to an evenly matched opponent, as the 3-1 loss category includes many more teams who faced off against a much stronger opponent yet managed to take a map off of them, but even if we assume the penalty to be consistent across all teams as a worst-case scenario, the expected value in terms of fantasy points relative to the league average is as follows for a game between two evenly matched teams (such that P(Team A wins) = P(Team B wins) = 0.50):

Team A loses 0-4   Team A loses 3-1   Team A loses 3-2   Team A wins 3-2   Team A wins 3-1   Team A wins 4-0   Expected Value
Odds of Map Differential6%25%19%19%25%6%100%
Fantasy Points Above Average-22.10-11.5314.0020.21-0.20-0.52N/A
Expected Value-1.38-2.882.633.79-0.05-0.032.07

By a back-of-the-napkin calculation, it certainly appears as though starting players in evenly matched games is worth the risk of the 3-1 or 4-0 as our expected gain in terms of average points is positive (+2.07).

Why might players stand to gain so much from playing in close 3-2 matches? It certainly appears to be match-time. 3-2 matches do, by virtue of that fifth map, record significantly more play-time than other match differentials.
Map Differential   Average Match Length (Min)
3 to 263.63
2 to 158.13
3 to 055.92
3 to 149.93
4 to 047.33
However, we should not discount the possibility of the strength of competition driving point totals as well. In terms of rate stats, it appears as though evenly matched teams post average point differentials against each other, whereas teams curb-stomping opponents generate a high degree of points-per-ten minutes (league average of 17.79 points per 10 minutes).
Map Differential   Outcome   Fantasy Points Per Slot Per 10 Min
4 to 0Winner20.03
2 to 1Winner19.47
3 to 0Winner19.05
3 to 1Winner19.04
3 to 2Winner18.12
2 to 1Loser17.51
3 to 2Loser17.15
3 to 0Loser16.80
3 to 1Loser16.76
4 to 0Loser15.42
Yet, as demonstrated above, the difference in rates of accumulation does not compensate for the brevity of four-map games.

It is also of note that games with drawn-maps also tend to display longer times and similar points-per-ten - indicating that these games are quite close as well. However, map draws are fairly rare and unpredictable enough that I have felt comfortable not including them in the larger discussion of this analysis.

Conclusion: Notes on Synergy and Impact

There is a natural question of, "The chicken or the egg": do fantasy teams truly post high totals by virtue of winning, or do they simply accumulate these high totals en route to winning, and we are mistaking the disease for the symptom? It is obvious that evenly matched teams present an opportunity for additional points by virtue of the fifth map, yet teams who win tend to simply have better players overall, and this is what is ultimately measured by fantasy points, not simply wins.

The answer is, it is probably both the chicken and the egg. In baseball, it is rather easy for a good player to have an excellent performance in a losing effort - like Mike Trout going 2-3 with two doubles and a walk in an 8-2 loss - but it is more difficult to accomplish that feat in Overwatch, especially in a GOATs meta where getting the first kill tends to result in the rest of the team dying or running away. It is obvious that winning is a function of player skill, yet it is also the function of multiple players' skills, and the other players on the team ultimately affect each others' fantasy point totals. Janus was not an awful fantasy player with NYXL, but watching him falter against his former teammates while playing on a much weaker team on Saturday was a reminder that team strength plays an important role in fantasy, as does the quality of opponent. Both of these factors are factored into the discussion of winning/losing and map-differential.

In that respect, consider these statistics to be overstated, but only to a degree. Yes, good fantasy players play for good teams, and teams tend to put up more points in fantasy wins. But at the same time, expectations regarding winning/losing can help predict fantasy stats. And by recognizing the potential for a 3-2 match, you might pick up some bonus points with ease.

Wednesday, October 31, 2018

Introducing rFIP and nFIP

One of the most valuable tools in all of baseball is pitch framing: the idea of stealing strikes for pitchers is severely underrated by metrics such as fWAR and rWAR, and from our few measurements of it, we know that pitch framing from an individual catcher can be worth upwards of 20-30 runs over the course of a full season. Metrics like WARP assign these values to catcher, and while DRA adjusts pitcher performance to a degree to account for factors like pitch framing, it's difficult to truly grasp the impact of pitch framing on a pitcher's performance. This is where rFIP and nFIP come in.

Background, rFIP

It's intuitive that pitchers that throw a lot of strikes will register a lot of strikes, and pitchers that throw a lot of balls will register a lot of walks. In this sense, the relationship between the ratios of strikes per ball and strikeouts per walk is quite strong. On a team by team level, the ratio of strikes/ball to strikeouts/walk recorded an R2 value of 0.9253 in 2018, indicating an extremely strong relationship between the two variables.


Since 2015, strikes/ball and strikeouts/walk on a pitcher-by-pitcher level recorded a strong relationship as well among qualified pitchers, albeit to a lesser degree (R2 value of 0.6445).

Despite having a slightly less strong relationship than the team-by-team relationship, both linear regressions reveal the same formula for approximating strikeouts per walk from strikes per ball:




In other words, a pitchers' strikeouts per walk can be approximated reasonably well from the number of strikes and balls that they record.

Knowing that a pitchers' strikes and walks are affected by pitch framing, we might seek to remove the influence of the catcher and umpire on strike and ball calls by looking solely at the number of pitches a pitcher throws in the zone (which are technically strikes according to MLB rules, but via mistakes from umpires and pitch framing efforts, might be called as balls) and the number of pitches a pitcher throws out of the zone (which are technically balls but may end up as strikes thanks to pitch framing). We can seek to contextualize this value by calculating a pitcher's fielding independent pitching value (FIP) by calculating a pitcher's strikeout to walk ratio given their balls and strikes as called by a robotic umpire (with Statcast serving as our robot) - this leads us to rFIP, robotic-FIP.

We must operate with multiple caveats in calculating rFIP. The primary caveat is that we are assuming a two-dimensional strike zone as Statcast does: Statcast defines the strike zone as the imaginary plane that runs perpendicular to the ground and parallel to the front edge of the plate, as shown in an illustration from MLB's officially rulebook below.




However, baseball's strike zone is not two dimensional, but three dimensional. According to major league baseball's official rules, "The STRIKE ZONE is that area over home plate the upper limit of which is a horizontal line at the midpoint between the top of the shoulders and the top of the uniform pants, and the lower level is a line at the hollow beneath the kneecap". Thus, the strike zone is not a plane, but a prism. A more accurate representation of the strike zone is shown below from Wikipedia.



Our approximation of the strike zone will miss some borderline pitches that do, in fact, cross through the strike zone but do not touch the front plane. Still, the number of strikes and balls that our methodology misses will be quite small, as it is quite difficult to throw a pitch that is a strike while avoiding that front plane.

Our methodology also assumes that a pitcher would record the same sum of strikeouts and walks regardless of the ratio of strikeouts to walks that they recorded. In other words, a pitcher might expect to record the same number of balls in play, HBP, and HR in a season regardless of how many walks/strikeouts they yield, so this not a completely unfair assumption.

To demonstrate how rFIP is calculated strictly, I will carry through an example using 2018 Mets starter Jacob deGrom, who had the greatest single season pitching performance of the past four seasons by rFIP.

I went into Statcast's database and pulled deGrom's strikeouts, walks, hit-by-pitches, home runs, and IP outs to calculate deGrom's FIP. deGrom recorded 268 K, 43 BB, 5 HBP, 10 HR, and 643 IP outs in 2018 according to Statcast. Note that these values are slightly off from deGrom's actual totals - 269 K, 46 BB, 5 HBP, 10 HR, 651 IP outs - because Statcast has some missing values. Still, the difference between our calculated FIP for deGrom - 1.94 FIP - is only marginally different from deGrom's actual FIP (1.99 FIP). This is the baseline FIP for deGrom.

I then pulled deGrom's total strikes and total walks, then I measured how many strikes were "stolen strikes" - that is to say, called strikes recorded as outside of the strike zone - and how many balls were "lost strikes" - called balls recorded as inside the strike zone. In 2018, deGrom recorded 1698 strikes, 80 of which were stolen strikes, and 999 balls, 54 of which were lost strikes. To calculate deGrom's strike total as called by a robotic umpire, I subtracted the number of stolen strikes from deGrom's total number of strikes and then added his lost strike total to that figure. deGrom's ball total as called by a robotic umpire was equivalent to the number of balls deGrom recorded minus his lost strike total plus his stolen strike total. deGrom's rStrikes (robotic strike calls) were 1672 (1698 - 80 + 54 = 1672) and his rBalls were 1025 (999 - 54 + 80 = 1025).

From there, I can use deGrom's rStrike/rBall ratio and the relationship between strikes per ball and strikeouts per walk to approximate deGrom's strikeouts per walk assuming a robotic umpire. deGrom's rStrike/rBall ratio was 1.63, so the equivalent strikeouts per walk ratio would be 7 × 1.63 - 6 = 5.41 (not far off from deGrom's 2018 figure of 5.85). deGrom recorded 268 + 43 = 311 strikeouts and walks in 2018, so his adjusted totals according to his rK/BB and the sum of his strikeouts and walks would be 263 strikeouts and 48 walks. Plugging these back into the FIP formula yields an rFIP of 2.07.

The benefits of rFIP is that it serves to isolate the influence of pitch framing, bad umpires, or other factors and distill a pitchers' ability to throw strikes while incorporating a pitchers' other tendencies (HBP and HR). It also serves to identify which pitchers do not throw many strikes but are saved from these tendencies thanks to their catchers (Zack Greinke is a prime example of this, running an extremely high FIP - rFIP differential during his time in Arizona with Jeff Mathis as his personal catcher).

Background, nFIP

But Major League Baseball does not have robotic umpires at the moment, and given the current strength of the umpires' union, it might not for a very long time. In that sense, we might wish to know what a players' performance might look like if they were pitching to a normal catcher/umpire duo. To accomplish this, we will use nFIP - normal FIP - which captures a pitchers' performance based on their strikes and balls as if they had an average catcher.

Since 2015, pitchers have seen 7.8% of their out-of-zone pitches converted from balls to strikes, and 5.4% of their zone pitches converted from strikes to balls. Following our deGrom example, deGrom's expected strikes with an average catcher/umpire duo would be 1672 rStrikes - 1672 rStrikes × 5.4% + 1025 rBalls × 7.8% = 1662 nStrikes, and deGrom's expected balls would be 1025 rBalls - 1025 rBalls × 7.8% + 1672 rStrikes × 5.4% = 1035 nBalls.

Using the same methodology as rFIP in approximation K/BB from strikes and balls, deGrom's FIP becomes 2.10 - about a 0.16 difference between his FIP, which indicates that deGrom had a little bit of help from his catchers in getting his value).

As a quick "stupid-check", we can also see that our league FIP (4.12) is extremely close to our league nFIP (4.17), so we know that our conversion methods are rather effective.

Advantages of rFIP and nFIP

rFIP is quite useful for us in that it gives us a solid idea of how effective a pitcher is at throwing strikes. In terms reliability from year N to year N+1 among pitchers from 2015-2018 with at least 100 IP in year N and N+1, FIP and nFIP have roughly the same correlation (R2 of 0.2607 and 0.2643 respectively), rFIP has an improved year-to-year reliability with an R2 of .2860.

Neither rFIP nor nFIP appears to predict year N+1 FIP with any reliability (R2 of 0.1796 and 0.1846 respectively), which is to be expected: since we have neutralized the impact of catching on a pitchers ability, the influence of the catcher is seen in only FIP.

Caveats

As an important reminder, rFIP and nFIP are simplified models that are not attempting to be one-hundred percent accurate. We are approximating strike and ball calls, we are approximating a pitchers' strikeout/walk rates from those approximations, and we are assuming that a pitcher has no influence on pitch framing, which is most likely not a completely certain assumption. Still, our model passes the eye test in that it capably identifies terrific seasons and does not appear to unduly punish pitchers for their catchers' skill.

Leaders

A full leaderboard of rFIP and nFIP leaders since 2015, both split by season and career totals over that span, can be seen below. The default selection for the filter includes qualified pitching leaders.


Code

The SQL code used to calculate rFIP and nFIP from a Statcast database can be found here.

As always, if you have any questions, suggestions, or feedback, please leave a comment or tweet me @John_Edwards_.

Sunday, July 8, 2018

Effective Chase Score

Swinging at pitches outside the zone is generally bad. After all, players are essentially giving up free balls in exchange for either a strike or a poorly hit ball. But hey, if you can put the ball in play, it's not the worst thing in the world. With this in mind, yesterday, I looked at which hitters were best at avoiding chasing those pitches, while making contact on said pitches.
I decided to refine this methodology further, and talk about players ability to effectively chase in that they A. don't chase frequently, B. make contact on pitches that they chase, and C. make quality contact on pitches that they chase.

The three components I incorporated were 1-O-swing% (how frequently players did not swing at outside pitches), O-Contact% (how frequently players make contact on their swings outside the zone) and xwOBA on O-zone pitches (the quality of contact on pitches made outside of the zone). After pulling all of these figures for players with 1000+ pitches this season from Baseball Savant, I then calculated the z-scores for players with regards to each metrics, then added them all together. The end result I called the "Effective Chase Score".

Here are 2018's leaders in Effective Chase Score.

PlayerEffective Chase Score
Joey Votto6.85
Mookie Betts5.43
Brett Gardner4.45
Alex Bregman4.42
Nick Markakis4.23
Jesse Winker4.20
Ben Zobrist3.61
Andrew Benintendi3.56
Aaron Hicks3.43
Jose Ramirez3.40
Andrelton Simmons3.36
Travis Shaw3.22
Carlos Santana3.15
Mike Trout3.01
Shin-Soo Choo2.96
Lorenzo Cain2.87
Matt Chapman2.82
Buster Posey2.81
Ian Kinsler2.80
Denard Span2.70

As we would expect, Votto is miles away the best player in terms in effective chase rate - in addition to having extremely low chase rates, Votto makes contact frequently on his outside swings and has extremely effective contact on outside pitches.

Here are the worst batters by the same metric.

playerEffective Chase Score
Freddy Galvis-2.54
Michael A. Taylor-2.65
Kevin Pillar-2.76
Joey Gallo-2.83
Chris Davis-2.84
Carlos Gomez-2.84
Odubel Herrera-2.85
Eduardo Escobar-3.06
Robinson Chirinos-3.11
Teoscar Hernandez-3.61
Adam Jones-3.62
Tim Anderson-3.62
Giancarlo Stanton-3.63
Luis Valbuena-3.67
JaCoby Jones-4.07
Nicholas Castellanos-4.13
Jonathan Schoop-4.16
Lewis Brinson-4.76
Ryon Healy-4.82
Javier Baez-5.65

There are a lot of free swingers here, including Gomez, Davis, Gallo, etc. Baez, however, is almost as bad as Votto is good - Baez has the worst O-Swing% by 6% (Baez - 46.0%, second is Kevin Pillar, 40.5%), bottom tier O-Contact%, and Baez has just a .237 xwOBA on outside pitches.

To view the full list of hitters with at least 1000+ pitches faced, I published my spreadsheet below.

Friday, July 6, 2018

MiLB Statcast Project Part Five: Next Steps

What's next for MiLB batted ball data? Clearly, there are issues with it, thanks to biased stringers, but there's also a wealth of valuable information in here.

Having already calculated launch angle, it seems logical that the next step would be to calculate exit velocity. It would seem as though some relationship between hit distance (calculated using the home plate location found in part three and the coordinates of the batted balls) and launch angle would yield an approximation for exit velocity, and indeed, such a relationship appears to exist at the major league level.


Despite this, using the model that I reverse engineered from Statcast and correcting for differences in hit-tracking between the stringers and the MiLB, I found that such a model was grossly inaccurate at the minor league level. Shown below are MiLB hitters with at least 200 BIP in 2016 and 200+ BIP in the majors in 2017.

Perhaps the depth of batted ball locations are inaccurate, or perhaps the model itself has issues. I think this is a difficult challenge because we're trying to measure the size of an intangible object using its shadow - it's not as simple as plugging the values into excel's equation solver, as we need to have method behind our model. I think of this challenge as a WIP, and I hope to update this post with a solution soon, but for now I have no clear way of estimating MiLB exit velocity.

Still, the rest of the data that we're working with appears solid and powerful. I've already revealed a couple functions that I've been using, and I hope to develop an R-package for all of these functions, including heatmaps, splits, date-ranges, a built-in R scraper, and more. I hope to keep y'all posted on this later this summer.

Thank you for reading this series! I hope this was insightful or at least entertaining. In my opinion, not enough public analysts are using MiLB data, and while it's certainly rough around the edges, there's still valuable information to be gleaned from it.



Wednesday, July 4, 2018

MiLB Statcast Project Part Four: Reverse Engineering Launch Angle

In the previous two part, we focused heavily on creating visualizations of MiLB pitch and batted ball data, but our data was not really used to create any workable number or analogs for MiLB data. I consider it arbitrary to do things like calculate batting average or slugging percentage, but methods like the ones available with regards to Statcast, such as launch angle and exit velocity. We have neither of the values available to us in any form with minor league data, but we can approximate them. This section will focus on launch angle for MiLB hitters.

While we do not have launch angle available in any form for MiLB hitters, we do have limited batted ball classification data - stringers will manually tabulate which balls are ground balls, which are fly balls, which are line drives, and which are pop-flies. While the stringers do not operate with anything close to the precision of BIS's ball classification system, it still gives a rough idea of players' batted ball tendencies. 

For example, let's say we want to know how frequently Ozzie Albies hit fly balls in 2017 in AAA. There are two ways of calculating fly ball rates - FanGraphs includes pop-ups in their calculation of FB%, but Baseball Savant does not. We'll calculate both for posterity.


FanGraphs has limited MiLB batted ball data from STATS, and Albies' figure for 2017 in AAA is fairly consistent with what we calculated from FanGraphs (37.9% from FanGraphs compared to 38.4% from our dataset). Albies hit fly balls in the MLB in 2017 at a 40.3% rate according to FanGraphs, and at a 32.1% rate according to Baseball Savant, so our minor league figures appear fairly accurate given that Albies' fly ball rate was consistent with his measured values both in MiLB play and in the majors.

So how can we extrapolate launch angle from this? Launch angle plays a large part in batted ball classification. We can use batted ball tendencies to reverse engineer launch angle at the MLB level, and apply that to MiLB data. Using my personal Statcast DB, I found the average launch angle for each batted ball classification for all batted balls ever recorded by Statcast.

BB TypeLaunch Angle
Fly Ball36.646
Popup63.285
Line Drive16.756
Ground Ball-12.553

If we treat each batted ball as having been hit with its average launch angle, we can theoretically get a solid estimation of average launch angle from batted ball classifications alone. I pulled 2017 hitters with at least 200 PA and compared their estimated launch angle from batted ball classification to their actual launch angle, the results were extremely promising. To clarify, the exact equation used was:


Our R-squared value is .93, indicating that our formula does an excellent job of estimating launch angle solely from batted ball data - not surprising considering that Baseball Savant likely uses launch angle as a majority factor in classifying batted balls.

If we re-scale our values to get a 1:1 relationship, we have a fairly strong model for estimating launch angle from batted ball classification.


As strong as the correlation is between xLA and LA, our RMSE is a bit weak. In looking at the relationship between residual values and batted ball frequencies, it looks like we're introducing a bit of error with our POP% value.


I found that my RMSE was minimized at pop-fly coefficient of about 60.65 - my guess is that since Statcast has difficulties tracking some balls at extreme launch angles, the true pop-fly angle is skewed upward.


We've marginally improved our RMSE and r-squared with our model. I think there are probably some bigger steps we could take to improve the model's accuracy, but at the moment, I think our r-squared value is superb, and our RMSE value is acceptable as a model of launch angle.

Armed with our model, we are now prepared to determine MiLB launch angle from the batted ball data found in our dataset.

This somewhat-intimidating wall of code grabs batted ball values and calculates estimated launch angle from them using batted ball data. In our csv, we now have estimated launch angle values for hitters in 2016 and 2017 for minor league players! Of course, we need to check ourselves - how accurate are these launch angle values?

To determine the accuracy of our results, we'll compare year n to year n+1 correlation. I pulled hitters who registered 200+ balls in play in 2016 and 2017 (210 of them), and found a correlation between 2016's launch angle and 2017's launch angle of .6606, so this is our benchmark.

We're not going to compare players with 200+ BIP in the minors from 2016 to players with 200+ BIP in the minors from 2017 - it just tells us the correlation between our measured values of FB%, GB%, LD%, and POP% in a rougher form. Instead, we're going to compare hitters with 200+ BIP in the minors from 2016 to hitters with 200+ BIP in the majors from 2017 - in this sense, we're looking at how well MiLB launch angle predicts MLB launch angle.

After pulling these values, I only found 27 hitters who registered both 200+ BIP in AAA in 2016 and 200+ BIP in the MLB in 2017, which was a bit of a disappointment. Still - our r-squared value for these hitters estimated launch angle from their 2016 MiLB campaign and their 2017 MLB campaign was .7274. Because we're dealing with fewer hitters (27 versus 210) and because we're dealing with consistent young hitters (there's no decline due to age or dramatic changes in LA, unlike in our MLB dataset) our r-squared looks better than our benchmark of .6606, but I don't think for a second that our xLA is somehow a better predictor of launch angle than previous year's launch angle. (EDIT: it also might have something to do with the fact that I accidentally included multiple Jose Martinezes here)

Still, xLA appears to have undeniable predictive value.

We have a reasonable predictor of launch angle using minor league data! If we want to compare MiLB launch angles to MLB launch angles to draw comparisons between hitters, we now have that ability, and we can be reasonably confident in our ability to do so.

Monday, July 2, 2018

MiLB Statcast Project Part Three: Cleaning up and visualizing batted ball data

In our previous section, we looked at the issues involved with minor league pitch placement data, strategies for cleaning and visualizing the data, and then compared that data to MLB data. In this section, we'll grab MiLB hit data, and use similar strategies for cleaning and visualizing that data.

Looking at our data, we can see that we have similar issues to our batted ball data-set as we did to our pitching data-set.



The issues with the batted ball data-set are as follows:

  1. There exists bias in the way that batted balls are grouped - batted balls are clustered around where fielders play, especially in the outfield.
  2. The units of the x and y coordinates are not immediately apparent.
  3. The field's dimensions are not cleanly defined.
We have little realistic approach for fixing our issues with the bias in clustering, but we can address problems 2 and 3.

Let's start by discussing the units. When stringers are tracking a game, in order to place a batted ball on the map, they use a 250x250 pixel map of the field. Where they click is then recorded then in pixels as the location of the hit. We have to determine a realistic scale from pixel to a real-world unit in order to calculate factors like hit distance.

So then, let's try to establish some concrete markers for scale. If we look at the 10th lowest y-value for ground balls for a stadium, we get a rough idea of where home-plate is.



If we move that intercept down slightly and plot the median x-value of all batted balls, we should find the tip of the baseball "diamond".



From here, we can construct foul-lines knowing that a baseball field is constructed with a 90 degree angle between the lines. As long as the field is not rotated beyond what we've already done, we can simply construct perpendicular lines from the tip of our diamond outward.



To determine the dimensions of our park (to both plot the outfield lines and to figure out the scale of pixels to park), let's look at the placement of home runs in the park.



There are a surprising number of misplaced HR balls - a bunch of long balls never left the infield, according to our data. We'll filter them out. Then, we'll plot a line of best fit along the outfield wall.



This looks like a decent approximation of Coca-Cola Field's outfield wall. If we shift all the values downward, mess with the colors a bit, and we have a decent approximation of what Coca Cola Field looks like.


The wall is slightly below almost all home runs. Coca-Cola Field does not have a perfectly round wall, but this approximation gives a good visualization, and looks useful for spray charts. And we can finally approximate the pixel to feet conversion factor! Home plate is at ~50 pixels, and the centerfield wall is at ~210, which gives a pixel distance of 160 pixels. In real life, Coca-Cola field measures ~400' from home to centerfield, so our coversion factor is 400/160 = 2.5. It's ~140 pixels down the left and right field lines for values of 350' down the lines. Coca-Cola Field is actually 325' down both lines, but the field itself curves inwards quite a bit. We're not quite capable of doing this with our approximation, but the values line up quite well.

With all of this implemented, let's turn this into a function!



I've overlaid Rhys Hoskins' 2017 batted balls over Coca-Cola Park (no relation to Coca-Cola Field). Our estimations of power look fairly accurate - Rhys has ~25 HR by my count on this chart, when he recorded 29 total in 2017 in AAA. Not bad for completely estimating the outfield wall as a semi-circle.

But what's more important is the information that the chart presents - from this spray chart alone, it's apparent that Hoskins hits a lot of ground balls to the right side of the infield, making him an excellent shift target. He also has substantial pull power.

We can glean this information from a scouting report, but it's important to have a visual confirmation of what's reported, and we can also pick up on systematic changes in approach. We can go deeper in terms of visualizing prospects and MiLB players.