Boyd's World-> Breadcrumbs Back to Omaha-> Park Factors with Teeth | About the author, Boyd Nation |
Publication Date: February 11, 2003
Context Is Everything
I apologize for not meeting my self-imposed deadline last week. In some ways, I'm combining the worst aspects of sportswriting and software development at times here, and the rigid deadlines of sportswriting combined with the unpredictability of software debugging collide sometimes with rather ugly results -- last week I just had one more bug I could not find despite more effort than I had time for. I appreciate your patience; I think the result will be worth it for the more statistically minded among you.
I finally have some park factors that I like. Half (warning, huge approximation alert!) of you are going, "Woooo hoooo," at this point, and the other half just flipped channels over to another Peter Gammons story on trade rumors involving six teams and a goat. For those of you who are left but need a little background, I'll explain my cunning plan to the audience, Exposition Boy.
The first step that's taken when you begin to seriously analyze sports statistics rather than just memorize them is to try to put them in context. Having a defense that holds teams to 105 points a game is horrible in today's NBA (I think that's still true); having a team that held your opponents to 105 in 1986 was above average. Putting up a 3.80 ERA in the SEC is harder than putting up a 2.80 ERA in the NEC. Hitting 20 home runs in Memphis is a considerably harder feat than hitting 20 home runs at New Mexico State, even though the two teams play roughly similar levels of competition.
When baseball fans talk about context, one of the first places they look to adjust is for what's called park effects, or the park factor. If you could somehow magically play the exact same game in two different parks, you'd get different results. A ball that travels 390 feet in Starkville is a long fly ball to the center fielder. The same ball is a 430 foot home run in Colorado Springs. That same batter popped out in the extra large foul territory in Akron (if I remember right). There are lots of factors that control how many runs are scored relative to a "normal" park for any given stadium. Some of them have to do with the park itself, such as the size of the field and the amount of foul territory available for the fielders. Others have to do with the location of the park, such as altitude and weather patterns. These things can obviously change from year to year as weather changes (this factor is smaller in college than in the pros, because most of the comparable stadiums tend to be in the same weather region) or stadium modifications, but that has to be balanced against the need to have enough data points for meaningful comparison, so a period of time has to be chosen for comparison. There's no absolute measure, so you can only compare a stadium to all the other stadiums; the addition of Coors Field to the National League, for example, lowered every other stadium's park factor by a noticeable amount.
Before the mail starts up, what park factors don't have anything to do with is the team that plays there, theoretically, at least. Teams with good pitching, for example, have good pitching both at home and on the road, so all of their games feature fewer runs than normal. Since park factors are a measure of how many runs both teams score in a game based on where they play, the quality of the team doesn't enter in.
The Method
In the pros, park factors are more or less easy to compute -- decide how long you want to count, compute the runs per game for all of a team's games home and away, divide the runs at home by the runs on the road, and you get a park factor. You can get fancy and use runs per inning if you want, to eliminate the effect of teams that win a lot at home, but that's about it.
The problem is that, for that to work, you need a complete more-or-less balanced schedule. For the situation in college, where in any given season, each team only plays around 10% of the available opponents, it just doesn't work. What I've done in the past is to just compare each team to its conference opponents, but all that gives you is numbers relative to the conference, and some conferences were obviously playing in parks that were more offense-based than others. A couple of weeks ago, though, I realized that this was somewhat similar to the problem of ranking teams; you could look at valid comparisons between conferences in cases where you had home-and-away series to look at, and you could use that to generate a full park factor. For example, if teams A, B, and C were in Conference Red and teams D, E, F, and G were in Conference Blue, and A had a home-and-away with D while C had a home-and-away with G, you could produce a reasonable guess at a full ranking of the seven parks. At this point, I'm boring even myself, so if you want more detail on how I did it (it starts with pairwise comparisons for all pairs of teams who played home-and-away between 1999 and 2002, and goes from there into something similar to the ISR algorithm), drop me a line.
The Numbers
Both by smell and by deduction, I'm comfortable that these are the most accurate park factors we're going to get for college in the reasonably near feature. There are a few individual numbers that I can't fully explain, so take any given number with a grain of salt, but the overall trend looks accurate. I'm putting a full list in a separate page to keep the download time for this piece at a reasonable level, and I'll just go over some of the highlights here. Overall, the numbers follow a nice little Bell curve; almost half of the parks fall within 10% of normal, and it tails out from there. The numbers represent a percentage; a game scored in a park with a park factor of 125 will feature one-fourth more runs than the same game scored in a park with a park factor of 100.
Here are the twenty highest park factors:
211 New Mexico 162 New Mexico State 159 Illinois 156 Air Force 154 Brigham Young 151 Nevada-Las Vegas 146 Southern Utah 139 Ball State 138 Western Illinois 136 Maryland-Eastern Shore 135 Iowa 134 Indiana 133 Coppin State 132 Virginia Military 131 Iona 129 C. W. Post 129 Albany 128 Towson 126 Western Carolina 126 Maryland-Baltimore County
One of the nice things about having 287 teams to deal with rather than just 30 is that you get just about the full range of what's possible. 211. That means that, if you take two hypothetical teams and let them play magically identical games in an average park and at New Mexico, they'll score more than twice as many runs in the game at New Mexico. They're in a conference where most of the teams have parks that are higher in altitude than Coors Field, and their park is the most offense-producing of them all. Some of the others are interesting -- Illinois, Ball State, and WIU all finishing in the top ten is something I'm still working on. My theory, given that none of those parks are particularly small and that Illinois-Indiana mountain range is a few million years away still, is that those schools are in conferences where most of their paired-up opponents are further north, and the relative weather differences makes their home parks more friendly to offense. I'm open to other suggestions, though. Some of the other parks are rather small.
Now, the other end of the spectrum:
59 Binghamton 61 Cal-Irvine 65 Texas-Pan American 66 Northeastern 69 Louisiana-Lafayette 71 Arkansas State 71 Florida International 73 Akron 74 Long Beach State 75 North Carolina 76 LeMoyne 76 Northwestern State 77 Duke 77 Mississippi 77 North Carolina-Wilmington 77 Oregon State 77 Portland 78 California 78 Connecticut 78 Elon
The first two only have one year of data, so those numbers may be a little suspect. On the other hand, in Binghamton's case, it's cold and the park is fairly large (I'm doing that from memory; if I'm wrong, someone will correct me), so it's probably fairly close. Essentially, here it's obvious that it's easier to suppress runs if you're closer to sea level or if it's cold.
There are some interesting followup results that I may try to get around to after this: Is there a relationship between the park factor and winning? Theoretically, it might be easier to recruit hitters in a high-offense park, but it might be easier to manage a pitching staff when they naturally don't have to throw as many pitches in a low-offense park. What are the relationships between numbers for the conferences, and how does that affect perception? Who takes best advantage of their home park? All of those are worth a look, now that we have some data to work with. Let me hear from you.
If you're interested in reprinting this or any other Boyd's World material for your publication or Web site, please read the reprint policy and contact me
Boyd's World-> Breadcrumbs Back to Omaha-> Park Factors with Teeth | About the author, Boyd Nation |