Park Factors with Teeth

Boyd's World-> Breadcrumbs Back to Omaha-> Park Factors with Teeth About the author, Boyd Nation

Publication Date: February 11, 2003

Context Is Everything

I apologize for not meeting my self-imposed deadline last week. In some ways, I'm combining the worst aspects of sportswriting and software development at times here, and the rigid deadlines of sportswriting combined with the unpredictability of software debugging collide sometimes with rather ugly results -- last week I just had one more bug I could not find despite more effort than I had time for. I appreciate your patience; I think the result will be worth it for the more statistically minded among you.

I finally have some park factors that I like. Half (warning, huge approximation alert!) of you are going, "Woooo hoooo," at this point, and the other half just flipped channels over to another Peter Gammons story on trade rumors involving six teams and a goat. For those of you who are left but need a little background, I'll explain my cunning plan to the audience, Exposition Boy.

The first step that's taken when you begin to seriously analyze sports statistics rather than just memorize them is to try to put them in context. Having a defense that holds teams to 105 points a game is horrible in today's NBA (I think that's still true); having a team that held your opponents to 105 in 1986 was above average. Putting up a 3.80 ERA in the SEC is harder than putting up a 2.80 ERA in the NEC. Hitting 20 home runs in Memphis is a considerably harder feat than hitting 20 home runs at New Mexico State, even though the two teams play roughly similar levels of competition.

When baseball fans talk about context, one of the first places they look to adjust is for what's called park effects, or the park factor. If you could somehow magically play the exact same game in two different parks, you'd get different results. A ball that travels 390 feet in Starkville is a long fly ball to the center fielder. The same ball is a 430 foot home run in Colorado Springs. That same batter popped out in the extra large foul territory in Akron (if I remember right). There are lots of factors that control how many runs are scored relative to a "normal" park for any given stadium. Some of them have to do with the park itself, such as the size of the field and the amount of foul territory available for the fielders. Others have to do with the location of the park, such as altitude and weather patterns. These things can obviously change from year to year as weather changes (this factor is smaller in college than in the pros, because most of the comparable stadiums tend to be in the same weather region) or stadium modifications, but that has to be balanced against the need to have enough data points for meaningful comparison, so a period of time has to be chosen for comparison. There's no absolute measure, so you can only compare a stadium to all the other stadiums; the addition of Coors Field to the National League, for example, lowered every other stadium's park factor by a noticeable amount.

Before the mail starts up, what park factors don't have anything to do with is the team that plays there, theoretically, at least. Teams with good pitching, for example, have good pitching both at home and on the road, so all of their games feature fewer runs than normal. Since park factors are a measure of how many runs both teams score in a game based on where they play, the quality of the team doesn't enter in.

The Method

In the pros, park factors are more or less easy to compute -- decide how long you want to count, compute the runs per game for all of a team's games home and away, divide the runs at home by the runs on the road, and you get a park factor. You can get fancy and use runs per inning if you want, to eliminate the effect of teams that win a lot at home, but that's about it.

The problem is that, for that to work, you need a complete more-or-less balanced schedule. For the situation in college, where in any given season, each team only plays around 10% of the available opponents, it just doesn't work. What I've done in the past is to just compare each team to its conference opponents, but all that gives you is numbers relative to the conference, and some conferences were obviously playing in parks that were more offense-based than others. A couple of weeks ago, though, I realized that this was somewhat similar to the problem of ranking teams; you could look at valid comparisons between conferences in cases where you had home-and-away series to look at, and you could use that to generate a full park factor. For example, if teams A, B, and C were in Conference Red and teams D, E, F, and G were in Conference Blue, and A had a home-and-away with D while C had a home-and-away with G, you could produce a reasonable guess at a full ranking of the seven parks. At this point, I'm boring even myself, so if you want more detail on how I did it (it starts with pairwise comparisons for all pairs of teams who played home-and-away between 1999 and 2002, and goes from there into something similar to the ISR algorithm), drop me a line.

The Numbers

Both by smell and by deduction, I'm comfortable that these are the most accurate park factors we're going to get for college in the reasonably near feature. There are a few individual numbers that I can't fully explain, so take any given number with a grain of salt, but the overall trend looks accurate. I'm putting a full list in a separate page to keep the download time for this piece at a reasonable level, and I'll just go over some of the highlights here. Overall, the numbers follow a nice little Bell curve; almost half of the parks fall within 10% of normal, and it tails out from there. The numbers represent a percentage; a game scored in a park with a park factor of 125 will feature one-fourth more runs than the same game scored in a park with a park factor of 100.

Here are the twenty highest park factors:

211 New Mexico
162 New Mexico State
159 Illinois
156 Air Force
154 Brigham Young
151 Nevada-Las Vegas
146 Southern Utah
139 Ball State
138 Western Illinois
136 Maryland-Eastern Shore
135 Iowa
134 Indiana
133 Coppin State
132 Virginia Military
131 Iona
129 C. W. Post
129 Albany
128 Towson
126 Western Carolina
126 Maryland-Baltimore County

One of the nice things about having 287 teams to deal with rather than just 30 is that you get just about the full range of what's possible. 211. That means that, if you take two hypothetical teams and let them play magically identical games in an average park and at New Mexico, they'll score more than twice as many runs in the game at New Mexico. They're in a conference where most of the teams have parks that are higher in altitude than Coors Field, and their park is the most offense-producing of them all. Some of the others are interesting -- Illinois, Ball State, and WIU all finishing in the top ten is something I'm still working on. My theory, given that none of those parks are particularly small and that Illinois-Indiana mountain range is a few million years away still, is that those schools are in conferences where most of their paired-up opponents are further north, and the relative weather differences makes their home parks more friendly to offense. I'm open to other suggestions, though. Some of the other parks are rather small.

Now, the other end of the spectrum:

 59 Binghamton
 61 Cal-Irvine
 65 Texas-Pan American
 66 Northeastern
 69 Louisiana-Lafayette
 71 Arkansas State
 71 Florida International
 73 Akron
 74 Long Beach State
 75 North Carolina
 76 LeMoyne
 76 Northwestern State
 77 Duke
 77 Mississippi
 77 North Carolina-Wilmington
 77 Oregon State
 77 Portland
 78 California
 78 Connecticut
 78 Elon

The first two only have one year of data, so those numbers may be a little suspect. On the other hand, in Binghamton's case, it's cold and the park is fairly large (I'm doing that from memory; if I'm wrong, someone will correct me), so it's probably fairly close. Essentially, here it's obvious that it's easier to suppress runs if you're closer to sea level or if it's cold.

There are some interesting followup results that I may try to get around to after this: Is there a relationship between the park factor and winning? Theoretically, it might be easier to recruit hitters in a high-offense park, but it might be easier to manage a pitching staff when they naturally don't have to throw as many pitches in a low-offense park. What are the relationships between numbers for the conferences, and how does that affect perception? Who takes best advantage of their home park? All of those are worth a look, now that we have some data to work with. Let me hear from you.

If you're interested in reprinting this or any other Boyd's World material for your publication or Web site, please read the reprint policy and contact me

Boyd's World-> Breadcrumbs Back to Omaha-> Park Factors with Teeth About the author, Boyd Nation