Boyd's World-> Breadcrumbs Back to Omaha-> Unlucky? About the author, Boyd Nation


Publication Date: April 6, 2004

It Could Get Better

This week's topic comes to you triggered by a conversation with Paul Kislanko of Paul's a relative newcomer to the game, which gives him a different perspective to go with quite a bit of analytical experience. He asked about Pythagorean Projections, and in explaining them to him, I finally found a way to clarify some thoughts I've been stumbling over for a few years.

Pythagorean Projections come from one of Bill James' multitudinous observations, in this case of the fact that most teams' winning percentage falls within a game or two of this formula:

WP% = RS2 / (RS2 + RA2)

You can get an even better fit if you use something slightly smaller for the exponent in most cases, usually something between 1.85 and 1.9, but that's just a bonus -- even the basic formula holds up really well for the major leagues.

Now, sometimes this gets called the Pythagorean Theorem, presumably in reference to the original Pythagorean Theorem (I don't remember if James himself used that term), but that's a misnomer -- there's not really any theory involved. There's no particular reason why it works; it just does. Most years roughly 90% of the major league teams fall within two games of the Pythagorean Projection. Knowing that, the formula can also be used mid-season to see who's likely to improve their fortunes even if they don't improve their teams; teams who have underperformed their projection usually improve and vice versa.

Now, college baseball is not that clean-cut, I've found as I played with it over the years, and the reason is the massively unbalanced schedule. Not only is the schedule different for each team, it's uneven within the season -- teams in major conferences tend to play easier schedules at the beginning of the year than they do at the end, which makes predictions based on projections difficult. However, there are a couple of different ways that the data can be looked at, and I think there's some potential there to identify teams that might improve from this point on.

The math gets complicated here, but if you adjust the actual runs scored and allowed by the difficulty of the opponent for each game, you get a different and possibly more accurate projection. The difference, I suppose, is that the traditional projection works if you assume an average schedule from this point forward, while the adjusted projection works if you assume a schedule of the same difficulty as the season to date for the rest of the year. There's no guarantee that either is the case, of course, but using the two together, I've identified some teams that may be in for a surprise, whether pleasant or not.

First and foremost, freshly identified by Baseball America as a disappointment, are the Baylor Bears. One-run games are generally a crapshoot, won by luck often than by any actual characteristic of the teams involved. The Bears are 3-12 in one-run games. In the meantime, they've outscored their opponents 161-142 against a strong schedule. Turning that 3-12 into a neutral record would move them from 12-18 to 17-13 or so and put them squarely into the Big 12 hunt. It's probably too late for a title run, but it wouldn't be at all surprising for them to make a good recovery.

Also sitting at the bottom of a power conference and looking up are the Alabama Crimson Tide. The Tide fit the classic profile of an under-projection team; they tend to score runs in bunches but get shut down some times. Historically, teams like that have had a tendency to put together strong runs at times. On the other hand, their non-conference schedule was fairly weak, so we'll consider them a test case and see how they do against the tough schedule approaching. Kansas falls into the same category.

Another category of teams who may improve because they've stopped hitting themselves in the head -- in other words, teams whose conference schedules are much less tough than their non-conference schedules and whose overall records should go up from this point -- includes Houston, Cal State Fullerton, and Fresno State. All of those are on the bubble right now, or just off of it; all of them could play themselves in with a strong conference finish.

The teams that have outplayed their projections are, for the most part, not as interesting, but Auburn and Loyola Marymount fans may want to brace themselves. In addition, UC Irvine has been very good, but they haven't been quite that good; they've outplayed their raw projection by three games. On the other hand, their schedule is strong enough that they're actually even on their adjusted projection, so they may actually be that good.


In response to a question from Long Beach baseball SID Niall Adler, the biggest single-season turnaround in conference play is one of these, depending on where you set the threshold for statistical significance:

Navy, EIBL; 1953, 1-8; 1954, 8-1
Texas Christian, SWC; 1955, 2-13; 1956, 13-2
Long Beach State, PCAA; 1988, 4-17; 1989, 17-4

Tournament Watch

This means absolutely nothing, ignore it.

Actually, this is an experiment for me to see how predictable the postseason makeup is. I want to see how accurate my picks are (using myself as the test subject as a moderately knowledgeable observer with no input into the results) at various distances from the selection. I'm not going to bother picking a team from the one-bid conferences, since the conference tournament will just be a crapshoot, but if I only list one team from a conference, they'll get an at large bid if they don't get the automatic bid.

Southern Conf.     Florida State        Notre Dame             Louisiana State
Atlantic 10        North Carolina St.   Birmingham-Southern    Mississippi
CAA                Virginia             UC Irvine              South Carolina
Horizon            Clemson              Long Beach State       Florida
MAAC               North Carolina       Albany                 Arkansas
MAC                Florida Atlantic     Cal State Fullerton    Tennessee
MEAC               Central Florida      Southern Mississippi   Auburn
Mountain West      Texas                East Carolina          Vanderbilt
NEC                Texas A&M            Tulane                 Texas State
OVC                Nebraska             Texas Christian        Lamar
Patriot            Oklahoma             Stanford               La.-Lafayette
SWAC               Southern California  Arizona State          Georgia Tech
Mid-Continent      Houston              Washington             Rice
Coastal Carolina   Oklahoma State       Arizona                South Florida
Miami, Florida     Minnesota            St. John's             Loyola Marymount
Wichita State      Penn State           Washington State       Georgia

Pitch Count Watch

Rather than keep returning to the subject of pitch counts and pitcher usage in general too often for my main theme, I'm just going to run a standard feature down here where I point out potential problems; feel free to stop reading above this if the subject doesn't interest you. This will just be a quick listing of questionable starts that have caught my eye -- the general threshold for listing is 120 actual pitches or 130 estimated, although short rest will also get a pitcher listed if I catch it. Don't blame me; I'm just the messenger.

Date   Team   Pitcher   Opponent   IP   H   R   ER   BB   SO   AB   BF   Pitches
Mar 27 Bucknell Kevin Miller Navy 9.0 5 1 0 2 10 32 35 112
Apr 2 North Carolina Daniel Bard Wake Forest 6.2 9 6 4 4 2 29 35 129
Apr 2 Northwestern J. A. Happ Indiana 9.0 2 0 0 1 12 27 29 126
Apr 2 Ohio State Josh Newman Illinois 9.0 7 1 1 0 14 33 33 123
Apr 2 Coastal Carolina Steven Carter Winthrop 8.0 5 1 1 3 13 29 34 139
Apr 2 North Carolina-Charlotte Zachary Treadway East Carolina 9.0 8 8 4 3 5 33 40 122
Apr 2 Houston Garrett Mock Alabama-Birmingham 7.2 6 3 3 3 6 29 34 136
Apr 2 Northern Illinois Joe Piekarz Ball State 8.1 8 3 3 2 9 31 34 132
Apr 2 Southwest Missouri State Derek Drage Southern Illinois 9.0 5 2 2 4 10 32 37 149(*)
Apr 2 Utah Jason Price Air Force 7.0 12 5 5 2 11 30 33 122
Apr 2 Stanford Mark Romanczuk UCLA 8.0 10 4 4 2 10 33 36 146(*)
Apr 2 Mississippi Mark Holliman Mississippi State 7.1 5 0 0 3 8 25 29 124
Apr 2 Vanderbilt Jeremy Sowers South Carolina 9.0 8 2 2 2 9 32 34 122
Apr 2 Appalachian State Peterson Citadel 6.0 4 6 3 5 8 22 30 127
Apr 2 Georgia Southern Carroll North Carolina-Greensboro 9.0 8 2 2 1 5 36 39 124
Apr 2 Lamar Kyle Stutes Texas-San Antonio 9.0 5 2 2 3 6 30 35 123
Apr 2 Gonzaga Eric Dworkis San Diego 9.0 11 5 5 1 4 38 41 150(*)
Apr 2 Louisiana Tech Clayton Meyer Fresno State 7.2 13 8 5 0 9 37 37 128
Apr 3 Wake Forest Justin Keadle North Carolina 8.2 10 6 5 5 6 35 41 133
Apr 3 Texas J. P. Howell Texas Tech 8.0 2 0 0 2 11 26 29 132
Apr 3 William and Mary Jeff Dagenhart Seton Hall 8.0 3 1 1 2 11 27 30 125
Apr 3 Houston Brad Lincoln Alabama-Birmingham 9.0 5 0 0 2 4 31 34 125
Apr 3 Centenary Kevin Willborn Texas A&M-Corpus Christi 8.2 11 2 2 3 5 35 40 122
Apr 3 Texas A&M-Corpus Christi Mike Garcia Centenary 9.0 8 4 4 2 8 34 36 132
Apr 3 Eastern Illinois Kirk Miller Murray State 7.0 9 5 5 8 3 27 35 151
Apr 3 Pepperdine Kea Kometani San Francisco 9.0 8 2 2 1 3 33 36 121
Apr 3 Fresno State David Griffin Louisiana Tech 9.0 9 3 3 0 13 36 36 130
Apr 4 Duke Greg Burke Virginia 9.0 9 3 1 2 6 36 39 146(*)
Apr 4 Alabama-Birmingham Jeff Brown Houston 8.1 9 5 2 4 5 33 39 145
Apr 4 Charleston Southern Bissell North Carolina-Asheville 9.0 5 0 0 3 8 32 36 146(*)
Apr 4 Texas A&M-Corpus Christi Trey Hearne Centenary 7.0 5 3 1 2 12 26 29 126

The Miller count from March 27 is a correction based on an actual pitch count.

(*) Pitch count is estimated.

If you're interested in reprinting this or any other Boyd's World material for your publication or Web site, please read the reprint policy and contact me


Boyd's World-> Breadcrumbs Back to Omaha-> Unlucky? About the author, Boyd Nation