Friday, June 22, 2007

True Colors

35-35.

Among the many quotes attributed to and about Bill Parcells, the most famous one might be, “You are what your record says you are.”

My personal favorite is from Keith Byars, a former full back who played under Parcells: “When Coach says there’s cheese on the mountain, you better bring crackers.”

But I digress.

The Yankees are indeed what their record says they are: a .500 baseball team. For every strength, there is a weakness to match. For every glimmer of hope, there is a cold dose of reality waiting around the bend.

What happens next?

This will be a question that will surround this team for the entire season, no matter the outcome. They have become a model of unpredictability. A team that is hard to figure, difficult to gauge and impossible to read.

Case in point: their expected won-loss record.

Bill James’ Pythagorean Theorem is considered to be a reliable tool for predicting a team’s winning percentage, based on the team’s number of runs scored and runs allowed. A simple equation, and generally very accurate for a crack baseball writer’s purposes.

The formula used by Major League Baseball on mlb.com varies somewhat. They use RS^1.82/((RS^1.82)+(RA^1.82))

In the American League last year, the average team played to within 3.71 games of their “expected” record.

That number is skewed quite a bit by the Indians who played a whopping 12 games under their expected won-loss record, the largest margin I’ve ever seen using this formula.

In the National League in 2006, the average team played to within 2.68 games of its expected record.

Going back to 2005, these numbers were 3.68 and 4 in the N.L. and A.L. respectively.

So, over the course of a full season, a typical team will play around 3 games within their projected record.

Back to 2007.

According to Pythagoras, the two best teams in baseball to this point have been the Boston Red Sox and the San Diego Padres. They’re both playing to an expected record of 44-27.

The Padres are under-performing this mark by three games. Boston is over-performing their record by two games. Both teams are basically right on their expected record, indicating there isn’t much going on that can be chalked up to luck: what you see is what you get.

There is a general tendency to believe that teams will gravitate to zero as the season continues. Sometimes this happens, sometimes it doesn’t.

Cleveland didn’t last year. Arizona, who was 11 games off their projected record in 2005, didn’t either.

I would venture that the further number gets away from zero, the more likelihood that the number will gravitate back towards zero in the future. I’m not a statistician, so I admit I could be wrong on this. But as the theorem tries to balance those clumsy dancing partners known as “luck” and “reality,” the longer you go, typically, the more accurate the runs scored and allowed will be reflected in the actual record.

Next up is Detroit with an expected record of 43-28, one game better than their actual record of 42-29.

Following the Tigers is the Yankees. Based on the number of runs they’ve scored, and the number of runs they’ve allowed, Baseball Pythagoras (whoever he is) would predict the Yankees to have a record of 41-29, six games over the winning percentage they’re playing at right now.

The only other American League team that is more than three games off its Pythagorean pace is the Orioles, who have under-performed their expected record by four games.

In the N.L., there are a few more teams that are off their Pythagorean record by more than three games:

San Francisco: -6
Chicago: -5
Arizona: +5
Cincinnati: -4
Atlanta: +4
St. Louis: +4

What does it all mean for the Yankees? It might not mean anything. Their RS vs. RA numbers suggest a good baseball team, a team that should be two games behind Boston, and in 1st place in the Wild Card race.

The reality says the team is a far-cry from the type of consistent outfit that would be 12 games over .500 at this point in the season.

I don’t know how to explain a team that was scoring over five runs per game through the first few weeks of June, and then total five runs over three games in Coors Field.

Their putrid record in one-run games, 4-11, worst in the majors, accounts for most of their good position in the expected W/L standings. When you lose a one-run game, it’s only a minimal hit to your overall run margin. And the Yankees are a team, in general, that scores a lot of runs, and will have games where they’ll be +4, +5, etc.

I’m out of predictions. I don’t know what direction this team is heading in.

This week was a bad blow to the team’s post-season aspirations, nullifying nearly all the strides they made against the Red Sox in June, and setting them too far back from the Wild Card (6.5 games) to be considered close. They have a lot of work to do to make the post-season, possibly an inordinate amount.

Looking at the expected won-loss record, I could make the assumption that their actual record will start to better reflect their solid run margin, and they’ll be in the Wild Card race through September.

I also could go the other way and say this team just doesn’t have it. Their record in one-run games will continue to go south; the beleaguered bullpen will remain beleaguered; the inept bench will remain inept; and the injuries to expected contributors will eventually take their toll on their ability to score runs over an extended period of time.

The general consensus among sabermetricians is that a team’s record in one-run games is simply a reflection of the most nebulous of elements in team sports: luck.

I’m not a big believer in luck. I believe that in each individual instance, there is an explanation that will at least partially explain a one-run loss. Just because there may be a dozen different explanations for 12 different one-run losses doesn’t mean that a team is unlucky; it may mean they’re susceptible to a myriad of ways of losing a tight ballgame.

This is the case for this year’s Yankee team. They can suffer from an almost unexplainable drought of runs, as they did this week, and at the same time get a solid effort from a starting pitcher. They can suffer from short outings from their starters and bad outings from their relievers on nights when they score six or seven runs. They are the type of team that fits the mold of an outfit that would be prone to losing one-run games, if there is such a thing.

There are two easy-to-see, easy-to-define numbers that suggest the Yankees have been an unlucky team: expected record and record in one-run games. The general belief would be that luck turns around, and the Yankees’ record will begin reflecting their ability to score runs and prevent runs to a more accurate extent.

I’m not convinced that their “luck” is going to turn around. Instead I look at their record in one-run games, for example, as a reflection of how flawed this team really is. I look at their 4-11 record in such games, and wonder how much they’re capable of improving on that record over the course of the remainder of the season.

Pythagoras sees a team contending for a division title. I see a team that is as average as their .500 record suggests.

Comments:
Great post... Question: Do you have any clue why MLB doesn't use the "actual" formula?
 
Mike,

Thanks for the kind words, and for reading.

The "actual" formula, or original formula first devised by James in the 80s, has been tweaked and revised over the years to more accurately reflect a team's actual winning pct.

This is from the Wikipedia page that I linked to in my original post:

Empirically, this formula correlates fairly well with how baseball teams actually perform, although an exponent of 1.81 is slightly more accurate. This correlation is one justification for using runs as a unit of measurement for player performance. Efforts have been made to find the ideal exponent for the formula, the most widely known being the Pythagenport formula[1] developed by Clay Davenport of Baseball Prospectus (1.5 log((r + ra)/g) + 0.45) and the less well known but equally effective Pythagenpat formula ((r + ra)/g)0.287), developed by David Smyth.[2]

So kudos to MLB for using a more updated version of the formula, which appears to be more accurate.
 
Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?