Saturday, October 6, 2007

Why MLB Playoff Odds?

You may be familiar with some or all of the three sites that have produced postseason odds for MLB. Why are my numbers different, and more importantly, why are they better?

The three sites above use what is known as a Monte Carlo simulation. In effect, the Monte Carlo engine creates a million different possible futures. In each future timeline, every team has a fixed percentage chance to win each of its scheduled games. If the Yankees are a 60-40 favorite over the Indians at Yankee Stadium, they are assigned a 60% chance to win that game in each simulation.

This is a good method, but to use it optimally, we need accurate inputs. All of the above simulators assume that a .500 team is a .500 team on any given day. In our example, the Yankees will be assigned a 60% chance to beat the Indians regardless of whether the pitching matchup is Wang-Laffey or DeSalvo-Sabathia. If A-Rod and Jeter collide while going for a pop-up and knock each other out for tomorrow's game, the .600 figure remains unchanged.

Furthermore, of the three, only Coolstandings applies tiebreakers to determine a champion in deadlocked playoff races. There are many instances where one team has a tiebreaker advantage clinched well before the end of the season, but the simulator simply counts this as half a win. That's fine for rough estimates, but we can do better.

Now, over a long season, these things tend to even out. But late in the season, or during the playoffs, these factors have a great influence on the race and the probabilities of each team advancing.

I'm not familiar with running my own Monte Carlo simulations, although Clay Davenport or the Coolstandings guys are welcome to give me recommendations for learning. What I can work with is probability distributions. Say you handicap a team's chances to win each game of a 5-game series at respectively .629, .514, .723, .450, and .498. From there, it's a relatively simple process to determine their probability of winning the series (61.8%). If the team wins Game 1, this figure goes up (75.9%); if they lose, it declines (38.0%).

Similarly, in the last month of a playoff race, handicapping each individual game should give you substantially better accuracy than you'd get from a Monte Carlo simulation. If your team is playing the Yankees in the final week but Joe Torre is resting all his regulars for the playoffs, you shouldn't be rated as a tremendous underdog. If the schedule lines up so that the Padres can pitch Jake Peavy twice in their final five games, this is a big advantage for them.

My goal is to integrate these considerations. Since they're most useful in a simple and predictable format like the playoffs, there's no time like the present to debut them.

No comments: