Oct 14, 2006

Oct 15, 2006

*
Which team had the best offense
in the history of baseball?
*

- A simple statistical measure
- Mod A: Excluding the top team itself
- Mod B: Smoothing team statistics over several years

There are many different ways to rate the offensive
performance of baseball teams.
A simple way is to count the number of runs scored
over the course of the season.
If we try to compare teams over long periods of time,
we need to normalize by the number of games played,
since the season increased in length from 154
to 162 games in 1961.
This leads us to compare teams by the
**number of Runs scored Per Game**,
or **RPG** for short.

So, which teams scored the largest number of runs per game? I'll consider only statistics from the "modern era" of baseball history, starting in 1901 and continuing to 2006. I will also restrict this study to the American League. Here are the top 10 teams by RPG:

Year team runs RPG ----------------------------- 1930 NYY 1062 6.900 1931 NYY 1067 6.880 1936 NYY 1065 6.870 1950 BOS 1027 6.670 1932 NYY 1002 6.420 1932 PHA 981 6.370 1939 NYY 967 6.360 1927 NYY 975 6.290 1937 NYY 979 6.240 1999 CLE 1009 6.230 -----------------------------

Does this mean that the 1930 Yankees were the most outstanding offense ever? Perhaps ... but perhaps not. Notice that 7 out of the top 10 teams played in the nineteen-thirties. This is not a coincidence. Over the decades, various aspects of the game have changed: new ballparks are built, different manufacturers supply baseballs, the height of the mound decreases, etc. Let's take a look at the average number of runs scored per game over the entire period of this study; you should see some clear trends.

The squares show the average number of runs scored per game for each year. Several regimes are obvious:

- the dead ball era, from 1903 to 1919
- the "live ball" times which followed, in the 1920s and 1930s
- the dip in scoring during World War II
- another dip in the late 1960s and early 1970s
- the current period of high-scoring games, which started in 1994

The errorbars attached to each square show the standard deviation from the mean value for each year; in other words, they provide a rough measure of the scatter in scoring for that year. If all the teams in some year had very similar values of RPG, then the errorbars will be small; for example, in 1960, the standard deviation was just 0.300 runs per game. If the spread in team scoring was large in some year (as in 1948, when the standard deviation was 0.822 runs per game), then the errorbars will be large.

It is clear that
simply computing RPG
won't tell us which teams
are the most outstanding.
A team which scored
5 runs per game in the nineteen-thirties
was just average,
while a team which scored 5 runs per game
in the nineteen sixties would have
been terrific.
One way to account for the changing
conditions and put all teams
on an even footing is to
compute a statistic which
takes the number of runs per game
above the average for that year,
and normalizes it by the standard
deviation for that year.
In mathematical terms,
we can compute a statistic I'll call
**D**:

(team RPG) - (league average RPG) D = ---------------------------------- (league standard deviation)

For example, consider the 1975 Boston Red Sox.
They scored 796 runs over 162 games,
so achieving 4.980 RPG.
The league average for 1975 was 4.300 runs per game,
so the Red Sox were 0.680 runs above the average.
The standard deviation from the mean that year
was 0.362 runs per game.
Thus, the 1975 Boston Red Sox had a **D** statistic of

(4.980 RPG) - (4.300 RPG) D = --------------------------- = 1.878 (0.362 RPG)

A statistician might say that the 1975 Red Sox were 1.878 normal deviates above the mean number of runs scored per game.

By computing the **D** statistic for each team,
we may possibly compare teams from different eras fairly.
In essence, we are asking
"How much did this team stand out above its peers?"

So, let's look at the top 20 teams in American League history according to this normalized statistic. First, I'll show them graphically, by plotting each team's location in the diagram you've seen before.

And now the list:

team league league Year team runs RPG avg RPG stdev D ------------------------------------------------------- 1914 PHA 749 4.740 3.652 0.507 2.146 2006 NYY 930 5.740 4.964 0.373 2.080 1982 MIL 891 5.470 4.478 0.486 2.041 1965 MIN 774 4.780 3.943 0.413 2.027 1913 PHA 794 5.190 3.930 0.626 2.013 2005 BOS 910 5.620 4.758 0.435 1.982 1984 DET 829 5.120 4.420 0.354 1.977 1966 BAL 755 4.720 3.892 0.426 1.944 1993 DET 899 5.550 4.706 0.435 1.940 1935 DET 919 6.050 5.091 0.497 1.930 1968 DET 671 4.090 3.405 0.358 1.913 1999 CLE 1009 6.230 5.176 0.552 1.909 2003 BOS 961 5.930 4.859 0.567 1.889 1987 DET 896 5.530 4.899 0.335 1.884 1908 DET 647 4.200 3.445 0.401 1.883 1975 BOS 796 4.980 4.300 0.362 1.878 1963 MIN 767 4.760 4.085 0.362 1.865 1934 DET 958 6.220 5.126 0.589 1.857 1985 NYY 839 5.210 4.557 0.353 1.850 1931 NYY 1067 6.880 5.139 0.941 1.850 -------------------------------------------------------

The 1931 New York Yankees fall from first (in raw runs scored per game) to twentieth (in normalized runs scored per game above league average). There is a nice mix of teams from different periods: 1 team from the 1900s, 2 from the 1910s, 3 from the 1930s, 4 from the 1960s, 1 from the 1970s, 4 from the 1980s, 2 from the 1990s, and 3 so far from the 2000s.

The "most outstanding offensive season" by this measure
belongs to the 1914 Philadelphia Athletics.
*
"Who were they?
*
you might ask.
I admit I didn't know myself.
Take a peek at the starting position players
(thanks to
baseball-reference.com )

Pos Player Ag G AB BA OBP SLG OPS+ ---+-------------------+--+----+----+-------+-----+-----++----+ C #Wally Schang 24 107 307 .287 .371 .404 137 1B Stuffy McInnis 23 149 576 .314 .341 .368 117 2B *Eddie Collins 27 152 526 .344 .452 .452 176 3B *Frank Baker 28 150 570 .319 .380 .442 151 SS Jack Barry 27 140 467 .242 .324 .268 81 OF *Eddie Murphy 22 148 573 .272 .379 .340 120 OF *Amos Strunk 25 122 404 .275 .364 .342 116 OF Rube Oldring 30 119 466 .277 .308 .371 108

A second baseman who led his team in slugging? Nice.
Eddie Collins had a great year, and made a heck of a
1-2 punch with Frank "Home Run" Baker.
Of course, even Collins' stats this year pale next to those
of Ty Cobb, who led the league
in **both** slugging (0.513) and on-base percentage (0.466).
Despite their all-time offense,
the 1914 Athletics lost the World Series in four straight
games to the Boston Braves.

The 1913 version of this Athletics team, by the way,
was also very strong: they appear as number 5 in the list.
They were mostly the same players,
with Jack Lapp instead of Wally Schang behind the plate
and Jimmy Walsh getting slightly more atbats than Amos Strunk.
The 1913 Athletics **did** win it all,
beating the New York Giants in 5 games.

From my own point of view as a Red Sox fan of the present age, I find it interesting to note

- the Sox teams of 2005 (sixth) and 2003 (thirteenth)
make the top twenty "most outstanding" offenses,
but the World Series winners of 2004 don't;
they came in twenty-second place with
**D = 1.812.** - the 2006 New York Yankees may really have been as strong a lineup as many have claimed. I thought it was mostly hype, but they put together the second "most outstanding" season of all time.

Is this the best way to rate all-time offenses? Well, probably not, but it seems to me at least a reasonable one. I have an idea for another scheme, but that will have to wait for another week.

Vermonter at Large and Rice4HOF pointed out a flaw
in my analysis:
by including the top team each year in the calculation
of mean runs per game and standard deviation thereof,
we penalize truly outstanding teams to some extent.
Why?
If one team scores far more runs than all the others,
then its performance will increase both the
average and standard deviation significantly;
then, when we subtract that average from the team's
own number of runs per game, and divide the difference
by the standard deviation, we diminish the
value of **D** by which we are ranking teams.

Note that this effect will be larger when there are fewer teams: removing one team from the statistical calculations makes a bigger difference. Thus, using this small modification will bias the results towards teams which played in the "old days":

- 1901-1960: 8 teams in the AL
- 1961-1968: 10 teams
- 1969-1976: 12 teams
- 1977-2006: 14 teams

With this caveat in mind, let us repeat the calculations,
but exclude the top team each year when computing
the mean and standard deviation of runs per game.
I'll call this modified statistic **D'**.

(team RPG) - (all-but-top-team league average RPG) D' = ---------------------------------------------------- (all-but-top-team league standard deviation)

Here are the top 20 teams ranked by **D'**:

team league league new old Year team runs RPG avg RPG stdev D' rank rank -------------------------------------------------------------------- 1914 PHA 749 4.740 3.497 0.273 4.553 1 1 1913 PHA 794 5.190 3.750 0.393 3.664 2 5 1935 DET 919 6.050 4.954 0.337 3.252 3 10 1908 DET 647 4.200 3.337 0.280 3.082 4 15 1965 MIN 774 4.780 3.850 0.307 3.029 5 4 1934 DET 958 6.220 4.970 0.421 2.969 6 18 1931 NYY 1067 6.880 4.890 0.674 2.953 7 20 1907 DET 694 4.540 3.534 0.355 2.834 8 21 1966 BAL 755 4.720 3.800 0.330 2.788 9 8 1950 BOS 1027 6.670 4.811 0.674 2.758 10 25 1939 NYY 967 6.360 5.047 0.480 2.735 11 26 1968 DET 671 4.090 3.329 0.281 2.708 12 11 2006 NYY 930 5.740 4.905 0.311 2.685 13 2 1927 NYY 975 6.290 4.726 0.584 2.678 14 29 1982 MIL 891 5.470 4.402 0.409 2.611 15 3 1963 MIN 767 4.760 4.010 0.289 2.595 16 17 1946 BOS 792 5.080 3.909 0.453 2.585 17 30 1949 BOS 896 5.780 4.513 0.505 2.509 18 32 2005 BOS 910 5.620 4.692 0.372 2.495 19 6 1936 NYY 1065 6.870 5.500 0.549 2.495 20 34 --------------------------------------------------------------------

As expected, the modified statistic favors teams from small leagues. Note that the top team remains the same: the 1914 Philadelphia Athletics. The 1913 edition of the team now ranks second. The 2006 Yankees, on the other hand, fall from second to thirteenth.

Rice4HOF pointed out that the method I've chosen suffers from another statistical shortcoming: the number of teams is so small -- even in the current 14-team era -- that the standard deviation from the mean is not robustly determined. He suggested grouping together statistics from all the years within a particular period before computing statistics. I will take his suggestion and run with it a bit later ...

Several people have suggested that one can't extract reliable
statistics on the mean and standard deviation of a distribution
with only 7 or 8 samples;
they have suggested that I group together a number of consecutive
years worth of team statistics to compute the mean and standard
deviation of runs scored per game,
and **then**
compute the normalized deviate.
Good idea!
Let's try it.

I'll choose boxcar smoothing (which gives equal weight to all values within a fixed span of years) and a full width of 5 years. Thus, for example, when evalulating the performance of the 1938 Yankees, I'll compute the mean and standard deviation of runs scored using all American League teams from 1936, 1937, 1938, 1939, and 1940. Note that teams at each end of our period of study, 1901-2006, will be compared to a smaller group; we don't yet know how many runs teams will score in 2007.

I'll refer to this statistic as **D _{smooth}**.

(team RPG) - (5-year-boxcar average RPG) D = ---------------------------------------------------- smooth (5-year-boxcar standard deviation)

First, let's look at the graph of runs scored per game, using all teams. I show the mean and stdev computed using individual years in red, and using 5-year groups in green.

**Warning Will Robinson!**
Note that this method of computing statistics will
give improper weights to teams during years when the
overall face of baseball was changing rapidly.
For example, look at the brief peak in runs scored
for 1911 and 1912.
When we smooth over 5 years, the sudden jump of
one full run per game from 1910 to 1911 is moderated;
we will be comparing each team's performance in 1911
to an artificially lower average performance.
Instead of subtracting 4.607 runs per game
(the average in 1911 alone)
from each team's performance,
we will subtract only 4.013 runs per game
(the average over the period 1909 to 1913).
This will favor teams during rising periods of scoring,
and punish teams during falling periods of scoring.
Keep this in mind...

Here are the top 20 teams ranked by
**D _{smooth}**:

team 5-year 5-year D Year team runs RPG avg RPG stdev smooth -------------------------------------------------------- 1950 BOS 1027 6.670 4.650 0.702 2.877 1936 NYY 1065 6.870 5.296 0.606 2.597 1987 DET 896 5.530 4.543 0.410 2.407 1915 DET 778 4.990 3.776 0.509 2.385 1982 MIL 891 5.470 4.392 0.464 2.323 1999 CLE 1009 6.230 5.055 0.524 2.242 1911 PHA 861 5.660 4.013 0.741 2.223 1907 DET 694 4.540 3.579 0.446 2.155 1939 NYY 967 6.360 5.103 0.588 2.138 1956 NYY 857 5.560 4.338 0.574 2.129 2003 BOS 961 5.930 4.860 0.504 2.123 1996 SEA 993 6.170 5.124 0.494 2.117 1930 NYY 1062 6.900 5.110 0.847 2.113 1921 NYY 948 6.200 4.700 0.710 2.113 1931 NYY 1067 6.880 5.157 0.839 2.054 1961 DET 841 5.160 4.358 0.393 2.041 1965 MIN 774 4.780 3.935 0.415 2.036 2004 BOS 949 5.860 4.881 0.481 2.035 1966 BAL 755 4.720 3.799 0.453 2.033 1927 NYY 975 6.290 4.923 0.673 2.031 -----------------------------------------------------------

Wow. This is quite a change from our previous
lists.
The 1913 Philadelphia Athletics drop to number 47 overall,
and the 1914 Athletics rank is 97!
The 1911 edition of this dynasty now comes
to the fore, reaching number 7 in **D _{smooth}**,
but in large part due to the overall jump
in scoring from 1910 to 1911.