We can be reasonably sure that, ON AVERAGE, it exists. With a
sample size of 1,450, all three methods I use to calculate 95%
CI has the lower bound at 51.7%, obviously larger than 50.0%
Do I wish I had a figure to plug in for each individual player?
Of course. But that data simply is not available, and it would
take recording every point played for months, maybe years (as
one would have to figure in the relative strength of the oponent)
to make even a crude stab at it.
While we are at it, I don't think it is intuitively obvious that
a .450 player should win 45% of his points against an "average"
.500 player. The .500 player could have gotten that percentage
by winning every point against players .499 and below, and losing
every point to .501 players and above. Same sort of logic for
the .450 player -- he always wins points against (even slight)
inferiors, and loses against (even slight) superiors -- only there
are more of the latter.
I have read the By the Numbers material on Log5 -- at least the
article you cited. More and more I am of the opinion to use alpha
in a Skiena probability formulation because I think it varies
between singles and doubles games -- lower for singles, and higher
for doubles. Once again, I use the tennis analogy.
There are three questions I am attempting to answer with my data
set:
What is the magnitude of the average serve advantage for singles
and doubles?
What is the optimum method to calculate point-win for singles
and doubles teams?
What is the optimum alpha to use to calcuate point wins in doubles
and singles games?
Finally, for the record, the data set has 3,506 points in doubles
games. The receiving teams won 1,893, or 54.0% of them.
-- S.t.