Tennis Forum banner

21 - 40 of 111 Posts

·
Registered
Joined
·
1,463 Posts
FiveThirtyEight:

http://fivethirtyeight.com/features/serena-williams-and-the-difference-between-all-time-great-and-greatest-of-all-time/

In this case, there were two main choices to make: First, what level of granularity to cover. That is, do we treat each game in a tennis match as its own “match” — thus increasing our sample size, but measuring something very different from overall wins? Or do we treat a match win as a win — regardless of whether it’s 6-0, 6-0; or 6-4, 3-6, 6-7(7), 7-6(3), 70-68? Or do we go set by set? This tradeoff always exists, even in other settings like football or baseball, where margin of victory and run differential are commonly used. The question is always what we gain in prediction strength by using a less accurate but more detailed metric than wins. For this system, we tested and optimized all three, and found that any predictive gains from using sets or games instead of matches were extremely small. While those versions of our system may yet prove to have their uses, we’ve stuck with a match-based system for most of this analysis.

The second choice is how to update ratings after a match. All Elo systems take the difference between the number of wins a player earned and the number of wins expected, and then do something with that number to determine the appropriate adjustment to the player’s rating. The crudest thing to do is to multiply that difference by some constant number, K, where K is chosen empirically to match the context. In chess, a common K for new players is 40, meaning that for each 10 percent above expectation a player ran, she would gain 4 points in Elo per game. For FiveThirtyEight’s NBA Elo, we used a K of 20. Chess uses a K function that depends on the number of games a player has played. This is the approach taken by other Elo variants such as Glicko and Stephenson, which use additional parameters as well.

We tried and tested all these methods extensively but ultimately settled on our own variant (which substantially outperformed the alternatives) in which the multiplier is determined by a function in the form K/((games in player’s dataset+offset)^shape). K is a constant multiplier much like that used by other systems, offset is a small adjustment to keep new players from shooting up or down too much, and shape tells us what shape the curve should take (essentially, the larger the number, the more stable the ratings for players with lots of games). With this structure in place, we simply had to test to see which parameters performed best over our data set. The values we settled on are a K of 250, offset of 5, and shape of 0.4.

This is an empirical approximation of what this adjustment curve should look like, and can likely be improved with a more accurate function or by incorporating more inputs than games and performance, but we tried to keep it as simple as possible to avoid the potential for overfitting. (The data we used is parsed from that on Jeff Sackmann’s GitHub page, totalling more than 250,000 matches at the tour level).
Thoughts? Would love to see this thread updated.
 

·
Registered
Joined
·
2,901 Posts
Discussion Starter #30

·
Registered
Joined
·
18,425 Posts
is there a historical database or ELO ratings for the WTA? Like back to the 90's?
 

·
Registered
Joined
·
2,901 Posts
Discussion Starter #38
is there a historical database or ELO ratings for the WTA? Like back to the 90's?
I have rankings for the last 1200 weeks or so

Top 30 - January 11, 1993



Curves for Kournikova & Dementieva



Some Links: Klick Klick
 

·
Registered
Joined
·
18,425 Posts
Curves for Kournikova & Dementieva



[/URL] Klick
Very cool. Could I get this chart for Hingis and Dokic, with the corresponding data for 1999?

I used to play chess competitively so I'm quite familiar with Elo but I didn't realize the data was readily available for tennis... makes sense though now that I think about it.

Many thanks in advance.

Cheers,
 

·
Registered
Joined
·
2,901 Posts
Discussion Starter #40
Very cool. Could I get this chart for Hingis and Dokic, with the corresponding data for 1999?

I used to play chess competitively so I'm quite familiar with Elo but I didn't realize the data was readily available for tennis... makes sense though now that I think about it.

Many thanks in advance.

Cheers,


Code:
                 MH    JD 

04.01.1999	2292	1536
11.01.1999	2292	1536
18.01.1999	2287	1536
25.01.1999	2287	1536
01.02.1999	2313	1567
08.02.1999	2340	1567
15.02.1999	2340	1567
22.02.1999	2340	1573
01.03.1999	2317	1558
08.03.1999	2317	1558
15.03.1999	2299	1558
22.03.1999	2299	1558
29.03.1999	2287	1558
05.04.1999	2309	1558
12.04.1999	2309	1557
19.04.1999	2309	1557
26.04.1999	2309	1591
03.05.1999	2309	1591
10.05.1999	2303	1598
17.05.1999	2319	1598
24.05.1999	2319	1598
31.05.1999	2319	1598
07.06.1999	2316	1582
14.06.1999	2316	1597
21.06.1999	2316	1597
28.06.1999	2316	1597
05.07.1999	2291	1692
12.07.1999	2291	1692
19.07.1999	2291	1692
26.07.1999	2291	1692
02.08.1999	2291	1692
09.08.1999	2312	1692
16.08.1999	2298	1692
23.08.1999	2320	1705
30.08.1999	2320	1705
06.09.1999	2320	1705
13.09.1999	2322	1701
20.09.1999	2322	1701
27.09.1999	2322	1698
04.10.1999	2311	1698
11.10.1999	2324	1698
18.10.1999	2315	1698
25.10.1999	2315	1698
01.11.1999	2315	1691
08.11.1999	2315	1679
 
21 - 40 of 111 Posts
Top