Who is (probably) today's best male tennis player?
- Written by Christopher Drovandi, Professor of Statistics, Queensland University of Technology
When you ask that question, three names come to mind: Roger Federer, Rafael Nadal and Novak Djokovic.
A simple way to compare tennis players is to look at how many grand slam tournaments they have won. That includes victories at the Australian Open, the French Open, Wimbledon in the UK and the US Open.
But this doesn’t take into account how many tournaments they’ve played, which tournaments they’ve played, how far they progressed in each tournament, and who they played against.
Kyodo via AP Images/AAP Image Kelly BarnesProbably the best player
My method estimates the probability of a player winning a match in a grand slam tournament. The player with the highest estimated probability of winning a match is then deemed the best player.
Using probability naturally accommodates how many matches and tournaments the player has played, and acknowledges the strong performance of a player who makes a final but doesn’t win the tournament.
The method builds a statistical model to estimate winning probabilities for each player from grand slam data.
By using a technique called regression modelling, it accounts for the fact the winning probability may depend on the quality of the opposition and the grand slam played. For example, some players have preference for hard courts (used at the Australian and US Opens) over clay (used at Roland Garros, home of the French Open).
The opposition quality is inferred from their ranking, and we consider five groups: the top 10, top 20, top 50, top 100 and outside the top 100. These group choices are consistent with terminology used by commentators and pundits.
Another advantage of using a statistical model is that we can make the most of the available data, which is quite small given there are only four grand slam tournaments per year.
For example, if the data support it, the model can enforce a similar pattern of performance against the quality of opposition across tournaments. This is a form of “borrowing of strength” to increase the accuracy of probability estimates from small datasets.
AAP Image/Dave HuntOh, the uncertainty
Using a statistical approach allows us to quantify the uncertainty in probability estimates. Here we communicate uncertainty as an interval (lower and upper limit), that contains the true winning probability with a 95% chance.
So, for example, if the estimated winning probability for a player is 0.77 with an interval of 0.63 to 0.86, it means that our best guess of the winning probability is 0.77. But there is a 95% chance the actual winning probability is between 0.63 and 0.86. This tells us how much uncertainty there is about our best guess.
The amount of uncertainty depends on the number of matches played and the winning probability. There will naturally be more uncertainty if the actual winning probability is around 0.5, that means an even chance of winning or losing.
The results are shown in the figures (below). Each square represents the best probability estimate for Federer, Nadal and Djokovic, and the vertical line represents the uncertainty interval.
Authors: Christopher Drovandi, Professor of Statistics, Queensland University of Technology
Read more https://theconversation.com/who-is-probably-todays-best-male-tennis-player-154185