Computation of tournament performance ratings • page 1/1 • Lichess Feedback • lichess.org

BerserkRoad

Based on a discussion with @globa9, we conjecture that the current lichess tournament performance rating is computed as follows:

Suppose a player plays against opponents with the GLICKO ratings r_1, ..., r_n, where n is the number of games played by the player, and gets the points p_1, ..., p_n (each p_k can be either 0 (loss), 1/2 (draw) or 1 (win)).

Currently, lichess calculates the average rating R = (r_1+...+r_n) / n and the "percentage of points achieved" P = (p_1+...+p_n) / n.

It then defines the tournament performance rating (TPR) as TPR = R + (P - 0.5) * 1000. For example, if you win all of your games, you get 500 points plus the average opponent rating as your TPR.

This approach has a number of downsides:

- The estimation error gets systematically higher when the variance of your opponent's ratings increases. For example, if you play against 10 players that are rated 2700 in a row, loose all of your games, and then win one game against a player rated 1000, your performance rating will be roughly 2170. In fact, even if you *loose all 10 games*, your performance rating would be 2070.
- Your TPR can drop even if you win (against a weaker opponent) and increase even if you loose (against a stronger opponent).

I thus suggest the following, different approach: Let E(x; y) = 1 / (1+10^((x-y))), which is the expected score of a player with ELO rating x playing against another player with ELO rating y according to the ELO system. [Side note: One could also use GLICKO expected scores to take into account rating deviation, however, this might be over-kill.)

Then the TPR should be defined as the solution x to the equation

E(x, r_1)+...+E(x, r_n) = p_1+...p_n

In other words, the TPR x is the rating such that a player with ELO rating x would have an expected score (according to ELO) equal to the score that was actually achieved by the player in the tournament.

That way, your TPR cannot drop if you win a game (since E(x, r_k)<1 for all real x,r_k) and, similarly, it cannot increase if you loose a game.

A caveat is that, if a player wins or looses all of their games, the TPR defined by the above system would be infinity/-infinity because of how ELO/GLICKO work. I suggest to use a different method in that case or to assign a provisional rating until someone has won and lost at least one game or drawn at least once.

MoistvonLipwig

I agree that this is a more natural approach. It breaks down at the extremes but for those cases the concept of tournament performance does not make much sense anyway.

With Elo this should be quite cheap, if nothing else a simple bisection approach will get there in log(rating) iterations.
I think it would be desirable to do this in Glicko, but that might be a bit expensive? If you recalculate this after every tournament game of a player then over the course of the tournament you need a quadratic amount of game evaluations. With Elo that shouldn't be a problem though, since there the evaluation is just a simple lookup based on rating difference. Or possibly use Elo for most iterations of the bisection approach, and then finish of with Glicko, banking on the fact that Glicko will give a similar performance than Elo?

Unless you have a different approach in mind than the naive "Just rating evaluate the tournament a number of times and try to make the rating gain 0"?