What is TrueSkill?

If you’ve been on the F1CRL Discord server (here is an invite), you’ve likely seen many references to TrueSkill. If you’ve ever wondered what TrueSkill is, you’re in the right place. Note that I am shamelessly taking this info from Microsoft Research’s TrueSkill Ranking System explanation page, while slightly adapting the language to apply directly to our league. Microsoft Research does a great job explaining the system if you want to dig deeper.


The TrueSkill ranking system is a skill based ranking system developed at Microsoft Research. The TrueSkill ranking system uses the final standings of all drivers in a race in order to update the skill estimates (ranks) of all drivers in the race.

Ranking Drivers

So, what is so special about the TrueSkill ranking system? Compared to the Elo rating system, the biggest difference is that in the TrueSkill ranking system skill is characterized by two numbers:

  • The average skill of the driver (μ in the picture).
  • The degree of uncertainty in the driver’s skill (σ in the picture).
Belief curve

The ranking system maintains a belief in every driver’s skill using these two numbers. If the uncertainty is still high, the ranking system does not yet know exactly the skill of the driver. In contrast, if the uncertainty is small, the ranking system has a strong belief that the skill of the driver is close to the average skill.

On the side, a belief curve of the TrueSkill ranking system is drawn. For example, the green area is the belief of the TrueSkill ranking system that the driver has a skill between level 15 and 20.

Maintaining an uncertainty allows the system to make big changes to the skill estimates early on but small changes after a series of consistent races has been conducted. As a result, the TrueSkill ranking system can identify the skills of individual drivers from a very small number of races. The following table gives an idea of the minimum number of races per driver that the system needs to identify the skill level:

Race TypeNumber of Races per Driver
16-Driver Grid3
8-Driver Grid3
4-Driver Grid5
2-Driver Grid12

The actual number of races needed per driver can be up to three times higher depending on several factors such as the variation of the performance per game, the availability of well-matched opponents, etc. If you want to learn more about how these numbers are calculated and how the TrueSkill ranking system identifies players’ skills, please read the Detailed Description of the TrueSkill™Ranking Algorithm on Microsoft Research’s TrueSkill Ranking System explanation page.

How TrueSkill is Shown in F1CRL

Using the two parameters μ and σ which characterize a belief in a player’s skill the TrueSkill ranking system ranks drivers using the so-called conservative skill estimate = μ – k*σ (this metric is also called confidence score in the F1CRL Tier Eligibility output). This estimate is called conservative because it is a conservative approximation of the driver’s skill: it is extremely likely the players actual skill is higher than the conservative estimate. The bigger the value of the more conservative the estimate; a common value of k is 3 (F1CRL uses a k value of 3).

Frequently Asked Questions (FAQ)

Q: What is the difference between skill and performance?

A: The TrueSkill ranking system implicitly uses a performance model that represents your (hypothetical) performance in a particular race. Skill is the average performance. The TrueSkill ranking system maintains a belief in your skill and assumes that your performance in a particular race varies around your skill.

Q: The default TrueSkill of a new driver is 25, right?

A: That’s not fully correct. The TrueSkill value that is displayed in the TrueSkill leaderboard is the conservative estimate of a driver’s skill, computed from two hidden parameters that are used to track a driver’s skill: the mean skill μ and the skill uncertainty σ. The TrueSkill value is then μ-3*σ. What is correct is that a new driver is assigned a mean skill of μ=25 and a skill uncertainty of σ=8.333. Thus, the TrueSkill of a new driver is 25-3*8.333 = 0. Note that these two choices for μ and σ effectively mean that a new driver’s skill can be anywhere from 0 to 50, representing a state of complete uncertainty about their skill.

Q: If I understand the TrueSkill update formula correctly then the change in μ is largest for the first few races and decreases over time. Thus, my first few races are most important; if I lose these races, it will take the TrueSkill much longer to converge to my skill. Right?

A: Not exactly right. It is correct, that the change in μ is getting smaller and smaller with every race contested, but regardless if you win or lose them. However, TrueSkill always takes more recent race outcomes more into account than older race outcomes. Hence, when racing against a set of drivers of same skill multiple times, a late win counts more than an early win.

Q: If the skill of every driver is represented by two numbers, how is it possible to rank drivers in a leaderboard?

A: The TrueSkill ranking system uses the so-called conservative skill estimate which is the 1% quantile of the belief distribution: it is extremely likely (to be precise, with a belief of 99%) that the driver’s actual skill is higher than the conservative estimate.

Q: Who is the better driver: Someone with a large μ and a large σ or a small μ and a small σ?

A: The answer to this question is not straightforward. For someone with a large σ the TrueSkill ranking system is still uncertain about the skill. Thus, the driver with the large μ and a large σ may be better. The best way to find out is to ask the driver with the large σ to race more.

One comment

Join the Conversation