+
After running my pickleball league with Glicko-2 for over a month, I realized the system had problems. So I did what any reasonable person would do: I threw it out and rebuilt it from scratch with an ELO system.
+
And yes, I happen to be the biggest beneficiary of the change. Coincidence? Probably. Let me explain the math, and you can be the judge.
+
The Problem: Glicko-2 Was Overkill
+
Glicko-2 is a sophisticated rating system designed for competitive chess. It tracks three values per player:
+
+- Rating β Your skill estimate (default: 1500)
+- Rating Deviation β How uncertain the system is about your skill
+- Volatility β How consistent you are
+
+
The math involves converting to different scales, computing probabilities with hyperbolic functions, and solving iteratively for new volatility. It’s clever, but for a casual league of six players, it’s like bringing a sports car to a parking lot.
+
But the real problem was this: I added a margin bonus to account for wins by different margins (winning 11-9 vs 11-2). The formula?
+
weighted_score = base_score + tanh(margin/11 Γ 0.3) Γ (base_score - 0.5)
+
Translation: I took the hyperbolic tangent of a fraction, multiplied by an arbitrary constant (why 0.3? No particular reason), and called it science.
+
This is what’s known as “making stuff up.” It had no theoretical basis and was impossible to explain to players.
+
The Doubles Problem
+
The old system calculated team ratings by averaging both partners’ ratings. Sounds reasonable, right?
+
Until you think about it: If you (1400) play with a strong partner (1700) against two 1550s, the system thinks it’s an even match. But you were carried by a stronger player! Winning that match shouldn’t boost your rating as much as winning with a weaker partner.
+
The system didn’t account for partner strength, making it unfair for everyone.
+
Enter: Pure ELO
+
ELO is elegantly simple. Every player has one number representing their skill. When two players compete:
+
+- Calculate the probability that one player beats the other based on rating difference
+- Compare expected performance to actual performance
+- Adjust ratings based on the difference
+
+
The key formula is:
+
Expected Win Probability = 1 / (1 + 10^((opponent_rating - your_rating) / 400))
+
If you’re 1500 and your opponent is 1500, you should win 50% of the time. If you’re 1600 and they’re 1500, you should win about 64% of the time. Simple.
+
After a match:
+
Rating Change = K Γ (Actual Performance - Expected Performance)
+
Where K = 32 (how much weight each match carries) and Actual Performance is your per-point performance:
+
Actual Performance = Points Scored / Total Points Played
+
Win 11-9? That’s 0.55 (55% of points). Win 11-2? That’s 0.846 (84.6%). This captures match quality far better than binary win/loss.
+
+
In doubles, we use:
+
Effective Opponent Rating = Opponent1 + Opponent2 - Your Teammate
+
Why this works:
+
If your teammate is strong, the effective opponent rating dropsβbecause your teammate made the match easier. If your teammate is weak, the effective opponent rating risesβbecause you were undermanned.
+
Beating 1500-rated opponents with a 1600-rated partner? Effective opponent: 1400. You gain less because your partner carried you.
+
Beating 1500-rated opponents with a 1400-rated partner? Effective opponent: 1600. You gain more because you did heavy lifting.
+
This is fair.
+
The Migration: Before and After
+
Here’s where things get spicy. I replayed all 29 historical matches through the new ELO system:
+
+
+ | Player |
+ Old Glicko-2 |
+ New ELO |
+ Change |
+ Matches |
+
+
+ | Andrew Stricklin |
+ 1651 |
+ 1538 |
+ β113 |
+ 19 |
+
+
+ | David Pabst |
+ 1562 |
+ 1522 |
+ β40 |
+ 11 |
+
+
+ | Jacklyn Wyszynski |
+ 1557 |
+ 1514 |
+ β43 |
+ 9 |
+
+
+ | Eliana Crew |
+ 1485 |
+ 1497 |
+ +11 |
+ 13 |
+
+
+ | Krzysztof Radziszeski |
+ 1473 |
+ 1476 |
+ +3 |
+ 25 |
+
+
+ | Dane Sabo |
+ 1290 |
+ 1449 |
+ +159 |
+ 25 |
+
+
+
Observations
+
The Rating Spread Compressed
+
The old system spread players across 361 rating points. The new system compresses them into 89 points. This makes senseβwe’re a recreational group, not chess grandmasters. The new system rates us fairly within a tighter band.
+
The Winners
+
+- Dane Sabo: +159 points. The old system penalized him for losses with weaker partners. The effective opponent formula gives credit for “carrying.” (Purely coincidental that I benefit from my own math.)
+- Eliana Crew: +11 points
+- Krzysztof Radziszeski: +3 points
+
+
The Losers
+
+- Andrew Stricklin: β113 points. Still ranked #1, but the old system over-credited wins with strong partners.
+- Jacklyn Wyszynski: β43 points
+- David Pabst: β40 points
+
+
A Note on Conflicts of Interest
+
You may notice that the system designer (me) is also the biggest beneficiary of the new ratings, gaining a convenient 159 points.
+
I want to assure you this is purely coincidental and the result of rigorous mathematical analysis, not at all influenced by the fact that I was tired of being ranked last.
+
The new formulas are based on sound theoretical principles that just happen to conclude I was being unfairly penalized all along.
+
Trust the math. π
+
Why This System Works
+
For a small league:
+
+- Simple to understand (one rating per player)
+- Fair to individual skill (per-point scoring)
+- Respects partnership (effective opponent formula)
+- Transparent (you can calculate rating changes yourself)
+- Fast convergence (5-10 matches to stabilize a rating)
+
+
The bottom line: Your rating now reflects your true skill more accurately than before. Even if it means Dane finally looks respectable.
+
+
+
+
+