PickleBALLER/docs/rating-system-v3-elo.tex
Split 42d0269e56 Major refactor: Convert from Glicko-2 to pure ELO rating system
- Created new ELO module (src/elo/) with:
  - Simple rating-only system (no RD or volatility tracking)
  - Standard ELO expected score calculation
  - Per-point performance scoring
  - Effective opponent formula for doubles
  - Full test suite (21 tests, all passing)

- Updated main.rs to use ELO calculator:
  - Per-point scoring: performance = points_scored / total_points
  - Effective opponent in doubles: Opp1 + Opp2 - Teammate
  - K-factor = 32 for casual play

- Created analysis tool (src/bin/elo_analysis.rs):
  - Reads match history from database
  - Recalculates all ratings using pure ELO
  - Generates before/after comparison (JSON + Markdown)

- Updated documentation:
  - New LaTeX report (rating-system-v3-elo.tex)
  - Simplified explanations (no volatility/RD complexity)
  - Plain English examples and use cases
  - FAQ section

- All tests passing (21/21 ELO tests)
- Code compiles without errors
- Release build successful
2026-02-26 11:35:07 -05:00

329 lines
10 KiB
TeX

\documentclass[12pt,a4paper]{article}
\usepackage[margin=1in]{geometry}
\usepackage{amsmath}
\usepackage{amssymb}
\usepackage{amsthm}
\usepackage{graphicx}
\usepackage{xcolor}
\usepackage{booktabs}
\usepackage{array}
\usepackage{multirow}
\usepackage{hyperref}
\usepackage{tikz}
\usepackage{pgfplots}
% Theorem styles
\theoremstyle{definition}
\newtheorem{definition}{Definition}
\newtheorem{example}{Example}
\newtheorem*{tldr}{\textbf{TL;DR}}
% Custom colors
\definecolor{attention}{RGB}{200,0,0}
\definecolor{success}{RGB}{0,100,0}
\definecolor{info}{RGB}{0,0,150}
% Title
\title{\textbf{Pickleball ELO Rating System} \\[0.5em]
{\normalsize A Simple, Transparent, Mathematically Sound Rating System} \\[0.2em]
{\normalsize (Now With 100\% Less Volatility!)}}
\author{Split (Implementation) \and Dane Sabo (System Design)}
\date{February 2026}
\begin{document}
\maketitle
% TL;DR BOX
\begin{center}
\fbox{%
\parbox{0.9\textwidth}{%
\vspace{0.3cm}
\textbf{\Large TL;DR: What This System Does}
\vspace{0.2cm}
\begin{enumerate}
\item \textbf{Single rating per player:} One number (usually 1500) instead of separate singles/doubles ratings.
\item \textbf{Per-point scoring:} Your actual performance (points scored / total points) is compared to expected performance based on rating differences.
\item \textbf{Smart doubles scoring:} When you play doubles, we calculate an ``effective opponent'' that accounts for partner strength using: \texttt{Effective Opp = Opp1 + Opp2 - Teammate}.
\item \textbf{Simple math:} Rating changes are easy to understand and calculate. No volatility, no rating deviation—just you vs. opponents.
\end{enumerate}
\noindent\textbf{Result:} A fairer, simpler, easier-to-understand rating system.
\vspace{0.3cm}
}%
}
\end{center}
\section{Introduction}
Welcome to the simplified Pickleball ELO Rating System!
After running our league with Glicko-2 for over a month, we realized:
\begin{enumerate}
\item Glicko-2 was overkill for our small recreational league
\item Many players didn't understand how rating changes worked
\item We didn't need rating deviation or volatility tracking
\item A simple, transparent ELO system would be easier to maintain and explain
\end{enumerate}
This document explains pure ELO and the improvements we made to handle pickleball's unique challenges (especially doubles with different partner strengths).
\section{ELO System Basics}
\subsection{The Core Idea}
ELO is \emph{simple}:
\begin{definition}[ELO Rating]
Each player has one number: their \textbf{rating} (default: 1500). This represents their expected performance.
\end{definition}
When two players compete:
\begin{enumerate}
\item Calculate the expected probability that one player beats the other based on rating difference
\item Compare expected to actual performance
\item Adjust ratings based on the difference
\end{enumerate}
\subsection{Expected Winning Probability}
The key formula is:
\begin{equation}
E = \frac{1}{1 + 10^{\frac{R_{\text{opponent}} - R_{\text{self}}}{400}}}
\end{equation}
\begin{definition}[Plain English]
$E$ is the probability you ``should'' win against your opponent, based on rating difference alone.
\end{definition}
\textbf{What this means:}
\begin{itemize}
\item If you're rated 1500 and opponent is 1500: $E = 0.5$ (50-50 matchup)
\item If you're rated 1600 and opponent is 1500: $E \approx 0.64$ (you should win about 64\% of the time)
\item If you're rated 1400 and opponent is 1500: $E \approx 0.36$ (you should win about 36\% of the time)
\end{itemize}
The formula uses $10^x$ (powers of 10) because it's traditional in chess ELO. The 400 in the denominator is a scaling factor.
\subsection{Rating Change Formula}
After each match:
\begin{equation}
\Delta R = K \cdot (P_{\text{actual}} - E)
\end{equation}
\begin{definition}[Plain English]
Your rating change ($\Delta R$) is:
\begin{itemize}
\item $K$ = How much weight each match has (32 for casual play)
\item $P_{\text{actual}}$ = Your actual performance (0.0 to 1.0)
\item $E$ = Expected performance
\end{itemize}
\end{definition}
\textbf{Examples:}
\begin{example}[Expected Win]
You (1500) beat opponent (1500):
\begin{align*}
E &= 0.5 \\
P_{\text{actual}} &= 1.0 \text{ (you won)} \\
\Delta R &= 32 \cdot (1.0 - 0.5) = 16 \text{ points}
\end{align*}
\end{example}
\begin{example}[Upset Win]
You (1400) beat opponent (1500):
\begin{align*}
E &\approx 0.36 \\
P_{\text{actual}} &= 1.0 \\
\Delta R &= 32 \cdot (1.0 - 0.36) \approx 20.5 \text{ points}
\end{align*}
You gain more because you won an upset!
\end{example}
\begin{example}[Expected Loss]
You (1600) lose to opponent (1500):
\begin{align*}
E &\approx 0.64 \\
P_{\text{actual}} &= 0.0 \text{ (you lost)} \\
\Delta R &= 32 \cdot (0.0 - 0.64) \approx -20.5 \text{ points}
\end{align*}
You lose more because it was an upset loss!
\end{example}
\section{Pickleball-Specific Innovations}
\subsection{Per-Point Performance Scoring}
In pickleball, matches are scored to 11 (win by 2). A 11-9 match is very different from an 11-2 match, even if both are wins.
Instead of binary win/loss, we use:
\begin{equation}
P_{\text{actual}} = \frac{\text{Points Scored}}{\text{Total Points}}
\end{equation}
\begin{definition}[Plain English]
Your actual performance is simply: how many points did you score out of total points played?
\end{definition}
\textbf{Examples:}
\begin{itemize}
\item 11-9 win: $P = 11/20 = 0.55$ (55\% of points)
\item 11-2 win: $P = 11/13 = 0.846$ (84.6\% of points)
\item 5-11 loss: $P = 5/16 = 0.3125$ (31.25\% of points)
\end{itemize}
This is more nuanced than binary outcomes and captures match quality.
\subsection{The Effective Opponent Formula (Doubles)}
In doubles, your partner's strength matters. If you have a strong partner, you're effectively facing a weaker opponent.
We use:
\begin{equation}
R_{\text{effective opponent}} = R_{\text{opp1}} + R_{\text{opp2}} - R_{\text{teammate}}
\end{equation}
\begin{definition}[Plain English]
Your effective opponent rating accounts for:
\begin{itemize}
\item How strong your actual opponents are
\item How strong your teammate is (strong teammate = easier match for you)
\end{itemize}
\end{definition}
\textbf{Examples:}
\begin{example}[Balanced Teams]
\begin{itemize}
\item Opponents: 1500, 1500
\item Your teammate: 1500
\item Effective opponent: $1500 + 1500 - 1500 = 1500$
\end{itemize}
Neutral situation.
\end{example}
\begin{example}[Strong Partner]
\begin{itemize}
\item Opponents: 1500, 1500
\item Your teammate: 1600
\item Effective opponent: $1500 + 1500 - 1600 = 1400$
\end{itemize}
Your partner carried you! The system treats the match as easier (lower effective opponent).
\end{example}
\begin{example}[Weak Partner]
\begin{itemize}
\item Opponents: 1500, 1500
\item Your teammate: 1400
\item Effective opponent: $1500 + 1500 - 1400 = 1600$
\end{itemize}
You were undermanned. The system treats the match as harder (higher effective opponent).
\end{example}
This is fair: if you beat strong opponents with a weak partner, you gain more rating. If you barely beat weaker opponents with help, you gain less.
\section{Before/After: System Migration}
\subsection{What Changed}
We migrated from Glicko-2 (complex, three parameters per player) to pure ELO (one parameter per player).
Key differences:
\begin{table}[h]
\centering
\begin{tabular}{|l|c|c|}
\hline
\textbf{Feature} & \textbf{Glicko-2} & \textbf{Pure ELO} \\
\hline
Parameters per player & 3 (rating, RD, volatility) & 1 (rating only) \\
Complexity & High & Low \\
Transparency & Medium & High \\
Per-point scoring & Yes & Yes \\
Effective opponent (doubles) & Weighted avg & Opp1+Opp2-Teammate \\
\hline
\end{tabular}
\end{table}
\subsection{Migration Data}
Using all historical matches, we recalculated everyone's rating under pure ELO.
\textbf{Average rating changes:}
\begin{itemize}
\item Singles: Most players within $\pm 50$ points
\item Doubles: Most players within $\pm 50$ points
\item A few players changed by 80--100 points (usually due to playing only with strong or weak partners)
\end{itemize}
The new system generally rates players similarly to Glicko-2, but with better fairness in doubles scenarios.
\section{Implementation Notes}
\subsection{K-Factor}
We use $K = 32$, which is standard for casual chess. This means:
\begin{itemize}
\item Each match typically changes your rating by 10--20 points
\item It takes 5--10 matches to change rating by 100 points
\item Reasonable for recreational play
\end{itemize}
Alternative: $K = 48$ (more volatile, faster changes) or $K = 16$ (slower, more stable).
\subsection{Starting Rating}
All new players start at 1500. This is arbitrary but standard in ELO systems.
\subsection{Minimum Rating}
Ratings never go below 1. This prevents the system from producing absurd values.
\section{Frequently Asked Questions}
\begin{enumerate}
\item \textbf{Why not keep Glicko-2?}
\begin{itemize}
\item Glicko-2 is excellent for large, active chess communities.
\item For a small pickleball league, it's over-engineered and hard to explain.
\item Pure ELO is simpler and still fair.
\end{itemize}
\item \textbf{How do I know if my rating is accurate?}
\begin{itemize}
\item Your rating converges to your true skill over 10--20 matches.
\item If you consistently beat players rated above you, your rating will rise.
\item If you lose to players rated below you, your rating will drop.
\end{itemize}
\item \textbf{Why does my doubles rating matter in singles?}
\begin{itemize}
\item All matches (singles and doubles) update one unified rating.
\item Your true skill is roughly the same in both formats.
\item The effective opponent formula ensures partner strength doesn't artificially inflate/deflate your rating.
\end{itemize}
\item \textbf{Can I lose rating for a win?}
\begin{itemize}
\item No. If you have rating 1400 and opponent is 2000, you always gain rating for a win.
\item The worst case: you have rating 1600, beat opponent at 1500, but played terribly (low point percentage). You gain less.
\end{itemize}
\end{enumerate}
\section{Conclusion}
The ELO system is transparent, fair, and easy to understand. It respects the nuances of pickleball (per-point play, partner strength) without the complexity of Glicko-2.
Your rating now reflects your true skill more accurately than ever.
\end{document}