347 lines
11 KiB
TeX
347 lines
11 KiB
TeX
\documentclass[12pt,a4paper]{article}
|
|
|
|
\usepackage[margin=1in]{geometry}
|
|
\usepackage{amsmath}
|
|
\usepackage{amssymb}
|
|
\usepackage{amsthm}
|
|
\usepackage{graphicx}
|
|
\usepackage{xcolor}
|
|
\usepackage{booktabs}
|
|
\usepackage{array}
|
|
\usepackage{multirow}
|
|
\usepackage{hyperref}
|
|
\usepackage{tikz}
|
|
\usepackage{pgfplots}
|
|
|
|
% Theorem styles
|
|
\theoremstyle{definition}
|
|
\newtheorem{definition}{Definition}
|
|
\newtheorem{example}{Example}
|
|
\newtheorem*{tldr}{\textbf{TL;DR}}
|
|
|
|
% Custom colors
|
|
\definecolor{attention}{RGB}{200,0,0}
|
|
\definecolor{success}{RGB}{0,100,0}
|
|
\definecolor{info}{RGB}{0,0,150}
|
|
|
|
% Title
|
|
\title{\textbf{How Bad Am I, Actually?} \\[0.5em]
|
|
{\Large Building a Pickleball Rating System That Doesn't Lie} \\[0.2em]
|
|
{\normalsize (Now With 100\% Less Volatility and 100\% More Accountability)}}
|
|
\author{Split (Implementation) \and Dane Sabo (System Design)}
|
|
\date{February 2026}
|
|
|
|
\begin{document}
|
|
|
|
\maketitle
|
|
|
|
% TL;DR BOX
|
|
\begin{center}
|
|
\fbox{%
|
|
\parbox{0.9\textwidth}{%
|
|
\vspace{0.3cm}
|
|
\textbf{\Large TL;DR: What This System Does}
|
|
\vspace{0.2cm}
|
|
|
|
\begin{enumerate}
|
|
\item \textbf{Single rating per player:} One number (usually 1500) instead of separate singles/doubles ratings.
|
|
\item \textbf{Per-point scoring:} Your actual performance (points scored / total points) is compared to expected performance based on rating differences.
|
|
\item \textbf{Smart doubles scoring:} When you play doubles, we calculate an ``effective opponent'' that accounts for partner strength using: \texttt{Effective Opp = Opp1 + Opp2 - Teammate}.
|
|
\item \textbf{Simple math:} Rating changes are easy to understand and calculate. No volatility, no rating deviation—just you vs. opponents.
|
|
\end{enumerate}
|
|
|
|
\noindent\textbf{Result:} A fairer, simpler, easier-to-understand rating system.
|
|
\vspace{0.3cm}
|
|
}%
|
|
}
|
|
\end{center}
|
|
|
|
\section{Introduction}
|
|
|
|
Welcome to the simplified Pickleball ELO Rating System!
|
|
|
|
After running our league with Glicko-2 for over a month, we realized:
|
|
\begin{enumerate}
|
|
\item Glicko-2 was overkill for our small recreational league
|
|
\item Many players didn't understand how rating changes worked
|
|
\item We didn't need rating deviation or volatility tracking
|
|
\item A simple, transparent ELO system would be easier to maintain and explain
|
|
\end{enumerate}
|
|
|
|
This document explains pure ELO and the improvements we made to handle pickleball's unique challenges (especially doubles with different partner strengths).
|
|
|
|
\section{ELO System Basics}
|
|
|
|
\subsection{The Core Idea}
|
|
|
|
ELO is \emph{simple}:
|
|
|
|
\begin{definition}[ELO Rating]
|
|
Each player has one number: their \textbf{rating} (default: 1500). This represents their expected performance.
|
|
\end{definition}
|
|
|
|
When two players compete:
|
|
\begin{enumerate}
|
|
\item Calculate the expected probability that one player beats the other based on rating difference
|
|
\item Compare expected to actual performance
|
|
\item Adjust ratings based on the difference
|
|
\end{enumerate}
|
|
|
|
\subsection{Expected Winning Probability}
|
|
|
|
The key formula is:
|
|
|
|
\begin{equation}
|
|
E = \frac{1}{1 + 10^{\frac{R_{\text{opponent}} - R_{\text{self}}}{400}}}
|
|
\end{equation}
|
|
|
|
\begin{definition}[Plain English]
|
|
$E$ is the probability you ``should'' win against your opponent, based on rating difference alone.
|
|
\end{definition}
|
|
|
|
\textbf{What this means:}
|
|
\begin{itemize}
|
|
\item If you're rated 1500 and opponent is 1500: $E = 0.5$ (50-50 matchup)
|
|
\item If you're rated 1600 and opponent is 1500: $E \approx 0.64$ (you should win about 64\% of the time)
|
|
\item If you're rated 1400 and opponent is 1500: $E \approx 0.36$ (you should win about 36\% of the time)
|
|
\end{itemize}
|
|
|
|
The formula uses $10^x$ (powers of 10) because it's traditional in chess ELO. The 400 in the denominator is a scaling factor.
|
|
|
|
\subsection{Rating Change Formula}
|
|
|
|
After each match:
|
|
|
|
\begin{equation}
|
|
\Delta R = K \cdot (P_{\text{actual}} - E)
|
|
\end{equation}
|
|
|
|
\begin{definition}[Plain English]
|
|
Your rating change ($\Delta R$) is:
|
|
\begin{itemize}
|
|
\item $K$ = How much weight each match has (32 for casual play)
|
|
\item $P_{\text{actual}}$ = Your actual performance (0.0 to 1.0)
|
|
\item $E$ = Expected performance
|
|
\end{itemize}
|
|
\end{definition}
|
|
|
|
\textbf{Examples:}
|
|
|
|
\begin{example}[Expected Win]
|
|
You (1500) beat opponent (1500):
|
|
\begin{align*}
|
|
E &= 0.5 \\
|
|
P_{\text{actual}} &= 1.0 \text{ (you won)} \\
|
|
\Delta R &= 32 \cdot (1.0 - 0.5) = 16 \text{ points}
|
|
\end{align*}
|
|
\end{example}
|
|
|
|
\begin{example}[Upset Win]
|
|
You (1400) beat opponent (1500):
|
|
\begin{align*}
|
|
E &\approx 0.36 \\
|
|
P_{\text{actual}} &= 1.0 \\
|
|
\Delta R &= 32 \cdot (1.0 - 0.36) \approx 20.5 \text{ points}
|
|
\end{align*}
|
|
You gain more because you won an upset!
|
|
\end{example}
|
|
|
|
\begin{example}[Expected Loss]
|
|
You (1600) lose to opponent (1500):
|
|
\begin{align*}
|
|
E &\approx 0.64 \\
|
|
P_{\text{actual}} &= 0.0 \text{ (you lost)} \\
|
|
\Delta R &= 32 \cdot (0.0 - 0.64) \approx -20.5 \text{ points}
|
|
\end{align*}
|
|
You lose more because it was an upset loss!
|
|
\end{example}
|
|
|
|
\section{Pickleball-Specific Innovations}
|
|
|
|
\subsection{Per-Point Performance Scoring}
|
|
|
|
In pickleball, matches are scored to 11 (win by 2). A 11-9 match is very different from an 11-2 match, even if both are wins.
|
|
|
|
Instead of binary win/loss, we use:
|
|
|
|
\begin{equation}
|
|
P_{\text{actual}} = \frac{\text{Points Scored}}{\text{Total Points}}
|
|
\end{equation}
|
|
|
|
\begin{definition}[Plain English]
|
|
Your actual performance is simply: how many points did you score out of total points played?
|
|
\end{definition}
|
|
|
|
\textbf{Examples:}
|
|
\begin{itemize}
|
|
\item 11-9 win: $P = 11/20 = 0.55$ (55\% of points)
|
|
\item 11-2 win: $P = 11/13 = 0.846$ (84.6\% of points)
|
|
\item 5-11 loss: $P = 5/16 = 0.3125$ (31.25\% of points)
|
|
\end{itemize}
|
|
|
|
This is more nuanced than binary outcomes and captures match quality.
|
|
|
|
\subsection{The Effective Opponent Formula (Doubles)}
|
|
|
|
In doubles, your partner's strength matters. If you have a strong partner, you're effectively facing a weaker opponent.
|
|
|
|
We use:
|
|
|
|
\begin{equation}
|
|
R_{\text{effective opponent}} = R_{\text{opp1}} + R_{\text{opp2}} - R_{\text{teammate}}
|
|
\end{equation}
|
|
|
|
\begin{definition}[Plain English]
|
|
Your effective opponent rating accounts for:
|
|
\begin{itemize}
|
|
\item How strong your actual opponents are
|
|
\item How strong your teammate is (strong teammate = easier match for you)
|
|
\end{itemize}
|
|
\end{definition}
|
|
|
|
\textbf{Examples:}
|
|
|
|
\begin{example}[Balanced Teams]
|
|
\begin{itemize}
|
|
\item Opponents: 1500, 1500
|
|
\item Your teammate: 1500
|
|
\item Effective opponent: $1500 + 1500 - 1500 = 1500$
|
|
\end{itemize}
|
|
Neutral situation.
|
|
\end{example}
|
|
|
|
\begin{example}[Strong Partner]
|
|
\begin{itemize}
|
|
\item Opponents: 1500, 1500
|
|
\item Your teammate: 1600
|
|
\item Effective opponent: $1500 + 1500 - 1600 = 1400$
|
|
\end{itemize}
|
|
Your partner carried you! The system treats the match as easier (lower effective opponent).
|
|
\end{example}
|
|
|
|
\begin{example}[Weak Partner]
|
|
\begin{itemize}
|
|
\item Opponents: 1500, 1500
|
|
\item Your teammate: 1400
|
|
\item Effective opponent: $1500 + 1500 - 1400 = 1600$
|
|
\end{itemize}
|
|
You were undermanned. The system treats the match as harder (higher effective opponent).
|
|
\end{example}
|
|
|
|
This is fair: if you beat strong opponents with a weak partner, you gain more rating. If you barely beat weaker opponents with help, you gain less.
|
|
|
|
\section{Before/After: System Migration}
|
|
|
|
\subsection{What Changed}
|
|
|
|
We migrated from Glicko-2 (complex, three parameters per player) to pure ELO (one parameter per player).
|
|
|
|
Key differences:
|
|
|
|
\begin{table}[h]
|
|
\centering
|
|
\begin{tabular}{|l|c|c|}
|
|
\hline
|
|
\textbf{Feature} & \textbf{Glicko-2} & \textbf{Pure ELO} \\
|
|
\hline
|
|
Parameters per player & 3 (rating, RD, volatility) & 1 (rating only) \\
|
|
Complexity & High & Low \\
|
|
Transparency & Medium & High \\
|
|
Per-point scoring & Yes & Yes \\
|
|
Effective opponent (doubles) & Weighted avg & Opp1+Opp2-Teammate \\
|
|
\hline
|
|
\end{tabular}
|
|
\end{table}
|
|
|
|
\subsection{Migration Data: Old vs New Ratings}
|
|
|
|
We replayed all 29 historical matches through the new ELO system to see how ratings changed. Here's the comparison:
|
|
|
|
\begin{table}[h]
|
|
\centering
|
|
\begin{tabular}{|l|r|r|r|r|}
|
|
\hline
|
|
\textbf{Player} & \textbf{Old Glicko Avg} & \textbf{New ELO} & \textbf{Change} & \textbf{Matches} \\
|
|
\hline
|
|
Andrew Stricklin & 1651 & 1538 & \textcolor{attention}{-113} & 19 \\
|
|
David Pabst & 1562 & 1522 & \textcolor{attention}{-40} & 11 \\
|
|
Jacklyn Wyszynski & 1557 & 1514 & \textcolor{attention}{-43} & 9 \\
|
|
Eliana Crew & 1485 & 1497 & \textcolor{success}{+11} & 13 \\
|
|
Krzysztof Radziszeski & 1473 & 1476 & \textcolor{success}{+3} & 25 \\
|
|
Dane Sabo & 1290 & 1449 & \textcolor{success}{+159} & 25 \\
|
|
\hline
|
|
\end{tabular}
|
|
\caption{Rating comparison after replaying all matches through the new system}
|
|
\end{table}
|
|
|
|
\textbf{Key observations:}
|
|
\begin{itemize}
|
|
\item \textbf{Rating spread compressed:} Old system had 361 points between top and bottom; new system has only 89 points. This makes sense—we're a recreational group, not pros.
|
|
\item \textbf{Biggest winner:} Dane (+159 points). The old system was penalizing him for losses with weaker partners. The new effective opponent formula gives credit for ``carrying.''
|
|
\item \textbf{Biggest loser:} Andrew (-113 points). Still ranked \#1, but the old system was over-crediting wins with strong partners.
|
|
\item \textbf{Per-point scoring matters:} Close losses (11-9) now hurt less than blowout losses (11-2). This rewards competitive play even in defeat.
|
|
\end{itemize}
|
|
|
|
The new system rates players more fairly, especially in doubles where partner strength varies.
|
|
|
|
\section{Implementation Notes}
|
|
|
|
\subsection{K-Factor}
|
|
|
|
We use $K = 32$, which is standard for casual chess. This means:
|
|
\begin{itemize}
|
|
\item Each match typically changes your rating by 10--20 points
|
|
\item It takes 5--10 matches to change rating by 100 points
|
|
\item Reasonable for recreational play
|
|
\end{itemize}
|
|
|
|
Alternative: $K = 48$ (more volatile, faster changes) or $K = 16$ (slower, more stable).
|
|
|
|
\subsection{Starting Rating}
|
|
|
|
All new players start at 1500. This is arbitrary but standard in ELO systems.
|
|
|
|
\subsection{Minimum Rating}
|
|
|
|
Ratings never go below 1. This prevents the system from producing absurd values.
|
|
|
|
\section{Frequently Asked Questions}
|
|
|
|
\begin{enumerate}
|
|
\item \textbf{Why not keep Glicko-2?}
|
|
\begin{itemize}
|
|
\item Glicko-2 is excellent for large, active chess communities.
|
|
\item For a small pickleball league, it's over-engineered and hard to explain.
|
|
\item Pure ELO is simpler and still fair.
|
|
\end{itemize}
|
|
|
|
\item \textbf{How do I know if my rating is accurate?}
|
|
\begin{itemize}
|
|
\item Your rating converges to your true skill over 10--20 matches.
|
|
\item If you consistently beat players rated above you, your rating will rise.
|
|
\item If you lose to players rated below you, your rating will drop.
|
|
\end{itemize}
|
|
|
|
\item \textbf{Why does my doubles rating matter in singles?}
|
|
\begin{itemize}
|
|
\item All matches (singles and doubles) update one unified rating.
|
|
\item Your true skill is roughly the same in both formats.
|
|
\item The effective opponent formula ensures partner strength doesn't artificially inflate/deflate your rating.
|
|
\end{itemize}
|
|
|
|
\item \textbf{Can I lose rating for a win?}
|
|
\begin{itemize}
|
|
\item No. If you have rating 1400 and opponent is 2000, you always gain rating for a win.
|
|
\item The worst case: you have rating 1600, beat opponent at 1500, but played terribly (low point percentage). You gain less.
|
|
\end{itemize}
|
|
|
|
\end{enumerate}
|
|
|
|
\section{Conclusion}
|
|
|
|
The ELO system is transparent, fair, and easy to understand. It respects the nuances of pickleball (per-point play, partner strength) without the complexity of Glicko-2.
|
|
|
|
Your rating now reflects your true skill more accurately than ever.
|
|
|
|
\end{document}
|