Update v3 report: sassy title, real comparison data table

2026-02-26 11:57:16 -05:00 · 2026-02-26 11:57:16 -05:00 · 858636018b
commit 858636018b
parent 16e21346c2
1 changed files with 28 additions and 10 deletions
--- a/docs/rating-system-v3-elo.tex
+++ b/docs/rating-system-v3-elo.tex
@ -25,9 +25,9 @@
 \definecolor{info}{RGB}{0,0,150}

 % Title
-\title{\textbf{Pickleball ELO Rating System} \\[0.5em]
-  {\normalsize A Simple, Transparent, Mathematically Sound Rating System} \\[0.2em]
-  {\normalsize (Now With 100\% Less Volatility!)}}
+\title{\textbf{How Bad Am I, Actually?} \\[0.5em]
+  {\Large Building a Pickleball Rating System That Doesn't Lie} \\[0.2em]
+  {\normalsize (Now With 100\% Less Volatility and 100\% More Accountability)}}
 \author{Split (Implementation) \and Dane Sabo (System Design)}
 \date{February 2026}

@ -253,18 +253,36 @@ Effective opponent (doubles) & Weighted avg & Opp1+Opp2-Teammate \\
 \end{tabular}
 \end{table}

-\subsection{Migration Data}
+\subsection{Migration Data: Old vs New Ratings}

-Using all historical matches, we recalculated everyone's rating under pure ELO.
+We replayed all 29 historical matches through the new ELO system to see how ratings changed. Here's the comparison:

-\textbf{Average rating changes:}
+\begin{table}[h]
+\centering
+\begin{tabular}{|l|r|r|r|r|}
+\hline
+\textbf{Player} & \textbf{Old Glicko Avg} & \textbf{New ELO} & \textbf{Change} & \textbf{Matches} \\
+\hline
+Andrew Stricklin & 1651 & 1538 & \textcolor{attention}{-113} & 19 \\
+David Pabst & 1562 & 1522 & \textcolor{attention}{-40} & 11 \\
+Jacklyn Wyszynski & 1557 & 1514 & \textcolor{attention}{-43} & 9 \\
+Eliana Crew & 1485 & 1497 & \textcolor{success}{+11} & 13 \\
+Krzysztof Radziszeski & 1473 & 1476 & \textcolor{success}{+3} & 25 \\
+Dane Sabo & 1290 & 1449 & \textcolor{success}{+159} & 25 \\
+\hline
+\end{tabular}
+\caption{Rating comparison after replaying all matches through the new system}
+\end{table}
+
+\textbf{Key observations:}
 \begin{itemize}
-\item Singles: Most players within $\pm 50$ points
-\item Doubles: Most players within $\pm 50$ points
-\item A few players changed by 80--100 points (usually due to playing only with strong or weak partners)
+\item \textbf{Rating spread compressed:} Old system had 361 points between top and bottom; new system has only 89 points. This makes sense—we're a recreational group, not pros.
+\item \textbf{Biggest winner:} Dane (+159 points). The old system was penalizing him for losses with weaker partners. The new effective opponent formula gives credit for ``carrying.''
+\item \textbf{Biggest loser:} Andrew (-113 points). Still ranked \#1, but the old system was over-crediting wins with strong partners.
+\item \textbf{Per-point scoring matters:} Close losses (11-9) now hurt less than blowout losses (11-2). This rewards competitive play even in defeat.
 \end{itemize}

-The new system generally rates players similarly to Glicko-2, but with better fairness in doubles scenarios.
+The new system rates players more fairly, especially in doubles where partner strength varies.

 \section{Implementation Notes}