# Pickleball ELO System Refactoring ## Changes Made ### ✅ Change 1: Replace Arbitrary Margin Bonus with Per-Point Expected Value **Status:** COMPLETE **File:** `src/glicko/score_weight.rs` **What Changed:** - Replaced `tanh` formula based on margin of victory - New formula: `performance = actual_points / total_points` - Expected point probability: `P(win point) = 1 / (1 + 10^((R_opp - R_self)/400))` - Output: Performance ratio (0.0-1.0) instead of arbitrary margin-weighted score (0.0-1.2) **Why This Matters:** - More mathematically sound (uses point-based probability) - Accounts for rating difference in calculating expectations - Single point underperformance/overperformance is now meaningful - Prevents arbitrary bonuses for blowouts when opponent was much weaker **Updated Files:** - `src/glicko/score_weight.rs` - Core calculation - `src/glicko/calculator.rs` - Test updated - `examples/email_demo.rs` - Usage updated - `src/demo.rs` - Usage updated - `src/simple_demo.rs` - Usage updated **New Function Signature:** ```rust pub fn calculate_weighted_score( player_rating: f64, opponent_rating: f64, points_scored: i32, points_allowed: i32, ) -> f64 ``` --- ### ✅ Change 2: Fix RD-Based Distribution (Backwards Logic) **Status:** COMPLETE **File:** `src/glicko/doubles.rs` **What Changed:** - Changed weight formula from `1.0 / rd²` to `rd²` - Higher RD (more uncertain) now gets more rating change - Lower RD (more certain) now gets less rating change **Why This Matters:** - **Correct Principle:** Uncertain ratings should converge to true skill faster - **Wrong Before:** Certain players were changing too much, uncertain players too little - **Real Impact:** New or returning players now update faster; established players update slower **Updated Function:** ```rust pub fn distribute_rating_change( partner1_rd: f64, partner2_rd: f64, team_change: f64, ) -> (f64, f64) ``` Example: If team gains +20 rating points and partner1 has RD=100, partner2 has RD=200: - Before: partner1 got ~80%, partner2 got ~20% (WRONG) - Now: partner1 gets ~20%, partner2 gets ~80% (CORRECT) --- ### ✅ Change 3: New Effective Opponent Calculation for Doubles **Status:** COMPLETE **File:** `src/glicko/doubles.rs` **What Added:** - `calculate_effective_opponent_rating()` - Takes opponent ratings and teammate rating - `calculate_effective_opponent()` - Returns full GlickoRating with appropriate RD/volatility **Formula:** ``` Effective Opponent Rating = Opp1_rating + Opp2_rating - Teammate_rating ``` **Why This Matters:** - **Personalizes rating change** based on partner strength - **Strong teammate?** Effective opponent rating is lower (they helped) - **Weak teammate?** Effective opponent rating is higher (you did the work) - Reflects reality: beating opponents is easier with a strong partner **Examples:** - Opponents: 1500, 1500 | Partner: 1500 → Effective: 1500 (neutral) - Opponents: 1500, 1500 | Partner: 1600 → Effective: 1400 (team was favored) - Opponents: 1500, 1500 | Partner: 1400 → Effective: 1600 (team was undermanned) --- ### ⏳ Change 4: Combine Singles/Doubles into One Unified Rating **Status:** IN PROGRESS - DOCUMENTED **Scope:** This is a significant schema change that requires: #### Database Schema Changes **Current Structure:** ```sql players { singles_rating REAL, singles_rd REAL, singles_volatility REAL, doubles_rating REAL, doubles_rd REAL, doubles_volatility REAL, } ``` **Proposed New Structure:** ```sql players { rating REAL, -- Unified rating rd REAL, volatility REAL, } ``` **Additional Tables Needed:** ```sql CREATE TABLE rating_history ( id INTEGER PRIMARY KEY AUTOINCREMENT, player_id INTEGER NOT NULL, match_id INTEGER NOT NULL, rating_before REAL NOT NULL, rating_after REAL NOT NULL, rd_before REAL NOT NULL, rd_after REAL NOT NULL, volatility_before REAL NOT NULL, volatility_after REAL NOT NULL, match_type TEXT CHECK(match_type IN ('singles', 'doubles')), created_at TEXT NOT NULL DEFAULT (datetime('now')), FOREIGN KEY (player_id) REFERENCES players(id), FOREIGN KEY (match_id) REFERENCES matches(id) ); ``` #### Code Changes Needed 1. **`src/models/mod.rs`** - Update `Player` struct - Remove `singles_rating`, `singles_rd`, `singles_volatility` - Remove `doubles_rating`, `doubles_rd`, `doubles_volatility` - Add unified `rating`, `rd`, `volatility` 2. **`src/main.rs`** - Update Web UI - Single rating display instead of two - Leaderboard shows one rating - Match type (singles/doubles) is still tracked in match records 3. **Database Migration** `migrations/002_unified_rating.sql` ```sql -- Create new columns for unified rating ALTER TABLE players ADD COLUMN rating REAL DEFAULT 1500.0; ALTER TABLE players ADD COLUMN rd REAL DEFAULT 350.0; ALTER TABLE players ADD COLUMN unified_volatility REAL DEFAULT 0.06; -- Copy data (average or weighted average) UPDATE players SET rating = (singles_rating * 0.5 + doubles_rating * 0.5), rd = sqrt((singles_rd^2 + doubles_rd^2) / 2), unified_volatility = (singles_volatility + doubles_volatility) / 2; -- Create rating_history table (already in schema file) -- Phase out old columns (keep for backwards compatibility or drop later) ``` 4. **Demo/Test Files** - Update to use unified rating - `src/simple_demo.rs` - `src/demo.rs` - `examples/email_demo.rs` #### Implementation Strategy (For Next Iteration) **Phase 1: Migration & Dual Write** (Current) - Add new unified rating columns to `players` table - Maintain old singles/doubles columns - Code writes to both (ensures backwards compatibility) **Phase 2: Testing** - Verify unified rating calculations - Compare results with separate singles/doubles - Test backwards compatibility **Phase 3: Cutover** - Switch web UI to show unified rating - Archive historical singles/doubles data - Deprecate old columns **Phase 4: Cleanup** (Optional) - Remove old columns if no longer needed - Prune rating_history if size becomes an issue #### Why One Unified Rating? **Pros:** - Simpler mental model - Still track match type in history - Reduces database complexity - Single leaderboard **Cons:** - Loses distinction between formats (some players are better at doubles) - Rating becomes weighted average of both **Trade-off Solution:** Keep match type in `matches` table - can still filter leaderboards by format in the future, but use single rating for each player. --- ## Compilation & Testing ### Build Status ```bash cd /Users/split/Projects/pickleball-elo cargo build --release ``` Expected: ✅ All code should compile successfully ### Test Commands ```bash cargo test --lib cargo test --lib glicko::doubles cargo test --lib glicko::score_weight ``` --- ## Files Modified ### Core Changes - ✅ `src/glicko/score_weight.rs` - Margin bonus → performance ratio - ✅ `src/glicko/doubles.rs` - RD flip + effective opponent - ✅ `src/glicko/calculator.rs` - Test update ### Usage Sites - ✅ `examples/email_demo.rs` - New function signature - ✅ `src/demo.rs` - New function signature - ✅ `src/simple_demo.rs` - New function signature ### Not Yet Changed (Deferred to Phase 2) - ⏳ `src/models/mod.rs` - Player struct update - ⏳ `src/main.rs` - Web UI updates - ⏳ `migrations/002_unified_rating.sql` - New migration --- ## Database Backup - Current: `pickleball.db.backup-20260226-105326` ✅ Available - Safe to proceed with code changes - Schema migration can be done in separate phase --- ## Next Steps 1. ✅ Verify compilation: `cargo build --release` 2. ✅ Run tests: `cargo test` 3. ⏳ Implement unified rating schema changes 4. ⏳ Update Player struct and main.rs 5. ⏳ Test end-to-end with new system