PickleBALLER/REFACTORING_NOTES.md
Split 9ae1bd37fd Refactor: Implement all four ELO system improvements
CHANGES:

1. Replace arbitrary margin bonus with per-point expected value
   - Replace tanh formula in score_weight.rs
   - New: performance = actual_points / total_points
   - Expected: P(point) = 1 / (1 + 10^((R_opp - R_self)/400))
   - Outcome now reflects actual performance vs expected

2. Fix RD-based distribution (backwards logic)
   - Changed weight from 1.0/rd² to rd²
   - Higher RD (uncertain) now gets more change
   - Lower RD (certain) gets less change
   - Follows correct Glicko-2 principle

3. Add new effective opponent calculation for doubles
   - New functions: calculate_effective_opponent_rating()
   - Formula: Eff_Opp = Opp1 + Opp2 - Teammate
   - Personalizes rating change by partner strength
   - Strong teammate → lower effective opponent
   - Weak teammate → higher effective opponent

4. Document unified rating consolidation (Phase 1)
   - Added REFACTORING_NOTES.md with full plan
   - Schema changes identified but deferred
   - Code is ready for single rating migration

All changes:
- Compile successfully (release build)
- Pass all 14 unit tests
- Backwards compatible with demo/example code updated
- Database backup available at pickleball.db.backup-20260226-105326
2026-02-26 10:58:10 -05:00

7.7 KiB

Pickleball ELO System Refactoring

Changes Made

Change 1: Replace Arbitrary Margin Bonus with Per-Point Expected Value

Status: COMPLETE

File: src/glicko/score_weight.rs

What Changed:

  • Replaced tanh formula based on margin of victory
  • New formula: performance = actual_points / total_points
  • Expected point probability: P(win point) = 1 / (1 + 10^((R_opp - R_self)/400))
  • Output: Performance ratio (0.0-1.0) instead of arbitrary margin-weighted score (0.0-1.2)

Why This Matters:

  • More mathematically sound (uses point-based probability)
  • Accounts for rating difference in calculating expectations
  • Single point underperformance/overperformance is now meaningful
  • Prevents arbitrary bonuses for blowouts when opponent was much weaker

Updated Files:

  • src/glicko/score_weight.rs - Core calculation
  • src/glicko/calculator.rs - Test updated
  • examples/email_demo.rs - Usage updated
  • src/demo.rs - Usage updated
  • src/simple_demo.rs - Usage updated

New Function Signature:

pub fn calculate_weighted_score(
    player_rating: f64,
    opponent_rating: f64,
    points_scored: i32,
    points_allowed: i32,
) -> f64

Change 2: Fix RD-Based Distribution (Backwards Logic)

Status: COMPLETE

File: src/glicko/doubles.rs

What Changed:

  • Changed weight formula from 1.0 / rd² to rd²
  • Higher RD (more uncertain) now gets more rating change
  • Lower RD (more certain) now gets less rating change

Why This Matters:

  • Correct Principle: Uncertain ratings should converge to true skill faster
  • Wrong Before: Certain players were changing too much, uncertain players too little
  • Real Impact: New or returning players now update faster; established players update slower

Updated Function:

pub fn distribute_rating_change(
    partner1_rd: f64,
    partner2_rd: f64,
    team_change: f64,
) -> (f64, f64)

Example: If team gains +20 rating points and partner1 has RD=100, partner2 has RD=200:

  • Before: partner1 got ~80%, partner2 got ~20% (WRONG)
  • Now: partner1 gets ~20%, partner2 gets ~80% (CORRECT)

Change 3: New Effective Opponent Calculation for Doubles

Status: COMPLETE

File: src/glicko/doubles.rs

What Added:

  • calculate_effective_opponent_rating() - Takes opponent ratings and teammate rating
  • calculate_effective_opponent() - Returns full GlickoRating with appropriate RD/volatility

Formula:

Effective Opponent Rating = Opp1_rating + Opp2_rating - Teammate_rating

Why This Matters:

  • Personalizes rating change based on partner strength
  • Strong teammate? Effective opponent rating is lower (they helped)
  • Weak teammate? Effective opponent rating is higher (you did the work)
  • Reflects reality: beating opponents is easier with a strong partner

Examples:

  • Opponents: 1500, 1500 | Partner: 1500 → Effective: 1500 (neutral)
  • Opponents: 1500, 1500 | Partner: 1600 → Effective: 1400 (team was favored)
  • Opponents: 1500, 1500 | Partner: 1400 → Effective: 1600 (team was undermanned)

Change 4: Combine Singles/Doubles into One Unified Rating

Status: IN PROGRESS - DOCUMENTED

Scope: This is a significant schema change that requires:

Database Schema Changes

Current Structure:

players {
    singles_rating REAL,
    singles_rd REAL,
    singles_volatility REAL,
    doubles_rating REAL,
    doubles_rd REAL,
    doubles_volatility REAL,
}

Proposed New Structure:

players {
    rating REAL,              -- Unified rating
    rd REAL,
    volatility REAL,
}

Additional Tables Needed:

CREATE TABLE rating_history (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    player_id INTEGER NOT NULL,
    match_id INTEGER NOT NULL,
    rating_before REAL NOT NULL,
    rating_after REAL NOT NULL,
    rd_before REAL NOT NULL,
    rd_after REAL NOT NULL,
    volatility_before REAL NOT NULL,
    volatility_after REAL NOT NULL,
    match_type TEXT CHECK(match_type IN ('singles', 'doubles')),
    created_at TEXT NOT NULL DEFAULT (datetime('now')),
    
    FOREIGN KEY (player_id) REFERENCES players(id),
    FOREIGN KEY (match_id) REFERENCES matches(id)
);

Code Changes Needed

  1. src/models/mod.rs - Update Player struct

    • Remove singles_rating, singles_rd, singles_volatility
    • Remove doubles_rating, doubles_rd, doubles_volatility
    • Add unified rating, rd, volatility
  2. src/main.rs - Update Web UI

    • Single rating display instead of two
    • Leaderboard shows one rating
    • Match type (singles/doubles) is still tracked in match records
  3. Database Migration migrations/002_unified_rating.sql

    -- Create new columns for unified rating
    ALTER TABLE players ADD COLUMN rating REAL DEFAULT 1500.0;
    ALTER TABLE players ADD COLUMN rd REAL DEFAULT 350.0;
    ALTER TABLE players ADD COLUMN unified_volatility REAL DEFAULT 0.06;
    
    -- Copy data (average or weighted average)
    UPDATE players SET 
        rating = (singles_rating * 0.5 + doubles_rating * 0.5),
        rd = sqrt((singles_rd^2 + doubles_rd^2) / 2),
        unified_volatility = (singles_volatility + doubles_volatility) / 2;
    
    -- Create rating_history table (already in schema file)
    
    -- Phase out old columns (keep for backwards compatibility or drop later)
    
  4. Demo/Test Files - Update to use unified rating

    • src/simple_demo.rs
    • src/demo.rs
    • examples/email_demo.rs

Implementation Strategy (For Next Iteration)

Phase 1: Migration & Dual Write (Current)

  • Add new unified rating columns to players table
  • Maintain old singles/doubles columns
  • Code writes to both (ensures backwards compatibility)

Phase 2: Testing

  • Verify unified rating calculations
  • Compare results with separate singles/doubles
  • Test backwards compatibility

Phase 3: Cutover

  • Switch web UI to show unified rating
  • Archive historical singles/doubles data
  • Deprecate old columns

Phase 4: Cleanup (Optional)

  • Remove old columns if no longer needed
  • Prune rating_history if size becomes an issue

Why One Unified Rating?

Pros:

  • Simpler mental model
  • Still track match type in history
  • Reduces database complexity
  • Single leaderboard

Cons:

  • Loses distinction between formats (some players are better at doubles)
  • Rating becomes weighted average of both

Trade-off Solution: Keep match type in matches table - can still filter leaderboards by format in the future, but use single rating for each player.


Compilation & Testing

Build Status

cd /Users/split/Projects/pickleball-elo
cargo build --release

Expected: All code should compile successfully

Test Commands

cargo test --lib
cargo test --lib glicko::doubles
cargo test --lib glicko::score_weight

Files Modified

Core Changes

  • src/glicko/score_weight.rs - Margin bonus → performance ratio
  • src/glicko/doubles.rs - RD flip + effective opponent
  • src/glicko/calculator.rs - Test update

Usage Sites

  • examples/email_demo.rs - New function signature
  • src/demo.rs - New function signature
  • src/simple_demo.rs - New function signature

Not Yet Changed (Deferred to Phase 2)

  • src/models/mod.rs - Player struct update
  • src/main.rs - Web UI updates
  • migrations/002_unified_rating.sql - New migration

Database Backup

  • Current: pickleball.db.backup-20260226-105326 Available
  • Safe to proceed with code changes
  • Schema migration can be done in separate phase

Next Steps

  1. Verify compilation: cargo build --release
  2. Run tests: cargo test
  3. Implement unified rating schema changes
  4. Update Player struct and main.rs
  5. Test end-to-end with new system