PickleBALLER/REFACTORING_NOTES.md

# Pickleball ELO System Refactoring

## Changes Made

### ✅ Change 1: Replace Arbitrary Margin Bonus with Per-Point Expected Value
**Status:** COMPLETE

**File:** `src/glicko/score_weight.rs`

**What Changed:**
- Replaced `tanh` formula based on margin of victory
- New formula: `performance = actual_points / total_points`
- Expected point probability: `P(win point) = 1 / (1 + 10^((R_opp - R_self)/400))`
- Output: Performance ratio (0.0-1.0) instead of arbitrary margin-weighted score (0.0-1.2)

**Why This Matters:**
- More mathematically sound (uses point-based probability)
- Accounts for rating difference in calculating expectations
- Single point underperformance/overperformance is now meaningful
- Prevents arbitrary bonuses for blowouts when opponent was much weaker

**Updated Files:**
- `src/glicko/score_weight.rs` - Core calculation
- `src/glicko/calculator.rs` - Test updated
- `examples/email_demo.rs` - Usage updated
- `src/demo.rs` - Usage updated
- `src/simple_demo.rs` - Usage updated

**New Function Signature:**
```rust
pub fn calculate_weighted_score(
    player_rating: f64,
    opponent_rating: f64,
    points_scored: i32,
    points_allowed: i32,
) -> f64
```

---

### ✅ Change 2: Fix RD-Based Distribution (Backwards Logic)
**Status:** COMPLETE

**File:** `src/glicko/doubles.rs`

**What Changed:**
- Changed weight formula from `1.0 / rd²` to `rd²`
- Higher RD (more uncertain) now gets more rating change
- Lower RD (more certain) now gets less rating change

**Why This Matters:**
- **Correct Principle:** Uncertain ratings should converge to true skill faster
- **Wrong Before:** Certain players were changing too much, uncertain players too little
- **Real Impact:** New or returning players now update faster; established players update slower

**Updated Function:**
```rust
pub fn distribute_rating_change(
    partner1_rd: f64,
    partner2_rd: f64,
    team_change: f64,
) -> (f64, f64)
```

Example: If team gains +20 rating points and partner1 has RD=100, partner2 has RD=200:
- Before: partner1 got ~80%, partner2 got ~20% (WRONG)
- Now: partner1 gets ~20%, partner2 gets ~80% (CORRECT)

---

### ✅ Change 3: New Effective Opponent Calculation for Doubles
**Status:** COMPLETE

**File:** `src/glicko/doubles.rs`

**What Added:**
- `calculate_effective_opponent_rating()` - Takes opponent ratings and teammate rating
- `calculate_effective_opponent()` - Returns full GlickoRating with appropriate RD/volatility

**Formula:**
```
Effective Opponent Rating = Opp1_rating + Opp2_rating - Teammate_rating
```

**Why This Matters:**
- **Personalizes rating change** based on partner strength
- **Strong teammate?** Effective opponent rating is lower (they helped)
- **Weak teammate?** Effective opponent rating is higher (you did the work)
- Reflects reality: beating opponents is easier with a strong partner

**Examples:**
- Opponents: 1500, 1500 | Partner: 1500 → Effective: 1500 (neutral)
- Opponents: 1500, 1500 | Partner: 1600 → Effective: 1400 (team was favored)
- Opponents: 1500, 1500 | Partner: 1400 → Effective: 1600 (team was undermanned)

---

### ⏳ Change 4: Combine Singles/Doubles into One Unified Rating
**Status:** IN PROGRESS - DOCUMENTED

**Scope:** This is a significant schema change that requires:

#### Database Schema Changes

**Current Structure:**
```sql
players {
    singles_rating REAL,
    singles_rd REAL,
    singles_volatility REAL,
    doubles_rating REAL,
    doubles_rd REAL,
    doubles_volatility REAL,
}
```

**Proposed New Structure:**
```sql
players {
    rating REAL,              -- Unified rating
    rd REAL,
    volatility REAL,
}
```

**Additional Tables Needed:**
```sql
CREATE TABLE rating_history (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    player_id INTEGER NOT NULL,
    match_id INTEGER NOT NULL,
    rating_before REAL NOT NULL,
    rating_after REAL NOT NULL,
    rd_before REAL NOT NULL,
    rd_after REAL NOT NULL,
    volatility_before REAL NOT NULL,
    volatility_after REAL NOT NULL,
    match_type TEXT CHECK(match_type IN ('singles', 'doubles')),
    created_at TEXT NOT NULL DEFAULT (datetime('now')),

    FOREIGN KEY (player_id) REFERENCES players(id),
    FOREIGN KEY (match_id) REFERENCES matches(id)
);
```

#### Code Changes Needed

1. **`src/models/mod.rs`** - Update `Player` struct
   - Remove `singles_rating`, `singles_rd`, `singles_volatility`
   - Remove `doubles_rating`, `doubles_rd`, `doubles_volatility`
   - Add unified `rating`, `rd`, `volatility`

2. **`src/main.rs`** - Update Web UI
   - Single rating display instead of two
   - Leaderboard shows one rating
   - Match type (singles/doubles) is still tracked in match records

3. **Database Migration** `migrations/002_unified_rating.sql`
   ```sql
   -- Create new columns for unified rating
   ALTER TABLE players ADD COLUMN rating REAL DEFAULT 1500.0;
   ALTER TABLE players ADD COLUMN rd REAL DEFAULT 350.0;
   ALTER TABLE players ADD COLUMN unified_volatility REAL DEFAULT 0.06;

   -- Copy data (average or weighted average)
   UPDATE players SET
       rating = (singles_rating * 0.5 + doubles_rating * 0.5),
       rd = sqrt((singles_rd^2 + doubles_rd^2) / 2),
       unified_volatility = (singles_volatility + doubles_volatility) / 2;

   -- Create rating_history table (already in schema file)

   -- Phase out old columns (keep for backwards compatibility or drop later)
   ```

4. **Demo/Test Files** - Update to use unified rating
   - `src/simple_demo.rs`
   - `src/demo.rs`
   - `examples/email_demo.rs`

#### Implementation Strategy (For Next Iteration)

**Phase 1: Migration & Dual Write** (Current)
- Add new unified rating columns to `players` table
- Maintain old singles/doubles columns
- Code writes to both (ensures backwards compatibility)

**Phase 2: Testing**
- Verify unified rating calculations
- Compare results with separate singles/doubles
- Test backwards compatibility

**Phase 3: Cutover**
- Switch web UI to show unified rating
- Archive historical singles/doubles data
- Deprecate old columns

**Phase 4: Cleanup** (Optional)
- Remove old columns if no longer needed
- Prune rating_history if size becomes an issue

#### Why One Unified Rating?

**Pros:**
- Simpler mental model
- Still track match type in history
- Reduces database complexity
- Single leaderboard

**Cons:**
- Loses distinction between formats (some players are better at doubles)
- Rating becomes weighted average of both

**Trade-off Solution:**
Keep match type in `matches` table - can still filter leaderboards by format in the future, but use single rating for each player.

---

## Compilation & Testing

### Build Status
```bash
cd /Users/split/Projects/pickleball-elo
cargo build --release
```

Expected: ✅ All code should compile successfully

### Test Commands
```bash
cargo test --lib
cargo test --lib glicko::doubles
cargo test --lib glicko::score_weight
```

---

## Files Modified

### Core Changes
- ✅ `src/glicko/score_weight.rs` - Margin bonus → performance ratio
- ✅ `src/glicko/doubles.rs` - RD flip + effective opponent
- ✅ `src/glicko/calculator.rs` - Test update

### Usage Sites
- ✅ `examples/email_demo.rs` - New function signature
- ✅ `src/demo.rs` - New function signature
- ✅ `src/simple_demo.rs` - New function signature

### Not Yet Changed (Deferred to Phase 2)
- ⏳ `src/models/mod.rs` - Player struct update
- ⏳ `src/main.rs` - Web UI updates
- ⏳ `migrations/002_unified_rating.sql` - New migration

---

## Database Backup
- Current: `pickleball.db.backup-20260226-105326` ✅ Available
- Safe to proceed with code changes
- Schema migration can be done in separate phase

---

## Next Steps

1. ✅ Verify compilation: `cargo build --release`
2. ✅ Run tests: `cargo test`
3. ⏳ Implement unified rating schema changes
4. ⏳ Update Player struct and main.rs
5. ⏳ Test end-to-end with new system