Migration: rebuild battle-test learnings + opening-balance orphan fix
- build_rebuild_dataset.py: subtract orphan paired-transfer amounts from
destination card's derived opening; html.unescape descriptions.
- merchant_map.json: +110 auto-tail rules from rebuild long-tail, +20
recurring rules + 135 auto-cluster acceptances; stripped all cached
account_ids; Rock Auto -> Z(Mizumi) review:true; Duquesne Light ->
Utilities; categories stripped from _auto_tail rules per user policy.
- migration/README.md: 'Lessons from the first rebuild' section.
- migration/rebuild_clusters.{json,md}: clustering proposal artifact.
This commit is contained in:
parent
26fb19ca9a
commit
e446c4097a
1769
merchant_map.json
1769
merchant_map.json
File diff suppressed because it is too large
Load Diff
@ -76,3 +76,32 @@ transfers.
|
||||
- `*review_preview*.html` -- review-UI previews on real data
|
||||
|
||||
Nothing here writes to Firefly except the final `--post` in step 6.
|
||||
|
||||
## Lessons from the first rebuild (2026-05-20)
|
||||
|
||||
Captured here so a second rebuild doesn't re-discover them.
|
||||
|
||||
- **Orphan paired transfers**: the PNC->Apple payment from 2025-08-01 has no
|
||||
Apple-side line (Apple's QFX starts 08-02). Its effect was already in
|
||||
Apple's derived opening; posting the transfer ALSO crediting Apple
|
||||
double-counted by $3,218. Fix: `build_rebuild_dataset.py` now subtracts
|
||||
orphan transfer amounts from the destination card's opening. See
|
||||
`references/transfers.md` in the skill.
|
||||
- **Asset accounts require `account_role`** on POST /accounts. `defaultAsset`
|
||||
works universally.
|
||||
- **Budgets do not auto-create.** If wiping to scratch, recreate Needs /
|
||||
Wants / Savings via UI or POST before the import.
|
||||
- **Wipe via UI leaves stale revenue accounts / categories** (only
|
||||
transaction-referenced asset accounts go). Prune manually if you want a
|
||||
truly clean slate.
|
||||
- **Strip cached `account_id` from `merchant_map.json` before any rebuild.**
|
||||
Pre-wipe ids are invalid post-wipe. The skill no longer caches to the map
|
||||
(in-memory only) but old maps may still carry stale ids.
|
||||
- **Background Python with `nohup ... &` can lose stdout to buffering.** Use
|
||||
`python -u` for the import step. The first rebuild's log was empty because
|
||||
Python buffered everything and we mistook it for "ran but did nothing."
|
||||
- **`error_if_duplicate_hash` is now off** — Firefly's content-hash dedup
|
||||
was too eager (rejected legit-distinct rows with same date+amt+desc, like
|
||||
two parking sessions same garage). `external_id` precheck is the only dedup.
|
||||
- **Wipe by deleting transactions, not by deleting accounts.** Otherwise you
|
||||
end up with stale ids referenced by merchant_map cache.
|
||||
|
||||
@ -13,7 +13,7 @@ Costco, with:
|
||||
|
||||
Nothing is posted. Output feeds `firefly_import.py --emit-plan/--review-html`.
|
||||
"""
|
||||
import re, json, hashlib, sys
|
||||
import re, json, hashlib, sys, html
|
||||
from collections import Counter
|
||||
|
||||
D = "/Users/danesabo/Documents/Finances/EXPORTS/-MAY172026"
|
||||
@ -35,7 +35,7 @@ def parse(path):
|
||||
for b in blocks:
|
||||
out.append({"date": g(b, "DTPOSTED")[:8], "amt": float(g(b, "TRNAMT")),
|
||||
"ttype": g(b, "TRNTYPE").upper(),
|
||||
"desc": (g(b, "NAME") + " " + g(b, "MEMO")).strip(),
|
||||
"desc": html.unescape((g(b, "NAME") + " " + g(b, "MEMO")).strip()),
|
||||
"fitid": g(b, "FITID")})
|
||||
return ledger, out
|
||||
|
||||
@ -100,6 +100,24 @@ for acct, (path, tag) in SRC.items():
|
||||
rec["type"] = "withdrawal" if amt < 0 else "deposit"
|
||||
records.append(rec)
|
||||
|
||||
# --- Orphan adjustment: a PNC->Apple/Costco payment whose date predates the
|
||||
# card QFX window has its card-side effect already baked into the card's
|
||||
# DERIVED opening (because opening = ledger - sum_kept_card_lines, and the
|
||||
# orphan never appeared on the card side). If we ALSO post the PNC->card
|
||||
# transfer in the rebuild, the card account gets credited twice. So subtract
|
||||
# orphan transfer amounts from the card opening.
|
||||
APPLE_WINDOW_START = "2025-08-02"
|
||||
COSTCO_WINDOW_START = "2025-08-02"
|
||||
for r in records:
|
||||
if r.get("type") == "transfer" and r["asset_account"] == "PNC Checking":
|
||||
dest = r.get("destination_account")
|
||||
if dest == "Apple Credit Card" and r["date"] < APPLE_WINDOW_START:
|
||||
recon["Apple Credit Card"]["opening"] -= float(r["amount"])
|
||||
recon["Apple Credit Card"]["opening"] = round(recon["Apple Credit Card"]["opening"], 2)
|
||||
elif dest == "Costco Visa Card" and r["date"] < COSTCO_WINDOW_START:
|
||||
recon["Costco Visa Card"]["opening"] -= float(r["amount"])
|
||||
recon["Costco Visa Card"]["opening"] = round(recon["Costco Visa Card"]["opening"], 2)
|
||||
|
||||
print("=== RECONCILIATION (must all tie) ===")
|
||||
ok = True
|
||||
for a, r in recon.items():
|
||||
|
||||
File diff suppressed because one or more lines are too long
86
sam-bachelor-party-invoice.pdf
Normal file
86
sam-bachelor-party-invoice.pdf
Normal file
@ -0,0 +1,86 @@
|
||||
%PDF-1.4
|
||||
%“Œ‹ž ReportLab Generated PDF document (opensource)
|
||||
1 0 obj
|
||||
<<
|
||||
/F1 2 0 R /F2 3 0 R /F3 4 0 R /F4 5 0 R
|
||||
>>
|
||||
endobj
|
||||
2 0 obj
|
||||
<<
|
||||
/BaseFont /Helvetica /Encoding /WinAnsiEncoding /Name /F1 /Subtype /Type1 /Type /Font
|
||||
>>
|
||||
endobj
|
||||
3 0 obj
|
||||
<<
|
||||
/BaseFont /Helvetica-Bold /Encoding /WinAnsiEncoding /Name /F2 /Subtype /Type1 /Type /Font
|
||||
>>
|
||||
endobj
|
||||
4 0 obj
|
||||
<<
|
||||
/BaseFont /Helvetica-Oblique /Encoding /WinAnsiEncoding /Name /F3 /Subtype /Type1 /Type /Font
|
||||
>>
|
||||
endobj
|
||||
5 0 obj
|
||||
<<
|
||||
/BaseFont /Symbol /Name /F4 /Subtype /Type1 /Type /Font
|
||||
>>
|
||||
endobj
|
||||
6 0 obj
|
||||
<<
|
||||
/Contents 10 0 R /MediaBox [ 0 0 306 576 ] /Parent 9 0 R /Resources <<
|
||||
/Font 1 0 R /ProcSet [ /PDF /Text /ImageB /ImageC /ImageI ]
|
||||
>> /Rotate 0 /Trans <<
|
||||
|
||||
>>
|
||||
/Type /Page
|
||||
>>
|
||||
endobj
|
||||
7 0 obj
|
||||
<<
|
||||
/PageMode /UseNone /Pages 9 0 R /Type /Catalog
|
||||
>>
|
||||
endobj
|
||||
8 0 obj
|
||||
<<
|
||||
/Author (Dane Sabo) /CreationDate (D:20260525192222-04'00') /Creator (\(unspecified\)) /Keywords () /ModDate (D:20260525192222-04'00') /Producer (ReportLab PDF Library - \(opensource\))
|
||||
/Subject (\(unspecified\)) /Title (Sam's Bachelor Party - Settle Up) /Trapped /False
|
||||
>>
|
||||
endobj
|
||||
9 0 obj
|
||||
<<
|
||||
/Count 1 /Kids [ 6 0 R ] /Type /Pages
|
||||
>>
|
||||
endobj
|
||||
10 0 obj
|
||||
<<
|
||||
/Filter [ /ASCII85Decode /FlateDecode ] /Length 1214
|
||||
>>
|
||||
stream
|
||||
Gatm;>Ar4d'Ro4HS-PjQ<Ue;YA%PN0dY'MNGI*::<@_A[1BB/;a0eVis1[M4JIl1a,"PQ"R</8>o?@RGUgLY,m=bOA0)U!Q!>BFE!&K7YHOi('i5Yh!QGbO<R"QEe5+oYL0Z"=?nG4qD%^\%)pTH`7g,B&V$f?-qH1X)lr9b/os*XZi$BqZWDt#IVidSK>pO@/^a92IC?a#)l;?oFGOCs#Leq!<N\q9LD/9Ne;AUth>@dp.*ja.VPKNmqKm!p4q42N1Rd&L[k[eet=5j)"Y^J"l0@2o>'Oo>DJ!0;'/@5crR6&]8fl0RsO$3iqZNEAj>E7$^M[-m'h/>H1Am9)6jpYgEB7JGcf6+C-q(*0)jiOqRkTMqS]h&RfWncVDmRsErl4:$4?6T"g&Y7-1G-`VH"q4gLL1dP18iq*PRkNUF5bPZ!;o/gAHpcQ56W9Af51AL&<!SiPXKW,<M)B+BJK9H)\&76V#)$mom>U+6j4>FQNJH1]AAA5O==kX)tbHq`-mKSD,<Wb&*B'03WUY;MmZtr6dEr2@8T(pDiR^W>q1\d,2A2m>F<cQ^?.^LM!\lps+*TX'L3@fAn'&RK$e"3^Gq7F`E5t&#FBP8EpXmUq-R[^n@Qb2<<19MNe_'t8\73A^%*,E\_\:^VK_-;b,.3VWOXX#G2V6;&)6!b8L7\Ro9<Gn?d@s2;1M2a^4$c,cf1<CX#-PI@)5Ka!E>$.,!["Cdegcu]c^Kfd5EM3u9Y6hnp5Y`,R@g%c(S1C^P@NOUTb!aU,-?9Maf`'*V?CJqn&Es3E7qK]<:RG/6E]PW1GT<e#NRd;g,7aV[3,HPdI1Jb=069#iRTe;7iXs?Zs%-5?JY?[.43,Y9Feq??AI@1WaP(-57f(!U531KOn9G]k>fgO8GDp0G?F?_Gf$0r'd52HIKoFqbGjFBBSiX$7F?$/1YdtK`#p7Z/\]T)t?7Ym_YEaXoCMe[(9sU)Pf4V40m7u>@4Q/1qo3SH;/9:*^@72fgD[o+EA)hS@._.WEZ1E(o6V['-q!:Nfa47Z>C0:n;#SJR%4,AU*cqA[;S/qH\0fmWs4G[8kUQ2=XH>?fiS+!rt[*$W_E?_#cP/tW/ib]C7UBB!AfPG(12S7>h=C]Om*B2u-1(%$F>J50HQ/,dT:![f/k%`+eCt"n5Lo<nkUS7*J7-bOr5E&P>NuG92V:u1=#m>(@>g$gNRN!sU"Ug8ODu~>endstream
|
||||
endobj
|
||||
xref
|
||||
0 11
|
||||
0000000000 65535 f
|
||||
0000000061 00000 n
|
||||
0000000122 00000 n
|
||||
0000000229 00000 n
|
||||
0000000341 00000 n
|
||||
0000000456 00000 n
|
||||
0000000533 00000 n
|
||||
0000000727 00000 n
|
||||
0000000795 00000 n
|
||||
0000001090 00000 n
|
||||
0000001149 00000 n
|
||||
trailer
|
||||
<<
|
||||
/ID
|
||||
[<3d7dcb593c2c9ba6fed463683a4107b4><3d7dcb593c2c9ba6fed463683a4107b4>]
|
||||
% ReportLab generated PDF document -- digest (opensource)
|
||||
|
||||
/Info 8 0 R
|
||||
/Root 7 0 R
|
||||
/Size 11
|
||||
>>
|
||||
startxref
|
||||
2455
|
||||
%%EOF
|
||||
Loading…
x
Reference in New Issue
Block a user