Ingestion methods
Ledger supports three ingestion paths. They all funnel into the same normalise / dedup / categorise / tag pipeline, so you can mix and match per bank.
| Method | Setup cost | Reliability | Best for |
|---|---|---|---|
| Playwright MCP | Author a skill once per bank | High once written | Banks you ingest from often |
| Manual download | None | Always works | One-off catch-ups, banks without a skill |
| Basiq API | One-time consent flow | Flaky — connections expire | Quick sync when it happens to be working |
1. Playwright MCP
Browser automation via the Playwright MCP server. You log in (so 2FA / random keypads / captcha all work normally), and Claude Code automates everything after that — navigating to the statements page, iterating accounts, intercepting downloads, and saving files into staging/<source>/ with the correct filename prefix.
Setup
claude mcp add playwright -- npx -y @playwright/mcp@latestThen invoke from Claude Code:
/ingest-bank-statementsThis runs the umbrella skill at .claude/skills/ingest-bank-statements.md, which routes to a bank-specific skill if one exists.
Existing bank skills
- ING Australia —
.claude/skills/ing-statements.md. Handles login pause for the randomised keypad, iterates each account in the dropdown, downloads up to 7 years of statements per account.
Adding a new bank
Copy ing-statements.md and adapt:
- Login — Navigate to the bank’s login URL. Fill the username field. Stop and ask the user to handle the password, 2FA, or keypad. Never store or type credentials.
- Navigate — Find selectors for the e-statements / exports page. Many bank sites use Web Components, where
page.evaluate()can’t see the DOM butgetByRole({ name })accessibility selectors still work. - Download loop — Use
browser_run_codewithPromise.all([page.waitForEvent('download'), <click>])to intercept download events. Save withdownload.saveAs(stagingDir + filename)because most banks serve generic filenames likeStatement.pdf. - Filename —
{file_prefix}_{YYYY-MM-DD}_to_{YYYY-MM-DD}.pdf(or.csv). Thefile_prefixmust match an entry inconfig/accounts.yamlso the parser binds to the right account.
Partial flow (no skill yet)
If there is no skill for your bank, you can still get help once you’re logged in:
- Open the bank’s site via Playwright MCP.
- Log in manually and navigate to the e-statements page.
- Ask Claude to take a snapshot, find the statement list, and run a download loop to save them all into
staging/<source>/.
If it goes well, capture what worked into a new skill file under .claude/skills/.
2. Manual download
Always works, no setup. The fallback when nothing else does.
- Log into the bank in a normal browser.
- Download statements (PDF or CSV — whichever the parser supports for that source).
- Drop files into the matching
staging/folder:
| Source | Folder | Format |
|---|---|---|
| ING | staging/ing/ | *.pdf |
| ING (CSV export) | staging/ing-csv/ | *.csv |
| PayPal | staging/paypal/ | *.csv |
| Bankwest | staging/bankwest/ | *.pdf |
| Bankwest (CSV) | staging/bankwest-csv/ | *.csv |
| HSBC | staging/hsbc/ | *.pdf |
| Coles Mastercard | staging/coles/ | *.pdf |
| Amex | staging/amex/ | *.csv |
| Airbnb | staging/airbnb/ | *.csv |
-
Prefix filenames with the
file_prefixfromconfig/accounts.yamlso the parser binds to the right account. Example:isaac_business_2025-07-01_to_2025-09-30.pdf. -
Run:
ledger ingest# or scope to one sourceledger ingest --source ing# or preview onlyledger ingest --dry-runProcessed files are moved to data/archive/ so re-running is safe.
3. Basiq API
Basiq is an Australian open-banking aggregator. One consent flow gives Ledger read-only access to ING, HSBC, Bankwest, and Coles.
It’s the fastest sync when it works. In practice, connections expire, MFA re-prompts on the bank side, and Basiq returns 5xx more often than you’d want. Treat it as opportunistic — if the next sync fails, fall back to Playwright or manual download instead of debugging.
One-time setup
Add your API key to .env:
BASIQ_API_KEY=your-base64-encoded-api-keyThen run:
ledger connectThis prints a consent URL. Open it in a browser, link your bank accounts, then come back to the terminal. State is persisted to data/basiq_state.json.
Sync
ledger sync# scope to one sourceledger sync --source ing# only fetch since a dateledger sync --since 2025-01-01# preview onlyledger sync --dry-runsince is auto-detected from the most recent basiq:% transaction in the DB if you don’t pass it.
Supported institutions
Defined in etl/basiq.py:
| Source | Basiq institution ID |
|---|---|
| ING | AU00201 |
| HSBC | AU07201 |
| Bankwest | AU00401 |
| Coles | AU15301 |
Other banks need Method 1 or 2.
Troubleshooting
- “No bank connections found” — run
ledger connect; consent may have expired. - HTTP 401 — check
BASIQ_API_KEY. - Partial history — some institutions return only ~90 days. Backfill earlier periods using Method 1 or 2.
After ingestion
All three methods feed the same pipeline:
- Dedup — SHA-256 transaction hash.
- Source-of-truth — credit-card payments from your bank are auto-marked as transfers (see Source of Truth).
- Categorise — regex rules from
config/categories.yaml. - Tag — multi-tag rules.
Then:
ledger split --backfill --fy 2025 # business splits for taxledger dedup # cross-account duplicate resolutionpython -m api # dashboard