AI Configuration

Guide 38: AI Configuration

Set up and optimize AI-powered transaction categorization for your workflow

Overview

OtterLedger uses a five-tier AI system to categorize transactions automatically. Each tier acts as a fallback for the one before it: fast, rule-based methods run first, and progressively more powerful AI models take over when earlier tiers cannot produce a confident result. The system is designed to give you high accuracy without requiring a cloud subscription — every tier except the cloud providers works entirely offline.

This guide explains each tier, how to configure it, and how the settings interact with one another.

Settings location: Menu -> Settings -> AI

All AI settings are stored in settings.db (SQLite database) in your AppData folder. API keys are encrypted at rest. Settings are per-machine but shared across all OtterLedger database files you open on that machine (cloud API keys and provider toggles), with the exception of per-database settings such as the confidence threshold and XGBoost enable/disable, which are stored in each database file.

Accessing AI Settings

Open the main menu (top-left)
Select Settings
Select the AI tab

The AI settings page is divided into sections: provider status overview, per-tier configuration, web search enrichment, hardware detection, privacy controls, and performance tuning.

[Screenshot: AI Settings page showing the five-tier provider status list at the top]

The Five-Tier Categorization System

When OtterLedger receives a transaction to categorize, it walks through the tiers in order and stops at the first tier that returns a result above your confidence threshold.

Tier	Name	Speed	Privacy	Requires Setup
1	Rules Engine	Instant	Full (local)	No
2	Payee Learning	Instant	Full (local)	No
3	XGBoost ML	Very fast	Full (local)	Optional
4	Local LLM	Seconds	Full (local)	Yes
5	Cloud LLM	1-3 seconds	Data sent to provider	Yes

Recommended starting point: Enable tiers 1-3 (always-on by default) and add a Local LLM or Cloud LLM if you want higher accuracy on unfamiliar payees.

Tip: The tier system is cumulative — enabling Cloud LLM does not disable Local LLM. Each tier only activates when the tiers above it produce no confident result.

Tier 1: Rules Engine

The Rules Engine runs first on every transaction and is always active. It cannot be disabled.

Rules match transactions based on payee name patterns, keywords, and amounts. When a rule matches, the transaction is categorized immediately with up to 95% confidence — no AI processing occurs.

To manage rules: See Guide 23: AI Categorization.

Rules are the fastest and most predictable tier. If you frequently see the same payees, creating rules for them is the most efficient way to categorize your transactions.

Tier 2: Payee Learning

Payee Learning is always active and requires no configuration. Every time you manually assign or correct a category on a transaction, OtterLedger records the payee-to-category mapping. On future imports, the same payee name is matched against this history and categorized automatically at up to 95% confidence.

Learning data is stored locally in your database file and is never sent to any external service.

Tip: The more corrections you make early on, the more accurate tier 2 becomes. After a few months of use, most common payees are covered by learning alone.

Tier 3: XGBoost ML

XGBoost is a machine learning classifier that runs locally on your machine. It operates as a Python microservice that OtterLedger communicates with over a local HTTP connection. Unlike the LLM tiers, XGBoost is extremely fast (milliseconds per transaction) and uses very little memory.

The XGBoost service ships with a pre-built merchant dictionary covering thousands of common payees. When the dictionary matches a payee, confidence is high (90%+). For payees not in the dictionary, the ML model uses transaction features (payee name, amount, date) to make a prediction.

[Screenshot: XGBoost section in AI Settings showing enabled toggle and service status]

XGBoost Settings

Setting	Description
Enable XGBoost ML	Turns the XGBoost classifier on or off for this database file
Service URL	Internal address of the XGBoost microservice (default: `http://localhost:8101`)
Minimum confidence	Suggestions below this threshold are discarded (default: 70%)

Service Status

The settings page shows whether the XGBoost service is reachable. If the service is not running, OtterLedger skips tier 3 and proceeds to tier 4 without error.

The service starts automatically with OtterLedger on supported platforms. If it fails to start, check the application log at:

C:\Users\<you>\AppData\Local\OtterLedger\logs\openledger-<date>.log

Incremental Learning

XGBoost learns from your corrections the same way Payee Learning does. Each time you manually correct a category, the correction is sent to the XGBoost service as a learning signal. This improves future predictions for that payee without requiring a full model retrain.

Note: XGBoost is enabled per database file. If you open multiple OtterLedger files, each has its own XGBoost enable/disable setting.

Tier 4: Local LLM

The Local LLM tier uses a large language model running entirely on your machine. It handles payees that rules, learning, and XGBoost cannot confidently categorize — particularly obscure, new, or abbreviated payee names.

Local LLM requires a one-time model download and works completely offline after setup. No transaction data is ever sent outside your machine.

[Screenshot: Local LLM section showing model selector dropdown, download button, and connection status]

Supported Model Formats

OtterLedger uses llama.cpp-compatible GGUF models. Models are downloaded directly from OtterLedger's release servers.

Available Models

Model	Size	Min VRAM	Min RAM (CPU)	Best For
Llama 3.2 3B	~1.9 GB	3 GB	8 GB	Low-end hardware, fast CPU-only
Llama 3.1 8B	~4.6 GB	6 GB	16 GB	Balanced accuracy and speed
Qwen 2.5 7B	~4.4 GB	6 GB	16 GB	Strong multilingual support

OtterLedger's hardware detection (see below) recommends the best model for your machine automatically.

Downloading a Model

Open Settings -> AI -> Local LLM
Select a model from the dropdown
Click Download Model
Wait for the download to complete — progress is shown in the settings panel

Downloads use parallel chunked transfers (8 concurrent connections) and support resume if interrupted. A model download can be cancelled at any time without leaving a corrupted file.

[Screenshot: Model download in progress showing percentage bar and status text]

Warning: Model files are large (1.9-4.6 GB). Ensure you have sufficient disk space before downloading. Models are stored in the application's Assets/Models directory.

Local LLM Settings

Setting	Description
Enable Local LLM	Turns the local language model on or off
Selected Model	Which GGUF model file to use for inference
Connection status	Shows whether the model loaded successfully

Testing the Connection

Click Test Connection to verify the model loads and responds correctly. The test sends a simple categorization request and reports success or failure. If the model file is missing, a warning banner appears with a Download button.

Hot-Swap Model

Changing the selected model in the dropdown immediately unloads the current model. The new model loads on the next categorization request. There is no need to restart OtterLedger.

Ollama (Legacy)

If you prefer to run your own Ollama instance, you can configure the Ollama endpoint URL and select a model name. This is provided for backward compatibility. The built-in Local LLM (llama.cpp) is recommended for most users as it requires no separate installation.

Ollama Setting	Description
Ollama URL	Endpoint for your Ollama instance (e.g., `http://localhost:11434`)
Model name	Name of the Ollama model to use (e.g., `llama3.2`, `mistral`)

Tier 5: Cloud LLM

Cloud LLM providers offer the highest accuracy, particularly for unusual or international payee names. Transaction data (payee name, amount, date — no account numbers or personal identifiers) is sent to the provider's API for processing.

Three providers are supported. You can enable one or more simultaneously. When multiple cloud providers are enabled, OtterLedger can process them in parallel for faster batch categorization.

[Screenshot: Cloud LLM section showing three provider cards (Claude, Gemini, OpenAI) each with enable toggle and API key field]

Supported Cloud Providers

Provider	Model	Notes
Anthropic Claude	claude-3-haiku / claude-3-5-sonnet	Conservative, accurate
Google Gemini	gemini-1.5-flash / gemini-1.5-pro	Fast, cost-effective, recommended
OpenAI ChatGPT	gpt-4o-mini / gpt-4o	Widely used, high accuracy

Tip: Google Gemini is the recommended cloud provider. It has strong accuracy, low latency, and a generous free tier that covers typical personal finance usage.

Setting Up a Cloud Provider

Obtain an API key from the provider's developer portal:
- Claude: https://console.anthropic.com
- Gemini: https://aistudio.google.com/apikey
- OpenAI: https://platform.openai.com/api-keys
Open Settings -> AI -> Cloud LLM
Find the provider card and toggle Enable
Paste your API key into the key field
Click Test Connection to verify

API keys are encrypted using Windows Data Protection API before being written to settings.db. The key is never stored in plain text.

Cloud Provider Settings

Setting	Description
Enable [Provider]	Turns that provider on or off globally (across all database files)
API Key	Your provider API key (masked after entry; click show to reveal)
Connection status	Not Configured / Not Tested / Connected / Error
Test Connection	Sends a test request to verify the key works

Prefer Cloud Over Local LLM

Enable Prefer Cloud Over Local LLM if you want cloud providers to run before the Local LLM (swapping tiers 4 and 5). This is useful if you have a slow CPU with no GPU and a cloud subscription you want to use as the primary AI.

Parallel Cloud Processing

When multiple cloud providers are enabled, OtterLedger can send requests to all of them concurrently and use the first response that meets the confidence threshold.

Setting	Description
Use parallel cloud processing	Send requests to multiple providers simultaneously
Concurrent requests	Number of simultaneous API requests (1-20, default: 10)

Web Search Enrichment

Web search enrichment runs before any AI tier for transactions with cryptic or abbreviated payee names. OtterLedger searches the web for the payee name and uses the result to provide better context to the AI model (e.g., identifying "TSC 0047382" as a farm supply store before sending it to the LLM).

[Screenshot: Web Search Enrichment section with enable toggle and search engine settings]

How It Works

Transaction arrives with a cryptic payee name (e.g., "WM SUPERCENTER #4281")
OtterLedger strips the store number and searches for "WM SUPERCENTER"
Search results identify it as Walmart
The enriched payee name "Walmart" is passed to the AI tier
Categorization accuracy improves significantly

Search Engine Options

OtterLedger supports two search backends:

Engine	Notes
Google (web scraping)	Higher coverage, but may be rate-limited (HTTP 429) with large transaction batches
DuckDuckGo Instant Answer	Privacy-focused, less likely to rate-limit, but misses some payees

Known limitation: Google scraping is subject to rate limiting when processing many transactions at once. If you import large statement files regularly, consider using DuckDuckGo as the primary engine or disabling web search for bulk imports.

Web Search Settings

Setting	Description
Enable web search enrichment	Turns web search on or off globally
Enable Google search	Allows Google to be used as a search backend
Prefer Google over DuckDuckGo	When enabled, tries Google first and falls back to DuckDuckGo

Note: Web search enrichment sends only the payee name (not amounts, dates, or account information) to the search engine. If you require strict data privacy, disable web search enrichment and rely on the local tiers only.

Hardware Detection

OtterLedger automatically detects your system's hardware capabilities when you open the AI settings page. Detection results are used to recommend the best Local LLM model and to warn you if Local LLM will run slowly on your hardware.

[Screenshot: Hardware Detection section showing detected GPU, CPU, and performance estimates]

What Gets Detected

Detected Item	Used For
CPU model and core count	Estimating CPU-only inference speed
Total system RAM	Determining if CPU-only mode is feasible
GPU backend (CUDA, Metal, Vulkan)	Enabling GPU-accelerated inference
GPU memory (VRAM)	Recommending max model size
Estimated tokens/second	Showing expected categorization speed

GPU Backends

Backend	Platform	GPU Vendor
CUDA	Windows, Linux	NVIDIA
Metal	macOS	Apple Silicon, AMD
Vulkan	Windows, Linux	AMD, Intel
CPU	All (fallback)	No GPU required

Performance Estimates

The settings page shows:

Estimated tokens per second — Higher is faster
Estimated time per transaction — Typical categorization time
Recommended max model size — Largest model your hardware can run well

If no GPU is detected, a warning appears: "Running in CPU-only mode. Local LLM categorization will be slow." In CPU-only mode, consider using the Llama 3.2 3B model or relying on Cloud LLM for speed.

Refresh Detection

Click Refresh Hardware Detection to re-run the detection after connecting a new GPU or changing drivers. Detection results are cached for the session.

Privacy Controls

OtterLedger gives you control over what data, if any, is shared with AI services.

[Screenshot: Privacy Controls section with toggle switches]

Setting	Description
Allow anonymized data	Permits sending payee names to cloud providers (required for Cloud LLM to function)
Include merchant names	Includes the payee name in cloud requests
Include amounts	Includes transaction amounts in cloud requests (improves accuracy for amount-dependent categories)

Local tiers (Rules, Payee Learning, XGBoost, Local LLM) never send data anywhere, regardless of these settings. Privacy controls only affect cloud provider requests.

Warning: Disabling "Allow anonymized data" prevents all cloud LLM tiers from running. Tiers 1-4 continue to operate normally.

What Data Is Never Sent

Regardless of privacy settings, OtterLedger never sends:

Account numbers or bank identifiers
Your name or personal information
Account balances
Full transaction notes or memos (only the payee name is used)

General Categorization Settings

These settings apply across all tiers.

[Screenshot: General Categorization section at the top of the AI settings page]

Setting	Description
Enable AI categorization	Master switch — turns all AI categorization on or off
Auto-apply confidence threshold	Automatically apply suggestions above this confidence level without prompting you
Allow AI to create categories	Permits the AI to suggest category names not already in your category list

Confidence Threshold

The confidence threshold controls when AI suggestions are applied automatically vs. when they are flagged for your review.

Threshold	Behavior
0% (Never)	All suggestions require manual review
70%	Auto-apply confident suggestions, review uncertain ones
90%	Only auto-apply very confident suggestions
100%	Effectively never auto-applies (very rare to reach)

A threshold of 70-80% is recommended for most users. Review the Uncategorized filter in the transaction list to see what the AI flagged for your attention.

Performance Tuning

Sequential vs. Parallel Processing

For large imports (100+ transactions), parallel cloud processing significantly reduces the total time. Each transaction can be sent to a cloud provider independently.

Setting	Recommended Value
Use parallel cloud processing	Enabled
Concurrent requests	5-10 for most API plans; 15-20 for unlimited plans

Reducing concurrency lowers the risk of hitting API rate limits on cloud providers.

Prefer Cloud Over Local LLM

If your Local LLM is slow (CPU-only mode) and you have a cloud API key, enabling Prefer Cloud Over Local LLM routes tier 4 requests to your cloud provider instead. This requires a restart to take effect as it changes the provider pipeline order.

Tips

Start with tiers 1-3. Rules, Payee Learning, and XGBoost handle the majority of transactions for most users without any cloud setup.
Download the 3B model first if you are unsure about your hardware. It is small, fast, and sufficient for most payees.
Use Gemini as your first cloud provider. It has a free API tier and strong accuracy.
Create rules for your most common payees. Tier 1 rules are faster than any AI and always reliable.
Review uncategorized transactions weekly. Each correction teaches Payee Learning and XGBoost, making future imports more accurate.
Disable web search if you do large imports. Rate limiting from Google can slow down bulk categorization. Run web search on smaller batches or use DuckDuckGo.

Troubleshooting

XGBoost service is not available

The XGBoost service is a Python microservice that must be running. Check the application log for startup errors:

C:\Users\<you>\AppData\Local\OtterLedger\logs\openledger-<date>.log

If XGBoost is unavailable, OtterLedger skips it silently and uses tier 4 (Local LLM) instead.

Local LLM shows "Model file missing"

Click Download Model in the settings panel. If you previously downloaded a model and it is missing, it may have been deleted manually. Re-download it from Settings.

Cloud API test returns "Unauthorized"

Verify you copied the full API key with no leading or trailing spaces
Confirm the key has not been revoked or expired in the provider's portal
Check that billing is active for your provider account (some providers disable keys on overdue accounts)

Categorization accuracy is low

Check that Payee Learning is accumulating corrections — make sure you are correcting wrong categories in the transaction list
Verify XGBoost is enabled and the service is available (green status in settings)
If using Local LLM in CPU-only mode, consider switching to a cloud provider for better accuracy on ambiguous payees
Enable web search enrichment to improve context for cryptic payee names

Hardware detection shows wrong GPU

Click Refresh Hardware Detection. If the GPU is still not detected, ensure your GPU drivers are up to date and that the appropriate backend (CUDA for NVIDIA, Vulkan for AMD/Intel) is installed.

Google search is rate-limited

Switch to DuckDuckGo as the primary search engine under Settings -> AI -> Web Search Enrichment and disable Prefer Google over DuckDuckGo. DuckDuckGo is not rate-limited in normal usage.

See also: Guide 23: AI Categorization — how categorization works and how to review AI suggestions

OtterLedger User Guide | AI Configuration | February 2026