AI Configuration
Guide 38: AI Configuration
Set up and optimize AI-powered transaction categorization for your workflow
Overview
OtterLedger uses a five-tier AI system to categorize transactions automatically. Each tier acts as a fallback for the one before it: fast, rule-based methods run first, and progressively more powerful AI models take over when earlier tiers cannot produce a confident result. The system is designed to give you high accuracy without requiring a cloud subscription — every tier except the cloud providers works entirely offline.
This guide explains each tier, how to configure it, and how the settings interact with one another.
Settings location: Menu -> Settings -> AI
All AI settings are stored in settings.db (SQLite database) in your AppData folder. API keys are encrypted at rest. Settings are per-machine but shared across all OtterLedger database files you open on that machine (cloud API keys and provider toggles), with the exception of per-database settings such as the confidence threshold and XGBoost enable/disable, which are stored in each database file.
Accessing AI Settings
- Open the main menu (top-left)
- Select Settings
- Select the AI tab
The AI settings page is divided into sections: provider status overview, per-tier configuration, web search enrichment, hardware detection, privacy controls, and performance tuning.
[Screenshot: AI Settings page showing the five-tier provider status list at the top]
The Five-Tier Categorization System
When OtterLedger receives a transaction to categorize, it walks through the tiers in order and stops at the first tier that returns a result above your confidence threshold.
| Tier | Name | Speed | Privacy | Requires Setup |
|---|---|---|---|---|
| 1 | Rules Engine | Instant | Full (local) | No |
| 2 | Payee Learning | Instant | Full (local) | No |
| 3 | XGBoost ML | Very fast | Full (local) | Optional |
| 4 | Local LLM | Seconds | Full (local) | Yes |
| 5 | Cloud LLM | 1-3 seconds | Data sent to provider | Yes |
Recommended starting point: Enable tiers 1-3 (always-on by default) and add a Local LLM or Cloud LLM if you want higher accuracy on unfamiliar payees.
Tip: The tier system is cumulative — enabling Cloud LLM does not disable Local LLM. Each tier only activates when the tiers above it produce no confident result.
Tier 1: Rules Engine
The Rules Engine runs first on every transaction and is always active. It cannot be disabled.
Rules match transactions based on payee name patterns, keywords, and amounts. When a rule matches, the transaction is categorized immediately with up to 95% confidence — no AI processing occurs.
To manage rules: See Guide 23: AI Categorization.
Rules are the fastest and most predictable tier. If you frequently see the same payees, creating rules for them is the most efficient way to categorize your transactions.
Tier 2: Payee Learning
Payee Learning is always active and requires no configuration. Every time you manually assign or correct a category on a transaction, OtterLedger records the payee-to-category mapping. On future imports, the same payee name is matched against this history and categorized automatically at up to 95% confidence.
Learning data is stored locally in your database file and is never sent to any external service.
Tip: The more corrections you make early on, the more accurate tier 2 becomes. After a few months of use, most common payees are covered by learning alone.
Tier 3: XGBoost ML
XGBoost is a machine learning classifier that runs locally on your machine. It operates as a Python microservice that OtterLedger communicates with over a local HTTP connection. Unlike the LLM tiers, XGBoost is extremely fast (milliseconds per transaction) and uses very little memory.
The XGBoost service ships with a pre-built merchant dictionary covering thousands of common payees. When the dictionary matches a payee, confidence is high (90%+). For payees not in the dictionary, the ML model uses transaction features (payee name, amount, date) to make a prediction.
[Screenshot: XGBoost section in AI Settings showing enabled toggle and service status]
XGBoost Settings
| Setting | Description |
|---|---|
| Enable XGBoost ML | Turns the XGBoost classifier on or off for this database file |
| Service URL | Internal address of the XGBoost microservice (default: http://localhost:8101) |
| Minimum confidence | Suggestions below this threshold are discarded (default: 70%) |
Service Status
The settings page shows whether the XGBoost service is reachable. If the service is not running, OtterLedger skips tier 3 and proceeds to tier 4 without error.
The service starts automatically with OtterLedger on supported platforms. If it fails to start, check the application log at:
C:\Users\<you>\AppData\Local\OtterLedger\logs\openledger-<date>.log
Incremental Learning
XGBoost learns from your corrections the same way Payee Learning does. Each time you manually correct a category, the correction is sent to the XGBoost service as a learning signal. This improves future predictions for that payee without requiring a full model retrain.
Note: XGBoost is enabled per database file. If you open multiple OtterLedger files, each has its own XGBoost enable/disable setting.
Tier 4: Local LLM
The Local LLM tier uses a large language model running entirely on your machine. It handles payees that rules, learning, and XGBoost cannot confidently categorize — particularly obscure, new, or abbreviated payee names.
Local LLM requires a one-time model download and works completely offline after setup. No transaction data is ever sent outside your machine.
[Screenshot: Local LLM section showing model selector dropdown, download button, and connection status]
Supported Model Formats
OtterLedger uses llama.cpp-compatible GGUF models. Models are downloaded directly from OtterLedger's release servers.
Available Models
| Model | Size | Min VRAM | Min RAM (CPU) | Best For |
|---|---|---|---|---|
| Llama 3.2 3B | ~1.9 GB | 3 GB | 8 GB | Low-end hardware, fast CPU-only |
| Llama 3.1 8B | ~4.6 GB | 6 GB | 16 GB | Balanced accuracy and speed |
| Qwen 2.5 7B | ~4.4 GB | 6 GB | 16 GB | Strong multilingual support |
OtterLedger's hardware detection (see below) recommends the best model for your machine automatically.
Downloading a Model
- Open Settings -> AI -> Local LLM
- Select a model from the dropdown
- Click Download Model
- Wait for the download to complete — progress is shown in the settings panel
Downloads use parallel chunked transfers (8 concurrent connections) and support resume if interrupted. A model download can be cancelled at any time without leaving a corrupted file.
[Screenshot: Model download in progress showing percentage bar and status text]
Warning: Model files are large (1.9-4.6 GB). Ensure you have sufficient disk space before downloading. Models are stored in the application's
Assets/Modelsdirectory.
Local LLM Settings
| Setting | Description |
|---|---|
| Enable Local LLM | Turns the local language model on or off |
| Selected Model | Which GGUF model file to use for inference |
| Connection status | Shows whether the model loaded successfully |
Testing the Connection
Click Test Connection to verify the model loads and responds correctly. The test sends a simple categorization request and reports success or failure. If the model file is missing, a warning banner appears with a Download button.
Hot-Swap Model
Changing the selected model in the dropdown immediately unloads the current model. The new model loads on the next categorization request. There is no need to restart OtterLedger.
Ollama (Legacy)
If you prefer to run your own Ollama instance, you can configure the Ollama endpoint URL and select a model name. This is provided for backward compatibility. The built-in Local LLM (llama.cpp) is recommended for most users as it requires no separate installation.
| Ollama Setting | Description |
|---|---|
| Ollama URL | Endpoint for your Ollama instance (e.g., http://localhost:11434) |
| Model name | Name of the Ollama model to use (e.g., llama3.2, mistral) |
Tier 5: Cloud LLM
Cloud LLM providers offer the highest accuracy, particularly for unusual or international payee names. Transaction data (payee name, amount, date — no account numbers or personal identifiers) is sent to the provider's API for processing.
Three providers are supported. You can enable one or more simultaneously. When multiple cloud providers are enabled, OtterLedger can process them in parallel for faster batch categorization.
[Screenshot: Cloud LLM section showing three provider cards (Claude, Gemini, OpenAI) each with enable toggle and API key field]
Supported Cloud Providers
| Provider | Model | Notes |
|---|---|---|
| Anthropic Claude | claude-3-haiku / claude-3-5-sonnet | Conservative, accurate |
| Google Gemini | gemini-1.5-flash / gemini-1.5-pro | Fast, cost-effective, recommended |
| OpenAI ChatGPT | gpt-4o-mini / gpt-4o | Widely used, high accuracy |
Tip: Google Gemini is the recommended cloud provider. It has strong accuracy, low latency, and a generous free tier that covers typical personal finance usage.
Setting Up a Cloud Provider
- Obtain an API key from the provider's developer portal:
- Claude: https://console.anthropic.com
- Gemini: https://aistudio.google.com/apikey
- OpenAI: https://platform.openai.com/api-keys
- Open Settings -> AI -> Cloud LLM
- Find the provider card and toggle Enable
- Paste your API key into the key field
- Click Test Connection to verify
API keys are encrypted using Windows Data Protection API before being written to settings.db. The key is never stored in plain text.
Cloud Provider Settings
| Setting | Description |
|---|---|
| Enable [Provider] | Turns that provider on or off globally (across all database files) |
| API Key | Your provider API key (masked after entry; click show to reveal) |
| Connection status | Not Configured / Not Tested / Connected / Error |
| Test Connection | Sends a test request to verify the key works |
Prefer Cloud Over Local LLM
Enable Prefer Cloud Over Local LLM if you want cloud providers to run before the Local LLM (swapping tiers 4 and 5). This is useful if you have a slow CPU with no GPU and a cloud subscription you want to use as the primary AI.
Parallel Cloud Processing
When multiple cloud providers are enabled, OtterLedger can send requests to all of them concurrently and use the first response that meets the confidence threshold.
| Setting | Description |
|---|---|
| Use parallel cloud processing | Send requests to multiple providers simultaneously |
| Concurrent requests | Number of simultaneous API requests (1-20, default: 10) |
Web Search Enrichment
Web search enrichment runs before any AI tier for transactions with cryptic or abbreviated payee names. OtterLedger searches the web for the payee name and uses the result to provide better context to the AI model (e.g., identifying "TSC 0047382" as a farm supply store before sending it to the LLM).
[Screenshot: Web Search Enrichment section with enable toggle and search engine settings]
How It Works
- Transaction arrives with a cryptic payee name (e.g., "WM SUPERCENTER #4281")
- OtterLedger strips the store number and searches for "WM SUPERCENTER"
- Search results identify it as Walmart
- The enriched payee name "Walmart" is passed to the AI tier
- Categorization accuracy improves significantly
Search Engine Options
OtterLedger supports two search backends:
| Engine | Notes |
|---|---|
| Google (web scraping) | Higher coverage, but may be rate-limited (HTTP 429) with large transaction batches |
| DuckDuckGo Instant Answer | Privacy-focused, less likely to rate-limit, but misses some payees |
Known limitation: Google scraping is subject to rate limiting when processing many transactions at once. If you import large statement files regularly, consider using DuckDuckGo as the primary engine or disabling web search for bulk imports.
Web Search Settings
| Setting | Description |
|---|---|
| Enable web search enrichment | Turns web search on or off globally |
| Enable Google search | Allows Google to be used as a search backend |
| Prefer Google over DuckDuckGo | When enabled, tries Google first and falls back to DuckDuckGo |
Note: Web search enrichment sends only the payee name (not amounts, dates, or account information) to the search engine. If you require strict data privacy, disable web search enrichment and rely on the local tiers only.
Hardware Detection
OtterLedger automatically detects your system's hardware capabilities when you open the AI settings page. Detection results are used to recommend the best Local LLM model and to warn you if Local LLM will run slowly on your hardware.
[Screenshot: Hardware Detection section showing detected GPU, CPU, and performance estimates]
What Gets Detected
| Detected Item | Used For |
|---|---|
| CPU model and core count | Estimating CPU-only inference speed |
| Total system RAM | Determining if CPU-only mode is feasible |
| GPU backend (CUDA, Metal, Vulkan) | Enabling GPU-accelerated inference |
| GPU memory (VRAM) | Recommending max model size |
| Estimated tokens/second | Showing expected categorization speed |
GPU Backends
| Backend | Platform | GPU Vendor |
|---|---|---|
| CUDA | Windows, Linux | NVIDIA |
| Metal | macOS | Apple Silicon, AMD |
| Vulkan | Windows, Linux | AMD, Intel |
| CPU | All (fallback) | No GPU required |
Performance Estimates
The settings page shows:
- Estimated tokens per second — Higher is faster
- Estimated time per transaction — Typical categorization time
- Recommended max model size — Largest model your hardware can run well
If no GPU is detected, a warning appears: "Running in CPU-only mode. Local LLM categorization will be slow." In CPU-only mode, consider using the Llama 3.2 3B model or relying on Cloud LLM for speed.
Refresh Detection
Click Refresh Hardware Detection to re-run the detection after connecting a new GPU or changing drivers. Detection results are cached for the session.
Privacy Controls
OtterLedger gives you control over what data, if any, is shared with AI services.
[Screenshot: Privacy Controls section with toggle switches]
| Setting | Description |
|---|---|
| Allow anonymized data | Permits sending payee names to cloud providers (required for Cloud LLM to function) |
| Include merchant names | Includes the payee name in cloud requests |
| Include amounts | Includes transaction amounts in cloud requests (improves accuracy for amount-dependent categories) |
Local tiers (Rules, Payee Learning, XGBoost, Local LLM) never send data anywhere, regardless of these settings. Privacy controls only affect cloud provider requests.
Warning: Disabling "Allow anonymized data" prevents all cloud LLM tiers from running. Tiers 1-4 continue to operate normally.
What Data Is Never Sent
Regardless of privacy settings, OtterLedger never sends:
- Account numbers or bank identifiers
- Your name or personal information
- Account balances
- Full transaction notes or memos (only the payee name is used)
General Categorization Settings
These settings apply across all tiers.
[Screenshot: General Categorization section at the top of the AI settings page]
| Setting | Description |
|---|---|
| Enable AI categorization | Master switch — turns all AI categorization on or off |
| Auto-apply confidence threshold | Automatically apply suggestions above this confidence level without prompting you |
| Allow AI to create categories | Permits the AI to suggest category names not already in your category list |
Confidence Threshold
The confidence threshold controls when AI suggestions are applied automatically vs. when they are flagged for your review.
| Threshold | Behavior |
|---|---|
| 0% (Never) | All suggestions require manual review |
| 70% | Auto-apply confident suggestions, review uncertain ones |
| 90% | Only auto-apply very confident suggestions |
| 100% | Effectively never auto-applies (very rare to reach) |
A threshold of 70-80% is recommended for most users. Review the Uncategorized filter in the transaction list to see what the AI flagged for your attention.
Performance Tuning
Sequential vs. Parallel Processing
For large imports (100+ transactions), parallel cloud processing significantly reduces the total time. Each transaction can be sent to a cloud provider independently.
| Setting | Recommended Value |
|---|---|
| Use parallel cloud processing | Enabled |
| Concurrent requests | 5-10 for most API plans; 15-20 for unlimited plans |
Reducing concurrency lowers the risk of hitting API rate limits on cloud providers.
Prefer Cloud Over Local LLM
If your Local LLM is slow (CPU-only mode) and you have a cloud API key, enabling Prefer Cloud Over Local LLM routes tier 4 requests to your cloud provider instead. This requires a restart to take effect as it changes the provider pipeline order.
Tips
- Start with tiers 1-3. Rules, Payee Learning, and XGBoost handle the majority of transactions for most users without any cloud setup.
- Download the 3B model first if you are unsure about your hardware. It is small, fast, and sufficient for most payees.
- Use Gemini as your first cloud provider. It has a free API tier and strong accuracy.
- Create rules for your most common payees. Tier 1 rules are faster than any AI and always reliable.
- Review uncategorized transactions weekly. Each correction teaches Payee Learning and XGBoost, making future imports more accurate.
- Disable web search if you do large imports. Rate limiting from Google can slow down bulk categorization. Run web search on smaller batches or use DuckDuckGo.
Troubleshooting
XGBoost service is not available
The XGBoost service is a Python microservice that must be running. Check the application log for startup errors:
C:\Users\<you>\AppData\Local\OtterLedger\logs\openledger-<date>.log
If XGBoost is unavailable, OtterLedger skips it silently and uses tier 4 (Local LLM) instead.
Local LLM shows "Model file missing"
Click Download Model in the settings panel. If you previously downloaded a model and it is missing, it may have been deleted manually. Re-download it from Settings.
Cloud API test returns "Unauthorized"
- Verify you copied the full API key with no leading or trailing spaces
- Confirm the key has not been revoked or expired in the provider's portal
- Check that billing is active for your provider account (some providers disable keys on overdue accounts)
Categorization accuracy is low
- Check that Payee Learning is accumulating corrections — make sure you are correcting wrong categories in the transaction list
- Verify XGBoost is enabled and the service is available (green status in settings)
- If using Local LLM in CPU-only mode, consider switching to a cloud provider for better accuracy on ambiguous payees
- Enable web search enrichment to improve context for cryptic payee names
Hardware detection shows wrong GPU
Click Refresh Hardware Detection. If the GPU is still not detected, ensure your GPU drivers are up to date and that the appropriate backend (CUDA for NVIDIA, Vulkan for AMD/Intel) is installed.
Google search is rate-limited
Switch to DuckDuckGo as the primary search engine under Settings -> AI -> Web Search Enrichment and disable Prefer Google over DuckDuckGo. DuckDuckGo is not rate-limited in normal usage.
See also: Guide 23: AI Categorization — how categorization works and how to review AI suggestions
OtterLedger User Guide | AI Configuration | February 2026