Settings
Configure your LLM provider, search method, tuning parameters, chunk settings, and input/output token limits to optimize search quality and AI response behavior.
Who can access this?
analyst manager superadmin
Overview
The Settings page lets administrators configure how BABEH processes queries and generates responses. Changes made here take effect immediately for all new queries — no restart required.
Screenshot: Settings page with search method radios, sliders, toggles, and model info card
All Configurable Settings
| Setting | Type | Default | Range | Description |
|---|---|---|---|---|
| Model Provider | Select | bedrock | bedrock / gemini | Which LLM provider to use for generating AI responses |
| Search Method | Radio | hybrid | hybrid / vector / bm25 | Which search algorithm to use (see How Search Works) |
| Temperature | Slider | 0.7 | 0.0 – 1.0 | Controls LLM creativity. Lower = more factual, Higher = more creative |
| Relevance Threshold | Slider | 0.25 | 0.0 – 1.0 | Minimum score for search results to be included. Higher = stricter filtering |
| Top K | Number | 5 | 1 – 50 | Maximum number of search results sent to the LLM as context |
| Chunk Size | Slider | 500 | 100 – 10,000 | Number of characters per document chunk (affects new uploads only) |
| Chunk Overlap | Slider | 50 | 0 – 5,000 | Characters shared between consecutive chunks |
| Use LLM | Toggle | On | On / Off | When off, queries return raw search results without AI-generated answers |
| Streaming | Toggle | On | On / Off | When on, responses are streamed word-by-word via SSE |
| Max Query Length | Slider | 100 | 10 – 200 chars | Maximum characters a user may send per query. Requests exceeding this are rejected with HTTP 400. |
| Max Input Tokens | Slider | 600 | 10 – 900 | Maximum tokens sent to the LLM as context. Prompt is truncated (≈4 chars/token) when exceeded. |
| Max Output Tokens | Slider | 900 | 1 – 1200 | Maximum tokens the LLM may generate. Maps to max_new_tokens (Nova), max_tokens (Claude), maxOutputTokens (Gemini). |
LLM Provider
BABEH supports two LLM providers. You can switch between them at any time.
AWS Bedrock
| Property | Value |
|---|---|
| Default model | amazon.nova-lite-v1:0 |
| Also supports | Anthropic Claude models via Bedrock |
| Features | Sync + streaming responses |
| Auth | AWS credentials (Access Key, Secret Key, Region) |
Google Gemini
| Property | Value |
|---|---|
| Default model | gemini-2.0-flash-exp |
| Also supports | Any Gemini model (configurable via env var) |
| Features | Sync + streaming responses |
| Auth | Gemini API key |
Temperature Guide
Temperature controls the randomness/creativity of the AI's responses:
| Value | Behavior | Best For |
|---|---|---|
| 0.0 – 0.3 | Very factual, deterministic | Pricing queries, spec lookups, factual Q&A |
| 0.3 – 0.7 | Balanced (default: 0.7) | General queries, explanations |
| 0.7 – 1.0 | More creative, varied | Content generation, brainstorming |
Tuning Tips
Best Practices
- Low relevance threshold (0.1–0.2) — More results, but may include less relevant ones. Good for broad queries.
- High relevance threshold (0.4–0.6) — Fewer, more precise results. Good when you need accuracy.
- Higher Top-K (10–20) — Gives the LLM more context, useful for comparison queries.
- Smaller chunks (200–400) — Better for precise, specific questions.
- Larger chunks (800–1500) — Better for context-heavy, explanatory questions.
- Max Query Length (10–200) — Keep at 100 for typical conversational queries; lower to restrict abuse.
- Max Input Tokens (10–900) — 600 is a good balance; increase to give the LLM more context at the cost of speed.
- Max Output Tokens (1–1200) — 900 suits most answers; lower for concise replies, raise for detailed explanations.
Save & Reset
- Save — Persists all current settings to the database. Changes take effect immediately for new queries.
- Reset — Reverts all settings to their default values.