Settings

Configure your LLM provider, search method, tuning parameters, chunk settings, and input/output token limits to optimize search quality and AI response behavior.

Who can access this?

analyst manager superadmin

Overview

The Settings page lets administrators configure how BABEH processes queries and generates responses. Changes made here take effect immediately for all new queries — no restart required.

Settings page
Screenshot: Settings page with search method radios, sliders, toggles, and model info card

All Configurable Settings

SettingTypeDefaultRangeDescription
Model Provider Select bedrock bedrock / gemini Which LLM provider to use for generating AI responses
Search Method Radio hybrid hybrid / vector / bm25 Which search algorithm to use (see How Search Works)
Temperature Slider 0.7 0.0 – 1.0 Controls LLM creativity. Lower = more factual, Higher = more creative
Relevance Threshold Slider 0.25 0.0 – 1.0 Minimum score for search results to be included. Higher = stricter filtering
Top K Number 5 1 – 50 Maximum number of search results sent to the LLM as context
Chunk Size Slider 500 100 – 10,000 Number of characters per document chunk (affects new uploads only)
Chunk Overlap Slider 50 0 – 5,000 Characters shared between consecutive chunks
Use LLM Toggle On On / Off When off, queries return raw search results without AI-generated answers
Streaming Toggle On On / Off When on, responses are streamed word-by-word via SSE
Max Query Length Slider 100 10 – 200 chars Maximum characters a user may send per query. Requests exceeding this are rejected with HTTP 400.
Max Input Tokens Slider 600 10 – 900 Maximum tokens sent to the LLM as context. Prompt is truncated (≈4 chars/token) when exceeded.
Max Output Tokens Slider 900 1 – 1200 Maximum tokens the LLM may generate. Maps to max_new_tokens (Nova), max_tokens (Claude), maxOutputTokens (Gemini).

LLM Provider

BABEH supports two LLM providers. You can switch between them at any time.

AWS Bedrock

PropertyValue
Default modelamazon.nova-lite-v1:0
Also supportsAnthropic Claude models via Bedrock
FeaturesSync + streaming responses
AuthAWS credentials (Access Key, Secret Key, Region)

Google Gemini

PropertyValue
Default modelgemini-2.0-flash-exp
Also supportsAny Gemini model (configurable via env var)
FeaturesSync + streaming responses
AuthGemini API key

Temperature Guide

Temperature controls the randomness/creativity of the AI's responses:

ValueBehaviorBest For
0.0 – 0.3 Very factual, deterministic Pricing queries, spec lookups, factual Q&A
0.3 – 0.7 Balanced (default: 0.7) General queries, explanations
0.7 – 1.0 More creative, varied Content generation, brainstorming

Tuning Tips

Best Practices
  • Low relevance threshold (0.1–0.2) — More results, but may include less relevant ones. Good for broad queries.
  • High relevance threshold (0.4–0.6) — Fewer, more precise results. Good when you need accuracy.
  • Higher Top-K (10–20) — Gives the LLM more context, useful for comparison queries.
  • Smaller chunks (200–400) — Better for precise, specific questions.
  • Larger chunks (800–1500) — Better for context-heavy, explanatory questions.
  • Max Query Length (10–200) — Keep at 100 for typical conversational queries; lower to restrict abuse.
  • Max Input Tokens (10–900) — 600 is a good balance; increase to give the LLM more context at the cost of speed.
  • Max Output Tokens (1–1200) — 900 suits most answers; lower for concise replies, raise for detailed explanations.

Save & Reset