How Search Works
Understand the hybrid search pipeline — from query decomposition through vector and keyword search to LLM-generated answers with source citations.
The Full Search Pipeline
When a user submits a query, it passes through a multi-stage pipeline that combines different search strategies for maximum relevance. This is the most critical diagram to understand:
'harga honda veloz vs toyota avanza'"] --> B[Query Decomposition] B -->|"Multi-topic detected"| B1["Sub-query 1: 'harga honda veloz'"] B -->|"Multi-topic detected"| B2["Sub-query 2: 'harga toyota avanza'"] B1 --> C[Keyword Detection] B2 --> C C -->|"'harga' → pricing keyword"| D[Pricing Database Search] C -->|"Always"| E[Vector Search
Semantic / ChromaDB] C -->|"Always"| F[BM25 Search
Keyword / FTS5] D --> G[Hybrid Fusion
Weighted Combination] E --> G F --> G G --> H[Relevance Threshold
Filter low scores] H --> I["Top-K Selection
(default: 5)"] I --> J[" LLM
Bedrock or Gemini"] J --> K[" Response
with [1][2] Citations"] style A fill:#3B82F6,color:#fff style D fill:#F59E0B,color:#fff style E fill:#8B5CF6,color:#fff style F fill:#F59E0B,color:#fff style G fill:#6366F1,color:#fff style J fill:#22C55E,color:#fff style K fill:#22C55E,color:#fff
Search Methods
BABEH supports three search methods, configurable in Settings:
| Method | How it Works | Best For |
|---|---|---|
| Hybrid (default) | Combines Vector + BM25 results with weighted fusion (default: 50% each) | Most queries — balances semantic understanding with keyword precision |
| Vector Only | Pure semantic search via ChromaDB cosine similarity using Cohere multilingual embeddings | Conceptual queries, paraphrased questions, cross-language search |
| BM25 Only | Pure keyword search via SQLite FTS5 full-text indexing | Exact term matching, product codes, technical terms |
Vector Search (Semantic)
The user's query is converted to a 1024-dimensional vector using the
cohere.embed-multilingual-v3 model, then compared against all stored document chunks
using cosine similarity in ChromaDB. This finds results that are
semantically similar even if they use different words.
The Cohere multilingual model understands 100+ languages. A query in Indonesian can match content in English, and vice versa.
BM25 Search (Keyword)
Uses SQLite's FTS5 (Full-Text Search version 5) engine with the BM25 ranking algorithm. Scores are based on term frequency (how often the keyword appears in a chunk) and inverse document frequency (how rare the keyword is across all chunks).
Hybrid Fusion
In hybrid mode, results from both search methods are combined using weighted scores:
final_score = (vector_weight × vector_score) + (bm25_weight × bm25_score) Default weights: Vector 50% + BM25 50%. These weights are configurable in the system configuration.
Query Decomposition
BABEH automatically detects multi-topic queries — questions that compare or ask about multiple items simultaneously. These are split into sub-queries for better retrieval.
Detection Patterns
| Pattern | Language | Example |
|---|---|---|
A vs B | EN / ID | "Honda Civic vs Toyota Corolla" |
A atau B | ID | "Civic atau Corolla?" |
A or B | EN | "Civic or Corolla?" |
A dan B | ID | "fitur Civic dan Corolla" |
A and B | EN | "features of Civic and Corolla" |
perbedaan A dan B | ID | "perbedaan Civic dan Corolla" |
difference between A and B | EN | "difference between Civic and Corolla" |
When a multi-topic query is detected, each sub-topic is searched independently, and results are merged for more diverse and comprehensive coverage.
Automatic Database Detection
BABEH automatically scans the query for keywords that indicate the user is asking about pricing or specifications. When detected, the relevant database is searched in addition to the document knowledge base.
Pricing Keywords
If any of these words appear in the query, the Pricing Database is searched:
harga, price, berapa, biaya, cost, tarif, pricing,
promo, diskon, discount, budget, kisaran, sekitar Specification Keywords
If any of these words appear, the Product Specifications database is searched:
spesifikasi, spec, fitur, feature, transmisi, transmission,
mesin, engine, cc, tenaga, hp, hybrid, electric, listrik,
bensin, petrol, 4wd, awd, fwd, sunroof, carplay,
kamera, kamera 360, 360 camera, wireless, kursi,
kapasitas, seat, cooling seat, heated seat, cruise control,
lane assist, android auto, keyless, push start, rear camera,
electric seat, punya, ada fitur, apakah ada A query like "harga dan spesifikasi Honda Civic" triggers both pricing and spec detection, searching all three sources (documents + pricing + specs) simultaneously.
Relevance Threshold & Scoring
After search results are collected, they are filtered by a relevance threshold to ensure only meaningful results reach the LLM.
| Setting | Default | Range | Effect |
|---|---|---|---|
| Relevance Threshold | 0.25 | 0.0 – 1.0 | Results below this score are filtered out before reaching the LLM |
| Top K | 5 | 1 – 50 | Maximum number of results sent to the LLM as context |
Score Color Coding
In the Search Debug tool, scores are color-coded:
| Color | Score Range | Meaning |
|---|---|---|
| Green | ≥ 0.5 | High relevance — strong match |
| Yellow | ≥ 0.3 | Moderate relevance — decent match |
| Red | < 0.3 | Low relevance — may be filtered out |
Citation System
BABEH uses a numbered citation system to ensure AI-generated answers are traceable back to their source documents.
How Citations Work
[1] Document A - chunk content...
[2] Document B - chunk content... LLM->>LLM: Generate answer using sources LLM-->>User: "The Honda Civic has 150HP [1] and...
costs around 500 juta [2]" Note right of User: Citations [1], [2] link
back to source documents
The response includes a citations JSON that maps each reference number to its source document, enabling the frontend to display clickable source links.
No Information Response
When no relevant results are found (all scores below threshold, or no matching documents), the system returns a default response:
"Maaf, saya tidak memiliki informasi tersebut."
(Sorry, I don't have that information.) This prevents the AI from hallucinating answers when no relevant context is available.
Streaming Responses
BABEH supports Server-Sent Events (SSE) streaming for real-time, word-by-word response delivery. The stream includes these event types:
| Event | Description |
|---|---|
search_complete | Search phase finished, results available |
llm_chunk | A piece of the LLM's response text |
citations | Source citation mapping |
metadata | Processing time, model used, token counts |
done | Stream complete |