Product Search & Retrieval

Start typing below and retrieve results from the 200 sample products

Retrieval Architecture Agentic LCA Architecture Ask Agent

Start typing to search across product names, EPD IDs, SKUs, and specifications

Score Guide

Semantic Score

Cosine similarity between the query embedding and the document embedding.

sem = cos(E(query), E(doc))

TF-IDF Score

Cosine similarity on TF-IDF vectors. Works well with exact token matches such as EPD IDs, SKUs, manufacturer names.

tfidf = cos(TF-IDF(query), TF-IDF(doc))

αHybrid Score

Weighted combination of semantic and TF-IDF. The weight α is auto-detected from query type:

hybrid = α · sem + (1 - α) · tfidf

α = 0.0 if EPD/SKU queries (pure keyword)
α = 0.3 if numeric specs (e.g. "12mm", "40kg")
α = 0.6 if natural language (balanced)

Exact Match

When the query contains a structured identifier (EPD ID or SKU) that matches a product verbatim, that product is returned immediately with α = 1.0.

Rerank Score

Cross-encoder relevance score. Unlike bi-encoders, the cross-encoder attends jointly over (query, document), i.e., it sees both texts simultaneously, enabling fine-grained comparison (e.g. 1960 vs 1770 MPa).

rerank = CrossEncoder(query, doc)

Raw logit from ms-marco-MiniLM-L-6-v2. Higher is more relevant. Applied to the top-20 hybrid candidates from embedding models, then the top-5 are returned from the reranker.

Multilingual

Select the multilingual-MiniLM model to search in 50+ languages. It maps queries like "acier" (FR) or "Stahl" (DE) into the same embedding space as English documents.

When active, α is set to 0.85 so semantic similarity dominates because TF-IDF can't match cross-language tokens.

Example queries:

"acier 12mm" → steel cables

"beton C30" → cement products

"isolierung 80mm" → insulation

"bois PEFC" → timber

Enriched Indexing

Products are indexed with enriched text: tail descriptions stripped, English category synonyms injected from SKU prefix (e.g. CEM → cement, concrete). This means a query for "cement" matches all 20 cement products, not just the ~5 that literally contain the word.

Models

all-MiniLM-L6-v2

22M·384-dim·Default bi-encoder

multilingual-MiniLM-L12-v2

118M·384-dim·Cross-lingual (50+ langs)

lca-qwen3-embedding

600M·1024-dim·LCA-domain fine-tuned

ms-marco-MiniLM-L-6-v2

22M·Cross-encoder reranker