labelf.ai
Book a Demo
API Documentation
https://api.labelf.ai/v2 REST + JSON Bearer Auth
POST /v2/similarity

Semantic Search

Find texts by meaning, not keywords. Search across conversations, detect near-duplicates, cluster related content, and discover patterns that keyword search misses entirely. Supports up to 200 texts per request with cross-language matching.

Three search modes

Labelf uses hybrid search that automatically selects the best strategy based on your query. You don't need to configure anything — just send your query and the engine does the right thing.

BM25

Keyword search

Triggered when your query is 3 words or fewer. Uses exact keyword matching optimized for short, specific lookups. Supports * wildcards for partial matches — e.g. *stream* matches "streaming", "livestream", "streamline".

VECTOR

Semantic search

Triggered when your query is 4 or more words. Searches by meaning, not exact terms. A query like "customer can't access their account" finds conversations about "login problems", "authentication failure", and "locked out" — even when none of those words appear in the query.

HYBRID

Combined search

The API automatically blends keyword and semantic results for the best of both worlds. Exact matches rank highest, while semantically related content surfaces alongside. No configuration needed — the engine handles mode selection and result merging.

Cross-language search

Because semantic search operates on meaning, it works across languages. A search query in English will find relevant conversations in Swedish, German, Arabic, or any other language in your dataset. This is particularly valuable for Nordic companies with multilingual customer bases — you can build one search index and query it in any language.

Example: Searching "customer was promised a callback but never received one" finds a Swedish conversation containing "Jag blev lovad att någon skulle ringa tillbaka men ingen har hört av sig" — with a high similarity score — because the meaning is the same.

Request body

Parameter Type Description
base_texts object Required The query texts you want to search with. Each text is matched against the candidate pool.
compare_to_texts object Required The candidate texts to search through. Up to 200 texts.
top_n integer Number of most similar results to return per base text. Defaults to all candidates, ranked by similarity.

Request example

Find conversations similar to a known churn conversation. The query is in English; the candidates include Swedish and English texts.

POST /v2/similarity
Authorization: Bearer your-api-key
Content-Type: application/json

{
  "base_texts": {
    "text_1": "I've been a customer for 8 years but I'm seriously considering switching to another provider. The price keeps going up and support takes forever."
  },
  "compare_to_texts": {
    "conv_4821": "Jag har varit kund i tio år men nu får det vara nog. Priset har höjts tre gånger på två år.",
    "conv_4822": "Can you help me upgrade my plan to the 100 Mbit package?",
    "conv_4823": "I want to cancel my subscription. I've found a better deal elsewhere and your retention offer wasn't enough.",
    "conv_4824": "The internet has been dropping every evening for two weeks.",
    "conv_4825": "Ich bin seit fünf Jahren Kunde und die Preise steigen ständig. Ich denke über einen Wechsel nach."
  },
  "top_n": 3
}

Response

Returns the top N most similar candidates per base text, ranked by similarity score (0 to 1). Notice that the Swedish and German churn conversations rank highest despite being in different languages.

{
  "similarities": {
    "text_1": [
      {
        "id": "conv_4823",
        "score": 0.91,
        "text": "I want to cancel my subscription. I've found a better deal..."
      },
      {
        "id": "conv_4821",
        "score": 0.87,
        "text": "Jag har varit kund i tio år men nu får det vara nog..."
      },
      {
        "id": "conv_4825",
        "score": 0.84,
        "text": "Ich bin seit fünf Jahren Kunde und die Preise steigen..."
      }
    ]
  }
}

Use cases

Find churn patterns

Take a known churn conversation and search your entire dataset for similar ones. Uncover how widespread a specific complaint is — across languages, channels, and time periods — before it shows up in your churn metrics.

Detect duplicate tickets

A Swedish complaint and an English complaint about the same issue will match by meaning. Use similarity search to de-duplicate across languages and surface recurring issues that span multiple markets.

Search for specific issue patterns

Describe the pattern in plain language: "customer was promised a callback but never received one". The API finds every conversation matching that pattern, regardless of how the customer or agent phrased it.

Build training sets for new models

Need examples of "broken promise" complaints to train a new classifier? Search semantically for the pattern, review the top matches, and you have a curated training set in minutes instead of days.

labelf.ai
Address:
Gamla Brogatan 26, Stockholm, Sweden
Contact:

© 2026 Labelf. All rights reserved.