labelf.ai
Book a Demo
API Documentation
https://api.labelf.ai/v2 REST + JSON Bearer Auth

Datasets

Datasets are where your interaction data lives — tickets, calls, chats, CSAT responses, CRM metadata. They support the full data lifecycle: ingestion, enrichment, search, and discovery.

Data Structure

Each record in a dataset represents a unit of customer interaction — an individual utterance, a full conversation, or an entire errand. The text column holds the content your models classify. Everything else is metadata: who handled it, when, which queue, what product, and any custom fields your team tracks.

Property Description
id Unique dataset identifier (UUID)
name Human-readable dataset name
text_column The primary text field used for classification
metadata_columns Structured fields — dates, agent IDs, scores, custom fields
row_count Total number of records in the dataset
created_at ISO 8601 timestamp of dataset creation
last_ingested_at When the most recent record was added
integration_id Connected data source (null for manual uploads)

Record Example

A single dataset record with text and metadata fields. When models classify this record, the predictions are stored alongside the original data — making everything queryable in dashboards and exports.

{
  "text": "I've been waiting 3 weeks for a replacement router and no one has called me back",
  "agent_id": "agent_4821",
  "customer_id": "cust_90312",
  "queue": "technical_support",
  "channel": "phone",
  "product": "fiber_100",
  "csat_score": 2,
  "handle_time_sec": 487,
  "created_at": "2025-11-14T09:22:00Z"
}

Data Sources

Labelf ingests data from every channel where customers interact with your team. Tickets, calls, chats, emails, CSAT/NPS surveys, and CRM metadata — all flow into datasets with full metadata preserved.

Zendesk

Tickets & chats

Pull tickets from Support and chat transcripts from Zendesk Chat. Tags, custom fields, and satisfaction ratings included.

Freshdesk

Tickets

Import tickets with full conversation threads, SLA data, and custom fields.

ServiceNow

Incidents & requests

Ingest incidents, service requests, and knowledge articles with assignment group metadata.

Intercom

Chats & conversations

Stream live chat conversations, bot handoff events, and user attributes in real-time.

Genesys

Call transcripts

Receive call transcripts and interaction metadata — AHT, disposition codes, and queue data.

Salesforce

Cases & interactions

Sync Service Cloud cases, email-to-case threads, and customer contact history.

Call transcripts can be imported from your existing transcription provider or transcribed directly through Labelf's built-in speech-to-text pipeline.

Ingestion Methods

Four ways to get data in, from real-time streaming to bulk historical imports.

Real-time API push

POST individual records or small batches as they happen. Ideal for live classification pipelines where every interaction is scored the moment it ends.

Scheduled batch refills

Configure hourly, daily, or weekly syncs from connected integrations. Labelf pulls new records automatically — no cron jobs required.

File upload

Upload CSV or JSON files through the dashboard or API. Supports files up to 500 MB with automatic schema detection and column mapping.

Direct integrations

One-click OAuth connections to Zendesk, Freshdesk, ServiceNow, Intercom, Genesys, and Salesforce. Data flows in with full metadata preserved.

Search Within Datasets

Every dataset is fully searchable. Combine semantic search (find conversations similar to a query, regardless of exact wording), keyword search (exact term matching), and metadata filters — date ranges, agent, team, product, channel, CSAT score — to zero in on exactly the interactions you need.

This is how quality teams audit thousands of calls efficiently: search for "mentioned competitor" across all calls this quarter, filtered to the retention queue, sorted by churn risk score.

Auto-Clustering & Discovery

Not sure what categories exist in your data? Labelf's auto-clustering analyzes your dataset and automatically discovers groupings — no predefined labels required. It finds patterns humans miss: a telecom might discover that 12% of "billing" calls are actually about a confusing invoice redesign, or that a specific product SKU drives 3x the complaint volume.

Clusters become the starting point for your classification models. Instead of guessing what labels you need, you let the data tell you.

Data Lifecycle

The typical flow:

  1. Create a dataset with a schema — define the text column and any metadata fields
  2. Connect a data source or upload historical data (CSV, JSON, API push)
  3. Discover — run auto-clustering to understand what topics exist in your data
  4. Apply models for classification — every record gets scored automatically
  5. Query results via the API, dashboards, or exports to your data warehouse
  6. Refill — scheduled syncs keep the dataset current without manual uploads

Available on Request

The full Datasets API is available to enterprise customers. This includes programmatic dataset creation, schema management, bulk ingestion endpoints, search APIs, and clustering configuration. Contact us to enable these endpoints for your workspace.

Create & configure datasets
Upload (CSV, JSON, API)
Scheduled refills from integrations
Semantic & keyword search
Metadata filtering & facets
Auto-clustering & discovery
Conversation threading
Transcription management
Custom field definitions
Data retention policies
Talk to us
labelf.ai
Address:
Gamla Brogatan 26, Stockholm, Sweden
Contact:

© 2026 Labelf. All rights reserved.