Endpoints
Available on Request
Reference
Need more?
We expose additional endpoints on request — model training, transcription, labeling, and more.
Talk to us →Datasets
Datasets are where your interaction data lives — tickets, calls, chats, CSAT responses, CRM metadata. They support the full data lifecycle: ingestion, enrichment, search, and discovery.
Data Structure
Each record in a dataset represents a unit of customer interaction — an individual utterance, a full conversation, or an entire errand. The text column holds the content your models classify. Everything else is metadata: who handled it, when, which queue, what product, and any custom fields your team tracks.
| Property | Description |
|---|---|
| id | Unique dataset identifier (UUID) |
| name | Human-readable dataset name |
| text_column | The primary text field used for classification |
| metadata_columns | Structured fields — dates, agent IDs, scores, custom fields |
| row_count | Total number of records in the dataset |
| created_at | ISO 8601 timestamp of dataset creation |
| last_ingested_at | When the most recent record was added |
| integration_id | Connected data source (null for manual uploads) |
Record Example
A single dataset record with text and metadata fields. When models classify this record, the predictions are stored alongside the original data — making everything queryable in dashboards and exports.
{
"text": "I've been waiting 3 weeks for a replacement router and no one has called me back",
"agent_id": "agent_4821",
"customer_id": "cust_90312",
"queue": "technical_support",
"channel": "phone",
"product": "fiber_100",
"csat_score": 2,
"handle_time_sec": 487,
"created_at": "2025-11-14T09:22:00Z"
} Data Sources
Labelf ingests data from every channel where customers interact with your team. Tickets, calls, chats, emails, CSAT/NPS surveys, and CRM metadata — all flow into datasets with full metadata preserved.
Zendesk
Tickets & chatsPull tickets from Support and chat transcripts from Zendesk Chat. Tags, custom fields, and satisfaction ratings included.
Freshdesk
TicketsImport tickets with full conversation threads, SLA data, and custom fields.
ServiceNow
Incidents & requestsIngest incidents, service requests, and knowledge articles with assignment group metadata.
Intercom
Chats & conversationsStream live chat conversations, bot handoff events, and user attributes in real-time.
Genesys
Call transcriptsReceive call transcripts and interaction metadata — AHT, disposition codes, and queue data.
Salesforce
Cases & interactionsSync Service Cloud cases, email-to-case threads, and customer contact history.
Call transcripts can be imported from your existing transcription provider or transcribed directly through Labelf's built-in speech-to-text pipeline.
Ingestion Methods
Four ways to get data in, from real-time streaming to bulk historical imports.
Real-time API push
POST individual records or small batches as they happen. Ideal for live classification pipelines where every interaction is scored the moment it ends.
Scheduled batch refills
Configure hourly, daily, or weekly syncs from connected integrations. Labelf pulls new records automatically — no cron jobs required.
File upload
Upload CSV or JSON files through the dashboard or API. Supports files up to 500 MB with automatic schema detection and column mapping.
Direct integrations
One-click OAuth connections to Zendesk, Freshdesk, ServiceNow, Intercom, Genesys, and Salesforce. Data flows in with full metadata preserved.
Search Within Datasets
Every dataset is fully searchable. Combine semantic search (find conversations similar to a query, regardless of exact wording), keyword search (exact term matching), and metadata filters — date ranges, agent, team, product, channel, CSAT score — to zero in on exactly the interactions you need.
This is how quality teams audit thousands of calls efficiently: search for "mentioned competitor" across all calls this quarter, filtered to the retention queue, sorted by churn risk score.
Auto-Clustering & Discovery
Not sure what categories exist in your data? Labelf's auto-clustering analyzes your dataset and automatically discovers groupings — no predefined labels required. It finds patterns humans miss: a telecom might discover that 12% of "billing" calls are actually about a confusing invoice redesign, or that a specific product SKU drives 3x the complaint volume.
Clusters become the starting point for your classification models. Instead of guessing what labels you need, you let the data tell you.
Data Lifecycle
The typical flow:
- Create a dataset with a schema — define the text column and any metadata fields
- Connect a data source or upload historical data (CSV, JSON, API push)
- Discover — run auto-clustering to understand what topics exist in your data
- Apply models for classification — every record gets scored automatically
- Query results via the API, dashboards, or exports to your data warehouse
- Refill — scheduled syncs keep the dataset current without manual uploads
Available on Request
The full Datasets API is available to enterprise customers. This includes programmatic dataset creation, schema management, bulk ingestion endpoints, search APIs, and clustering configuration. Contact us to enable these endpoints for your workspace.