Argilla logo

Argilla

Free tier

The open-source collaboration tool for AI engineers and domain experts to build high-quality datasets

Free·All audiences·Powered by Hugging Face·API available·Open source

Key strengths

Open-source and self-hostable with full data controlIntuitive API for seamless integration into existing ML pipelinesSupports RLHF, fine-tuning, and active learning workflowsCombines AI automation with human-in-the-loop feedbackStrong community support and ecosystem via Hugging Face
Completely free
Madrid, Spain
Founded 2021
Self-hostable
No ratings yet

Developer Documentation

Argilla provides a full-featured Python SDK and REST API for integrating dataset curation into your MLOps pipeline.

Installation

pip install argilla

Quickstart Example

import argilla as rg

# Connect to your Argilla server
rg.init(api_url="http://localhost:6900", api_key="YOUR_API_KEY")

# Create and log a dataset
dataset = rg.DatasetForTextClassification.from_pandas(df)
rg.log(dataset, name="my-classification-dataset")

Key Capabilities for Developers

  • Distilabel integration — Use the companion distilabel library for scalable AI-assisted data generation and synthetic labeling pipelines.
  • Hugging Face Hub — Push and pull datasets directly to/from the Hugging Face Hub using native integrations.
  • Active learning support — Plug in your models to prioritize uncertain samples for annotation, reducing labeling costs.
  • REST API — Full REST API available for custom integrations and programmatic dataset management.
  • Self-hosting — Deploy via Docker with full control over your data and infrastructure.

Refer to the Argilla Docs and Distilabel Docs for full API references and advanced guides.