Modal logo

Modal

Free tier

High-performance AI infrastructure with sub-second cold starts and instant autoscaling

Free tier available·Technical·API available

Key strengths

Sub-second cold starts with instant container boot timesAutoscale from 0 to 1000+ GPUs on demand with no capacity planningPython-native SDK — define infrastructure and logic in a single fileFull support for inference, training, sandboxes, and batch processingSOC2 & HIPAA compliant with battle-tested isolation and data residency controls
Free tier + paid plans · from $30 USD/mo
San Francisco, USA
Founded 2021
No ratings yet

Developer Documentation

Installation & Authentication

pip install modal
modal token new  # authenticates via browser

Defining a GPU Function

import modal

app = modal.App("my-inference-app")
image = modal.Image.debian_slim().pip_install("torch", "transformers")

@app.function(gpu="H100", image=image)
def run_inference(prompt: str) -> str:
    # your model logic here
    return result

Key Primitives

  • @app.function() — decorates any Python function to run remotely; accepts gpu, image, timeout, concurrency_limit, and more.
  • modal.Image — defines the container environment (base OS, pip packages, system dependencies, custom Dockerfiles).
  • modal.Sandbox — programmatically spins up ephemeral, isolated execution environments for running untrusted or agent-generated code.
  • modal.web_endpoint() — exposes a function as an HTTPS endpoint with built-in support for streaming, WebSocket, and WebRTC.

Scaling & Deployment

  • Autoscaling is handled automatically; set allow_concurrent_inputs and keep_warm parameters for latency-sensitive workloads.
  • Multi-node training uses modal.Cluster with gang scheduling and Infiniband networking configured in a single line.
  • Secrets and environment variables are managed via modal.Secret, injectable at function level.