AI Cloud Compute Tool

What can I build with Modal?
Modal enables AI and data workloads at scale. Use cases include generative AI inference, fine-tuning and training, large-scale batch processing, sandboxed code, building OpenAI-compatible LLM services, and Notebooks. You can deploy custom models or popular frameworks, run workloads in flexible environments, autoscale to meet demand, and serve functions as web endpoints or APIs.
How does Modal’s pricing work?
Modal uses a pay-as-you-go pricing model billed by actual compute used. You pay per second of GPU or CPU usage and per GiB of memory. There are free compute credits each month ($30) and options for startups to receive additional credits. Pricing highlights include a range of Nvidia GPUs with per-second rates, and a minimum core allocation per container. See GPU rates, CPU per-core and memory per-GiB rates in the pricing details.
What workloads and scale can Modal handle?
Modal is designed for large-scale AI and compute workloads. It autosscales containers to thousands of GPUs and back to zero as demand changes, with fast cold boots and a Rust-based container stack engineered for performance. It supports scales from hundreds of GPUs to zero in seconds, plus robust tooling for scheduling, debugging, and observability.
What hardware and environments are supported?
- Flexible environments: Bring your own image or build one in Python, and scale resources as needed.
- Hardware: GPUs such as Nvidia B200, H200, H100, A100 (80 GB and 40 GB), L40S, A10, L4, and T4 are available.
- Runtime flexibility: Run code with state-of-the-art GPUs and scalable compute primitives; leverage the built-in logs and integrations.
How do I deploy web endpoints and APIs?
Web Endpoints let you deploy and manage web services with custom domains, streaming, and websockets, serving functions as secure HTTPS endpoints.
How do I store and access data?
Data Storage provides network volumes, key-value stores, and queues. You can mount weights and data in distributed volumes and interact with them from your Python code, with easy integration to major cloud storage providers (S3, R2, etc.).
What debugging and observability features are built in?
Built-In Debugging includes the modal shell for interactive debugging and breakpoints to pinpoint issues quickly. You can export function logs to Datadog or any OpenTelemetry-compatible provider to monitor your workloads.
What security and governance features does Modal offer?
Modal is built on gVisor for sandboxing and isolation. It supports SOC 2 and HIPAA, region support, and SSO sign-in for enterprise. These controls help you manage security, compliance, and access at scale.
Is Modal’s Notebooks feature available?
Yes. Modal Notebooks are generally available, enabling notebook-style development and experimentation alongside other Modal workloads.
What are the pricing plans and what do they include?
Modal offers plan options for teams of all sizes:
- Starter: For small teams and independent developers (details on seats/concurrency/features are based on the plan description).
- Team: For startups and larger organizations needing more capacity.
- Enterprise: For security, governance, and customized support.
All plans use the same per-second compute pricing, with monthly $30 in free compute credits and potential startup credits where offered.
Are there free compute credits or startup programs?
Yes. Every month you receive $30 in free compute credits. Early-stage startups can access up to $25,000 in free compute credits. Additional credits may be available through startup programs.














