beta9
Get Started
📕 Learn how we fine-tuned Mixtral on 128x RTX 4090s

The only tool you need to use GPUs at scale

Beta9 is used to run the most demanding GPU workloads across any cloud provider -- VMs or bare-metal.

Run thousands of GPU containers across clouds

Meet Beta9, the container runtime for the most ambitious Python applications

Use GPUs on any cloud

Find the best GPU pricing and availability. Launch workloads on any cloud provider -- VMs or bare metal.

Scale to Zero

Run serverless workloads that automatically scale down between requests.

Autoscale to 1000s of GPUs

Automatically scale out to thousands of GPUs across cloud providers. Stop worrying about quota limits on AWS.

Largest Catalog of GPU Providers

Our open-source approach allows us to run your workloads on any hardware -- even your laptop.

Run LLMs your way

We include primitives for running LLMs in the cloud, including task-queueing, web endpoints, and scale-to-zero.

Securely Self-Host

Self-host Beta9 on your own infrastructure, ensuring no data ever leaves your VPC.

Beta9 is the most secure, flexible, and fastest way of running serverless GPU workloads at massive scale

Proudly Open Source

Licensed under Apache 2.0
beta9 is used to power production GPU workloads at Beam.