Simple, Transparent Pricing
Pay only for what you use. No monthly subscriptions. No hidden fees. Choose from API access, dedicated GPUs, or multi-node clusters.
How It Works
1. Add Credits
Add any amount starting from $10. Your credits never expire.
2. Use AI Models
Access 500+ models for chat, audio, code, images, embeddings & moderation. Credits deduct based on actual usage.
3. Top Up Anytime
Add more credits whenever you need. No commitments.
No credit card required to sign up
GPU Cluster Pricing
High-performance multi-node clusters for distributed training, HPC, and large-scale inference
60-75% cheaper than AWS, Azure, or GCP
NVIDIA H100 80GB
Flagship GPU for AI training
NVIDIA A100 80GB
Proven performance for AI workloads
GPU Type | Per GPU | 2 Nodes (16 GPUs) | 4 Nodes (32 GPUs) | Interconnect |
---|---|---|---|---|
NVIDIA H200 141GB Next-gen flagship | $4.29/hr | $68.64/hr | $137.28/hr | 3.2 Tbps |
NVIDIA B200 Latest Blackwell architecture | $5.49/hr | $87.84/hr | $175.68/hr | 3.2 Tbps |
NVIDIA H100 NVL Optimized for inference | $3.89/hr | $62.24/hr | $124.48/hr | 3.2 Tbps |
NVIDIA L40s Cost-effective option | $0.99/hr | $15.84/hr | $31.68/hr | 1.6 Tbps |
What's Included:
GPU Instance Pricing
Single GPU instances for development, fine-tuning, and inference
Per-second billing • No setup fees • Instant deployment
GPU Model | VRAM | Per Hour | Per Day | Per Month | Best For |
---|---|---|---|---|---|
NVIDIA H100 80GB | 80GB | $3.23 | $77.52 | $2,357.90 | Large model training |
NVIDIA A100 80GB | 80GB | $1.69 | $40.56 | $1,233.70 | Fine-tuning, training |
NVIDIA A100 40GB | 40GB | $1.29 | $30.96 | $941.70 | Medium models |
NVIDIA L40s | 48GB | $0.99 | $23.76 | $722.70 | Inference, dev |
NVIDIA RTX A6000 | 48GB | $0.79 | $18.96 | $576.70 | Development, testing |
NVIDIA RTX 4090 | 24GB | $0.59 | $14.16 | $430.70 | Small models, prototyping |
Compare Our Pricing
See how much you can save compared to major cloud providers
Configuration | RunAICloud | AWS | Azure | GCP | Savings |
---|---|---|---|---|---|
1x H100 80GB Per hour | $3.23 | $8.14 | $9.45 | $8.92 | 60-65% off |
1x A100 80GB Per hour | $1.69 | $4.95 | $5.61 | $5.23 | 65-70% off |
8x H100 Cluster Per hour | $25.84 | $65.12 | $75.60 | $71.36 | 60-65% off |
16x H100 Cluster 2 nodes • Per hour | $51.65 | $130.24 | $151.20 | $142.72 | 60-65% off |
Massive Savings on GPU Compute
By optimizing our infrastructure and passing the savings to you, we offer 60-75% lower prices than AWS, Azure, and GCP for equivalent GPU compute.
Example: A 16-GPU H100 cluster that costs $130+/hr on AWS costs only $51.65/hr on RunAICloud. That's over $78/hr in savings, or $56,000+ per month!
AI Model API Pricing
Access 500+ AI models through our unified API. Transparent pricing per million tokens.
Model Category | Price per 1M Tokens |
---|---|
Small Models (3B-7B) Llama 8B, Gemma, etc. | $0.035 - $0.05 |
Medium Models (8B-34B) DeepSeek, Qwen, etc. | $0.07 - $0.20 |
Large Models (70B+) Llama 70B, Mixtral, etc. | $0.14 - $0.20 |
Premium Models GPT-4, Claude, Gemini | $0.16 - $3.90 |
Code Models Specialized coding models | $0.04 - $0.20 |
Image Models Text-to-image generation | View models page |
Audio Models Speech-to-text, TTS, audio processing | $0.01 - $0.50 |
Embedding Models Vector embeddings for RAG & search | $0.02 - $0.10 |
Moderation Models Content safety & filtering | $0.02 - $0.15 |
Frequently Asked Questions
GPU Clusters & Instances
How is GPU usage billed?
Both GPU instances and clusters are billed per-second with no minimum charges. You only pay for the exact time your GPUs are running. For example, if you use a $3/hour GPU for 30 minutes, you'll be charged $1.50.
What's the difference between GPU instances and clusters?
GPU Instances: Single GPU machines perfect for development, fine-tuning, and inference. Deploy in seconds, SSH access included.
GPU Clusters: Multi-node systems (2-8 nodes) with high-speed InfiniBand networking, ideal for distributed training, HPC workloads, and large-scale inference.
Can I scale my cluster up or down?
Currently, cluster configurations are fixed at creation time. However, you can terminate a cluster and create a new one with different specifications at any time. We're working on dynamic scaling capabilities.
What templates are available for clusters?
We offer 5 pre-configured templates: PyTorch Distributed Training, Slurm HPC Cluster, Axolotl LLM Fine-Tuning, TensorFlow Distributed, and Ray Distributed Computing. All templates come with NCCL, CUDA, and necessary environment variables pre-configured.
What network speeds do clusters offer?
Clusters feature ultra-fast InfiniBand or RoCE v2 interconnects: 1.6 Tbps for A100/L40s clusters and up to 3.2 Tbps for H100/H200/B200 clusters. This ensures minimal communication overhead for distributed training.
How quickly can I deploy a GPU or cluster?
GPU instances deploy instantly (typically under 30 seconds). Clusters deploy in 1-2 minutes. All come with pre-configured environments and SSH access.
Is there a minimum usage time for GPUs?
No minimum! Billing is per-second. Use a GPU for 5 seconds or 5 months - you only pay for actual usage. However, we recommend keeping GPUs running for at least a few minutes to make setup worthwhile.
API & Credits
Do credits expire?
No, your credits never expire. Use them at your own pace for API calls, GPU instances, or clusters.
What's the minimum credit purchase?
The minimum is $10. You can add any amount above that. Credits can be used for AI model APIs, GPU instances, and GPU clusters.
Are there any hidden fees?
Absolutely no hidden fees. You only pay for what you use - API tokens, GPU seconds, or cluster compute time. No setup fees, no bandwidth charges, no surprise costs.
Can I get a refund?
Unused credits can be refunded within 30 days of purchase. Contact support for refund requests. Note that used credits (API calls, GPU time) are non-refundable.
Do you offer volume discounts?
Yes! Contact us for enterprise pricing if you plan to spend $1,000+ per month. We offer custom pricing for high-volume API usage and dedicated GPU commitments.
Can I use the same credits for APIs and GPUs?
Yes! Credits are universal across our platform. Use them for AI model API calls, single GPU instances, or multi-node clusters - whatever your project needs.
Savings & Comparison
How can you offer 60-75% savings vs AWS/Azure/GCP?
We optimize our infrastructure, leverage spot capacity efficiently, and maintain lower overhead. We pass these savings directly to customers rather than pocketing the difference. Our pricing is transparent and competitive.
Are there any compromises with lower pricing?
No compromises! You get the same enterprise-grade NVIDIA GPUs (H100, A100, etc.), ultra-fast networking, and reliable infrastructure. The only difference is the price.
How much can I save on a typical workload?
Example 1: Training a large language model on a 16-GPU H100 cluster for 24 hours:
- • AWS: ~$3,125 (24 hrs × $130.24/hr)
- • RunAICloud: $1,240 (24 hrs × $51.65/hr)
- • Savings: $1,885 per day!