From prototype
to production scale.
Pay-as-you-go from $10. Committed plans from $499/mo. Dedicated GPU instances from $5,000/mo. All 200+ models, one API.
Pay-as-you-go from $10. Committed plans from $499/mo. Dedicated GPU instances from $5,000/mo. All 200+ models, one API.
All tiers include
Provision dedicated GPU instances for inference, fine-tuning, and training. Available on Dedicated and Enterprise tiers.
| GPU | VRAM | Price / hr | Best For |
|---|---|---|---|
| NVIDIA H100 SXM | 80 GB | $3.49/hr | Large models, fine-tuning, training |
| NVIDIA A100 SXM | 80 GB | $2.09/hr | Training, high-throughput inference |
| NVIDIA L40S | 48 GB | $1.19/hr | Inference, cost-efficient production |
| NVIDIA A10G | 24 GB | $0.75/hr | Lightweight inference, experimentation |
GPU instances billed per hour. Minimum 1-hour commitment. Multi-GPU clusters available on Enterprise. Pricing may vary by availability and region.
From shared multi-tenant pools to fully isolated on-premises deployments. Pick the tenancy model that matches your compliance and performance needs.
Every model has clear input and output pricing. Use our Vedika models for faith-domain tasks, or route to any open-source model at competitive rates.
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Context |
|---|---|---|---|
| Vedika Standard XALEN | $0.60 | $1.80 | 128K |
| Vedika Fast XALEN | $0.10 | $0.30 | 128K |
| Vedika Voice XALEN | $0.02/sec | — | 31 langs |
| Llama 3.1 405B Meta | $0.88 | $2.64 | 128K |
| Llama 3.1 70B Meta | $0.54 | $1.62 | 128K |
| Mixtral 8x22B Mistral AI | $0.60 | $1.80 | 65K |
| Qwen 2.5 72B Alibaba | $0.54 | $1.62 | 128K |
| DeepSeek V3 DeepSeek | $0.27 | $0.81 | 128K |
| Gemma 2 27B Google | $0.20 | $0.60 | 8K |
| Command R+ Cohere | $2.50 | $7.50 | 128K |
| +190 more models | Full pricing in docs | ||
Batch processing pricing is 50% of the rates shown above. All prices in USD. Custom pricing available for Enterprise plans.
Add a minimum of $10 to your wallet via Razorpay (UPI, cards, net banking). Use any of the 200+ models and pay per token consumed. Credits are valid for 1 year from purchase. No monthly fees, no commitments. When your balance hits zero, API requests return 402 until you top up.
Growth ($499/mo) gives you 300 req/min, 500K tokens/min, 5 API keys, priority support, and usage analytics. Scale ($2,499/mo) upgrades to 1,000 req/min, 2M tokens/min, 10 API keys, priority inference queue, dedicated account manager, and 99.9% SLA. Both tiers include all 200+ models at standard token pricing.
Dedicated tier starts at $5,000/mo. You get isolated GPU instances billed per GPU-hour: H100 at ~$3.49/hr, A100 at ~$2.09/hr, L40S at ~$1.19/hr. Includes custom fine-tuning, 99.99% SLA, multi-tenant isolation, and private endpoints. Contact sales for exact pricing based on your configuration.
Upgrades are immediate — your new rate limits apply instantly. Downgrades take effect at the end of your current billing cycle. You can always fall back to Pay As You Go with no penalty. Contact billing@xalen.io for tier changes.
Yes. Verified religious organizations and registered nonprofits receive 30% off all token pricing on any tier. Contact enterprise@xalen.io with your organization verification documents and we will apply the discount within 48 hours.
Enterprise tier includes SOC 2 compliance and HIPAA support for health-faith applications. We also offer SSO/SAML integration, data residency options, and custom compliance documentation. Contact our enterprise team for specific certification requirements.
Processing millions of tokens monthly? Need dedicated infrastructure, custom SLAs, or on-premise deployment? Let us build a plan for your organization.
From pay-as-you-go prototyping to dedicated GPU clusters. Pick the tier that fits your stage.