Now running DeepSeek R1, Llama 3.1 & more

Frontier AI. In Your Control.

Self-hosted AI platform powered by open-source models. Web search, memory, skills, full API access — your data never leaves your server.

Quick start — 30 seconds
# OpenAI-compatible API
curl https://bundleai.cloud/v1/chat/completions \
  -H "Authorization: Bearer YOUR_KEY" \
  -d '{"model":"deepseek-r1:14b",
      "messages":[{"role":"user","content":"Hello!"}]}'

# Response streams in real-time
"Hello! How can I assist you today?"
7+
AI Models
100K
Free Tokens
5
Built-in Tools
100%
Private & Self-hosted
OpenAI
Compatible API
Powered by open-source frontier models
DeepSeek R1 14B
Llama 3.1 8B
Llama 3.2 3B
Gemma 3
Gemma 2
DeepSeek R1 7B
+ Pull any Ollama model

Everything you need.
Nothing you don't.

BundleAI packs enterprise-grade AI capabilities into a clean, fast platform anyone can run.

🧠
Persistent Memory
AI remembers your conversations across sessions. Pin facts, preferences, and context that always stay with you.
🔍
Live Web Search
Automatic real-time web search for current events, prices, news, and anything beyond training data.
Smart Routing
Messages auto-route to the right model — fast 3B for greetings, powerful 14B for complex reasoning.
📚
RAG Knowledge Base
Upload PDFs, docs, and text files. AI answers from your documents with full source attribution.
🎯
Custom Skills
Create specialized AI personas — code reviewer, researcher, writer, analyst — with custom system prompts.
🔌
OpenAI-Compatible API
Drop-in replacement for OpenAI. Same API format — switch any app by changing one URL.
🧮
Built-in Tools
Calculator, weather, URL fetcher, web search, page summarizer — all available to the AI automatically.
🔐
Full Data Privacy
Every conversation stays on your server. No third-party APIs. No data logging. 100% self-hosted.
💳
Token Economy
Pay per use with Stripe or crypto. Top up tokens, track usage, manage users from the admin panel.

Thinks before it answers.

DeepSeek R1's reasoning model shows its thought process — collapsible thinking blocks let you see exactly how it arrived at every answer.

  • Collapsible "💭 Thinking..." reasoning blocks
  • Real-time streaming responses
  • Multi-turn conversation memory
  • Code syntax highlighting with copy buttons
  • 5 themes including dark, midnight, forest
👤
Explain how neural networks learn
🧠
💭 Reasoning · 2.1s
Let me think through backpropagation and gradient descent...
Neural networks learn through **backpropagation** — adjusting millions of weights to minimize prediction error. Each training example nudges the network slightly closer to the right answer...
deepseek-r1:14b · streaming

Switch from OpenAI in 30 seconds.

BundleAI speaks OpenAI's API format natively. Change one URL and your existing apps just work — with full streaming, tool calls, and vision support.

  • OpenAI-compatible /v1/chat/completions
  • Personal API key management
  • Token-based rate limiting
  • Webhook support for integrations
  • Full API documentation at /docs
Python example
from openai import OpenAI

client = OpenAI(
  base_url="https://bundleai.cloud/v1",
  api_key="bai_your_key_here"
)

response = client.chat.completions.create(
  model="deepseek-r1:14b",
  messages=[{
    "role": "user",
    "content": "Hello!"
  }]
)

print(response.choices[0].message.content)

Up and running in minutes.

1
Create account
Sign up and get 100,000 free tokens instantly. No credit card required to start.
2
Choose your model
Select from 7+ models — DeepSeek for reasoning, Llama for speed, or pull any Ollama model.
3
Start chatting or building
Use the chat UI directly or integrate the API into your app with one URL change.
4
Top up when needed
Buy token packs with Stripe or crypto. Tokens never expire. Pay only for what you use.

Pay for what you use.

Tokens never expire. Start free with 100K tokens. Scale as you grow.

Starter
$4.99
500K tokens
Get started
Basic
$12.99
2M tokens
Get started
Business
$99.99
30M tokens
Get started
Enterprise
$299
100M tokens
Get started

All plans include every feature. View full pricing →

Your AI.
Your server. Your rules.

100,000 free tokens. No card required. Full platform access.