Get a Sipsa Inference API key
OpenAI-compatible inference API serving compressed transformer models. Type your email below, get a key in 3 seconds, $5 free credit. Same SDK as OpenAI — just change the base URL.
Instant signup
Free $5 credit. No card. No waitlist. Key shown once — save it now.
By signing up you agree to the privacy policy. We don’t log prompts or completions.
What you can do with $5 of free credit
- ~5,000 chat completions on smaller models (Qwen3-0.6B, OLMo-2-1B) at default settings
- ~500 completions on Mistral-7B, Qwen3-8B, Mixtral-8x7B
- ~50 completions on Hermes-3-Llama-3.1-405B (largest we’ve PPL-verified)
- Drop-in
openaiSDK compatibility — no code rewrite - Verifiable reproducible reconstruction — SHA-256 manifest available per model on demand
Two paths if you outgrow free credit
Path 1 — Subscribe (live, instant access)
Subscribe via Stripe. Hobby $10/mo: 20K reqs/mo. Solo $20/mo: 60K reqs/mo. Pro $100/mo: 250K reqs/mo + priority queue + 99.5% SLA. Team $200/mo: 600K reqs/mo + 3 seats + audit logs.
Sign up above first to get your API key, then click Subscribe to link your Stripe subscription to your account.
Buttons unlock after you get your API key above.
Full pricing: /pricing. Want compression on your model? Start a $5K Phase 0 POC — five business days, signed reconstruction audit.
Path 2 — Self-host (no API key required)
The compressed model substrate is fully production. Pull any of 22 customer-side reproducible artifacts from our HuggingFace org and run locally:
pip install ultracompress hf download SipsaLabs/qwen3-8b-uc-v3-bpw5 --local-dir ./qwen3-8b uc verify ./qwen3-8b # confirms pack structure + SHA-256 download integrity
Free for sub-$1M ARR companies, research, and individuals (BUSL-1.1 + Additional Use Grant). Auto-converts to Apache 2.0 four years after each release.
Code sample (managed API)
import os
from openai import OpenAI
client = OpenAI(
base_url="https://api.sipsalabs.com/v1",
api_key=os.environ["SIPSA_API_KEY"],
)
response = client.chat.completions.create(
model="sipsa-qwen3-0.6b", # instant. Mixtral-8x7B, 235B MoE, 405B available — see /pricing
messages=[{"role": "user", "content": "hello, lossless world"}],
)
print(response.choices[0].message.content)
Compliance & enterprise
For SOC 2 / SR-11-7 / FDA / DoD / HIPAA-bound deploys, the on-prem MSA path is available today (no waitlist). See the enterprise page for the three deployment paths and current compliance status, or email founder@sipsalabs.com for the contract template. SHA-256 verifiable, reproducible reconstruction is the regulatory-equivalence floor that makes this the only 5-bit-class substrate viable for those workloads.