Sipsa Labs · Portfolio · 2 products live · 1 service · 1 on roadmap

Sipsa Labs Products.

Sipsa Labs is a parent company that builds verifiable AI infrastructure. Two products live, one contract-only service, one research-stage product on a public roadmap. Each carries the same posture: SHA-256 reproducibility, published benchmarks, and a self-serve path before a sales motion.

Shipping — self-serve today Available — contract-only Research — in development

/ Portfolio

The lineup, today.

Two shipped products, one contract-only service, one research-stage product on a public roadmap. Each card lists status, who it's for, and the next step.

Shipping · flagship

UltraCompress

Lossless 5-bit transformer compression + OpenAI-compatible inference

5× smaller weights at the same task quality, with bit-identical reconstruction you can verify on your own hardware. Headline records: Hermes-3-Llama-3.1-405B at 1.0066× perplexity ratio on a single 32 GB consumer GPU; Mistral-7B-v0.3 at 1.00548× — tightest dense 7B-class 5-bit number published. 22 architectures shipped, 14 PPL-verified end-to-end. Same OpenAI SDK, just change the base URL.

Status Live on PyPI v0.6.9 · self-serve API · $5 free credit, no card Install pip install ultracompress Stack PyTorch · CUDA · safetensors · HF Hub Verify uc verify reproduces the SHA-256 contract on your hardware

Visit product → Try free demo Get API credit GitHub ↑ Hugging Face ↑

Shipping · managed API

Sipsa Inference

OpenAI-compatible managed serving for compressed transformers

Drop-in inference API at api.sipsalabs.com/v1 serving all 22 shipped architectures including Hermes-3-Llama-3.1-405B. Same openai SDK, just change the base URL. Pro ($99/mo, 600 req/min) and Team ($499/mo, 6,000 req/min) tiers with reserved capacity. $5 free credit to start, no card required.

Status Live · publicly self-serve · Pro + Team tiers Endpoint api.sipsalabs.com/v1 SDK openai Python/Node · any OpenAI-compatible client

Get $5 free credit → Try live demo View pricing

Available · contract-only

Compression-as-a-Service

Bring your fine-tuned model. We return a verifiable artifact.

For teams running an in-house fine-tune or an internal foundation model on an architecture not yet in the public matrix. We compress to your spec and return a customer-side reproducible artifact — bit-identical, SHA-256-verifiable, served by the same runtime as your stock models. You keep the artifact. Mainstream public architectures integrated within 2 weeks of release; internal architectures under MNDA.

For In-house fine-tunes · internal foundation models · customer-specific architectures Pricing $5K — $50K per architecture Contract Lawyer-reviewed for engagements above $50K Delivery Customer-side reconstruction — the codec recipe stays at Sipsa Labs

See deployment paths → Scope a job

Research · Q3 2026 target

Verifier as a Service

Third-party reconstruction + perplexity verification

A managed audit surface for regulated buyers who need an independent attestation that a compressed model is bit-identical to its source and preserves task quality within a published tolerance. Productizes the uc verify contract and the held-out PPL eval harness into a signed, dated report. Designed for SR-11-7, FDA SaMD, and internal model-governance reviews where the compressed and uncompressed artifacts must be treated as the same model under audit.

For Regulated buyers · ML platform governance teams · procurement Status Research · design partners welcome Target Q3 2026 limited beta · pricing TBD

Become a design partner Security posture

/ What's next

More products, same posture.

The roadmap is research-led.

Sipsa Labs is a multi-product portfolio. Beyond the cards above, we're prototyping additional research-stage tooling spanning training-time, inference-time, and audit surfaces. Each future product will follow the same rule the current ones do: self-serve before sales motion, public benchmark before public claim, signed manifest before "trust us".

No codenames in marketing copy. No dates we can't hit. When something is ready to use, it gets a card on this page.

Read our research → · Early-signal request →