Two shipped products, one contract-only service, one research-stage product on a public roadmap. Each card lists status, who it's for, and the next step.
Shipping · flagship
UltraCompress
Lossless 5-bit transformer compression + OpenAI-compatible inference
5× smaller weights at the same task quality, with bit-identical reconstruction you can verify on your own hardware. Headline records: Hermes-3-Llama-3.1-405B at 1.0066× perplexity ratio on a single 32 GB consumer GPU; Mistral-7B-v0.3 at 1.00548× — tightest dense 7B-class 5-bit number published. 22 architectures shipped, 14 PPL-verified end-to-end. Same OpenAI SDK, just change the base URL.
Status Live on PyPI v0.6.9 · self-serve API · $5 free credit, no card
Install pip install ultracompress
Stack PyTorch · CUDA · safetensors · HF Hub
Verify uc verify reproduces the SHA-256 contract on your hardware
Shipping · managed API
Sipsa Inference
OpenAI-compatible managed serving for compressed transformers
Drop-in inference API at api.sipsalabs.com/v1 serving all 22 shipped architectures including Hermes-3-Llama-3.1-405B. Same openai SDK, just change the base URL. Pro ($99/mo, 600 req/min) and Team ($499/mo, 6,000 req/min) tiers with reserved capacity. $5 free credit to start, no card required.
Status Live · publicly self-serve · Pro + Team tiers
Endpoint api.sipsalabs.com/v1
SDK openai Python/Node · any OpenAI-compatible client
Available · contract-only
Compression-as-a-Service
Bring your fine-tuned model. We return a verifiable artifact.
For teams running an in-house fine-tune or an internal foundation model on an architecture not yet in the public matrix. We compress to your spec and return a customer-side reproducible artifact — bit-identical, SHA-256-verifiable, served by the same runtime as your stock models. You keep the artifact. Mainstream public architectures integrated within 2 weeks of release; internal architectures under MNDA.
For In-house fine-tunes · internal foundation models · customer-specific architectures
Pricing $5K — $50K per architecture
Contract Lawyer-reviewed for engagements above $50K
Delivery Customer-side reconstruction — the codec recipe stays at Sipsa Labs
Research · Q3 2026 target
Verifier as a Service
Third-party reconstruction + perplexity verification
A managed audit surface for regulated buyers who need an independent attestation that a compressed model is bit-identical to its source and preserves task quality within a published tolerance. Productizes the uc verify contract and the held-out PPL eval harness into a signed, dated report. Designed for SR-11-7, FDA SaMD, and internal model-governance reviews where the compressed and uncompressed artifacts must be treated as the same model under audit.
For Regulated buyers · ML platform governance teams · procurement
Status Research · design partners welcome
Target Q3 2026 limited beta · pricing TBD