Blog

UltraCompress engineering notes, lab discoveries, and shipping milestones. We tell you which numbers we have measured, when, with what conditions — and we do not tell you the others.

2026-05-11

Sipsa Inference is live.

OpenAI-compatible inference API at api.sipsalabs.com/v1 serving 22 lossless 5-bit transformer architectures including Hermes-3-Llama-3.1-405B at 1.0066x PPL ratio. Drop-in replacement for the official openai SDK. First $5 of usage on us.

2026-05-09

We searched for our competition. The 5-bit band was empty.

A live HuggingFace Hub query for lossless 5-bit transformer compression returned zero competing artifacts. Twenty-two architectures, three new sub-1.005x records this week, SHA-256 verifiable bit-identical reconstruction.

2026-05-09

Hermes-3-405B compressed lossless at 5 bits.

Largest dense transformer artifact published to HuggingFace at 5-bit lossless. Compressed end-to-end on dual RTX 5090s in 13 hours of wall clock. 251 GB pack on disk. Verified perplexity ratio 1.0066x.

2026-05-08

Eighteen architectures, one pack format.

UltraCompress 5-bit lossless transformer compression validated across 18 architectures from 0.6B to 405B parameters — dense, mixture-of-experts, and state-space (Mamba). One pack format. Bit-identical reconstruction.