Sipsa Labs · Research surface · 23 verified records · every open-pack number reproducible

Research.

Near-lossless compression at the limits — methods, results, and a reproduction harness for every open-pack number. Reproducible weight reconstruction is SHA-256-verifiable on every openly-downloadable pack, and PPL ratios are reproducible from the published harness (available to evaluators on request); full per-architecture provenance is available on request. We publish only results that reproduce; we don't cherry-pick.

/ Verified PPL records

23 verified records: 22 PPL-verified end-to-end + 1 ViT cosine-verified.

22 PPL-verified architectures with held-out perplexity ratios reproduced end-to-end against the bf16 baseline (17 dense + 4 MoE + 1 SSM), plus 1 ViT cosine-verified (DINOv2-Large, CLS cosine). A summary of headline rows is below. See the full benchmarks page →

Model Params PPL ratio Verified
Hermes-3-Llama-3.1-405B 405B 1.0066× Single 32 GB consumer GPU
Qwen3-1.7B-Base 1.7B 1.0040× Tightest dense-decoder record
Mixtral-8x7B 47B MoE 1.00368× End-to-end PPL
Mistral-7B-v0.3 7B 1.00548× Tightest dense 7B-class 5-bit ratio we currently publish
Qwen3-0.6B 0.6B 1.0069× Local PPL eval
OLMo-2-0425-1B-Base 1B 1.0073× End-to-end PPL
SmolLM2-1.7B-Instruct 1.7B 1.0075× End-to-end PPL
Llama-3.1-8B 8B 1.0125× Architecture-specific floor
Held-out PPL on FineWeb-edu. Baseline = bf16 reference weights. Full verified matrix →
/ Reproducible, not cherry-picked

Verify every number yourself.

The positives only mean what they mean if you can regenerate them. Reproducible reconstruction is SHA-256-verifiable on every openly-downloadable pack and PPL ratios are reproducible from the published harness (available to evaluators on request) — full per-architecture provenance available on request; we publish only results that reproduce on demand, and we don't cherry-pick the runs that looked good.

Reproduce it, don't trust it.

uc verify confirms pack structure + download integrity against the public SHA-256 manifest — no GPU required. the PPL reproduction harness reproduces the baseline-vs-compressed perplexity comparison against the same held-out FineWeb-edu tail that produced every record on this page, on a single consumer GPU. Seed 42, deterministic, fixed n per row: the same harness for every architecture, no hand-tuned hero runs.

If your reproduction disagrees with the published matrix by more than the eval tolerance, that's a bug report we want — email founder@sipsalabs.com for the harness. The point is that you never have to take our word for it.

Full verified matrix & reproduction harness →

/ Architecture-specific floors

Not all models compress the same way.

Some architectures hit a floor that no in-substrate knob-tuning can break. We publish the floor as the floor — marked as such — rather than chasing a smaller number with a perturbation we can't justify.

The floor is the result.

For Llama-3.1-8B the floor sits at 1.0125×; for Mistral-7B the production substrate ships at 1.00548×. Both numbers are what the production runtime returns today — honest reports of where the current substrate stops, not where we wish it stopped, and both are SHA-256-verifiable on your own machine.

Where a model is floor-bound, we mark it as the floor rather than chasing a smaller number with an adjustment we can't justify. That's the same discipline behind every other figure on this page.

Verify these numbers yourself →

/ Citations welcome

If our results were useful to your work.

Academic and industry citation is welcome. BibTeX block below; reach founder@sipsalabs.com for collaboration, data access, or a preprint coordination conversation.

@software{sipsalabs_ultracompress_2026,
  author       = {{Sipsa Labs}},
  title        = {UltraCompress: Near-Lossless 5-bit Transformer Compression with SHA-256 Verifiable, Reproducible Reconstruction},
  year         = {2026},
  url          = {https://sipsalabs.com/ultracompress},
  organization = {Sipsa Labs},
  note         = {23 verified architectures (22 PPL-verified + 1 ViT cosine-verified); public verifier via {\tt pip install ultracompress}}
}

Headline records and reproduction harness are at /inference. Source code, SHA-256 manifests, and the verification CLI are at github.com/sipsalabs/ultracompress ↑. All public artifacts are mirrored at huggingface.co/SipsaLabs ↑.