Sipsa Labs / Leaderboard
Near-lossless 5-bit transformer benchmark leaderboard

UC vs AWQ vs GPTQ vs EXL3. Side by side.

Every UltraCompress row is SHA-256-verified and independently reproducible via pip install ultracompress && uc verify. Competitor numbers are pulled from published HuggingFace model cards and peer-reviewed papers. Sorted by PPL ratio ascending — lower is more faithful to the original model.

/ 22 verified architectures · updated 2026-05-27 · methodology below

20/20
UC top positions
1.001x
Best UC ratio (Phi-3.5-MoE)
~3.2x
Avg gap vs best competitor
SHA-256
Every UC row verifiable
UC AWQ GPTQ EXL3
# Model Params Method BPW PPL Ratio Drift Verified Source
How to read PPL ratio. A PPL ratio of 1.0000 means the compressed model produces identical perplexity to the bf16 original. 1.005 means 0.5% degradation. Below ~1.01 is generally considered "lossless" in the literature. Above 1.05 typically indicates measurable quality loss. UltraCompress holds all 22 verified records below 1.013 — the tightest is 1.00129x on Phi-3.5-MoE.

Methodology & sources.

UltraCompress numbers are pulled directly from the public auditor registry (docs/benchmarks.json). Every row is independently verifiable: pip install ultracompress && uc verify <hf_pack>.

UC eval protocol:

Competitor sources:

AWQ, GPTQ, and EXL3 numbers are taken from published HuggingFace model cards (TheBloke, turboderp, Qwen team) and the original method papers. Where multiple published results exist for the same architecture at similar bit-widths, we use the best published number. All sources are linked per-row.

Fair comparison note. AWQ/GPTQ operate at 4-bit (variable group sizes); EXL3 uses mixed-rate trellis quantization. UltraCompress operates at 5 bits per weight with a fundamentally different contract (reproducible, cryptographically verifiable reconstruction). The comparison is valid on the metric that matters to production buyers: "how much quality do I lose at this memory footprint?" UC's 5 bpw achieves lower degradation than competitors at 4 bpw — at a modestly higher memory cost.

Source links per competitor method:

Method Primary source Bit-width Verification primitive
UC docs/benchmarks.json 5 bpw SHA-256 manifest + uc verify
AWQ Lin et al. 2023 (arxiv 2306.00978) 4 bpw (g128) None (no manifest, no verification primitive)
GPTQ Frantar et al. 2022 (arxiv 2210.17323) 4 bpw (g128) None (no manifest, no verification primitive)
EXL3 turboderp-org/exllamav3 ~4-5 bpw (trellis) None (no manifest, no verification primitive)

Run your own benchmark.

Every UC row is independently verifiable. Install, verify, benchmark. If your numbers disagree, that's a bug report we want.