Sipsa Labs / Products / UltraCompress / Verify
Independent quality verification

Verify our claims yourself.

We publish PPL ratios. We ship the tool to verify them. We document failures alongside wins.

No "trust me" — every published number is reproducible end-to-end on your hardware. Both contracts (shape + quality) ship today in v0.6.9; a productized third-party audit service is on the Q3 2026 roadmap.

/ self-serve · honest about gaps · everything traces to source data

SHAPE CONTRACT · uc verify, shipping today QUALITY CONTRACT · uc bench-ppl, v0.6.9 LIVE AUDIT SERVICE · productized Q3 2026 NEGATIVE RESULTS · 18+ documented refutations
/ section 01

The verification stack.

Three independent layers, each answering a different question. The first two ship in the public package; the third is on the roadmap as a paid service. If a claim doesn't survive all three, we don't ship it as a customer-facing number.

/ Layer 01 Live today

uc verify <pack>

Question it answers

"Did I get the same artifact the trainer measured?" Reads the SHA-256 manifest bundled inside the pack and confirms every layer reconstructs to a bit-identical match of the bytes the trainer wrote. If a single byte drifts, it fails loudly.

Ships in: pip install ultracompress
/ Layer 02 v0.6.9 LIVE

uc bench-ppl --against-bf16

Question it answers

"Does the published 1.0066× ratio actually reproduce on my hardware?" Pulls the bf16 baseline from HuggingFace, runs the eval on the FineWeb-edu held-out tail (seq_len=1024, seed=42) on your GPU, computes the PPL ratio locally, and prints whether it matches the published value.

Closes the verifier-quality gap surfaced during a recent cofounder evaluation.
/ Layer 03 Roadmap Q3 2026

Audit & Verification Service

Question it answers

"Can a third party reproduce these numbers under a contract?" Productized service tier where Sipsa runs an end-to-end re-eval against a customer-supplied baseline and signs the resulting receipt. Targets regulated buyers who need an external attestation, not just self-serve reproducibility.

Part of the Compression-as-a-Service catalog.
Why two contracts, not one? "Lossless" is a load-bearing word. uc verify proves the customer's bytes equal the trainer's bytes. uc bench-ppl proves those bytes behave the same way on a held-out evaluation. Either one alone is incomplete. v0.6.7 ships the first contract; v0.6.9 ships the second. We're saying that out loud here because the gap was real and we'd rather close it in public than gloss over it.

Reproduce a published row in 3 commands

Today (shape contract):

# 1. install
pip install ultracompress

# 2. pull a published artifact
hf download SipsaLabs/qwen3-1.7b-base-uc-v3-bpw5 --local-dir ./pack

# 3. verify SHA-256 reconstruction is bit-identical
uc verify ./pack
# → VERIFY: PASS — bit-identical reconstruction guaranteed.

This week (quality contract, v0.6.9):

# 4. reproduce the published PPL ratio on your hardware
uc bench-ppl ./pack --against-bf16
# → published: 1.00401×  measured: 1.0041×  match: PASS

Every published number in BENCHMARKS_2026_05_10.json traces to a source eval JSON in scripts/overlay/artifacts/ in the public repo. We don't ship round numbers without the underlying receipts.

/ section 02

The honest negative side.

Most projects publish the wins and quietly archive the failures. We catalogue both at the same level of detail, and the ratio is meant to be read as a feature: a research lab that hides nothing should expect a refutation count larger than its publication count.

/ Honest negative results · 2026-05-08 → 2026-05-14

18+ documented refutations from a single research arc.

From "AWQ-style channel pre-scaling regressed 13× the uniform baseline" to "the universal cure for the dense PPL floor doesn't exist at our current configuration," every closed hypothesis is published with the catalysing experiment, the measured number, and the reason we stopped pursuing it. Researchers comparing 5-bit codecs should treat the document as the audit trail.

Read the full catalogue →

The README's "What doesn't work yet" section is the same idea, applied to capabilities a reader might assume work because the rest of the package does. Both surfaces stay fresh.

/ section 03

What customers should ask us.

If you're evaluating a 5-bit lossless codec for production, these are the right questions to bring to the call. Our answers are the same in writing as they are on a sales call.

/ section 04 · for investors

For VCs specifically.

We know the verifier-quality gap is the #1 trust objection from technical diligence. Here's the timeline to close it — written in the present and near-future tense, not the aspirational quarter-plus-two.

/ Verification roadmap

Three contracts, one credibility surface.

The objection is real and we're closing it on a public schedule. Each step below is either shipped or has a hard near-term ship date attached, not a vague "later."

  • v0.6.7Shape contract. uc verify ships today. SHA-256 manifest match between customer artifact and trainer's measured artifact.
  • v0.6.9Quality contract. uc bench-ppl --against-bf16 LIVE today. Reproduces the published PPL ratio on the customer's hardware against a HuggingFace-pulled bf16 baseline.
  • Q3 2026Productized audit service. — Third-party-runnable verification under a paid contract, with a signed receipt. Roadmap item on the Compression-as-a-Service catalog.

The pattern is the same on every other surface: ship the verifier alongside the claim, document what didn't make it, refuse to round the numbers. The verification stack is the product around the codec, not a footnote underneath it.

Want to verify a specific row?

Single solo founder, US business hours, 4–8h response. Reproduce a published benchmark, request a Phase 0 POC for an architecture not yet in the matrix, or ask anything diligence-grade about the verification stack.

/ press: press@sipsalabs.com · security: security@sipsalabs.com · HuggingFace org: huggingface.co/SipsaLabs