Bit-identical reproducibility. SHA-256 verifiable. SOC 2 / SR-11-7 / FDA / DoD-ready architecture.
Deploy compressed Hermes-3-405B, Mixtral-8x22B, Llama-3.1-70B, and Qwen3-235B-A22B on your hardware — under your security boundary, under your audit log.
Pick the lane that matches your security posture and procurement model. Every path delivers the same compressed substrate — same SHA-256 manifest, same OpenAI-compatible endpoint, same verifier.
Path 1
On-Prem MSA
Install the compressed substrate inside your firewall. Runs in your VPC, your bare-metal cluster, or your private cloud. SHA-256 attestation per artifact. BUSL-1.1 source license plus a commercial deployment grant for your workload, vendor, and subsidiary scope.
For DoD, classified, and regulated bond environments. Offline reconstruction from signed manifests — the runtime never reaches the public internet. FedRAMP-ready architecture. Delivered via signed media or via your authorized cross-domain solution.
For: defense primes, intelligence community, ITAR-controlled programs, classified research labs.
Bring your fine-tuned model. We compress it to your spec and return a customer-side reproducible artifact — bit-identical, SHA-256-verifiable, served by the same runtime as your stock models. You keep the artifact. Lawyer-reviewed contract for engagements above $50K.
For: in-house fine-tunes, internal foundation models, customer-specific architectures under MNDA.
We're explicit about what's live, what's a target, and what the architecture supports today. No carrying-on, no bullet-point inflation. Compliance certifications below describe Sipsa Labs' attestation status — the deployment architecture is designed to fit inside your controls regardless.
Control
Status
What this means for you
SHA-256 bit-identical reconstruction
Live today
Every artifact ships with a signed manifest. Your security team runs uc verify and confirms byte-for-byte equivalence to the source weights independently.
SR-11-7 model risk management
Architecture supports today
Bit-identical reconstruction means the model under risk-management review is the same model in production — no separate "compressed-variant" governance lane required. Self-attestation provided.
FDA SaMD architecture fit
Architecture supports today
For Software-as-a-Medical-Device pipelines, the deterministic reconstruction contract preserves model identity through the validation pipeline. Phase II clinical-trial proposals require a separate IRB engagement.
SOC 2 Type 1
Q3 2026 target
Audit firm engaged. Type 1 expected in Q3 2026; Type 2 follows once the observation window completes.
HIPAA-managed BAA
Q3 2026 target
Required for the managed inference path. Customers running the on-prem deploy keep PHI inside their own boundary today — no PHI ever reaches Sipsa Labs infrastructure.
FedRAMP
2027 target
Air-gapped deploy is the bridge in the meantime — the runtime fits inside an existing FedRAMP boundary your IL4/IL5 environment already operates.
Compliance status is reviewed quarterly. We will not claim a certification we don't hold — if your procurement requires a specific control on a specific date, email founder@sipsalabs.com for the current attestation letter and roadmap.
/ What you get
Same SDK. Same code path. Half the GPUs.
Enterprise engagements include the same substrate that ships on PyPI — plus the contractual envelope, deployment support, and architecture coverage your platform team needs.
Same OpenAI SDK, same code path — swap OPENAI_BASE_URL and your existing inference code keeps working unchanged.
Customer-side reconstruction — the codec recipe never leaves Sipsa Labs; weights reconstruct on your hardware. No recipe leakage to your security review.
22 architectures shipped — 14 fully PPL-verified end-to-end, 6 in active eval, 2 in compression. Full verified matrix →
Hermes-3-Llama-3.1-405B at 1.0066× perplexity ratio on a single 32 GB consumer GPU class. Mixtral-8x7B at 1.00368×. Mistral-7B at 1.00548×.
Direct line to founder — one email, 24-hour reply on enterprise inquiries: founder@sipsalabs.com.
Public verifier — pip install ultracompress && uc verify reproduces the SHA-256 contract on your hardware. Your security team audits the codec independently of Sipsa Labs.
Custom architecture support — mainstream public architectures integrated within 2 weeks of release. Internal architectures under MNDA.
Reference architecture — sample deployment topology, capacity planning sheet, and security review packet on request.
/ The pitch in 3 bullets
For your CFO and your CTO.
Three things make this different from every other "model compression" pitch you've seen. Each is verifiable on your hardware before you sign anything.
5×
Smaller weights at the same task quality
5× fewer GPUs for the same throughput, or the same GPUs serve 5× the load. Quality is verified per architecture — published PPL ratio between 1.0026× and 1.0200× against the bf16 baseline.
=
Bit-identical reconstruction
No need to re-validate the model in your audit pipeline — it's the same model, byte-for-byte, after dequantization. SR-11-7, FDA SaMD, and internal model-governance reviews carry through without a parallel "compressed-variant" lane.
✓
Public verifier
Your security team audits the codec independently with pip install ultracompress && uc verify. The verification path is open and reproducible — you don't have to take our word for any claim on this page.
/ Get started
Talk to the founder.
We don't have a CRM funnel yet — one email, the founder reads it. Include your use case, expected scale (GPU count and token throughput), and your security boundary (on-prem, VPC, air-gapped). 24-hour reply on enterprise inquiries.
Reference customer needed for procurement? We will line up a direct conversation between your team and a current design partner under mutual NDA — mention “reference request” in the subject line.