Cofounder, CTO-class.
Sipsa Labs is hiring its first cofounder. Solo founder seeking the second person. Equity partner. Below-market cash until first funding round closes. Decision in 14 days.
What Sipsa Labs is
Sipsa Labs, Inc. is an experimental and deep tech-and-software company — multi-product research lab in the Bell-Labs / Edison spirit. Delaware C-corp, incorporated May 2026. We invent and ship across the full breadth of tech and software: deep research, runtime systems, novel substrates, infrastructure, hardware-adjacent stacks, and software products that don't fit anywhere else yet.
UltraCompress — lossless 5-bit transformer compression — is our first flagship publicly-shipped product. Two product surfaces are live today:
- Open-source library:
pip install ultracompress(v0.6.2 on PyPI). 22 architectures verified end-to-end, 0.6B → 405B parameters. SHA-256 verifiable bit-identical reconstruction. BUSL-1.1 with Additional Use Grant — free for sub-$1M ARR companies, research, and individuals. - OpenAI-compatible inference API:
api.sipsalabs.com/v1— drop-inOPENAI_BASE_URLswap. The officialopenaiSDK works unchanged. Backed by dual RTX 5090 over Cloudflare Tunnel.
This week's verified PPL ratio records (5 bpw vs bf16)
| Model | PPL ratio | Note |
|---|---|---|
| Mixtral-8x7B (47B MoE) | 1.00368× | tightest MoE result |
| Qwen3-14B | 1.00403× | new this week |
| Qwen3-8B | 1.00440× | production-tier |
| Mistral-7B-v0.3 | 1.00548× | 9.16× tighter than prior public best |
| Phi-3-mini-4k-instruct | 1.00624× | cross-arch confirm |
| Hermes-3-Llama-3.1-405B | 1.0066× | first 405B-class lossless 5-bit on a single 32 GB consumer GPU |
Two USPTO provisional patents filed April 25, 2026. Supplement filed May 9, 2026. Continuations through 2027. YC S26 application: In Review (decision June 5, 2026).
What you'd do as cofounder
You own the systems half of the company. Specifically:
- Production CUDA kernel work for the streaming compression decompression hot path. The reference implementation today is Python plus
huggingface_hub. The production deployment path is fused dequant + matmul primitives that need to ship as compiled kernels integrated with vLLM, TensorRT-LLM, and llama.cpp. - Inference engine integration. vLLM PagedAttention plus our streaming weight format. TensorRT-LLM kernel registration. llama.cpp gguf format extension. PTX-level work, memory-access pattern debugging, benchmarking against AWQ-in-vLLM at the warp-divergence level.
- Cross-platform binary distribution. Windows / Linux / macOS. ARM Mac (Apple Silicon Metal). NVIDIA + AMD + Intel GPU coverage where it matters.
- Customer-facing solutions engineering for the IaaS segment. Together AI / Fireworks / Lambda Labs / CoreWeave engagements: sit with their inference team, instrument their stack, prove the per-token cost reduction, hand off to deployment.
- Hiring the next 2-3 systems engineers. The first $4-6M funds three systems engineers reporting to you. You define the role, source the candidates, run the interviews.
- Co-driving research direction across the broader Sipsa product roadmap. UltraCompress is the first product, not the whole company.
What you look like
Required
- CUDA kernel experience in production. Not "I called PyTorch ops once." We mean: you've shipped at least one CUDA kernel to a production codebase that real users depend on. Examples: llama.cpp contributor with merged kernels, vLLM team alum, TensorRT-LLM eng, FlashAttention contributor, similar.
- At least one open-source project with real users. GitHub stars in the hundreds-to-thousands range. PyPI downloads in the thousands-per-month range. Or comparable evidence that you ship things people use.
- Cross-platform binary distribution experience. You've shipped wheels,
.so,.dll, and.dylibfiles in the same release. You know the Windows-specific gotchas (cp1252 console encoding, MSVC vs MinGW, path separators) by heart. - Comfort with pre-funding ambiguity. This is not a Series B company with a 50-person org chart. It is two people for the next 6-12 months. The role expands as we hire. If you need org structure to function, this is not the right role.
Preferred but not required
- Inference engine team experience (vLLM, TensorRT-LLM, llama.cpp, MLC-LLM, ExLlamaV3).
- Quantization-specific knowledge (AWQ, GPTQ, HQQ, QTIP, EXL3 internals).
- KV-cache compression, LLMLingua-style prompt compression, or related runtime-efficiency research.
- Edge / embedded deployment experience (Jetson, ARM Mali, Apple Silicon, mobile).
Compensation — honest framing
Equity: Meaningful single-founder-level equity. Sip retains majority and primary control until at least Series B. You get a partner-level stake that vests over 4 years with a 1-year cliff. Specific number negotiated at offer; calibrated to your background and market comp.
Cash: Below-market until the first funding round closes. Sip's burn rate to date is roughly $5K total over 6 months — we operate in capital-discipline mode until YC June 5 OR first revenue. Between offer and round-close, expect deferred salary that converts to cash post-round, plus minimum needed for living expenses if your runway is tight.
Post-round (Series A close): Market cofounder cash compensation for a research-led infrastructure company. The investor pitch memo targets $4-6M at $20-30M post-money. Cofounder-level salary post-close is in the $200K-280K range plus equity refresh.
Honest: if you need market cash from day one, this is not the right role. If you can sustain 3-6 months of below-market cash for a $1B+ company outcome ambition, this is the right role.
How the conversation goes
- First call (45 min): Sip explains the company in detail, walks through the LAB-NOTEBOOK, shows live API calls against
api.sipsalabs.com, walks through the patent stack at the public-claim level. You ask hard questions. - Second call (90 min): We share-screen and code together for an hour on a real Sipsa research question (e.g., "what's the minimum-viable CUDA kernel for the fused dequant + matmul primitive on Hopper?"). We see how we work together under technical pressure.
- Reference checks: Both directions. You talk to anyone Sip has worked with. Sip talks to anyone you've worked with.
- Offer (or polite decline) within 14 days of first call. No drawn-out interview process. Either we're aligned or we're not.
To apply
Send to founder@sipsalabs.com:
- Subject line: "Cofounder — [your full name]"
- GitHub profile (so Sip can read your code).
- One PR or merged commit you're proud of, with 2-3 sentences explaining what it does and why you made the design choice you made.
- Your honest framing on cash runway (so we can match expectations on deferred-salary-vs-immediate-cash structure pre-funding).
- One question you have for Sip — anything from "what's the deepest technical risk you see in the next 12 months" to "what would a typical week look like in month 3."
We'll respond within 72 hours, every time. If we're not the right fit, we'll say so directly and quickly. If we are, we'll get on the first video call within 7 days.
Apply via founder@sipsalabs.com →Sipsa Labs, Inc. is a research-led infrastructure company. We're building the compression substrate for the next decade of AI compute — and that's only the first product. The first cofounder is the most important hire of the company's existence. We are taking that seriously.
— Missipssa Ounnar, Founder
founder@sipsalabs.com · sipsalabs.com · github.com/sipsalabs/ultracompress · api.sipsalabs.com