Hello World! The Blitzy C Compiler Has Arrived
Mar 26, 2026 • Sid Pardeshi • 10 min read

One Prompt. 229,983 LoC. 624 Hours of Autonomous Engineering Parallelized. 2,271 Tests. Here's How.
Last month, Anthropic achieved a milestone in agentic coding: 16 Claude agents, guided by a senior researcher over two weeks, built a Rust-based C compiler from scratch. The researcher logged over 2,000 interactive turns writing test cases, resolving merge conflicts, and devising debugging strategies when compilation stalled. Anthropic demonstrated what AI agents can accomplish with a skilled human at the helm.
Although a pioneering project, the output had limitations. The agents that architected Claude's C Compiler (CCC) could not handle a system of this scale, validate their own output, or self-correct when the design went sideways.
Our recent blog post details how Blitzy autonomously fixed every issue, backfilled missing C11 features, and shipped a six-tier optimization pipeline, turning CCC into a production-ready compiler. Blitzy's refinements showcased the platform's technical maturity and ability to understand and enhance existing codebases.
We are known for understanding and working with large codebases. What happens when Blitzy doesn't inherit someone else's architecture, training data artifacts, and untested claims but rather starts from an empty repository?
The answer is BCC: Blitzy's C Compiler.
2,271 passing tests. 229,983 lines of Rust. Four architecture backends. Zero external dependencies. SQLite, Redis, Lua, and zlib compile and run.
The Linux kernel boots on QEMU.
How Blitzy Works
The distinction between Blitzy's and Anthropic's output comes from a difference in architecture. Our platform does not place the burden on engineers to coordinate a swarm of agents.
Blitzy is an autonomous software development platform built to understand existing enterprise-scale applications and turn natural language prompts into 100,000+ lines of fully validated, end-to-end tested code.
The process starts when Blitzy translates your prompt into what we call an Agent Action Plan (AAP): a structured set of instructions that serves as the single source of truth for the entire build. From there, Blitzy's orchestration layer takes over, dynamically planning, spawning and coordinating thousands of specialized agents. Blitzy parallelizes execution of the AAP into workstreams covering architecture, code generation, implementation, testing, debugging, validation and integration.
For the BCC compiler, Blitzy coordinated over 3,600 agents across 127 files, managing dependencies between components, resolving conflicts between parallel code generation streams, and validating output at every stage.
How the Two Approaches Compare
Before diving into BCC, observe the differences in how these two projects defined and achieved "complete":
| Claude's Rust-based C Compiler (CCC) | Blitzy's Rust-based C Compiler (BCC) | |
|---|---|---|
| Number of Agents | 16 | 3,600 |
| Development Time | 2 weeks | 4.5 days to execute 624 engineering hour Agent Action Plan |
| Human Turns (instances where agent surfaced for human iteration) | ~2,000 | 2 (user submits prompt for Agent Action Plan generation and refined PR for Linux validation) |
| Agent to Agent Turns | 0 | 23,390 |
| Testing Suite | Researcher wrote test/debug failures for compiler | Blitzy generated own test suite and fixed all bugs |
Same challenge. Fundamentally different approaches.
What We Built
See our Pull Request and our Project Guide that recaps the work. Here are the numbers:
| Metric | Result |
|---|---|
| Total engineering hours | 624 (88% of project scope) |
| Lines of Rust code | 229,983 |
| Source files | 129 .rs files + 14 SIMD headers |
| External Rust crate dependencies | Zero |
| Tests passing | 2,271 (2,113 unit + 158 integration) |
| Test failures | Zero |
| Clippy warnings | Zero |
| Formatting diff | Zero |
| Target architectures | x86-64, i686, AArch64, RISC-V 64 |
| Optimization passes | 15 |
| GCC torture test pass rate | 98.8% (measured) |
BCC is a complete, self-contained C11 compiler. Every component was generated from scratch with zero external dependencies:
Frontend: full C11 preprocessor with paint-marker macro recursion protection, lexer with PUA-aware encoding for non-UTF-8 byte round-tripping, and recursive-descent parser covering all C11 syntax plus GCC extensions including statement expressions, typeof, computed gotos, case ranges, and inline assembly with AT&T syntax.
SSA-form IR: alloca-then-promote architecture matching LLVM's canonical SSA construction, with dominance tree computation via Lengauer-Tarjan, phi-node insertion, mem2reg promotion, and a clean lowering pipeline from AST to machine code.
15 optimization passes: constant folding, dead code elimination, CFG simplification, copy propagation, common subexpression elimination, global value numbering, loop-invariant code motion, sparse conditional constant propagation, aggressive dead code elimination, strength reduction, instruction combining, register coalescing, tail call optimization, peephole optimization, and a pass manager.
Four native code generators: x86-64 (System V AMD64 ABI), i686 (cdecl), AArch64 (AAPCS64), and RISC-V 64 (LP64D) with architecture-specific instruction selection and linear scan register allocation.
Integrated assembler and linker: encodes instructions directly to machine code and links ELF32/ELF64 binaries with full relocation, dynamic linking via PLT/GOT, shared library support, and CRT0/_start injection. No external toolchain invocations. Anthropic's blog post admitted their assembler and linker were "still somewhat buggy" and the demo used GCC's. BCC's standalone toolchain has been working since day one and validated through 2,271 tests and five runtime-verified projects.
DWARF v4 debug information: .debug_info, .debug_abbrev, .debug_line, and .debug_str sections verified by readelf across 21 dedicated checkpoint tests. An independent review of CCC reported "missing DWARF data, broken frame pointers, no function symbols."
Architecture Targets
Anthropic's compiler ships with optional GCC fallbacks for assembling and linking, silently ignores unrecognized flags, and reports itself as GCC 14.2.0 for compatibility. Pragmatic choices but with concessions that acknowledge incomplete coverage. Blitzy targeted the same four architectures as CCC, but the engineering philosophies diverge beneath the surface.
BCC makes no such concessions as a single-binary, fully self-contained toolchain. Instructions are encoded by our assembler, and the linker produces every ELF binary.
Blitzy agents generated 229,983 lines of production Rust code using nothing but Rust's built-ins: no serialization libraries, argument parsers, or regex crates.
Security and Hardening
Production compilers need to emit correct and safe code. BCC implements three x86-64 security mitigations that CCC does not offer:
Retpoline thunks for Spectre v2 mitigation: function pointer calls route through __x86_indirect_thunk_* rather than targeting the pointer directly, preventing branch target injection attacks.
Intel CET/IBT with endbr64 emission: forward-edge control-flow integrity that ensures indirect branches land only at legitimate targets.
Stack guard page probing: for frames exceeding 4,096 bytes, preventing stack clash attacks by probing each page before the stack pointer adjustment.
The safeguards are verified by 16 dedicated checkpoint tests that inspect the actual machine code output. CCC's README, blog post, and design documents make no mention of any security mitigation features.
Correctness: Where It Counts
After Anthropic released CCC, two independent analyses exposed systematic correctness failures. A chibicc fork maintainer filed 20 specific bug patterns with Godbolt reproducers. John Regehr's fuzzing campaign found 11 additional miscompilation classes. All 29 of these bugs remain unfixed in CCC's upstream repository.
BCC was designed from the start to avoid these failure modes. All 18 applicable chibicc-pattern bugs were systematically addressed and fixed with regression tests. All 11 Regehr fuzzing bug classes were verified correct. Beyond inherited errors, BCC's own Csmith and YARPGen fuzzing campaign discovered and fixed 4 additional issues that only surfaced under randomized testing.
The distinction matters. CCC was optimized to pass its own test suite. BCC was validated against external, adversarial test harnesses designed by domain experts to find exactly the kind of problems that test-suite-driven development misses.
Stress Testing Against Real-World Code
The real test of any C compiler is not Hello World (which BCC does compile) but whether it can handle the production systems developers actually ship. BCC was validated against major open source projects with full compile, link, and runtime verification:
| Project | Compile | Link | Runtime | Notes |
|---|---|---|---|---|
| SQLite 3.45.0 | ✓ | ✓ | ✓ | .selftest passes, CRUD works. 2 bugs fixed. |
| Redis 7.2.4 | ✓ | ✓ | ✓ | 93/93 files. SET/GET/INCR/LPUSH/HSET. 4 bugs fixed. |
| Lua 5.4.7 | ✓ | ✓ | ✓ | 33/33 files. Coroutines, pcall, math. 3 bugs fixed. |
| QuickJS 2024 | ✓ | ✓ | ✓ | 26/27 tests pass. 14 bugs fixed. |
| zlib 1.3.1 | ✓ | ✓ | ✓ | 15/15 files. Round-trip compress/decompress. |
| Linux Kernel 6.9 | ✓ | ✓ | ✓ | 456/476 files (95.8%). Boots to USERSPACE_OK. |
| PostgreSQL 16.2 | ✓ | 342+ .o files, zero compile errors. | ||
| DOOM | ✓ | 81/85 core files. 0 actual errors, 4 timeouts. | ||
| FFmpeg | ✓ | 33/37 core lib files. | ||
| coreutils 9.4 | ✓ | 111/129 files. echo, cat, sort, head, tail. |
The Linux kernel result deserves emphasis.
BCC compiled 456 out of 476 kernel source files (95.8%) in a hybrid build, produced a linked vmlinux with BCC-compiled code, and booted it on QEMU RISC-V 64 to userspace. The BCC-compiled ctype.o module ran inside the kernel, a verified boot sequence ending in USERSPACE_OK. CCC's README claims kernel compilation, but Anthropic's blog post does not document a successful boot.
Real-world compilation is unforgiving.
When Blitzy compiled SQLite, it surfaced a stack alignment issue and a static initializer address bug. Redis yielded four distinct defects. Lua turned up three. QuickJS, fourteen. In Anthropic's workflow, these are exactly the problems a human researcher would eventually diagnose (hours or days later) and then build workarounds for.
The Blitzy agents found, understood, and resolved every one autonomously, each paired with a dedicated regression test, long before any human reviewer had a chance to look.
How BCC Compares to CCC
We built BCC because we believed our platform could produce a more production-ready compiler than CCC. Here is where the two stand across 22 measurable dimensions:
| Category | BCC | CCC | Edge |
|---|---|---|---|
| Hello World out-of-box | Works immediately | Fails (first GitHub issue) | BCC |
| Standalone assembler | All 4 arch, tested | Blog: "still somewhat buggy" | BCC |
| Standalone linker | All 4 arch, tested | README claims; blog contradicts | BCC |
| CRT0/startup injection | Implemented | Not documented | BCC |
| Multi-object linking | Tested and working | Not independently verified | BCC |
| chibicc bug patterns | 0 of 18 remaining | 18 of 18 still present | BCC |
| Regehr fuzzing bugs | 0 of 11 remaining | 11 of 11 unfixed upstream | BCC |
| Security: retpoline | Implemented, tested | Not documented | BCC |
| Security: CET/IBT | Implemented, tested | Not documented | BCC |
| Security: stack probing | Implemented, tested | Not documented | BCC |
| Atomic type tracking | Storage/representation level | Not documented | BCC |
| DWARF debug info | Verified, 21 tests | "Missing DWARF data" | BCC |
| Clippy warnings | Zero | Not verified | BCC |
| Code formatting | Zero diff | Not verified | BCC |
| Non-UTF-8 fidelity (PUA) | Implemented | Not supported | BCC |
| GCC torture pass rate | 98.8% (measured) | ~99% (self-reported) | CCC |
| Projects w/ runtime tests | 5 verified | 15+ claimed | CCC |
| Total projects compiled | 11 tested | 150+ claimed | CCC |
| Optimization passes | 15 | 15 | Tie |
| SIMD intrinsic headers | Full suite | Full suite | Tie |
| Target architectures | 4 | 4 | Tie |
| Zero external Rust deps | Yes | Yes | Tie |
Final score: BCC leads in 15 categories, CCC leads in 3, with 4 ties.
CCC's advantages are real. A higher GCC torture test pass rate and broader project claims are meaningful. But CCC's own README warns that "the docs may be wrong and make claims that are false," and Anthropic's blog post acknowledged the assembler and linker were "still somewhat buggy" even as the README claimed they worked by default. BCC's claims are backed by committed test results, not documentation that disclaims its own accuracy.
Remaining Work
Transparency about limitations is as important as demonstrating capabilities.
With 88% of the project scope completed autonomously (624 of 710 hours), the remaining work is well-understood. Twenty kernel source files still require GCC extensions that BCC has not yet implemented. Head-to-head performance benchmarking against the 5x GCC wall-clock ceiling has not been run. Stretch targets including full FFmpeg build, PostgreSQL linking, and coreutils completion remain open.
The GCC torture test suite pass rate for BCC stands at 98.8% for non-skipped tests with 1,584 passed out of 1,602. CCC claims ~99%, which may represent a stronger result, though it remains unclear whether that figure is calculated on the same basis.
These percentages highlight how BCC's and CCC's main priorities differ. BCC's focus was to build a production-grade C compiler, not to replicate GCC. If rebuilding GCC was the goal, we would expect Blitzy to deliver 100% feature parity.
The bottom line: BCC's test suite passes 2,271 of 2,271 executed tests with a 100% pass rate and zero failures. The remaining gaps are well-identified, well-scoped, and require no architectural changes. Blitzy's core approach proves that any outstanding tasks would simply be incremental engineering work.
A Leap Forward in Autonomous Software Development
A C compiler is not the end goal. Blitzy's primary objective was to illuminate what is possible with autonomous software development.
AI coding assistants require users to be architects, context engineers, QA, and project managers simultaneously. Blitzy gives engineers time back to drive customer needs into technical specifications and design.
The compiler proves that integrating Blitzy in large-scale engineering initiatives pays off, buying back time and bandwidth to enable development at record speed.
Projects once gated by engineering time and resources become achievable.
The Blitzy platform makes the once unattainable now viable.

