Learn Zig Series (#18b) - Addendum: Async Returns in Zig 0.16

What will I learn

You will learn why Zig is bringing async back -- and why it looks nothing like before;
the new std.Io interface and how it mirrors the std.mem.Allocator pattern;
what "function coloring" is and how Zig 0.16 defeats it;
how io.async() and Future.await() work across different backends;
how io_uring on Linux reduces syscalls by 20x compared to poll-based I/O;
fibers (stackful coroutines) and why they matter;
the distinction between asynchrony and concurrency in Zig's model.

Requirements

A working modern computer running macOS, Windows or Ubuntu;
An installed Zig 0.15+ distribution (download from ziglang.org);
Familiarity with @scipio/learn-zig-series-18-async-concepts-and-event-loops" target="_blank" rel="noopener noreferrer">Episode #18 (poll/epoll, manual event loops);
The ambition to learn Zig programming.

Difficulty

Intermediate

Curriculum (of the `Learn Zig Series`):

@scipio/zig-programming-tutoroial-ep001-intro" target="_blank" rel="noopener noreferrer">Zig Programming Tutorial - ep001 - Intro
@scipio/learn-zig-series-2-hello-zig-variables-and-types" target="_blank" rel="noopener noreferrer">Learn Zig Series (#2) - Hello Zig, Variables and Types
@scipio/learn-zig-series-3-functions-and-control-flow" target="_blank" rel="noopener noreferrer">Learn Zig Series (#3) - Functions and Control Flow
@scipio/learn-zig-series-4-error-handling-zigs-best-feature" target="_blank" rel="noopener noreferrer">Learn Zig Series (#4) - Error Handling (Zig's Best Feature)
@scipio/learn-zig-series-5-arrays-slices-and-strings" target="_blank" rel="noopener noreferrer">Learn Zig Series (#5) - Arrays, Slices, and Strings
@scipio/learn-zig-series-6-structs-enums-and-tagged-unions" target="_blank" rel="noopener noreferrer">Learn Zig Series (#6) - Structs, Enums, and Tagged Unions
@scipio/learn-zig-series-7-memory-management-and-allocators" target="_blank" rel="noopener noreferrer">Learn Zig Series (#7) - Memory Management and Allocators
@scipio/learn-zig-series-8-pointers-and-memory-layout" target="_blank" rel="noopener noreferrer">Learn Zig Series (#8) - Pointers and Memory Layout
@scipio/learn-zig-series-9-comptime-zigs-superpower" target="_blank" rel="noopener noreferrer">Learn Zig Series (#9) - Comptime (Zig's Superpower)
@scipio/learn-zig-series-10-project-structure-modules-and-file-io" target="_blank" rel="noopener noreferrer">Learn Zig Series (#10) - Project Structure, Modules, and File I/O
@scipio/learn-zig-series-11-mini-project-building-a-step-sequencer" target="_blank" rel="noopener noreferrer">Learn Zig Series (#11) - Mini Project: Building a Step Sequencer
@scipio/learn-zig-series-12-testing-and-test-driven-development" target="_blank" rel="noopener noreferrer">Learn Zig Series (#12) - Testing and Test-Driven Development
@scipio/learn-zig-series-13-interfaces-via-type-erasure" target="_blank" rel="noopener noreferrer">Learn Zig Series (#13) - Interfaces via Type Erasure
@scipio/learn-zig-series-14-generics-with-comptime-parameters" target="_blank" rel="noopener noreferrer">Learn Zig Series (#14) - Generics with Comptime Parameters
@scipio/learn-zig-series-15-the-build-system-buildzig" target="_blank" rel="noopener noreferrer">Learn Zig Series (#15) - The Build System (build.zig)
@scipio/learn-zig-series-16-sentinel-terminated-types-and-c-strings" target="_blank" rel="noopener noreferrer">Learn Zig Series (#16) - Sentinel-Terminated Types and C Strings
@scipio/learn-zig-series-17-packed-structs-and-bit-manipulation" target="_blank" rel="noopener noreferrer">Learn Zig Series (#17) - Packed Structs and Bit Manipulation
@scipio/learn-zig-series-18-async-concepts-and-event-loops" target="_blank" rel="noopener noreferrer">Learn Zig Series (#18) - Async Concepts and Event Loops
@scipio/learn-zig-series-18b-addendum-async-returns-in-zig-016" target="_blank" rel="noopener noreferrer">Learn Zig Series (#18b) - Addendum: Async Returns in Zig 0.16 (this post)

Learn Zig Series (#18b) - Addendum: Async Returns in Zig 0.16

This is a special addendum to @scipio/learn-zig-series-18-async-concepts-and-event-loops" target="_blank" rel="noopener noreferrer">episode #18. I wrote that episode covering how Zig removed async/await in 0.11 and how we use raw poll() and epoll instead. We built event loops by hand. That was all correct -- and it still is.

But something big has happened since. Async is coming back. Zig 0.16 brings a completely redesigned async I/O model, and it's too important to leave out of this series. So here we are with an addendum ;-)

This is not a regular episode with exercises and solutions. Think of it as a "breaking news" companion to ep18 -- same topic, new developments. The exercises from ep18 still stand (solutions next episode), and everything we covered there about raw I/O multiplexing is still the foundation that std.Io is built on top of.

Here we go!

Why bring async back?

Andrew Kelley (Zig's creator) put it bluntly: "the stuff that I did with Async/Await before, it never felt finished, it never felt like it was good enough. I feel like there is a path towards realizing my vision with this new thing."

The old async (0.5 through 0.10) used stackless coroutines and accounted for roughly a third of the compiler's complexity while serving maybe 5% of use cases. That's why it got ripped out -- we covered the full story in @scipio/learn-zig-series-18-async-concepts-and-event-loops" target="_blank" rel="noopener noreferrer">ep18. But the NEED for async I/O didn't go away. Network servers, file operations, anything that touches the outside world benefits from it. The question was never "should Zig have async?" but "how do we do it without the mess?"

The answer is std.Io.

The function coloring problem

Before we get into the Zig solution, I need to explain the problem it solves, because this is one of those things that seems abstract until you've been bitten by it in a real codebase. Bob Nystrom wrote a famous piece called "What Color is Your Function?" about a disease infecting almost every language with async support.

In Python (and if you've been following my @scipio/learn-python-series-intro" target="_blank" rel="noopener noreferrer">Python series you've seen this pattern):

# "Blue" function (sync)
def get_data():
    return db.query("SELECT * FROM users")

# "Red" function (async) -- different color!
async def get_data():
    return await db.query("SELECT * FROM users")

A sync function cannot call an async function without becoming async itself. Once ANY function in your call chain is async, EVERY function above it must also be async. The color spreads upward like a virus. You end up maintaining two versions of everything -- sync and async. Or you give up and make everything async (which is what most Python projects end up doing after a certain size).

JavaScript, Python, Rust, C#, Dart, Kotlin, Swift -- they all have this problem. The async keyword is viral. In Rust it's especially painful because you get crate fragmentation: tokio-specific crates that don't work with async-std, and vice versa. The async runtime choice infects your entire dependency tree. If you've ever tried to mix tokio with smol or async-std, you know what I'm talking about.

Zig 0.16 defeats this. Completely. No function coloring, period.

std.Io -- the Allocator pattern for I/O

If you've been following this series you know Zig's allocator pattern (we covered it in @scipio/learn-zig-series-7-memory-management-and-allocators" target="_blank" rel="noopener noreferrer">episode #7). Functions that allocate memory receive an Allocator parameter. The function doesn't know (or care) whether the allocator is a page allocator, an arena, a general purpose allocator, or a fixed buffer. The caller decides.

fn processData(allocator: std.mem.Allocator, input: []const u8) ![]u8 {
    var list = std.ArrayList(u8).init(allocator);
    // ... the caller decides HOW memory is allocated
}

std.Io applies the exact same pattern to I/O. Functions that do I/O receive an Io parameter:

fn saveData(io: std.Io, data: []const u8) !void {
    const file = try std.Io.Dir.cwd().createFile(io, "save.txt", .{});
    defer file.close(io);
    try file.writeAll(io, data);
}

The function has no idea whether io uses blocking syscalls, a thread pool, io_uring, or Grand Central Dispatch on macOS. It doesn't care. The caller picks:

pub fn main() !void {
    // Option 1: blocking I/O (simplest, zero overhead, like C)
    var io = std.Io.blocking();

    // Option 2: thread pool
    var io = std.Io.threaded(.{ .thread_count = 4 });

    // Option 3: io_uring (Linux, high performance)
    var io = std.Io.io_uring(.{});

    try saveData(io, "hello world");
}

One function. Zero changes. Three execution models. That's how you defeat function coloring -- not with clever syntax, but with a uniform interface. The same philosophy Zig applies to memory allocation, applied to I/O. And just like how you can write a library that works with any allocator, you can now write a library that works with any I/O backend. No tokio-specific crates. No "this library requires asyncio." Just std.Io.

Having said that, this is a LOT more elegant than what I was expecting when the core team first announced they were revisiting async. I figured we'd get something like Go's goroutines or maybe a lightweight threading model. But the interface-based approach is pure Zig philosophy -- give the caller control, don't bake assumptions into the library ;-)

async and await -- but not keywords

Here's where it gets interesting. You CAN express concurrency through the Io interface using io.async() and Future.await():

fn saveData(io: std.Io, data: []const u8) !void {
    // Start two independent file writes
    var a_future = io.async(saveFile, .{ io, data, "saveA.txt" });
    var b_future = io.async(saveFile, .{ io, data, "saveB.txt" });

    // Wait for both results
    try a_future.await(io);
    try b_future.await(io);
}

With blocking I/O, io.async() just calls the function immediately and returns a completed future. Sequential, zero overhead. With a threaded or io_uring backend, both writes happen in parallel. Same code, different behavior -- the application author decides, not the library author.

Notice these aren't language keywords. There's no async fn declaration. There's no function coloring because saveData is just a normal function that takes an Io parameter. You could call it from any other normal function. The "async-ness" is a property of the runtime (the Io backend), not the function signature. This is the critical insight that separates Zig's approach from every other language.

Cancellation follows the defer pattern we know and love from @scipio/learn-zig-series-4-error-handling-zigs-best-feature" target="_blank" rel="noopener noreferrer">ep4:

fn processWithTimeout(io: std.Io) !Result {
    var task = io.async(doExpensiveWork, .{io});
    defer task.cancel(io) catch {};

    // If loadConfig fails, deferred cancel runs automatically
    const config = try loadConfig(io);
    return try task.await(io);
}

Both cancel() and await() are idempotent -- calling either multiple times is safe. This eliminates double-free and use-after-free bugs on async tasks. The same philosophy Zig applies to memory management (explicit lifetime, explicit cleanup, defer for automatic scope-based cleanup) applied to concurrent tasks.

Asynchrony is not concurrency

Most languages treat these as synonyms. Zig separates them explicitly, and the distinction matters more than you might think.

Asynchrony: "I'm starting this, I'll collect the result later." The operations MIGHT run in parallel, depending on the backend. With blocking I/O they run sequentially. With io_uring they run in parallel. Your code doesn't care either way.

Concurrency: "These MUST run simultaneously." If the backend can't provide that, it panics rather than silently doing the wrong thing.

// Asynchrony -- works with ANY backend, even blocking
var a = io.async(saveFile, .{ io, data, "a.txt" });
var b = io.async(saveFile, .{ io, data, "b.txt" });

// Concurrency -- REQUIRES a parallel backend, panics on blocking
var server = io.asyncConcurrent(acceptLoop, .{io});
var handler = io.asyncConcurrent(handleLoop, .{io});

This is explicit. No surprises. If your server code requires concurrency, the type system (well, the runtime contract) makes that visible. If someone tries to run it with blocking I/O, they find out immediately -- not through a subtle bug where the server handles one client at a time and nobody notices until production.

I really like this distinction. In every other language I've used, "async" implies "concurrent" and you discover the edge cases the hard way. Zig's asyncConcurrent is basically a contract that says "this NEEDS parallelism to function correctly, don't run it on a backend that can't provide it." That's the kind of explicitness that prevents 3 AM debugging sessions ;-)

io_uring -- 20x fewer syscalls

Remember from ep18: poll() scans every fd on every call. epoll improved this to O(1) readiness checks. But both are readiness-based -- the kernel tells you a socket is ready, then you make a second syscall to actually read or write. Two round trips between userspace and kernel for every I/O operation.

io_uring (Linux, you want kernel 6.11+ for the full feature set) is completion-based. You submit a batch of I/O requests to a ring buffer shared with the kernel. The kernel executes them and puts results in another ring buffer. One submission, one completion drain, no matter how many operations.

Loris Cro (Zig core team member, author of libxev which we mentioned in ep18) measured: 33 syscalls with io_uring vs 677 with poll-based I/O. Same workload, same results. 20x fewer transitions between userspace and kernel.

Through std.Io, you get this for free. Your code calls file.writeAll(io, data). The io_uring backend batches it into the submission ring. You never touch the io_uring API directly. And if you're on macOS or Windows where io_uring doesn't exist, the same code uses whatever the best backend is for that platform (kqueue on macOS, IOCP on Windows). No #ifdef soup. No platform-specific codepaths in your application.

The practical difference between 33 and 677 syscalls isn't just academic -- every syscall involves a context switch between user mode and kernel mode, which flushes CPU caches (or at least parts of them), causes TLB invalidations, and consumes thousands of CPU cycles. At scale (thousands of connections, millions of operations per second), 20x fewer syscalls translates directly into lower latency and higher throughput. This is why io_uring has been adopted by every high-performance Linux server project in the last few years.

Fibers -- green threads without coloring

The io_uring backend uses fibers (also called stackful coroutines, or green threads -- same concept, different names depending on who you ask). A fiber has its own stack and instruction pointer but runs cooperatively on an OS thread. Think of it like std.Thread but much lighter weight -- creating a fiber is almost free compared to creating a real OS thread.

When a fiber calls file.writeAll(io, data) and the I/O isn't ready, the fiber suspends transparently. The OS thread picks up another fiber. When the kernel completes the I/O (via the io_uring completion ring), the original fiber resumes exactly where it left off.

From the fiber's perspective: blocking call. From the OS thread's perspective: non-blocking multiplexing. No async keyword. No colored functions. Just regular function calls that happen to cooperate behind the scenes.

This is fundamentally different from the old Zig async (0.5-0.10) which used stackless coroutines. Stackless coroutines transform each async function into a state machine at compile time -- which is why they added so much compiler complexity. Stackful coroutines (fibers) don't need compiler transformations at all. The stack switching is a runtime operation, invisible to the compiler. Less complexity, fewer bugs, more maintainable.

The first prototype runs on x86_64 Linux. Other architectures and macOS (via Grand Central Dispatch) are planned. WebAssembly, where stack-switching isn't possible (the WASM spec doesn't allow direct stack manipulation), will use stackless coroutines as a fallback -- still through the same std.Io interface. That's the beauty of the interface approach: different backends can use completly diferent implementation strategies, and your application code doesn't change.

The four backends

Backend	How it works	Best for	Platform
Blocking	Direct syscalls, single thread	CLI tools, scripts, tests	Everywhere
Threaded	Blocking syscalls on a thread pool	Mixed CPU + I/O workloads	Everywhere
Green threads	Fibers + io_uring/kqueue/GCD	High-connection servers	Linux 6.11+, macOS later
Stackless	Compiler-transformed state machines	WASM, constrained envs	Everywhere (future)

Same library code works with all four. The application picks. No library fragmentation -- unlike Rust where you have tokio-specific and async-std-specific crates that don't interoperate. This is a genuinely novel approach and I'm not aware of any other systems language that does it this way.

Before/after: the echo server from ep18

In @scipio/learn-zig-series-18-async-concepts-and-event-loops" target="_blank" rel="noopener noreferrer">episode #18 we built an echo server with raw poll() -- about 80 lines of manual fd management, revents checking, swap-on-remove logic. Here's the same server with std.Io:

fn handleClient(io: std.Io, client: std.Io.Socket) !void {
    defer client.close(io);
    var buf: [4096]u8 = undefined;
    while (true) {
        const n = client.read(io, &buf) catch return;
        if (n == 0) return;
        try client.writeAll(io, buf[0..n]);
    }
}

fn acceptLoop(io: std.Io, server: std.Io.Socket) !void {
    while (true) {
        const client = try server.accept(io);
        _ = io.async(handleClient, .{ io, client });
    }
}

pub fn main() !void {
    var io = std.Io.io_uring(.{});
    const server = try std.Io.Socket.listen(io, .{ .port = 8080 });
    defer server.close(io);
    try acceptLoop(io, server);
}

No poll set. No fd array. No revents. Each io.async(handleClient, ...) spawns a fiber on io_uring, a thread on the threaded backend, or runs inline on blocking. The code reads like a simple blocking server -- but it handles thousands of concurrent connections. Swap io_uring for blocking and you have a test harness that's easy to reason about. Swap it for threaded and you have a portable cross-platform server. One codebase.

Compare this to the 80-line poll()-based echo server from ep18 with the manual fd array management, the swap-on-remove trick, the revents bitmask checking... You can see why this matters. The std.Io version is not just shorter -- it's fundamentally simpler to reason about, because the concurrency model is separated from the application logic. And you STILL understand what's happening underneath, because we built the raw version first.

That's actually the pedagogical point of doing ep18 before this addendum. If you only learned the std.Io interface, you'd be using it as a black box. Having built the poll-based event loop by hand, you know what every line of this code is doing under the hood. When something goes wrong in production (and it will, it always does), you'll know where to look.

A note on the std.Io API surface

One thing I want to call attention to is how the std.Io API mirrors patterns we've seen throughout this series. The io parameter that gets threaded through function calls works the same way as the allocator parameter from ep7. The defer client.close(io) is the same scope-based cleanup pattern from ep4. The error handling with try and catch return is standard Zig error handling from ep4. Futures with await look like optionals that you unwrap.

This is deliberate. Zig's async isn't a new language -- it's the same language, with one more interface type. You don't need to learn new syntax, new keywords, new control flow. You just pass one more parameter. That's incredibly refreshing after watching other languages bolt on async as essentially a parallel language within a language (looking at you, JavaScript Promises, Python asyncio, Rust Pin/Unpin...).

Status and timeline

The Zig 0.16.0 milestone on Github is essentially complete -- nearly all issues resolved. The std.Io interface is available in nightly builds right now.

Threaded backend: production-ready
io_uring/green threads: experimental, working, being optimized
Stackless coroutines: design phase, future release

If you're on 0.15.x (like we are in this series), the practical advice: keep using std.posix.poll() and std.Thread for production code. Start experimenting with std.Io on nightly builds if you're curious. When 0.16 drops, you'll be ready -- and everything from ep18 will still make you a better debugger of async issues, because you understand what's happening under the hood.

The poll/epoll code from ep18 isn't going away or becoming obsolete. Those are the syscalls that std.Io calls internally. Knowing how they work is like knowing how TCP works even though you use HTTP libraries -- it makes you a better engineer when things go wrong.

Before you close this tab...

Zig is bringing async back in 0.16 -- completely redesigned, nothing like the old 0.5-0.10 version that got ripped out
std.Io does for I/O what std.mem.Allocator does for memory: one interface, multiple backends, the caller decides
Function coloring is defeated. No async keyword, no viral signatures, no library fragmentation. A function that takes std.Io works with blocking, threaded, io_uring, or any future backend
io_uring reduces syscalls 20x (33 vs 677 measured by Loris Cro) -- completion-based instead of readiness-based
Fibers (stackful coroutines) give you synchronous-looking code with async execution -- no manual state machines, no compiler transformation complexity
Asynchrony vs concurrency: io.async() works on any backend (may or may not parallelize). io.asyncConcurrent() REQUIRES a parallel backend and panics if it can't deliver. Explicit contracts, no surprises.
Everything from ep18 still applies. poll(), epoll, raw event loops -- they're the foundation std.Io is built on. Knowing the low level makes you better at using the high level.
0.16 is imminent. Threaded backend is ready. io_uring backend is experimental. Try nightly builds today.

That's it for this addendum. Regular episodes resume next time, and we'll have the solutions to ep18's exercises waiting for you there ;-)

De groeten!

@scipio

Learn Zig Series (#18b) - Addendum: Async Returns in Zig 0.16

@scipio

Learn Zig Series (#18b) - Addendum: Async Returns in Zig 0.16

What will I learn

Requirements

Difficulty

Curriculum (of the Learn Zig Series):

Learn Zig Series (#18b) - Addendum: Async Returns in Zig 0.16

Why bring async back?

The function coloring problem

std.Io -- the Allocator pattern for I/O

async and await -- but not keywords

Asynchrony is not concurrency

io_uring -- 20x fewer syscalls

Fibers -- green threads without coloring

The four backends

Before/after: the echo server from ep18

A note on the std.Io API surface

Status and timeline

Before you close this tab...

De groeten!

Discussion

Curriculum (of the `Learn Zig Series`):