Design Rationale
This page explains the key design decisions in Corosio and the tradeoffs they represent. Understanding these decisions helps users make informed choices about when to use Corosio and how to extend it.
Coroutine-First Design
Corosio is designed from the ground up for coroutines. Unlike frameworks that adapt callback-based operations for coroutines, every I/O operation in Corosio returns an awaitable.
Why Not Callbacks?
Traditional callback-based frameworks like Boost.Asio use templates extensively:
// Callback-based: templates everywhere
template<class Executor, class Handler>
void async_read(basic_socket<Protocol, Executor>& s,
MutableBufferSequence const& buffers,
Handler&& handler);
This creates several problems:
-
N×M template instantiations for N operations × M executor/handler combinations
-
Binary size growth that can reach megabytes
-
Compile times measured in minutes for moderate codebases
-
Nested move-construction overhead at runtime
The Coroutine Alternative
Corosio’s coroutine-first approach provides:
// Coroutine-first: uniform types
capy::task<void> read_data(corosio::socket& s, buffer buf);
This approach provides several benefits:
-
Clean public interfaces: No templates, no allocators, just
task -
Hidden platform types: I/O state lives in translation units
-
Fast compilation: Type erasure at boundaries
-
ABI stability: Platform-specific types never appear in headers
Affine Awaitable Protocol
The central innovation in Corosio is the affine awaitable protocol, which propagates executor affinity through coroutine chains without embedding executor types in public interfaces.
The Lost Context Problem
Consider this scenario:
capy::task<void> ui_handler()
{
auto data = co_await fetch(); // Completes on network thread
update_ui(data); // Where are we now?
}
When fetch() completes, the coroutine might resume on a different thread
than expected. This is the scheduler affinity problem.
The Solution: Forward Propagation
Corosio solves this by passing the dispatcher forward through await_suspend:
template<capy::dispatcher Dispatcher>
auto await_suspend(std::coroutine_handle<> h, Dispatcher const& d)
{
// Store dispatcher, start I/O
// When complete, resume via: d(h)
}
The dispatcher flows from caller to callee through await_transform, not
through backward queries. When I/O completes, the awaitable resumes the
coroutine through the stored dispatcher, guaranteeing it runs on the
correct executor.
Type Erasure Strategy
Corosio uses type erasure strategically to balance performance against API simplicity.
Where Type Erasure Happens
| Component | Erasure | Rationale |
|---|---|---|
Executor at call site |
None |
Full type preserved for inlining |
Executor in coroutine chain |
|
Single pointer, no templates |
Buffer sequences |
|
One implementation, not N×M |
Platform I/O state |
Preallocated in socket |
Hidden from headers entirely |
The Encapsulation Tradeoff
We pay one pointer indirection per I/O operation for translation unit hiding. This addresses the template tax while keeping overhead negligible compared to actual I/O latency:
-
Network RTT: 100,000+ ns
-
Disk access: 10,000+ ns
-
Dispatch overhead: 4–60 ns (depth dependent)
The indirection cost (~1-2 ns) is invisible in I/O-bound workloads.
Platform I/O Hiding
Platform-specific types (OVERLAPPED, io_uring_sqe, file descriptors)
do not appear in public headers.
How It Works
Each socket preallocates its operation state:
struct socket
{
struct state : work
{
any_coro h_;
any_executor const* ex_;
// OVERLAPPED, HANDLE, etc. — hidden here
};
std::unique_ptr<state> op_; // Allocated once
};
The state structure:
-
Inherits from
work, enabling intrusive queuing -
Stores coroutine handle and executor reference for completion
-
Contains platform-specific members invisible to callers
-
Is allocated once at socket construction, not per-operation
Comparison with Frame Embedding
An alternative approach embeds operation state in the coroutine frame:
// Hypothetical: types exposed, state in frame
template<class Socket>
task<size_t> async_read(Socket& s, buffer buf) {
typename Socket::read_op op{s, buf}; // In frame
co_await op;
}
This eliminates indirection but exposes platform types in headers. We chose encapsulation for the following reasons:
-
ABI stability across library versions
-
Fast compilation with minimal header parsing
-
Single implementation per operation (not N×M)
-
Clean refactoring by changing one translation unit
Executor Model
Corosio uses the term executor rather than scheduler deliberately.
Why Not Scheduler?
In std::execution, schedulers are designed for heterogeneous computing:
selecting GPU vs CPU algorithms, managing completion domains, dispatching
to hardware accelerators.
Networking has different needs:
-
Strand serialization for ordering guarantees
-
I/O completion contexts (IOCP, epoll, io_uring)
-
Thread affinity to ensure handlers run on correct threads
By using "executor" we signal this is a distinct concept tailored to networking’s requirements.
Executor Operations
An executor supports three operations:
| Operation | Behavior |
|---|---|
|
Run inline if allowed, else queue. Use when crossing context boundaries. |
|
Always queue. Use when guaranteed asynchrony is required. |
|
Queue as continuation (optimization hint). |
For coroutines, symmetric transfer is preferred when caller and callee share the same executor. The compiler generates tail calls between frames with zero executor involvement.
Allocation Strategy
With proper recycling, both callbacks and coroutines achieve zero steady-state allocations.
Comparison with std::execution
Corosio diverges significantly from std::execution (P2300).
Different Design Drivers
| Aspect | std::execution | Corosio |
|---|---|---|
Primary use case |
GPU/parallel algorithms |
Networking/I/O |
Context flow |
Backward queries |
Forward propagation |
Algorithm customization |
Domain transforms |
Not needed (one impl per platform) |
Type exposure |
|
Hidden in translation units |
When to Use Corosio
Corosio is well-suited for projects where:
-
Coroutines are the primary programming model
-
Public APIs must hide implementation details
-
Compile time and binary size matter
-
ABI stability is required across library boundaries
-
Clean, simple interfaces are prioritized
Consider alternatives for projects where:
-
You need callback-based APIs for C compatibility
-
Maximum performance with zero abstraction is required
-
You’re already invested in the Asio ecosystem
References
-
Affine Awaitables — Protocol details