Fibers with a grain of salt

Do you prefer to listen?

I've created an audio version of this post. You can listen on YouTube or by subscribing to my podcast feed on Apple Podcasts, Stitcher or Spotify

So I was going to write an in-depth blogpost about using fibers in PHP 8.1. We were going to start with a basic example to explain them from the ground up. The idea was to send async HTTP requests and process them in parallel using fibers.

But playing around with them, I learned that the RFC wasn't kidding when it said "The Fiber API is not expected to be used directly in application-level code. Fibers provide a basic, low-level flow-control API to create higher-level abstractions that are then used in application code".

So instead of going down this path and make things way too complicated, we'll discuss what fibers are conceptually, why they are barely usable in application code, and how you can make use of async PHP after all.

First, a little bit of theory.


Imagine you want to send three HTTP requests and process their combined result. The synchronous way of doing so is by sending the first one, waiting for the response, then sending the second one, waiting, etc.

Let's represent such a program flow with as easy a chart as possible. You need to read this chart from the top down, and time progresses the further down you go. Each colour represents one HTTP request. The coloured pieces of each request represent PHP code actually running, where the CPU on your server is doing work, the transparent blocks represent waiting times: the request needs to be sent over the wire, the other server needs to process it and send it back. It's only when the response arrives that we can work again.

This is a synchronous execution flow: send, wait, process, repeat.

In the world of parallel processing, we send the request but don't wait. Then we send the next request, followed by another. Only then do we wait for all requests. And while waiting we periodically check whether one of our requests is already finished. If that's the case we can process it immediately.

You can see how such an approach reduces execution time because we're using the waiting time more optimally.

Fibers are a new mechanism in PHP 8.1 that allow you to manage those parallel execution paths more efficiently. It was already possible by using generators and yield, but fibers are a significant improvement, since they are specifically designed for this use case.

You would create one fiber for each request, and pause the fiber after the request is sent. After you've created all three fibers, you'd loop over them, and resume them one by one. By doing so, the fiber checks whether the request is already finished, if not it pauses again, otherwise it can process the response and eventually finish.

You see, fibers are a mechanism to start, pause and resume the execution flow of an isolated part of your program. Fibers are also called "green threads": threads that actually live in the same process. Those threads aren't managed by the operating system, but rather the runtime — the PHP runtime in our case. They are a cost efficient way of managing some forms of parallel programming.

But note how they don't add anything truly asynchronous: all fibers live in the same PHP process, and only one can run at a time. It's the main process that loops over them and checks them while waiting, and that loop is often called the "event loop".

The difficult part about parallelism isn't about how you loop over fibers or generators or whatever mechanism you want to use; it's about being able to start an operation, hand it over to an external service and only check the result when you want to, in a non-blocking way.

See, in the previous examples, we assumed that we could just send off a request and check its response later when we want to, but that actually isn't as easy as it sounds.

That's right: most of PHP's functions that deal with I/O don't have this non-blocking functionality built-in. In fact, there's only a handful of functions that do, and using them is quite cumbersome.

There's the example of sockets, which can be set to be non-blocking, like so:

[$read, $write] = stream_socket_pair(
    STREAM_PF_UNIX,
    STREAM_SOCK_STREAM,
    STREAM_IPPROTO_IP
);
 
stream_set_blocking($read, false);
stream_set_blocking($write, false);

By using stream_socket_pair(), two sockets are created that can be used for bidirectional communication. And as you can see, they can be set to be non-blocking using stream_set_blocking().

Say we'd want to implement our example, sending three requests. We could use sockets to do so, but we'd need to implement the HTTP protocol ourselves on top of it. That's exactly what nox7 did, a user who shared a small proof of concept on Reddit to show how to send HTTP GET requests using fibers and sockets. Do you really want to be concerned with doing so in your application code?

The answer, for me at least, is "no". Which is exactly what the RFC warned about; I'm not mad about that. Instead, we're encouraged to use one of the existing async frameworks: Amp or ReactPHP.

With ReactPHP, for example, we could write something like this:

$loop = React\EventLoop\Factory::create();

$browser = new Clue\React\Buzz\Browser($loop);

$promises = [
    $browser->get('https://example.com/1'),
    $browser->get('https://example.com/2'),
    $browser->get('https://example.com/3'),
];

$responses = Block\awaitAll($promises, $loop);

That's a better example compared to manually creating socket connections. And that's what the RFC means: application developers shouldn't need to worry about fibers, it's an implementation detail for frameworks like Amp or ReactPHP.

That brings us to the question though: what are the benefits of fibers compared to what we already could do with generators? Well the RFC explains it this way:

Unlike stack-less Generators, each Fiber has its own call stack, allowing them to be paused within deeply nested function calls. A function declaring an interruption point (i.e., calling Fiber::suspend()) need not change its return type, unlike a function using yield which must return a Generator instance.

Fibers can be suspended in any function call, including those called from within the PHP VM, such as functions provided to array_map or methods called by foreach on an Iterator object.

It's clear that fibers are a significant improvement, both syntax-wise and in flexibility. But they are nothing yet compared to, for example, Go, with its "goroutines".

There's still lots of functionality missing for async PHP to become mainstream without the overhead of frameworks, and fibers are a good step in the right direction, but we're not there yet.

Noticed a tpyo? You can submit a PR to fix it. If you want to stay up to date about what's happening on this blog, you can subscribe to my mailing list: send an email to brendt@stitcher.io, and I'll add you to the list.

So there's that. There actually isn't much to say about fibers if you're not a maintainer of either Amp, ReactPHP or a smaller async PHP framework. Maybe even more frameworks or libraries will start incorporating them?

Meanwhile, there's also Swoole — a PHP extension that actually modifies several core functions to be non-blocking. Swoole itself is a Chinese project and often not very well documented when it comes to English, but recently Laravel announced first party-integration with it. Maybe this is the better strategy when it comes to moving PHP towards a more async model: optionally integrate Swoole or other extensions with frameworks like Laravel and Symfony?

It'll be interesting to see what the future will bring!