Async LLM calls using PHP Fibers or Swoole coroutines

dmitry_kv24 May 2025 21:42

For a feature that needs to call the OpenAI API three times in parallel (different prompts, independent results), I want to fan out the calls and wait for all to complete.

What is the right approach in PHP? Currently each call takes 3-4 seconds, total would be 12 seconds sequentially.

981

Replies (5)

alex_petrov24 May 2025 21:56

If you are on Swoole or Hyperf, Coroutine::parallel() or Co\run with multiple coroutines is the cleanest approach. Total time becomes max(call1, call2, call3) instead of sum.

dmitry_kv24 May 2025 22:01

On FPM without async framework: curl_multi_exec handles parallel HTTP requests natively. openai-php/client does not support it directly but you can use the underlying Guzzle pool with Promise::all() via Guzzle concurrent requests.

artem_ml24 May 2025 23:45

Guzzle concurrent requests with the pool abstraction are straightforward. You create an array of request callables and Guzzle runs them concurrently up to your concurrency limit. Worked well for our 5-call fanout.

sergey_web25 May 2025 00:31

ReactPHP event loop approach also works but adds dependencies. For a one-off fanout, Guzzle pool is simpler. For a system that does this constantly, an async framework is the better long-term choice.

vova25 May 2025 02:16

With Fibers you can write a simple scheduler manually: create one Fiber per call, resume them round-robin, they suspend at the curl read point. But honestly the Guzzle pool does this for you already.

Write a reply