dmitry_kv15 May 2026 21:11

Processing a CSV with 500k rows? Loading it into an array first will hit memory limits. Generators let you process one row at a time with constant memory usage regardless of file size.

Here is a complete pipeline pattern you can run in sandbox (using an in-memory string instead of an actual file):

PHP
<?php
// Simulate a large CSV as a string
$csv = "id,name,score,active\n";
for ($i = 1; $i <= 1000; $i++) {
$csv .= "$i,User$i," . rand(1, 100) . "," . rand(0, 1) . "\n";
}
// Generator 1: yield lines from source
function readLines(string $content): Generator
{
$lines = explode("\n", trim($content));
foreach ($lines as $line) {
if ($line !== '') {
yield $line;
}
}
}
// Generator 2: parse CSV lines into arrays
function parseCsv(Generator $lines): Generator
{
$headers = null;
foreach ($lines as $line) {
$row = str_getcsv($line);
if ($headers === null) {
$headers = $row;
continue;
}
yield array_combine($headers, $row);
}
}
// Generator 3: filter rows
function filterActive(Generator $rows): Generator
{
foreach ($rows as $row) {
if ($row['active'] === '1') {
yield $row;
}
}
}
// Generator 4: transform
 
הההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההה
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

The entire pipeline processes one row at a time. Peak memory stays flat no matter how large the input gets.

Replies (2)
lukaszkrzyz15 May 2026 21:37

Great pattern. You can generalize the pipeline construction to avoid the deeply nested call syntax:

PHP
<?php
function pipeline(mixed $source, callable ...$stages): Generator
{
$current = $source;
foreach ($stages as $stage) {
$current = $stage($current);
}
return $current;
}
// Then instead of mapScore(filterActive(parseCsv(readLines($csv))))
// you write:
function lines(string $s): Generator {
foreach (explode("\n", trim($s)) as $l) if ($l) yield $l;
}
function csvRows(Generator $g): Generator {
$h = null;
foreach ($g as $l) {
$r = str_getcsv($l);
if (!$h) { $h = $r; continue; }
yield array_combine($h, $r);
}
}
$data = "name,val\nalice,10\nbob,20\ncarol,30";
$result = pipeline(
lines($data),
fn($g) => csvRows($g),
fn($g) => (function($g) { foreach($g as $r) if((int)$r['val']>10) yield $r; })($g)
);
foreach ($result as $row) {
echo $row['name'] . ": " . $row['val'] . "\n";
}
הההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההההה
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
0
mykolap16 May 2026 00:24

One thing to keep in mind: generators are not rewindable. Once you consume a generator you cannot iterate it again. If you need to pass the same data through multiple consumers, collect it into an array first or wrap it in a class that re-creates the generator on each iteration.

Also, exception handling inside generators can be tricky. If an exception is thrown inside yield, the generator terminates and the exception propagates to the foreach call site. Make sure your pipeline handles errors at the consumer level with try/catch around the foreach, not inside individual generators.

0
Write a reply
Markdown. ```php blocks are runnable.