artem_ml16 Jun 2025 17:42

We have an existing PHP infrastructure. The ML pipeline needs data preprocessing (normalization, tokenization, feature extraction). Making a case to the team for keeping it in PHP rather than spinning up a Python service.

Looking for arguments either way and practical experience.

Replies (5)
alex_petrov16 Jun 2025 18:09

For preprocessing that is mostly string manipulation, JSON parsing, and basic math: PHP is fine. For anything involving matrix operations, convolutions, or libraries that have no PHP equivalent: Python is the right tool. Do not fight the ecosystem.

0
dmitry_kv16 Jun 2025 18:34

The question is whether your preprocessing is tightly coupled to the rest of the PHP app. If it just takes DB records and outputs feature vectors, a Python script that reads from the same DB is clean and not a big operational addition.

0
vova16 Jun 2025 19:39

PHP FFI can call into native C libraries and Python extensions indirectly but it is fragile and hard to debug. Not worth it unless you have a very specific bottleneck you cannot solve otherwise.

0
artem_ml16 Jun 2025 20:19

We keep ingestion and chunking in PHP (DB access, file reading, text splitting). Embedding generation goes to OpenAI API. Vector storage/retrieval uses the Qdrant client. No Python anywhere in the pipeline. Depends on whether you use external APIs or run your own models.

0
katedev16 Jun 2025 22:17

If you eventually need to run local models for cost or privacy reasons, Python becomes harder to avoid. Starting with hosted APIs means you can stay in PHP longer before hitting that wall.

0
Write a reply
Markdown. ```php blocks are runnable.