katedev10 Jun 2025 23:42

We have 15 different prompts used across our application. They change frequently as we tune them. No good process for tracking changes, testing regressions, or rolling back a bad prompt change.

How do teams manage prompt engineering over time?

Replies (7)
artem_ml10 Jun 2025 23:52

Store prompts in the DB with a version column, not in code. Each prompt has a name, version, content, and model configuration. The application loads the active version at runtime. Rollback is a DB update.

0
alex_petrov11 Jun 2025 00:58

For regression testing: collect a test set of (input, expected output) pairs. After changing a prompt, run the test set and compare outputs with the previous version. Manual review or LLM-as-judge scoring.

0
dmitry_kv11 Jun 2025 02:47

We keep prompts in version-controlled YAML files in the repo. Deployment includes a migrate step that upserts prompt versions to the DB. Best of both worlds: git history and runtime DB lookup.

0
vova11 Jun 2025 03:51

LangSmith and similar tools do prompt versioning as a service. Overkill for 15 prompts, useful if you have hundreds across many models.

0
artem_ml11 Jun 2025 04:12

Shadow testing: run both old and new prompt in parallel, log both outputs, compare offline. Costs double per request during the test period but gives you real traffic results before switching.

0
katedev11 Jun 2025 05:30

How do you handle prompts that have PHP variables interpolated? Template syntax in the stored prompt?

0
artem_ml11 Jun 2025 06:52

Simple Mustache-style placeholders: {{user_name}}, {{document}}. Before sending, do a str_replace pass. Avoid complex templating engines in prompts, the indirection makes them harder to read and test.

0