Brauchen wir dafür einen LLM-Vendor-Vertrag?

Nicht zwingend. Die Agenten arbeiten modellanbieter-agnostisch — OpenAI, Anthropic, Mistral oder Self-Hosted. Wir bauen die Architektur so, dass ihr den Anbieter jederzeit wechseln könnt, ohne die Pipeline neu aufzusetzen.

Wie verhindert ihr, dass ein Agent etwas Falsches mergt?

Jede Entscheidung läuft durch einen Validierungs-Schritt mit Tests, statischer Analyse und policy-basierten Gates. Bei kritischen Änderungen (Breaking, Security mit hohem Impact, fehlende Test-Coverage) eskaliert der Entscheidungs-Agent an den Menschen. Wir versprechen keine 100 % Autonomie — wir versprechen weniger Routine.

Funktioniert das auch bei Legacy-Code?

Eingeschränkt. Ohne saubere Tests, klare Architektur und reproduzierbare Builds bringt der Ansatz wenig. Vor Agentic Upgrades steht oft erst eine Konsolidierungsphase: Tests stabilisieren, Builds reproduzierbar machen, Dependencies entwirren.

Was passiert, wenn ein Agent eine schlechte Entscheidung dokumentiert?

Genau das ist der Punkt: weil jede Entscheidung dokumentiert ist, ist sie auch reviewbar. Wir auditieren regelmäßig die Entscheidungen unserer Agenten. Wenn Muster auftauchen, die nicht passen, justieren wir die Policies. Das ist näher an Engineering-Disziplin als an Magic.

Agentic upgrades — when automation decides

How we have reinvented our automated project upgrades using AI agents. From the linear pipeline to an adaptive system.

27. April 2026

The problem with classic upgrade automation

Automated updates are nothing new. For years we have relied on CI/CD pipelines, dependency scanners and Infrastructure-as-Code to keep systems current. And yet there was always a problem: automation often ends exactly where real decisions begin.

Typical upgrade setups today look like this:

Tools detect new versions (e.g. dependencies, base images).
Pull requests are created automatically.
Tests run through (or do not).
A human decides whether to merge.

This works — but only up to a point. Once it gets complex, friction appears: breaking changes have to be interpreted. Changelogs are unstructured or incomplete. Migration steps are not deterministic. Security fixes compete with stability requirements. The result: automation creates work instead of fully taking it over.

Our approach: agentic upgrades

We have started to model upgrade processes no longer as a pipeline, but as a system of cooperating AI agents. Each agent takes on not just a task, but a role with responsibility.

1. Analysis agent

Assesses new versions semantically (not just version numbers).
Extracts relevant changes from changelogs.
Classifies risks (breaking / minor / security).

2. Migration agent

Derives concrete code or configuration changes.
Produces structured migration plans.
Adapts IaC, Dockerfiles or application code in a targeted way.

3. Validation agent

Runs tests in a context-aware manner.
Detects not only failures, but also unusual behaviour.
Assesses result quality, not just exit codes.

4. Decision agent

Aggregates all signals.
Reaches a well-founded merge or rollback decision.
Documents the decision in a traceable way.

The difference: from static pipeline to adaptive system

The central difference does not lie in the technology — it lies in the paradigm.

Before: Linear pipeline. Rigid rules. Humans as the bottleneck.

Today: Dynamic interaction between agents. Context-based decisions. Humans as supervisors, not operators.

That fundamentally changes the quality of automation.

What has improved in concrete terms

Fewer manual interventions. Upgrades complete fully automated noticeably more often — even with more complex changes.

Better risk assessment. Agents evaluate changes more granularly than simple version bumps or SemVer rules.

Faster security responses. Critical updates are prioritised, classified and often brought into production within a very short time.

Traceability. Every decision is documented — why something was merged, why an upgrade was rejected, which risks were recognised.

Technical implementation (simplified)

Our agents typically operate within existing DevSecOps structures:

Integration into CI/CD (e.g. GitOps workflows).
Use of repository context (code, IaC, history).
Access to security data sources (CVEs, advisories).
State-based communication between agents.

Importantly: the agents are not isolated, but share context and build on one another.

Not a replacement for engineering — a multiplier

A common misconception is that AI replaces engineering. Our experience is different: good agents amplify good engineering — and fail at bad engineering. Without clean tests, clear architecture and reproducible builds, even the best agents achieve little.

Conclusion

With agentic upgrades we have taken a decisive step: from “automation that prepares” to “automation that decides”. The result is more stable systems, faster response times and noticeably less operational load. And perhaps most importantly: we can again concentrate on what really matters — building meaningful systems instead of managing updates.

Time to rethink your own upgrade pipeline?

If your routine upgrades create too much manual work or your security patches take too long, we will look at it with you in 30 minutes. Concretely, on your setup. Free of charge, no obligation.

Frequently asked questions

What clients ask us most often about agentic upgrades — answered openly.

Do we need an LLM vendor contract for this?+

Not necessarily. The agents work in a model-vendor-agnostic way — OpenAI, Anthropic, Mistral or self-hosted. We design the architecture so you can switch provider at any time without rebuilding the pipeline.

How do you stop an agent merging the wrong thing?+

Every decision passes through a validation step with tests, static analysis and policy-based gates. For critical changes (breaking, high-impact security, missing test coverage) the decision agent escalates to a human. We do not promise 100 % autonomy — we promise less routine work.

Does this work with legacy code as well?+

Only to a limited extent. Without clean tests, clear architecture and reproducible builds the approach yields little. Agentic upgrades are often preceded by a consolidation phase: stabilising tests, making builds reproducible, untangling dependencies.

What happens if an agent documents a bad decision?+

That is precisely the point: because every decision is documented, it is also reviewable. We regularly audit the decisions our agents make. When patterns emerge that do not fit, we adjust the policies. This is closer to engineering discipline than to magic.