Your AI Incident Tools Are Missing a Key Layer

TL;DR: PagerDuty's Chief AI Officer warns that while AI accelerates code delivery, it also increases incidents. Most current AI tools for incident response lack a critical layer of operational context, making them less effective.
Key facts
- Category
- Infrastructure
- Impact
- High
- Published
- Source
- The New Stack
Full summary
While AI helps ship code faster, it also causes more incidents. PagerDuty's CAIO says current AI response tools are missing a critical layer.
Artificial intelligence is enabling engineering teams to ship software faster than ever before. However, this increased velocity comes with a significant side effect. According to PagerDuty's Chief AI Officer, with roughly 70% of all production incidents stemming from changes to live systems, a higher rate of code deployment naturally leads to more frequent and complex incidents. As this pace accelerates, traditional incident response processes, which were designed for a slower, more predictable era of software development, are becoming inadequate. The sheer volume and speed of AI-driven development demand a fundamental shift in how organizations prepare for and manage system failures. The old playbooks are simply not equipped to handle the new speed of operations, creating a growing risk for teams that fail to adapt their strategies.
In response, many organizations are turning to AIOps and other automated tools to help manage this complexity. However, the PagerDuty executive argues that most of these solutions are missing a critical component: a deep understanding of the organization's unique operational context. These tools can analyze logs and metrics but often lack the “socio-technical” awareness of who owns which service, what the on-call rotation is, or what was learned from a similar incident six months ago. This missing layer of human and organizational knowledge means that AI tools can generate a lot of noise, escalate issues to the wrong people, or fail to identify the true root cause. Without this context, they remain powerful but blunt instruments, unable to provide the nuanced, targeted support that human responders need during a high-stakes outage.
This gap highlights the need for a more intelligent approach to AIOps. The future of effective incident management isn't just about more automation; it's about smarter automation that augments human expertise. For technology leaders, this means evaluating and implementing AI tools that don't just process machine data but also integrate with and learn from team structures, service ownership maps, and historical incident data. The goal is to create a collaborative system where AI provides context-aware insights, helping teams diagnose problems faster and coordinate responses more effectively. As development cycles continue to shrink, the ability to bridge the gap between machine-generated data and human operational wisdom will become the key differentiator between resilient and fragile systems.
Why it matters
This highlights a critical gap in many AIOps strategies: focusing on pure automation without considering the human and operational context. For engineering leaders, it's a warning that simply buying AI tools won't solve the problem of increased incident rates; the tools must integrate deeply with team workflows and organizational knowledge.
Business impact
Companies relying on AI to accelerate software delivery may face diminishing returns if their incident response capabilities don't evolve. Increased downtime and slower resolutions can directly impact revenue, customer trust, and engineering team morale. This insight pushes businesses to evaluate AIOps tools not just on their technical features, but on their ability to understand and adapt to the company's unique operational environment.
Tags
Related on Notifire
Related stories
Primary source: The New Stack