How Hud's runtime sensor cut triage time from 3 hours to 10 minutes
💻 Technology

How Hud's runtime sensor cut triage time from 3 hours to 10 minutes

9 min read
VentureBeat

Engineering teams are generating more code with AI agents than ever before. But they're hitting a wall when that code reaches production. The problem isn't necessarily the AI-generated code itself. It...

Engineering teams are generating more code with AI agents than ever before. But they're hitting a wall when that code reaches production. The problem isn't necessarily the AI-generated code itself. It's that traditional monitoring tools generally struggle to provide the granular, function-level data AI agents need to understand how code actually behaves in complex production environments.

Without that context, agents can't detect issues or generate fixes that account for production reality. It's a challenge that startup Hud is looking to help solve with the launch of its runtime code sensor on Wednesday. The company's eponymous sensor runs alongside production code, automatically tracking how every function behaves, giving developers a heads-up on what's actually occurring in deployment. "Every software team building at scale faces the same fundamental challenge: building high-quality products that work well in the real world," Roee Adler, CEO and founder of Hud, told VentureBeat in an exclusive interview.

"In the new era of AI-accelerated development, not knowing how code behaves in production becomes an even bigger part of that challenge." What software developers are struggling with The pain points that developers are facing are fairly consistent across engineering organizations.

Moshik Eilon, group tech lead at Monday.com, oversees 130 engineer and describes a familiar frustration with traditional monitoring tools. "When you get an alert, you usually end up checking an endpoint that has an error rate or high latency, and you want to drill down to see the downstream dependencies," Eilon told VentureBeat. "A lot of times it's the actual application, and then it's a black box.

You just get 80% downstream latency on the application." The next step typically involves manual detective work across multiple tools. Check the logs. Correlate timestamps. Try to reconstruct what the application was doing.

For novel issues deep in a large codebase, teams often lack the exact data they need. Daniel Marashlian, CTO and co-founder at Drata, saw his engineers spending hours on what he referred to as an "investigation tax." "They were mapping a generic alert to a specific code owner, then digging through logs to reconstruct the state of the application," Marashlian told VentureBeat.

"We wanted to eliminate that so our team could focus entirely on the fix rather than the discovery." Drata's architecture compounds the challenge. The company integrates with numerous external services to deliver automated compliance, which creates sophisticated investigations when issues arise.

Engineers trace behavior across a very large codebase spanning risk, compliance, integrations, and reporting modules. Marashlian identified three specific problems that drove Drata toward investing in runtime sensors. The first issue was the cost of context switching. "Our data was scattered, so our engineers had to act as human bridges between disconnected tools," he said.

The second issue, he noted, is alert fatigue. "When you have a complex distributed system, general alert channels become a constant stream of background noise, what our team describes as a 'ding, ding, ding' effect that eventually gets ignored," Marashlian said.

The third key driver was a need to integrate with the company's AI strategy. "An AI agent can write code, but it cannot fix a production bug if it can't see the runtime variables or the root cause," Marashlian said.

Why traditional APMs can't solve the problem easily Enterprises have long relied on a class of tools and services known as Application Performance Monitoring (APM). With the current pace of agentic AI development and modern development workflows, both Monday.com and Drata simply were not able to get the required visibility from existing APM tools. "If I would want to get this information from Datadog or from CoreLogix, I would just have to ingest tons of logs or tons of spans, and I would pay a lot of money," Eilon said. Eilon noted that Monday.com used very low sampling rates because of cost constraints.

That meant they often missed the exact data needed to debug issues. Traditional application performance monitoring tools also require prediction, which is a problem because sometimes a developer just doesn't know what they don't know. "Traditional observability requires you to anticipate what you'll need to debug," Marashlian said. "But when a novel issue surfaces, especially deep within a large, complex codebase, you're often missing the exact data you need." Drata evaluated several solutions in the AI site reliability engineering and automated incident response categories and didn't find what was needed.

"Most tools we evaluated were excellent at managing the incident process, routing tickets, summarizing Slack threads, or correlating graphs," he said. "But they often stopped short of the code itself. They could tell us 'Service A is down,' but they couldn't tell us why specifically." Another common capability in some tools including error monitors like Sentry is the ability to capture exceptions. The challenge, according to Adler, is that being made aware of exceptions is nice, but that doesn't connect them to business impact or provide the execution context AI agents need to propose fixes.

How runtime sensors work differently Runtime sensors push intelligence to the edge where code executes. Hud's sensor runs as an SDK that integrates with a single line of code. It sees every function execution but only sends lightweight aggregate data unless something goes wrong. When errors or slowdowns occur, the sensor automatically gathers deep forensic data including HTTP parameters, database queries and responses, and full execution context.

The system establishes performance baselines within a day and can alert on both dramatic slowdowns and outliers that percentile-based monitoring misses. "Now we just get all of this information for all of the functions regardless of what level they are, even for underlying packages," Eilon said. "Sometimes you might have an issue that is very deep, and we still see it pretty fast." The platform delivers data through four channels: Web application for centralized monitoring and analysis IDE extensions for VS Code, JetBrains and Cursor that surface production metrics directly where code is written MCP server that feeds structured data to AI coding agents Alerting system that identifies issues without manual configuration The MCP server integration is critical for AI-assisted development.

Monday.com engineers now query production behavior directly within Cursor.

"I can just ask Cursor a question: Hey, why is this endpoint slow?" Eilon said. "When it uses the Hud MCP, I get all of the granular metrics, and this function is 30% slower since this deployment. Then I can also find the root cause." This changes the incident response workflow. Instead of starting in Datadog and drilling down through layers, engineers start by asking an AI agent to diagnose the issue.

The agent has immediate access to function-level production data. From voodoo incidents to minutes-long fixes The shift from theoretical capability to practical impact becomes clear in how engineering teams actually use runtime sensors. What used to take hours or days of detective work now resolves in minutes. "I'm used to having these voodoo incidents where there is a CPU spike and you don't know where it came from," Eilon said.

"A few years ago, I had such an incident and I had to build my own tool that takes the CPU profile and the memory dump. Now I just have all of the function data and I've seen engineers just solve it so fast." At Drata, the quantified impact is dramatic. The company built an internal /triage command that support engineers run within their AI assistants to instantly identify root causes. Manual triage work dropped from approximately 3 hours per day to under 10 minutes.

Mean time to resolution improved by approximately 70%. The team also generates a daily "Heads Up" report of quick-win errors. Because the root cause is already captured, developers can fix these issues in minutes. Support engineers now perform forensic diagnosis that previously required a senior developer.

Ticket throughput increased without expanding the L2 team. Where this technology fits Runtime sensors occupy a distinct space from traditional APMs, which excel at service-level monitoring but struggle with granular, cost-effective function-level data. They differ from error monitors that capture exceptions without business context. The technical requirements for supporting AI coding agents differ from human-facing observability.

Agents need structured, function-level data they can reason over. They can't parse and correlate raw logs the way humans do. Traditional observability also assumes you can predict what you'll need to debug and instrument accordingly. That approach breaks down with AI-generated code where engineers may not deeply understand every function.

"I think we're entering a new age of AI-generated code and this puzzle, this jigsaw puzzle of a new stack emerging," Adler said. "I just don't think that the cloud computing observability stack is going to fit neatly into how the future looks like." What this means for enterprises For organizations already using AI coding assistants like GitHub Copilot or Cursor, runtime intelligence provides a safety layer for production deployments.

The technology enables what Monday.com calls "agentic investigation" rather than manual tool-hopping. The broader implication relates to trust.

"With AI-generated code, we are getting much more AI-generated code, and engineers start not knowing all of the code," Eilon said. Runtime sensors bridge that knowledge gap by providing production context directly in the IDE where code is written. For enterprises looking to scale AI code generation beyond pilots, runtime intelligence addresses a fundamental problem. AI agents generate code based on assumptions about system behavior.

Production environments are complex and surprising. Function-level behavioral data captured automatically from production gives agents the context they need to generate reliable code at scale. Organizations should evaluate whether their existing observability stack can cost-effectively provide the granularity AI agents require. If achieving function-level visibility requires dramatically increasing ingestion costs or manual instrumentation, runtime sensors may offer a more sustainable architecture for AI-accelerated development workflows already emerging across the industry.

Tags

#AI

Continue Reading on Source

This article was originally published on VentureBeat. Click below to read the complete story with full details, images, and analysis.

Read Full Article on VentureBeat

Related Articles

More Technology Articles