Meet us at AWS re:Invent | Booth # 712:
Let’s talk strategy, scalability, partnerships, and the future of autonomous systems.

We solved the "write code" problem with conversational AI. The "understand systems" problem is still stuck in polyglot queries and manually constructing narratives. Consider this contrast:
At Resolve AI, we are making Debugging conversational. What we like to call “Vibe Debugging”. It collapses the entire loop of hypothesis -> evidence -> validation into a conversation:
Look at this flow as an example: https://resolve.navattic.com/uuh07j2
Along the way, we realized that in making Vibe Debugging possible, logs represent the most valuable evidence yet the most challenging to navigate.
In conversational debugging, logs represent the highest value because logs contain the ground truth. Metrics tell you what (latency increased), traces show where (bottleneck in service X), but logs explain why (connection pool exhausted). They're how engineers leave debugging breadcrumbs and where the actual failure reasons live. If you were to go through the same example as above manually, you would have to:
Building AI agents for log investigation isn't just "ChatGPT for logs." Logs are fundamentally unstructured. Unlike metrics (time series) or traces (structured events), logs are free-form text with infinite variety. Every service logs differently. This creates a paradox: the most valuable debugging information is trapped in the least structured format.
Making logs conversational requires solving problems that traditional log analysis tools sidestep entirely.
You need to translate intent to query: Every log investigation starts with a human question: "Why did checkout break?" But to answer that question, you need to become a translator:
service:payment-service status:error @timestamp:[now-1h TO now]{app="payment"} |= "error" | json | __error__=""index=production source=payment earliest=-1h@h | search level=ERRORYou're fighting semantic differences. Is it service.name or app or component? Does "error" mean log level, HTTP status, or exception presence?
Querying logs is not as simple as dumping them as context to an LLM. To understand the scale of this challenge, consider that even Text2SQL (converting natural language to database queries) remains largely unsolved despite years of research. The best GPT-4 systems today achieve only ~60% accuracy ¹ on structured database queries with well-defined schemas. If you are building AI agents that run iterations and error correction on top of such data, these systems can require up to 10 attempts² per query while still struggling with complex joins.
Real incidents span services and time. When you ask "Why did checkout break?", you are correlating error patterns across multiple services and building temporal causality. Consider this debugging scenario:
14:23:15 payment-service: ERROR Connection timeout to auth-db
14:23:15 auth-service: INFO Processing token validation
14:23:16 payment-service: WARN Retrying connection to auth-db
14:23:17 database-pool: ERROR Max connections reached (100/100)
14:23:18 payment-service: ERROR Transaction failed: unable to validate auth
You immediately see the story: auth service is overwhelming the database, causing payment failures. But extracting that narrative requires:
At Resolve AI, we are building a multi agent system that provides an interface to your production systems. The goal is to reimagine how we as engineers interact with our complex production systems.
![][image1]
Our multi agent system uses
- Knowledge agents to learn how your production systems operate. As part of their objective, they combine structured metrics, semi-structured traces, unstructured logs or even visual dashboards into coherent understanding.
- Reasoning agents learn your specific system's patterns, not just general debugging heuristics. When you ask a question, they pursue multiple hypotheses and correlate evidence from the right data sources – showing the evidence trail so you can validate conclusions and learn from investigations.
- Action agents can understand and operate your stack based on the conversation you are having with Resolve AI. They can interpret Grafana dashboards, follow your conventions to generate code suggestions, Git commits, or PRs in Cursor.
- Learning agents constantly learn from every interaction, investigation, or outcomes to build institutional memories. Resolve AI is designed to continuously evolve as you use it
If building multi-agent systems that can reason about distributed infrastructure at scale excites you, we'd love to talk. We're looking for engineers who want to define what the next generation of agentic AI looks like for software engineering.
Resolve AI is the agentic AI company for software engineering founded by the co-creators of OpenTelemetry. By combining our deep expertise in building developer tools and observability with state-of-the-art agentic AI, our mission is to increase engineering velocity by transforming the way engineers build, deploy, and maintain real-world software systems.
With Resolve AI, customers like Datastax, Tubi, and Rappi, have increased engineering velocity and systems reliability by putting machines on-call for humans and letting engineers just code. Interested in learning more about our Agentic AI approach to production systems? Say hello.
¹ DIN-SQL + GPT-4 achieves 60.0% exact set match accuracy on the Spider benchmark (Yale Spider Leaderboard, 2024). While execution accuracy can reach 85%+, exact match—getting the SQL query precisely right—remains challenging even for state-of-the-art models on structured database queries.
² Agentic Text2SQL systems can require "up to 10 iterations to refine a SQL query" with multi-stage workflows involving schema linking, candidate generation, self-correction, and evaluation (Hexgen-Text2SQL research, 2024). This iterative approach increases computational complexity while still struggling with accuracy on complex queries.
Discover why ChatGPT and out-of-the-box LLMs can't handle production incidents. Learn how true AI SREs use multi-agent systems to deliver root cause analysis in minutes, not hours.

Is Vibe debugging the answer to effortless engineering?

Discover how Resolve AI's knowledge graph empowers agentic AI to revolutionize incident response.