Presenters
Source
🚀 Agentic AI: Making Open Search Observability Accessible to All 🤖
Hey everyone! 👋 Kyle here from AWS, and I’m incredibly excited to dive into a project that’s fundamentally changing how we think about observability – specifically, within the vibrant Open Search ecosystem. We’re talking about Unleashing Agentic AI, and it’s poised to make complex data analysis accessible to users of all skill levels. Let’s explore how we’re tackling the challenges of PromQL complexity and building a future where data insights are truly democratized.
💡 The Problem: PromQL and the Need for Democratization
Let’s be honest: PromQL can be intimidating. 🤯 For many organizations using Open Search – a community of over 3,000 contributors – the power of Prometheus metrics is often locked behind a steep learning curve. We heard directly from the audience that nearly half of those users are responsible for building dashboards to explain metrics to non-technical stakeholders. That’s a huge responsibility, and a significant barrier to truly leveraging the wealth of data Open Search is collecting – from logs and traces to vector embeddings powering AI applications. Translating business questions into effective PromQL queries is a challenge, and it’s preventing many users from fully understanding their systems.
🛠️ Open Search: A Powerful Foundation
Open Search itself is a robust, distributed search and analytics engine. It’s built on four core components:
- Open Search Core: The heart of the search engine.
- Data Prepper: Handles the ingestion of data from various sources.
- Vector Engine: Enables powerful AI features like semantic search.
- Open Search Dashboards: Our open-source data visualization platform – currently offering querying, visualization, and alerting capabilities, and connecting to external sources via Glue Data Catalog.
📈 Native Prometheus Support: A Major Step Forward
We’ve achieved a significant milestone with the backend plug-in implementation for Prometheus support, now merged! 🎉 However, the front-end experience is still under development and slated for a future release. Despite this, we’re already demonstrating how to visualize and explore Prometheus metrics without relying on plug-ins, thanks to the magic of agentic AI.
✨ Agentic AI: Guided Exploration with Intelligent Agents
This is where things get really interesting. We’re introducing an experimental approach to observability driven by AI agents – think of them as intelligent assistants that can proactively investigate data. We’re leveraging technologies like Amazon QCLI and Anthropic’s Claude 4 to explore natural language to PromQL translation. These agents are given limited access to telemetry data and guided through investigations, mimicking the workflow of an on-call operator.
Crucially, we’re using knowledge bases – collections of specialized documentation and runbooks – to provide the LLM with the context it needs to avoid “hallucinations” (making things up!) and deliver accurate results. A stark contrast was observed when querying Open Search’s Piped Processing Language (PPL) versus Prometheus; the LLM struggled with PPL without prior knowledge, highlighting the importance of a robust knowledge base.
🤖 Key Features & Technologies Powering Agentic AI
Let’s break down the key components:
- Query Generation: Using Amazon QCLI and Claude 4, we’re working on translating natural language questions into PromQL queries.
- Automatic Data Exploration: Agents autonomously investigate data sources, guided by user prompts.
- Agentic User Interaction Protocol (AGUI): A new standard for connecting agent backends to front-end UIs, ensuring a consistent experience.
- Tool Discovery & State Sharing: AGUI facilitates the discovery of relevant tools and the sharing of investigation state, streamlining the process.
- Plug-in Agent Support: We’re enabling the use of self-hosted agents for enhanced security and compliance, alongside third-party options.
- HomeGPT Integration: We’ve integrated HomeGPT, an open-source observability agent, directly into Open Search dashboards, showcasing its immediate capabilities.
🌐 Looking Ahead: A Collaborative Journey
We’re actively seeking community feedback and direction. Our future roadmap includes integrating with alerting systems (like Alertmanager) and ticketing systems to automate incident response. The ultimate goal is to empower everyone, regardless of their PromQL expertise, to unlock the full potential of their observability data.
🎯 Quantifiable Impact & Future Roadmap
- Community Size: Over 3,000 contributors – a testament to the Open Search community’s passion.
- Future Focus: Continued development of the front-end metrics experience, integration with Prometheus MCP servers, and expansion of agentic AI capabilities.
We believe this approach represents a significant step towards democratizing observability and unlocking the power of data for everyone. Thank you! 🙏