Presenters
Source
🚀 Level Up Your Anomaly Detection: A Smarter Approach 🤖
Hey everyone! 👋 I’m Jorge, and I’m thrilled to dive into a new strategy for anomaly detection – one that’s designed to be more robust and, frankly, smarter. Let’s explore how we’re moving beyond simple threshold-based alerts to a system that truly understands the behavior of your data. 💡
🧱 The Core Idea: Bands Around Your Metrics
The fundamental concept is simple: we create bands around your key metrics. Think of it like a safety net – we’re looking for deviations from the expected behavior. When a metric drifts outside these bands, we flag it as a potential anomaly. 🎯
📏 How It Works: A Step-by-Step Breakdown
Here’s the breakdown of the algorithm, which has evolved significantly from last year’s version:
- Median Calculation (Daily): We start by calculating the median of your metric over a 24-hour window. This is our central point – a robust measure less affected by outliers. 💾
- Median Absolute Deviation (MAD): Next, we calculate the MAD – essentially, the average absolute distance between the metric and the median. This gives us a measure of persistence – how consistently the metric deviates from its central value. We’re aiming for an experimental function for this, which will be available soon! 📡
- Band Creation: Finally, we add a constant (currently set to 2) to the MAD. This creates the boundaries of our bands – a range above and below the median line. 🛠️
⏳ Introducing the Concept of “Level” – Smoothing for Robustness
This is where things get really interesting. We’ve added a concept called “level,” which is calculated over a 1-hour window. 🤯 This acts as a smoothed representation of your metric, providing a more stable baseline for detecting anomalies. It’s like applying a gentle filter to reduce noise and improve accuracy.
💥 Comparing the New Approach to the “Adaptive” Algorithm (Last Year)
Let’s see how this new approach stacks up against our previous algorithm, which we’ve now dubbed “Adaptive.”
- Short Spikes: The Adaptive algorithm immediately triggered an anomaly for short spikes, causing the bands to widen dramatically. This is great for reducing false positives, but it can lead to missed detections.
- Sustained Spikes: Conversely, the Adaptive algorithm delayed the detection of sustained spikes, sometimes too late. ⏳
- The New Approach: The new algorithm ignores short spikes entirely, thanks to the smoothing effect of the “level.” It only triggers an anomaly when a sustained spike pushes the level outside the bands. This provides a better balance between sensitivity and precision. 🎯
📊 Simulation Results: A Clear Difference
Let’s look at a simulated scenario:
- Short Spike: The Adaptive algorithm immediately flagged it, while the new algorithm completely ignored it.
- Sustained Spike: The Adaptive algorithm triggered an alert late, while the new algorithm detected it promptly and sustained the alert until the spike subsided. 🦾
🤔 Tradeoffs and Considerations
It’s important to remember that there’s no one-size-fits-all solution. The best approach depends on the specific characteristics of your data and your tolerance for false positives versus missed anomalies.
- Sensitivity: If you need to catch every anomaly, you might prefer the Adaptive algorithm (though be prepared for more false alarms).
- Precision: If you prioritize minimizing false positives, the new approach with the “level” is likely a better choice.
🎉 Conclusion: A Smarter Way to Detect Anomalies
This new anomaly detection strategy offers a significant improvement over previous methods. By incorporating the concept of “level” and focusing on sustained deviations, we’ve created a system that’s more robust, accurate, and adaptable to a wider range of metrics. 🌟
I hope this has been insightful! Let me know your thoughts in the comments below. 👇