Scale or Fail as Spotify's Growth Exposed the Abstraction Paradox | Stuart Clark | Conf42 SRE 2026

Presenters Stuart Clark Source Conf42 SRE 2026 Scale or Fail: How Spotify Solved the Abstraction Paradox 🚀 In the high-stakes world of software engineering, we often treat abstraction as our greatest ally. We build layers to hide complexity, simplify workflows, and help our teams move faster. But what happens when those very abstractions become your biggest enemy during a 3:00 a.m. critical incident? Stuart Clark, Senior Developer Advocate at Spotify, recently shared a compelling story about how Spotify nearly fell into the abstraction trap and how they engineered their way out of it. This isn’t just a story about code; it’s about scaling operational knowledge across thousands of engineers. ...

March 19, 2026 · 4 min

SRE-Ready NDC Shopping: Caching at Scale Without Pricing Drift | Mukul Kumar Gaur | Conf42 SRE 2026

Presenters Mukul Kumar Gaur Source Conf42 SRE 2026 Hello everyone, and a huge thank you for joining this insightful session with Mukul Kumar Gaur! Today, we’re diving deep into a pressing challenge many airlines and Online Booking Platforms (OBPs) face: how to scale NDC shopping traffic while keeping prices accurate and systems reliable. Get ready to discover how intelligent caching is revolutionizing airline distribution! The NDC Challenge: Navigating a Skyrocketing Demand 💥 The world of airline distribution is evolving rapidly, moving towards modern retailing and dynamic offers. While this brings incredible flexibility, it also creates a massive headache for infrastructure. Mukul Kumar Gaur highlights a dramatic increase in shopping requests, not just in volume but also in complexity. Each request now demands significantly more computation than ever before, involving intricate calculations for fares, taxes, inventory, availability, and merchandising rules. ...

March 19, 2026 · 5 min

Strengthening Regulated Production Systems | Shruthi Sepuri | Conf42 SRE 2026

Presenters Shruthi Sepuri Source Conf42 SRE 2026 Beyond Uptime: Why Your “Healthy” System Might Still Be Failing the Business 🚀 In the world of regulated enterprise platforms, we often obsess over a single version of the truth: the dashboard. If the lights are green, we breathe a sigh of relief. But what if those green lights are lying? Shruthi Sepuri, an expert in enterprise systems testing and reliability, argues that for systems making high-stakes business decisions, technical health is no longer the gold standard. A system can have 99.99% uptime and still be a catastrophic failure if the decisions it automates are wrong. ...

March 19, 2026 · 4 min

AI-Assisted Incident Response Using LLMs and MCP | Makarand Gujarathi | Conf42 SRE 2026

Presenters Makarand Gujarathi Source Conf42 SRE 2026 From Chaos to Clarity: Revolutionizing Incident Response with AI and MCP 🚀 In the world of modern software, the transition from monoliths to microservices has brought unparalleled scale but also a massive headache for on-call engineers. If you have ever been paged at 2:00 AM, you know the drill: a single user request might touch five to 10 microservices, each with its own database, cache, and deployment pipeline. When things break, the noise is deafening. ...

March 19, 2026 · 4 min

AI-Governed Lakehouse Ingestion with Flink on Kubernetes | Jyothish Sreedharan | Conf42 SRE 2026

Presenters Jyothish Sreedharan Source Conf42 SRE 2026 Building Resilient Data Pipelines: AI Governance Meets Stream Processing 🚀 In today’s data-driven world, organizations thrive on massive data pipelines that pull information from a multitude of sources – databases, APIs, event streams, and logs. However, these pipelines often falter due to evolving data structures (schema drift), subtle meaning changes (semantic inconsistencies), and the sheer complexity of operations. Enter Jyothish Sreedharan, who presents a groundbreaking architecture that harmonizes stream processing, semantic intelligence, and cloud-native infrastructure to create injection systems that are not just self-adapting and resilient, but truly production-ready. ...

March 19, 2026 · 7 min