Frontline Reliability: Protecting User Journeys with SLOs with Shery Brauner (Razor, ex-Zalando)

https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/20/f3/2e/20f32e67-e699-3cec-f77a-b7c1b381baa4/mza_5590371432551785944.jpg/600x600bb.jpg

Humans of Reliability

Rootly

20 episodes

4 days ago

Only 50% of companies monitor their ML systems. Building observability for AI is not simple: it goes beyond 200 OK pings. In this episode, Sylvain Kalache sits down with Conor Brondsdon (Galileo) to unpack why observability, monitoring, and human feedback are the missing links to make large language model (LLM) reliable in production. Conor dives into the shift from traditional test-driven development to evaluation-driven development, where metrics like context adherence, completeness, and ac...

Technology

RSS

All content for Humans of Reliability is the property of Rootly and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.

Technology

Frontline Reliability: Protecting User Journeys with SLOs with Shery Brauner (Razor, ex-Zalando)

Humans of Reliability

31 minutes

2 months ago

Frontline Reliability: Protecting User Journeys with SLOs with Shery Brauner (Razor, ex-Zalando)

What does it really take to move from firefighting incidents to building reliability at scale? In this episode of Humans of Reliability, Shery Brauner (Razor, ex-Zalando) shares her unique journey from frontend and backend engineering to leading site reliability practices. She explains why protecting the user journey is the key to effective incident management, how SLOs cut through noisy alerts, and why observability must come first. Shery also talks about practical steps teams can take to ad...