All content for YAAP (Yet Another AI Podcast) is the property of AI21 and is served directly from their servers
with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Building Enterprise RAG: Lessons from 2+ Years of Production Deployments
YAAP (Yet Another AI Podcast)
37 minutes
4 months ago
Building Enterprise RAG: Lessons from 2+ Years of Production Deployments
<p>Building production AI systems is hard — especially when you're pioneering entirely new categories. In this episode, Yuval speaks with Guy Becker, Group Product Manager at AI21, to trace the evolution from task-specific models to Agent planning and orchestration systems. Guy shares hard-won lessons from building some of the first RAG-as-a-service offerings when there were literally zero handbooks to follow.</p><p><strong>Key Topics:</strong></p><ol><li><strong>Task-specific models vs. general LLMs</strong>: Why focused, smaller models with pre and post-processing beat general purpose LLMs for business use cases.</li><li><strong>Building RAG before it was cool</strong>: Creating one of the first RAG-as-a-service platforms in early 2023 without any established patterns.</li><li><strong>The one-size-fits-all problem</strong>: Why chunking strategies, embedding models, and retrieval parameters need customization per use case.</li><li><strong>From SaaS to on-prem</strong>: Scaling deployment models for enterprise customers with sensitive data.</li><li><strong>When RAG breaks down</strong>: Multi-hop queries, metadata filtering, and why semantic search isn't always enough.</li><li><strong>Multi-agent orchestration</strong>: How AI21 Maestro uses automated planning to break complex queries into parallelizable subtasks.</li><li><strong>Production lessons</strong>: Evaluation strategies, quality guarantees, and building explainable AI systems for enterprise..</li></ol>