
This academic paper from Meta Platforms introduces **AdLlama**, a novel large language model (LLM) designed to enhance generative advertising text on Facebook. The core innovation is **Reinforcement Learning with Performance Feedback (RLPF)**, a post-training method that utilizes historical ad performance data, specifically click-through rates (CTR), as a reward signal to fine-tune the LLM. Unlike traditional methods relying on human preferences, RLPF optimizes for measurable real-world outcomes. A large-scale A/B test involving nearly 35,000 advertisers demonstrated that AdLlama significantly improved advertiser-level CTR by 6.7% and increased the number of ad variations created, showcasing the tangible economic impact of this new reinforcement learning approach.