Safety, Evaluation, and Reasoning

https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/23/72/b6/2372b6f2-e946-2b4a-2d13-3f32527305e3/mza_2092205043051898135.jpg/600x600bb.jpg

Today in arXiv AI

Scot Bearss

7 episodes

3 days ago

Today in arXiv AI is your daily deep dive into the cutting edge of artificial intelligence. Every morning, we unpack the latest breakthroughs in LLM architectures, agentic AI, multimodal models, scaling strategies, safety research and more—mixing expert analysis, lively debate, and real‑world use cases. Whether you’re an AI practitioner, tech leader, or just curious about what’s next, we break down complex papers (and what they mean for you) into a fast‑paced, two‑host conversation you’ll actually enjoy. I am an independent creator and not affiliated with arXiv. Sources linked in descriptions

Technology

RSS

All content for Today in arXiv AI is the property of Scot Bearss and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.

Technology

https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/44125644/44125644-1753362261358-97f877a66347b.jpg

Safety, Evaluation, and Reasoning

Today in arXiv AI

54 minutes 59 seconds

3 months ago

Safety, Evaluation, and Reasoning

Generated with Google NotebookLM.

In this episode, we dive into the cutting edge of Large Language Models (LLMs)—their promise, their pitfalls, and the novel techniques reshaping how they’re used in the wild.

We unpack a wide spectrum of advancements:

Privacy Risks in fine-tuning: How LoRA-adapted models are vulnerable to Membership Inference Attacks, and what defenses like dropout and differential privacy can do about it.
Auto-Grading Physics Exams: Meet AlphaPhysics, a hybrid system that uses LLMs, computer algebra, and term rewriting to accurately grade even complex equations.
LLMs as Critics: Explore CLEAR, a tool that uses LLMs themselves to identify recurring reasoning failures in math and retrieval-augmented generation (RAG).
Reasoning in Arabic Tables: Enter AraTable, a benchmark for Arabic tabular understanding, with an Assisted Self-Deliberation (ASD) framework pushing multilingual evaluation forward.
Smarter Code Reviews: Discover how symbolic reasoning enhances LLMs' ability to flag subtle code defects beyond current semantic techniques.
Interpretability Research: Learn how Sparse Autoencoders and wMPPC are being used to analyze how language and vision models share core internal concepts.
AI Safety at Scale: Dive into SafeWork-R1, a safety-aligned model trained using the SafeLadder framework and multiple ethical verifiers to steer behavior.
Domain-Specific Data Synthesis: See how AQuilt generates instruction-tuning data for legal and medical domains through embedded logic and self-inspection—cutting cost while raising relevance.

If you're tracking where AI is going next, this episode is your briefing on the research shaping the next generation of intelligent systems.

Sources:

https://arxiv.org/pdf/2507.18584v1.pdf

https://arxiv.org/pdf/2507.18576v1.pdf

https://arxiv.org/pdf/2507.18512v1.pdf

https://arxiv.org/pdf/2507.18476v1.pdf

https://arxiv.org/pdf/2507.18442v1.pdf

https://arxiv.org/pdf/2507.18392v1.pdf

https://arxiv.org/pdf/2507.18391v1.pdf

https://arxiv.org/pdf/2507.18337v1.pdf

https://arxiv.org/pdf/2507.18302v1.pdf