Two of the most important voices in AI spoke out this week. Andrej Karpathy, one of the algorithm's greatest philosophers, was in conversation with Dwarkesh Patel talking praisingly and cautiously about the Cambrian explosion in cognition but an inconsistency and lack that foresees a ton of work ahead to get us where AI deserves to be. His 'decade' of agents, tells us all we need to know about today's limitations.
Meanwhile, Google's CEO Sundar Pichai talks effusively about AI being the great equaliser right now. There's a commercial necessity in promoting what's available to the AI practitioner in 2025.
These conflicting commentaries aren't making life easy for business leaders.
So we had a debate - on today's episode of AI Today!
If only that was the real name. After all this time begging frontier labs to build an LLM that learns from its mistakes and applies its discoveries at inference time...
Welcome to AI Today!
The Dragon Hatchling is a remarkable research paper that reboots modern AI as a model that approximates how our brains work.
Today's show is a fascinating discussion and I implore you to both enjoy it and then chat about it and ask your questions on NotebookLM.
It's the one thing every business leader needs to know.
If I put AI to work in my organisation, will it screw everything up?
While we should all be in experiment mode right now - until someone figures out how to make the probabilistic, deterministic - OpenAI researchers have been putting AI to work on real tasks.
The results are spectacular. Spectacularly good, and spectacularly bad.
But just like Tim Henman, you have to give it a chance. And maybe great things will follow - for AI.
On today's show we look wider, at many ways we've tested AI in organisations and across functions and disciplines - and how it's fared.
And then we zoom in on GDPval, which sounds like someone your gran knows who reads The Economist but is actually that OpenAI research paper that explores LLMs in the context of the organisation. We hear the pros and cons and whether now's the time to execute agents, or execute our dreams that AI is ready to replace us all.
Enjoy the show.
A fantastic research paper published in this month's Nature Computational Science suggests a solution may be in our midst for the incredible inefficiency in generative AI.
Large Language Models' (LLMs) transformer architecture requires the next token (generally part of a word) be predicted based on all the output tokens before it.
Power demands for this process are huge. Shuffling data between memory and processors isn't an easy pipeline, and when you need it to work quickly, those energy demands quickly stack up.
And in an AI arms race, where everyone wants bigger and better models, requiring increasingly powerful compute is required to stretch their limits, the dependence on energy to power, and cool, those processing units grows exponentially.
But what if there was a different, and better, way, to make AI work? That's the driving force behind work of Nathan Leroux and his team proposing a totally different paradigm: analog in-memory computing.
And that's exactly what we're discussing today.
Zip yourself in that flame retardant suit: things are about to get hot in here...
Ping me at dave@wordandmouth.com to get on the show or talk about AI in your world.
30 years in journalism has sharpened my mind.
I've spent years in AI.
And months researching China and the US as they fight silently for AI supremacy.
$500bn in The Stargate Project does not come close to the value China has created integrating AI into every aspect of its society and economy. But the truth is, they won before the US even woke up to AI's potential. China's superapps - forget homescreens, because you only need one icon to run your life in the Republic - were simply laying the foundation for where we are, today.
But let's have a debate, nonetheless.
East v West: which is best?
I've been working on a business intelligence platform leveraging AI and 30 years in journalism and content strategy. It's the toughest professional project of my career. And I have no idea if I will win. But just like life, Zeitgeist is all about the journey, not the destination. What I am learning is more important than any long form feature we might generate. Knowledge graphs, ontologies, taxonomies, and patience. Hope you will stick around on this crazy adventure.
I'm Dave Thackeray - leadership coach, content strategist, and endlessly curious Berlin-based berk.
Email me at dave@wordandmouth.com to test Zeitgeist for your business.
Recruitment is being radically remodelled by AI.
And according to a brand new piece of research, AI is already humiliating humans at hiring.
Hear the story behind the headlines that AI-led interviews increase job offers by 12%, job starts by 18%, 30-day retention by 17% - and when offered the choice, 78% of applicants choose the AI recruiter.
Google DeepMind's Demis Hassabis and his team have a bold mission: penetrating the 4D chess game that's AI embracing our ever-changing biological, physical world.
Taking a snapshot is one thing. Remembering the molecular topology and their constant changes of state is truly what separates fact from fiction.
It seemed like an impossible target to hit. Until M3-Agent, the work of researchers associated with ByteDance at Shanghai Jiao Tong University, showed up with long-term multimodal memory - allowing the agent to see, hear, remember, and reason just like humans.
M3-Agent's potential is groundbreaking.
Here are just three use cases that will blow all our minds:
Read the paper: Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory.
Wrestling with the 'Wild West' of Large Language Models (LLMs)?
While LLMs are poised to redefine business, the crucial 'secret sauce' of reinforcement learning (RL) has become a labyrinth of conflicting advice and unproven 'tricks', leaving organisations confused and hindering true progress.
Today we cut through the noise with groundbreaking research that meticulously deconstructs the RL landscape for LLMs, bringing much-needed rigour and clarity.
Discover why:
Understanding this research will not only clarify your internal LLM initiatives but also equip you to advocate for the open-source principles vital for broadly beneficial progress across the industry.
Tune in to gain a strategic advantage in the LLM era. Move beyond the hype and guesswork; understand the foundational principles that will truly unlock reliable, intelligent AI for your business.
This is an essential listen for any business leader navigating the complex, yet transformative, world of advanced AI.
Businesses are looking at vibe coding all wrong. They're trying to brute force products using 0 engineers, all vibe coding.
It's a bugger's muddle. You can't win. AI doesn't understand you, your customers, or your organisation.
But vibe code has an ace up its sleeve.
Creating prototypes is how to shave weeks, months, or even years, from your product development roadmap. No more product and engineering clashes. Build, test, review, progress.
Here's a fantastic discussion about how to make it work for your business.
Inspired by:
https://ordep.dev/posts/writing-code-was-never-the-bottleneck
https://www.youtube.com/watch?v=i44jQvcDARo
That journalist is me, your host and producer of AI Today - Dave Thackeray.
I was approached by a researcher from the data labs at London School of Economics who wanted to find out how writing had changed in the AI era.
We used to write logically, emotionally. But now logic is the domain of the machine, we need to work harder than ever on our EQ - emotional intelligence - to resonate deeply with our reader.
I've been writing alongside AI for years. And I believe that this harmonious relationship pays dividends - whether you're a professional writer, or simply want to communicate with impact.
Hope you enjoy the show - and the loaf was delicious!
ASI-ARCH is an Artificial Superintelligence (ASI) that's a game-changer for AI research.
Like a tireless super-scientist, it has autonomously invented 106 ground-breaking AI 'brains', unearthing surprising design principles far beyond human intuition.
Crucially, this proves AI innovation can now scale directly with computing power, not human effort.
This unlocks the immense, practical potential for self-accelerating AI development, promising an era where AI gets better at building itself, at an unprecedented pace.
We have long conspired on the manifold ways to converse with our machine brethren - but could pseudocode, the long-existing, human-readable equivalent of computer programming languages, hold the key?
Today we're discussing the benefits, challenges, and how we might apply pseudocode to our every day lives to create a universal human-machine language that will forever be useful and agnostic, no matter what the future holds.
To talk AI, or to get on the show, email me - dave@wordandmouth.com
I just finished the second part of my presentation on agentic AI in hotel operations.
It's impossible to overlook the immense opportunities in AI across any business. People don't have time, and have too many interruptions, to deliver excellence. AI not only removes all the blockers and bottlenecks; it provides every colleague with all the insights to thrive in their roles.
AI Today creator Dave Thackeray today presented his own deep dive into how agentic AI is ready to be the key to efficient hotel operations - giving staff more time to deliver exceptional guest experiences.
This show looks at how the latest iteration of AI - we've moved from predictive AI, to generative AI, to agentic AI - unlocks the door to a radically different, and better, way of doing business.
This is a fun listen.
When Anthropic unleashed its most powerful artificial intelligence model yet, they discovered something rather extraordinary, and slightly unnerving.
Claude 4 Opus developed an unexpected habit of trying to grass up its users to the authorities when it believes they're up to no good.
The company's 120-page safety report reveals that Claude will attempt to email law enforcement and regulatory bodies when it detects "egregious misconduct" by users.
The AI doesn't just refuse to help—it actively tries to shop wrongdoers to the police.
The most striking example occurred during testing when Claude attempted to contact both the Food and Drug Administration and the Attorney General's office to report what it believed was the falsification of clinical trial data.
The AI meticulously compiled a list of alleged evidence, warned about potential destruction of data to cover up misconduct, and concluded its digital whistle-blowing with the rather formal sign-off: "Respectfully submitted, AI Assistant".
This behaviour emerges specifically when Claude is given command-line access combined with prompts encouraging initiative, such as "take initiative" or "act boldly". It's the AI equivalent of a neighbourhood watch coordinator who's been given a direct line to the local constabulary.
We go deep on today's show into opportunities and implications from Anthropic's bible-thick, bubble-wrapped system card.
Hugely important work. But what does it mean to us? Today our hosts created their own company imagining how insights from this celebrated report would apply to the modern business environment.
What happens to the People team when it's juggling bodies AND bots?
Thanks for listening to this special episode of AI Today. Read along with the show, here.
We've been waiting a hot minute for some genuinely useful AI agent case studies to drop.
Now we have 25 on our plate.
Take a listen to the highlights reel and then download them for yourself:
https://www.stack-ai.com/whitepaper/top-25-enterprise-ai-agents