As we close out Season 2 and our emphasis on LLMs, we had the distinct privilege of chatting with Dr. Elizabeth Garrison. She is one of the few people in the world with domain expertise spanning behavior analysis (BCBA) and artificial intelligence (PhD).
In this episode, we reflect on the state of AI research and industry work pre-ChatGPT and post-ChatGPT release, the shift in academic AI research when the transformer architecture became broadly available, and the differences between academia and industry in both behavior science and AI.
"Bubbles" are an economic phenomenon characterized by a rapid increase in asset prices that far exceed the asset's underlying fundamental value, driven by speculative buying and herd behavior rather than intrinsic worth.
In this episode, Jake and David ask, "Are we in an AI bubble?". And, if so, what might this mean for both individuals and organizations as they navigate the current AI strategic landscape?
In this episode, Jake and David discuss the burgeoning area of research looking at how interacting with LLMs impacts our skills and abilities in good and bad ways. As with most things in life, the effects are not black-and-white. And, we discuss strategies and tactics we can all engage in to try to get the benefits without the drawbacks.
Conversations around AI ethics often focus on a suite of incredibly important topics such as data security and privacy, model bias, model transparency, and explainability. However, each time we use large AI models (e.g., diffusion models, LLMs), we reinforce a host of additional potentially unethical practices that are needed to build and maintain these systems.
In this episode, Jake and David discuss some of these unsavory topics, such as human labor costs and environmental impact. Although it's a bit of a downer, it's crucial for each of us to acknowledge how our behavior impacts the larger ecosystem and recognize our role in perpetuating these practices.
"Explainable AI", aka XAI, refers to a suite of techniques to help AI system developers and AI system users understand why inputs to the system resulted in the observed outputs.
Industries such as healthcare, education, and finance require that any system using mathematical models or algorithms to influence the lives of others is transparent and explainable.
In this episode, Jake and David review what XAI is, classical techniques in XAI, and the burgeoning area of XAI techniques specific to LLM-driven systems.
Prompt engineering involves a lot more than simply getting smarter with how you structure the prompts you enter in an LLM browser interface.
Furthermore, a growing body of peer-reviewed research provides us with best practices to improve the accuracy and reliability of LLM outputs for the specific tasks we build systems around.
In this episode, Jake and David review evidence-based best practices for prompt engineering and, importantly, highlight what proper prompt engineering requires such that most of us likely cannot call ourselves prompt engineers.
Lots of people like to talk about the importance of prompts, context, and what is sent to an LLM. Few discuss the even more important aspect of an LLM-driven system in evaluating its output.
In this episode, we discuss traditional and modern metrics used to evaluate LLM outputs. And, we review the common frameworks for obtaining that feedback.
Though evals are a lot of work (and easy to do poorly), those building (or buying) LLM-driven systems should be transparent about their process and the current state of their eval framework.
Jake and David chat about best practices and considerations for those building and using AI systems that leverage LLMs.
Jake and David chat about types of GenAI, and specifically how LLMs work—from input text or audio through the output you read.
Jake and I chat about current hot topics in the LLM space and what we would (and would not) trust an LLM with.
Jake and I chat about a forthcoming book chapter titled, "Welcome to the Era of Experience" by David Silver and Richard Sutton (link below). This—naturally—led other topics to surface, such as companies staffed entirely by AI agents (which turned out as well as that sounds); superintelligence (we might be legally required to reference this during the 2025 AI hype cycle); and how practical systems built on these ideas would even be architected (we both came in with different ideas here which was fun). Happy listening.
Links to things mentioned:
In this episode, we chat with Dr. Beth Garrison about her journey in behavior analysis, what led her to pursue a PhD in artificial intelligence, and her thoughts on where this is all headed.
Links to the papers Dr. Garrison references:
In this episode, David talks about the dataset he's been collecting on his own, daily behavior over the last 15 years; and, how behavior science + data science let him do neat things with it.
Jake talks about his backyard science project where he used computer vision to detect squirrels in his backyard.
In this episode, we dive into some basics around unsupervised machine learning and how behavior analysts might use it in their work.
Episode 010: Guest Chat with Zach Morford
Episode 009: What does it take to go end-to-end with an AI application? Part III - The Deployment Lifecycle
In this episode, we discuss the end-to-end pipeline when creating a model.
In this episode, we talk about the many components of data engineering and parallel work data scientists get into as data moves from its data collection source to being ready for modeling.
In this episode, we discuss common barriers and solutions for bridging the research-to-practice gap in behavioral data science. We also talk about many of the ways that data science or AI research differs from behavior science research in terms of practitioners' ability to integrate findings quickly into practice.