In this episode, I chat with Nick Tierney, a statistician, data scientist, and creator of the {naniar} R package for exploring missing data. Nick reflects on his journey from a psychology undergrad to a PhD in statistics, and how open-source tools—and a deeply curious mindset—shaped his path.
We discuss Nick’s early struggles with R, the importance of community, and his evolution into package development and consulting. He also walks through the why and how of testing R packages, making the case for better, more reliable code.
Resources mentioned:
Books: (1) The R Book by Michael Crawley, and (2) R Packages by Hadley Wickham and Jenny Bryan
Connect with Garrick on Bluesky and GitHub
In this episode I sit down with Garrick Aden-Buie, a senior software engineer at Posit (formerly RStudio), to explore his latest project: brand.yml. This innovative tool aims to simplify and unify branding across various data science outputs, including Quarto documents, Shiny apps, and more.
Garrick shares the inspiration behind brand.yml, stemming from his experiences in creating custom theming solutions at different organizations. Recognizing the repetitive nature of this task, he developed brand.yml to serve as a single source of truth for styling, enabling consistent application of brand guidelines across multiple platforms.
Important resources mentioned:
Connect with Garrick on LinkedIn, Bluesky and Mastodon. Visit his Website to learn more.
Subscribe to our newsletter: https://rfortherestofus.com/newsletter
In this episode, I sit down with Robert Smith, co-founder of Dark Peak Analytics, to discuss his journey from academia to advising the UK government during the COVID-19 pandemic. Rob shares his experience building data models that directly influenced public health policy and decision-making at the highest levels. We also dive into his work with Parkrun, a community-driven initiative promoting equitable access to physical activity, and how he used R to analyze participation patterns and improve accessibility.
Rob's insights into health economics, simulation modeling, and the challenges of transitioning industries provide valuable lessons for data professionals.
Important resources mentioned:
Connect with Robert on:
LinkedIn: Robert Smith
Subscribe to our newsletter: https://rfortherestofus.com/newsletter
In this episode, I chat with Simon Couch, a software engineer at Posit, about his work on AI-powered R packages. Simon maintains open-source statistical software and explores how AI can streamline coding workflows. We discuss his journey from statistics and sociology to AI-driven tooling, his development of the PAL and Gander packages, and why AI won’t replace coding anytime soon—but can definitely make it easier.
Improtant resources mentioned:
Connect with Simon on:
LinkedIn: Simon P. Couch
Bluesky: @simonpcouch.com
Mastodon: @simonpcouch@fosstodon.org
X: @simonpcouch
Subscribe to our newsletter: https://rfortherestofus.com/newsletter
In this episode, I chat with Terence Teo, Professor at Seton Hall University and expert in creating stunning 3D maps using the {rayshader} package in R. Terence discusses his journey into data visualization, specifically his use of R and the RayShader package to create mesmerizing 3D maps. Terence shares insights from his academic background in political science, his creative process for making maps, and how he balances artistic flair with technical rigor. The discussion dives into geospatial data, the intricacies of the {rayshader} and {rayrender} packages, and the value of experimentation in visual storytelling.
Important resources mentioned:
Connect with Terence on:
Subscribe to our newsletter: https://rfortherestofus.com/newsletter
In this episode, I chat with Alex Gold, who leads solution engineering and support teams at Posit. We talk about his new book ‘DevOps for Data Science’, and explore why DevOps principles are crucial for data scientists, even if you aren't working in hardcore production environments. Alex shares insights on how to think about putting your data science projects into "production"—whether it's a report for colleagues or a full-scale deployment. We discuss practical tips like using {renv} to manage package versions and ensure reproducibility, and Alex gives a live demo on setting it up in an R project. Tune in for key takeaways on making your data science work more robust, shareable, and ready for production.
Important resources mentioned:
Connect with Alex on:
Subscribe to our newsletter: https://rfortherestofus.com/newsletter
In this episode, I talk with Cédric Viddone, an Information Management Officer at UNHCR with over 15 years of experience in humanitarian data visualization. Cédric shares his journey from GIS and cartography to creating impactful infographics and embracing the R programming language. He discusses how R has revolutionized data efficiency within UNHCR, where it aids in producing faster, reproducible data products essential for crisis response. Cedric also highlights the growth of the UNHCR R community and introduces the organization's data visualization guidelines platform, which includes a custom ggplot theme that streamlines UNHCR branding. Additionally, Cédric explores the advantages of Quarto for developing UNHCR-branded templates, enabling team members to create consistent, polished reports with ease.
Important resources mentioned:
Connect with Cédric on:
Subscribe to our newsletter: https://rfortherestofus.com/newsletter
In this episode, I talk with Christine Parker, the Senior GIS Analyst on the Community Broadband Networks team at the Institute for Local Self Reliance. Christine shares how she used R to clean, combine, and summarize data for a dashboard tracking enrollment in the Affordable Connectivity Program (ACP), a COVID-era initiative to help people access affordable internet. The dashboard gained wide attention. It was shared in advocacy circles, referenced in Congress, and discussed with the White House. Christine highlights R's value for performing repeatable data tasks, particularly with regularly updated datasets, and its advantages compared to manual Excel processes.
Important resources mentioned:
Connect with Christine on:
Subscribe to our newsletter: https://rfortherestofus.com/newsletter
In this episode, I chat with Crystal Lewis about data management and her recently published book titled ‘Data Management in Large-Scale Education Research’. Crystal, a freelance research data management consultant, shares insights on good planning and systematic implementation of practices that are key to effective data management. She discusses the importance of automated data validation, and outlines a structured approach to data cleaning. Additionally, Crystal reflects on her experience writing an open-source book with Bookdown and navigating the publishing process.
Important resources mentioned:
Learn more about Crystal Lewis by visiting her website and connect with her on X (@Cghlewis), LinkedIn, GitHub, and Fosstodon.
Subscribe to our newsletter: https://rfortherestofus.com/newsletter
In this episode, I speak with Miles McBain, a data scientist and R package developer from Brisbane, Australia, about patterns and anti-patterns in data analysis reuse. Miles shares his journey from a generalist software developer to a data science specialist, his passion for R, and the evolution of his coding practices. We delve into the intricacies of code reuse in data analysis, discussing common pitfalls to avoid, the benefits of creating reusable code packages, the process of breaking down large codebases, and how teams can evolve their coding practices to enhance efficiency and maintainability.
Important resources mentioned:
Connect with Miles McBain:
Subscribe to our newsletter: https://rfortherestofus.com/newsletter
In this episode, I speak with Meghan Harris, a data scientist at the Prostate Cancer Trials Consortium at the Memorial Sloan Kettering Cancer Center. Meghan is one of the people who does generative art in R. She talks about why she likes making generative art in R and how making generative art has helped her improve her R skills in other areas.
Important resources mentioned:
Art From Code Workshop by Danielle Navarro
A talk Meghan did in September 2023 making a case for generative art
Ijemaka's blog Meghan referenced that help her learn/do generative art with R
Connect with Meghan on LinkedIn, X, and Mastodon
Subscribe to our newsletter: https://rfortherestofus.com/newsletter
In this episode, I speak with Cara Thompson about color, delving into several aspects of its use in data visualization. Cara is a UK-based data visualization consultant with over 15 years of experience in transforming data insights into clear, compelling visual stories. We explore how she finds inspiration for selecting colors, her reasons for not simply using organizations' brand colors in her visualizations, and the importance of dedicating time to thoughtfully consider color choices in this context.
Important resources mentioned:
Connect with Cara Thompson:
Subscribe to our newsletter: https://rfortherestofus.com/newsletter
In this episode, I talk with Nicola Rennie about making data viz for mobile devices. Nicola is a lecturer in health data science based within the Center for Health Informatics, Computing, and Statistics at Lancaster University in the UK.
She recounts her initial encounter with R and how she got deeper into data visualization in R as a means of creative expression. Amidst the plethora of programming languages available, Nicola sheds light on why she chose R specifically for data visualization. Additionally, she offers valuable advice for people wanting to get started with data visualization using R.
Important resources mentioned:
Connect with Nicola:
Subscribe to our newsletter: https://rfortherestofus.com/newsletter
In this episode, I’m joined by Will Landau, a statistician and software developer currently working with Eli Lilly and Company. Will specializes in Bayesian methods, high-performance computing, and reproducible workflows. He is the creator of the {targets} R package, a pipeline tool for reproducible computation in statistics and data science. The package became part of ROpenSci in early 2021.
Will talks about his journey into R and using it for open source projects. He gives a detailed account of {targets} - its origin and how it works as a reproducible analysis pipeline tool.
Subscribe to our newsletter: https://rfortherestofus.com/newsletter
In this episode, I talk with Ahmadou Dicko, a statistician based in Senegal working with the United Nations High Commissioner for Refugees (UNHCR). Ahmadou shares insights on utilizing data-driven approaches to address development obstacles, especially within humanitarian settings. He explores the innovative packages and strategies developed by his team using R for data management, analysis, and communication. Among these innovations is robotoolbox, an extensive R package designed for accessing and handling Kobo Toolbox data in a tidy format.
Subscribe to our newsletter: https://rfortherestofus.com/newsletter
In this episode, I speak with Chris Knox, who is currently the Head of Data Journalism at the New Zealand Herald. Prior to that, he worked at the New Zealand ministry of health, where he led an analytics team focused on New Zealand's COVID response.
During our conversation, Chris highlights why he considers R as the optimal tool for data analysis and reporting, especially when dealing with frequently changing data sources and parameters. He also emphasizes the benefits of using R in a collaborative environment, where junior analysts can be quickly integrated into the data analytics and reporting process and assume significant responsibilities, thanks to the reproducibility of R code.
In this episode, Travis Gerke and Garrick Aden-Buie join me to demystify the process behind developing custom packages in R. Travis is the Director of Data Science at The Prostate Cancer Clinical Trials Consortium (PCCTC), and Garrick is a Data Science Educator and R developer at R Studio.
During the discussion, Travis and Garrick highlight the numerous benefits of having a custom package, including making it easier to access data, automation & documentation of functions, and enhanced learning opportunities for R users seeking to upskill. They also delve into their own experiences working together at Moffitt Cancer Center, discussing how their set of custom R packages helped alleviate data reporting pain points within the organization.
In this episode, I speak with Kyle Walker, Associate Professor of Geography and Director of the Center for Urban Studies at Texas Christian University. Kyle has developed several packages, but the one we talk about in this chat is called tidycensus. tidycensus allows R users to return Census and ACS data as tidyverse-ready data frames. Kyle had a rough start with R programming and he didn’t want anything to do with it for 3 years. What made him come back to R and become one of its renowned champions? We chat about that as well.
In this episode, I speak with Meghan Harris, a data integration specialist at the Primary Care Research Institute, University of Buffalo. There, she brings together data from multiple sources to create insights that benefit people affected by opioid use disorder. Meghan talks about how she uses R to pull data directly from Google Sheets, and highlights the advantages of this workflow as opposed to working on a manually downloaded Google Sheets file.
Fun fact: When Meghan is not creating data pipelines, she makes cool art using R and posts them on her Twitter timeline.
In this episode, I chat with Matt Herman about building websites in R. Matt shares lessons from his experience building a self-updating Covid-19 tracking site for Westchester County.
Matt is a Data Scientist at the Council of State Governments (CSG) Justice Center, where he focuses on research and policy analysis. Matt has created automated and reproducible workflows to generate outcome measures and performance indicators for several projects within the justice system.