Home
Categories
EXPLORE
True Crime
Comedy
Business
Society & Culture
Health & Fitness
Sports
Technology
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Podjoint Logo
US
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts115/v4/74/fe/d5/74fed5f1-4887-604a-629a-5db0625c96ab/mza_9739583941962038745.jpg/600x600bb.jpg
Your Data Teacher Podcast
Your Data Teacher
7 episodes
5 days ago
A podcast about data science, machine learning, artificial intelligence, statistics and everything related to data. Home Page: https://www.yourdatateacher.com
Show more...
Technology
RSS
All content for Your Data Teacher Podcast is the property of Your Data Teacher and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
A podcast about data science, machine learning, artificial intelligence, statistics and everything related to data. Home Page: https://www.yourdatateacher.com
Show more...
Technology
Episodes (7/7)
Your Data Teacher Podcast
Episode 7 - A Python library to remove collinearity

Collinearity is a huge problem for machine learning problems. It increases the dimensions of our dataset without increasing the amount of information. That's why I've created a Python library that can be used to remove collinearity from a dataset. I talk about this library in this podcast. 

Article: https://www.yourdatateacher.com/2021/06/28/a-python-library-to-remove-collinearity/ 

Pypi package: https://pypi.org/project/collinearity/ 

GitHub repo: https://github.com/gianlucamalato/collinearity

Show more...
4 years ago
8 minutes 39 seconds

Your Data Teacher Podcast
Episode 6 - Checking the distribution of your data using Q-Q plot

In this episode, I'm talking about Q-Q plot and how to use it for checking if our dataset follows a particular distribution. Instead of using complex hypothesis tests like Kolmogorov-Smirnov test, using this simple plot, we'll be able to check if our dataset follows a particular distribution or if two datasets have been created according to the same distribution.

Link to the article: https://www.yourdatateacher.com/2021/06/16/how-to-use-q-q-plot-for-checking-the-distribution-of-our-data/

Show more...
4 years ago
7 minutes 28 seconds

Your Data Teacher Podcast
Episode 5 - Tuning the threshold in binary classification tasks

In this episode, I'll talk about tuning the threshold in binary classification tasks. The usual value for the threshold is 0.5, but it's useful to optimize it in order to make the model fit our needs. I talk about optimizing according to the ROC curve and maximizing the balanced accuracy.  

Link to the article: https://www.yourdatateacher.com/2021/06/14/are-you-still-using-0-5-as-a-threshold/

Show more...
4 years ago
7 minutes 45 seconds

Your Data Teacher Podcast
Episode 4 - Ensemble models. Bagging and boosting

In this episode, I'm going to talk about ensemble models, particularly bagging and boosting. Bagging is very useful for reducing variance, boosting is used for reducing bias. The most common bagging algorithm is Random Forest, the most common boosting algorithm is Gradient Boosting, whose most common implementations are XGBoost, LightGBM and CatBoost.

Home Page: https://www.yourdatateacher.com

Show more...
4 years ago
11 minutes 55 seconds

Your Data Teacher Podcast
Episode 3 - Precision, recall, accuracy. How to choose?

In this episode, I talk about accuracy, precision and recall. We're going to focus on what they are and when to use them in machine learning projects.


Link to the article: https://www.yourdatateacher.com/2021/06/07/precision-recall-accuracy-how-to-choose/

Show more...
4 years ago
11 minutes 55 seconds

Your Data Teacher Podcast
Episode 2 - How to explain neural networks using SHAP

Today we're going to talk about how we can explain neural networks. Neural networks are like black boxes that hide the way they model and represent data. That's why explaining them is very difficult. A very powerful approach is called SHAP. Using this method, we can calculate the impact of a feature according to a given model independently of the type of model we're using. It's very useful for black boxes like neural networks.

Home page: https://www.yourdatateacher.com

Link to the article: https://www.yourdatateacher.com/2021/05/17/how-to-explain-neural-networks-using-shap/

Show more...
4 years ago
6 minutes 54 seconds

Your Data Teacher Podcast
Episode 1 - How accurate is your accuracy?

Today we're going to talk about the standard error on proportions. In data science, it's very important to calculate the standard error on every estimate we calculate in order to see if finite-size effects are lowering the precision too much and in order to compare two different measurement results with each other.

Home page: https://www.yourdatateacher.com

Link to the article: https://www.yourdatateacher.com/2021/05/31/how-accurate-is-your-accuracy/

Show more...
4 years ago
6 minutes 40 seconds

Your Data Teacher Podcast
A podcast about data science, machine learning, artificial intelligence, statistics and everything related to data. Home Page: https://www.yourdatateacher.com