In this episode of Beneficial Intelligence, I discuss biased data. Machine Learning depends on large data sets, and unless you take care, ML algorithms will perpetuate any bias in the data it learns from. The famous ImageNet database contains 14 million labeled images. However, 6% of these have the wrong label. The labels are provided by humans paid very little per image, so they will work very fast. Unfortunately, as Nobel Prize winner Daniel Kahneman has shown, when humans work fast,...
Show more...