A common concern surrounding AI applications is that they’ll be just as racist, sexist or bigoted as the people they’re replacing. Or, perhaps, they’d be even worse.
In response, you see business leaders, politicians and social advocates demanding that data experts create machine learning applications that are free of bias. (See: SAP’s How AI Can End Bias, or VentureBeat’s Bias in AI is spreading and it’s time to fix the problem). There’s this notion that the whole point of getting machines to think for us, is that they’re wiser, making decisions on pure facts (data points). It’s all just a matter of designing the application properly!
An AI system “learns” from data. If the data itself is biased, so is the model. In theory, we can force the system to not learn specific things from data, or we can make sure the data contains no possible bias. But both approaches force us to manually specify every possible way the system could be biased. That’s not feasible in reality.
But let’s suppose for a moment that we do our very best to eliminate as much bias as possible from the method and the data. Then, surely the model will be bias-free?
Well, still no.
In machine learning theory, if you can mathematically prove you don’t have any type bias, then the model becomes useless. ML models are trained on the very principle of differentiating between classes - or, well, let’s call it what it is: discriminating between classes. Of course, some discernments are harmful biases (discerning based on gender or race), while others are just discernments (discerning based on loan repayment history) that help machine learning models (and humans, for that matter) make decisions. Biases, in and of themselves, are not all harmful.
So no, you can’t eliminate bias from machine learning. You actually need bias. But you can take care to pick the right bias.