The issue of bias in AI has become a focal point in recent discussions, both in the academia and amongst practitioners and policymakers. I observe a lot of confusion and diffusion in those discussions. At the risk of seeming patronizing, my advice is to engage only with the understanding of the specific jargon which is used, and particularly how it’s used in this context. Misunderstandings create confusion and blur the path forward.
Here is a negative, yet typical example:
In artificial intelligence (AI)-based predictive models, bias – defined as unfair systematic error – is a growing source of concern1.
This post tries to direct those important discussions to the right avenues, providing some clarifications, examples for common pitfalls, and some qualified advice from experts in the field on how to approach this topic. If nothing else, I hope you find this piece thought-provoking.
In modern statistics – AI being a subclass here – a biased prediction (for example) means that the prediction deviates from the true unknown value you wish to predict. As an easy example, say you want to estimate the average weight in the population but you are using only women in your sample. Women don’t represent the whole population (only about 50%) so if you intend to estimate the average weight for both men and women, your estimate will be biased downwards since women weigh less than men on average. If you aim to estimate the average weight only for women then your estimate is unbiased. Is bias always a bad thing? No. In modern statistics we often, and purposely, introduce bias in many type of machine learning algorithms as a way stabilize volatile estimates2.
Now continuing with our easy example, say you used your data properly, i.e. your sample is indeed a 50-50 mix between men and women, nevertheless, for whatever reason you don’t like the result, you disagree with the estimated weight, thinking it should be higher or lower. Does this make your model biased? of course not! so what if you think the results should be different? You may even refuse to use the model unless it’s tailored to provide you with results that YOU are comfortable with. But it’s nothing, nothing to do with model-bias, and nothing to do with the way the model is estimated. To begin with, we choose to use AI models precisely because these are able to capture the patterns present in the data and now, if we dislike the results, it means we take issue with the data; and of course we do. To quote Gordon Gekko: “human beings… you gotta give ’em a break. We’re all mixed bags.” So we are talking about any model-results that we are happy or unhappy to use. The data (cliché alert) is what it is.
Yes, we would like people to be blind to gender and race, that is work in progress, in the meantime we want to build models so they are useful. Useful particularly in that they are aligned with our, creators and users, values and preferences.
So we agree, I hope, that bias has nothing to do with anything in this context. The discussion should be cast in terms of usability and alignment. Simple? Sure. Easy? far from it.
Who is to determine what fair or is not fair, who is to determine the preferences? Consider this loaded topic: Policy makers think tax-rate should be x%, is it fair? Ask different people to get different answers and make of it what you will.
So what should we do?
Now that we have our concepts lined up clearly. To create useful AI models we are comfortable using we need, like with all things, to start at the start:
- Who is the target group? is it a firm? a department within a firm? is it for the general audience? for the whole of society?
- The answer to the previous item determines who is to set the preferences and values the users would feel comfortable with; and by way of derivative: which data should be fed to the model. It is recommended3 to have this process as participatory as possible. That said, it’s good to remember Arrow’s Impossibility Theorem, so while the process should involve the users and stakeholders, it should not be a democratic ranking of values. There is no such animal called a “fully aligned AI model”.
- Perhaps the most important point I wish to make is this: if we want AI models to serve society effectively, we best help create more data to reflect our societal values. Ideally we do that by increasing our spending on proper education for our current and future “data creators” (us and our kids), but let’s be less naïve.
- A more pragmatic ground to walk on is using data-augmentation techniques. We can generate synthetic data so as to augment our existing data with fictitious data but, that embodies the values we want our model to endorse. I encountered this approach recently applied by Amazon-science in the context of time-series forecasting4. To a large measure, it’s simply down-weighting outliers (say toxic data) and up-weighting the center of the distribution, which statisticians have been doing for centuries now, but now we need to do it at scale.
Stop all discussions around biases in AI systems (no such thing). The focus should be on how to create more data which is aligned with the users\human values, and how to eliminate the data which is not. The AI model has none to do with it.
Footnotes and References
1 For example ridge regression is often used instead of the usual OLS, because it trades the variance of the estimates for biased estimates.
2 from Rising to the challenge of bias in health care AI
3 Training language models to follow instructions with human feedback
4 Chronos: Learning the language of time series