When it comes to AI (Artificial Intelligence), there’s usually a major focus on using large datasets, which allow for the training of models. But there’s a nagging issue: bias. What may seem like a robust dataset could instead be highly skewed, such as in terms of race, wealth and gender.
Then what can be done? Well, to help answer this question, I reached out to Dr. Rebecca Parsons, who is the Chief Technology Officer of ThoughtWorks, a global technology company with over 6,000 employees in 14 countries. She has a strong background in both the business and academic worlds of AI.
So here’s a look at what she has to say about bias:
Can you share some real-life examples of bias in AI systems and explain how it gets there?
It’s a common misconception that the developers who are responsible for infusing bias in AI systems are either prejudiced or acting out of malice—but in reality, bias is more often unintentional and unconscious in nature. AI systems and algorithms are created by people with their own experiences, backgrounds and blind spots which can unfortunately lead to the development of fundamentally biased systems. The problem is compounded by the fact that the teams responsible for the development, training, and deployment of AI systems are largely not representative of the society at large. According to a recent research report from NYU, women comprise only 10% of AI research staff at Google and only 2.5% of Google’s workforce is black. This lack of representation is what leads to biased datasets and ultimately algorithms that are much more likely to perpetuate systemic biases.
One example that demonstrates this point well is the use of voice assistants like Siri or Alexa that are trained on huge databases of recorded speech that are unfortunately dominated by speech from white, upper-middle class Americans—making it challenging for the technology to understand commands from people outside that category. Additionally, studies have shown that algorithms trained on historically biased data have significant error rates for communities of color especially in over-predicting the likelihood of a convicted criminals to reoffend which can have serious implications for the justice system.
How do you detect bias in AI and guard against it?
The best way to detect bias in AI is by cross-checking the algorithm you are using to see if there are patterns that you did not necessarily intend. Correlation does not always mean causation, and it is important to identify patterns that are not relevant so you can amend your dataset. One way you can test for this is by checking if there is any under- or overrepresentation in your data. If you detect a bias in your testing, then you must overcome it by adding more information to supplement that underrepresentation.
The Algorithmic Justice League has also done interesting work on how to correct biased algorithms. They ran tests on facial recognition programs to see if they could accurately determine race and gender. Interestingly, lighter-skinned males were almost always correctly identified, while 35% of darker-skinned females were misidentified. The reason? One of the most widely used facial-recognition data training sets was estimated to be more than 75% male and more than 80% white. Because the categorization had far more definitive and distinguishing categories for white males than it did for any other race, it was biased in correctly identifying these individuals over others. In this instance, the fix was quite easy. Programmers added more faces to the training data and the results quickly improved.
While AI systems can get quite a lot right, humans are the only ones who can look back at a set of decisions and determine whether there are any gaps in the datasets or oversight that led to a mistake. This exact issue was documented in a study where a hospital was using machine learning to predict the risk of death from pneumonia. The algorithm came to the conclusion that patients with asthma were less likely to die from pneumonia than patients without asthma. Based off this data, hospitals could decide that it was less critical to hospitalize patients with both pneumonia and asthma, given the patients appeared to have a higher likelihood of recovery. However, the algorithm overlooked another important insight, which is that those patients with asthma typically receive faster and more intensive care than other patients, which is why they have a lower mortality rate connected to pneumonia. Had the hospital blindly trusted the algorithm, they may have incorrectly assumed that it’s less critical to hospitalize asthmatics, when in reality they actually require even more intensive care.
What could be the consequences to the AI industry if bias is not dealt with properly?
As detailed in the asthma example, if biases in AI are not properly identified, the difference can quite literally be life and death. The use of AI in areas like criminal justice can also have devastating consequences if left unchecked. Another less-talked about consequence is the potential of more regulation and lawsuits surrounding the AI industry. Real conversations must be had around who is liable if something goes terribly wrong. For instance, is it the doctor who relies on the AI system that made the decision resulting in a patient’s death, or the hospital that employs the doctor? Is it the AI programmer who created the algorithm, or the company that employs the programmer?
Additionally, the “witness” in many of these incidents cannot even be cross-examined since it’s often the algorithm itself. And to make things even more complicated, many in the industry are taking the position that algorithms are intellectual property, therefore limiting the court’s ability to question programmers or attempt to reverse-engineer the program to find out what went wrong in the first place. These are all important discussions that must be had as AI continues to transform the world we live in.
If we allow this incredible technology to continue to advance but fail to address questions around biases, our society will undoubtedly face a variety of serious moral, legal, practical and social consequences. It’s important we act now to mitigate the spread of biased or inaccurate technologies.