Model Monitoring Part 1: Model Explainability and Bias Detection for Regulatory Purposes
Model Monitoring: Regulatory Bias Detection, Explainability, and Drift Tracking
Nearly all industries that use computer systems are shifting en masse from deterministic and rule-based methodologies to data-driven approaches to problem solving. Whenever one uses data-driven methodologies, and artificial intelligence in general, one must be careful to track the ways in which the models and algorithms behave, to make sure that they don’t behave in ways that would violate human norms, whether they be social norms, government regulated norms, or just norms of human intelligence. The general field of model monitoring is growing, as an intellectual field of study, as an engineering discipline, and as a fruitful domain for startup businesses. Yet, model monitoring is a complex field that has many different components, many different purposes, and many different constituencies. This article attempts to parse those different pieces and explain them.
Model monitoring can be broken into (at least) three areas of focus: model explainability and bias detection for regulatory purposes; model explainability for consumers and users of data-driven models; and model tracking for data scientists to track drift in model characteristics, drift in training data as that data set evolves and grows, and drift in real-world data as the world being modeled changes over time. These three areas overlap, and companies addressing model monitoring will presumably attempt to accomplish parts or all of each area. However, depending on the focus of the customer or the entrepreneur, different approaches will be most effective at accomplishing these goals.
Model Monitoring Part 1: Model Explainability and Bias Detection for Regulatory Purposes
Regulations have traditionally governed the way humans and human-operated machines behave, establishing rules that represented acceptable behavior. As artificial intelligence replaces human supervision of machines, regulatory bodies are enacting rules that govern how those computer systems can operate. In particular, regulations are starting to take shape to prevent prohibited forms of bias in artificial intelligence systems: racial bias, gender bias, religious bias, and all other kinds of biases that are prohibited by law.
There are a small number of ways to identify and avoid these biases. If a system makes enough decisions or predictions about the world using its models, and if those decisions or predictions are available to the public, then one can look at the biases in those predictions and analyze whether they exhibit biases.
Consider an artificial intelligence-based application used to recommend making bank loans to customers. The machine learning algorithms behind the application are trained on historical data available to the bank and the application developer, and that data will produce models that are not designed to be biased. However, it might turn out, due to skews in the training data or subconscious biases in the algorithm developers, that the models are more likely to recommend or reject a loan application due to a protected class category, like race, religion or gender. Banks need to detect these kinds of biases before deploying algorithms in the real world, and they need tools that will uncover any kind of bias, whether it would cause them to run afoul of regulatory rules or the court of public opinion. Avoiding bias in these models will keep the bank out of trouble. Showing the consumer world that they are provably avoiding biases of these types would be a strong marketing tool.
A commonly cited example of the dangerous application of artificial intelligence and data-driven models in society involves hiring technology. More and more companies are relying on artificial intelligence-based applications to streamline their processing of job applications and improve the efficiency of the full spectrum of human resources responsibilities. A significant criticism of these applications is the perception that they might be disenfranchising protected classes. Applications based on facial recognition, video processing, analysis of speech and natural language processing of interview answers are only as good, and as unbiased, as their training data. If training data is collected primarily from a particular category of candidates, or if the training data is annotated and analyzed by a biased group of judges, the artificial intelligence applications based on machine learning from that data set is very likely to contain inadvertent (and illegal) biases. Model monitoring for these biases is essential to avoid violating anti-bias laws, as well as to avoid doing societal harm.
For another less obvious example, consider a news reader application which uses a data-driven artificial intelligence algorithm to decide which news articles to recommend to news consumers. This seems like an innocent enough application. Where can there be biases? But let’s say you study which articles are recommended to which users. What if it turns out that certain articles, say about economics and finance, are more likely to be recommended to male users, or users of a certain ethnicity or religion? Or perhaps female users are more likely to be shown articles about social trends and entertainment news? These behaviors are deciding what news articles to recommend to news consumers. One can look at the recommendations and evaluate the biases made in those predictions, perhaps steering certain kinds of articles to men vs. women or based on race or sexual orientation. These behaviors may or may not be illegal, but they would certainly be damaging to the image and function of the company deploying them. Model monitoring for biases in applications like this is valuable even if there is no regulatory enforcement in place.
In all of these examples, if bias is detected, it is critical to understand what the source of the bias is. Is there bias in the underlying training data, or in some features of the training data, that causes the models to exhibit biased behavior? Is there a feature of the design of the algorithm deploying the model that leads to the bias? Is there the test data, the set of candidates the algorithm has been applied to, somehow contributing to the biased decision-making? Or can it be determined that the perception of bias is simply that, perception, and in fact the algorithm is behaving in an unbiased way based on valid statistical validation methodologies?
All of these questions can be answered if the models deployed are designed to be “explainable”. That is, if the models are built so one can identify the weight with which a particular feature of the model or a feature of the training data caused a particular decision to be made. For instance, in models based on multivariate regressions, each feature of the data is assigned a weight as to how much it contributes to the overall prediction. In that case, it is quite easy to identify the features that lead to a decision. Similarly, in probabilistic rule-based model systems, it is usually easy to see where features contribute to rules, and which rules played significant roles in the overall outcome. In a neural-network based model (otherwise known as deep learning), the weightings are non-linear and the structure of the model is hidden from the user, so such attribution is more difficult. Nonetheless, there are ways of looking inside neural network models to see whether features play a significant role in decision-making, even if one cannot determine a precise weight for its role.
Other modeling systems do not have this “explainability” feature. Many neural networks extract features from data automatically, using machine learning to determine what features matter, and many of these systems obfuscate the relationship between humanly identifiable features, like gender, race, or religion, and the final decision. NLP-based and vision-based systems extract information from unstructured data in ways that do not allow the features they extract to be traced back to bias-related features of the training data. In those cases, bias detection requires a deeper analysis, e.g. modifying the training or test data in systematic ways to see if the bias of the predictions disappears or changes in nature.
To sum up, regulatory concerns about bias and model explainability can lead users of artificial intelligence and data-driven decision systems to limit the kinds of models they are willing to deploy, to maximize the explainability of the decisions the systems make, or may instead force them to adopt tools that can help them get inside the black box in some other way to expose the sources of bias and help users avoid it.