“But Why?”
Understanding
Explainable Artificial
Intelligence
Copyright By PowCoder代写 加微信 powcoder
Opaque algorithms get to score and choose in many areas using their own inscrutable logic. To whom are said algorithms held accountable? And what is being done to ensure explainability of these algorithms?
By Tim Miller
DOI: 10.1145/3313107
Imagine the following. You finish as one of the top students in your class and are looking for that all-important first step in your career. You see an exciting graduate role at a great organization, which matches your own skills, experience, and characteristics. Excited, you tailor your curriculum vitae to the role, and a couple of
days before the deadline, you submit it. Just 30 seconds later, you see an email from the company. You open it and it says: “Thank you for your application to our graduate
program. Unfortunately, you have not been successful in this round. We hope you will consider us for fu- ture roles.”
Puzzled, you wonder how they could have made a decision so quick- ly. Some digging on the Internet shows that the company uses an au- tomated algorithm to filter out most applicants before ever being seen by a human. “No problem”, you think, “I’ll just find out why I could not pass
the filter, update my application, and re-apply in the next round.” However, your request to the company is met with the response: “We use advanced machine learning algorithms to make these decisions. These algo- rithms do not offer any reasons for their output. It could be your work experience, but it could be simpler the terminology used in your applica- tion. We really cannot tell.” You have missed out on your dream job and
have no idea why nor what you could have done differently.
MODERN DECISION MAKING
This scenario may sound somewhat dystopian, but it is precisely what many recruitment organizations do right now. Automated algorithms are being used to assess and classify appli- cants’ potential, while only the high- est-ranked applications make it into human hands. Typically, these filters
XRDS • SPRING 2019 • VOL.25 • NO.3
Image by Vdant85 / Shutterstock.com
XRDS • SPRING 2019 • VOL.25 • NO.3 21
Photo Credit TK
are machine-learning algorithms. This means their decision processes are written not by a person, but are trained using data. The training process is that a number of features of importance are selected, such as degree name, insti- tute name, grade-point average, work experience, etc., but also the terminol- ogy and phrasing used in the applica- tion. Then, taking a huge stack of prior applications and the final decisions of these applications (hire or not hire), a mathematical model is automatically derived, using a machine learning al- gorithm that predicts the likelihood of a person being hired. It does this by discovering patterns in the underlying data. For example, certain institutes produce more suitable graduates than others, but also certain styles of writ- ing do too. This model is then used to predict future applicants’ chances of being hired, filtering out those with a low rating.
These are what are sometimes called “black-box” algorithms. This means for recruiters and applicants, some input (the features for the par- ticular individuals) are fed into the algorithm, and then some output ap- pears. There is no indication what is happening inside.
These algorithms are not limited to recruitment. They are used to make sensitive and important decisions, such as estimating the probability of a prisoner re-offending if released, estimating the risks of children be- ing neglected by their parents, and estimating the likelihood of someone defaulting on a bank loan; as well as in much more mundane tasks, such as in voice-based interaction with smart- phones and recommending movies on streaming services based on what you have watched previously.
Worryingly, many of these mod- els have shown bias against certain groups of people, precisely because the data used to train them were also biased. The most salient example is of Amazon’s recruiting tool, which was trained on past applications and learnt to prefer male applicants over female applicants. However, gender was not one of the features used. In- stead, the algorithm discovered a pat- tern in which applications that used more “masculine language” were
In particular, explanations are social processes; when explainers give explanations, explainees argue or ask follow-up questions.
more likely to be hired. Ultimately, Amazon downgraded the use of the tool, using it only as a recommenda- tion to a human recruiter. This does nothing to solve the problem however, because it is the recruiters who fed the machine-learning algorithm the biased data in the first place.
ANSWERING “WHY?” FOR
TRUSTED AND ETHICAL
ARTIFICIAL INTELLIGENCE
Studies such as the Amazon case are far from isolated. Recent books such as ’Neil’s Weapons of Math Destruction and Au- tomating Inequality show the impact that poor automated decision-making algorithms have on real people.
One step to improving people’s trust in these algorithms, and ulti- mately, to produce ethical artificial intelligence, is to produce artificial intelligence that can explain why it made a decision.
The field of explainable artificial intelligence (XAI) aims to address this problem. Given some output from an algorithm, an explainable algorithm can provide justifications or reasons why this output was reached. This can be as simple as noting which inputs were the most important, to provid- ing some details of the inner workings of the model, or to informing people what they would need to do differently to get a certain other output, such as to get an interview.
DO WE NEED EXPLAINABLE ARTIFICIAL INTELLIGENCE?
There are some people in artificial in-
telligence who reject the need for XAI. For example, , creator of the backpropagation algorithm wide- ly used in deep learning and highly respected AI researcher, was recently quoted [1] as saying that asking algo- rithms to explain their decisions or beliefs would be “a complete disas- ter.” He continued, “People can’t ex- plain how they work, for most of the things they do. When you hire some- body, the decision is based on all sorts of things you can quantify, and then all sorts of gut feelings. People have no idea how they do that. If you ask them to explain their decision, you are forcing them to make up a story.”
Hinton is referring to the concept of post-hoc explanations; human ex- planations are constructed after our reasoning and these explanations can be inaccurate. His claim appears to be that because neural networks are a metaphor for the human brain and are hard to understand, to explain their decisions would require neural network architectures to “make up stories” too. These would be incorrect or incomplete and therefore “a com- plete disaster.” Hinton and others ar- gue in order to trust these systems; we should instead regulate them based on their outputs. That is, based on their performance over many tasks.
I think this reasoning is entirely in- correct.
First, it is contradictory. Hinton’s own words are an explanation of how he reached his opinion on XAI. Is he making up a story about this? I imag- ine he would claim it is based on care- ful reasoning. But in reality, his words are abstract summaries about the neurons in his brain firing in a par- ticular way that nobody understands. The ability to produce and commu- nicate such summaries to others is a strength of the human brain. Phi- losopher claims con- sciousness itself is simply our brain creating an “edited digest” of our brains inner workings for precisely the purpose of communicating our thoughts and intentions (including explanations) to others. There is no theoretical barrier I know that would prevent similar digests for machine learning models.
Second, these arguments against
XRDS • SPRING 2019 • VOL.25 • NO.3
Figure 1. An artificial neural network
tistics; they develop trust based on their own interactions and experi- ences, and of those around them. This is a mistake that the artificial intelligence community continues to make, to our detriment.
XAI: THE CHALLENGES
Explainability is not a new challenge in artificial intelligence. The first re- search on XAI dates back to the mid- 1980s, with approaches to explain rule-based expert systems. However, the modern techniques employed in artificial intelligence, in particular, machine-learning techniques that use deep neural networks, have re- cently lead to an explosion of inter- est on the topic. Further, these tech- niques present new challenges to explainability that were not present 1980s expert systems.
Challenge 1: Opaqueness. Modern AI models, in particular deep neural networks, are not particularly easy to understand. If we consider an algo- rithm written by a software engineer, we could give that software engineer a sample input and they would typi- cally be able to tell you what the out- put would be. With most techniques in artificial intelligence, this is not the case. For example, given a logistic
regression model or a heuristic search algorithm, the output of the algo- rithm is harder to predict. In the lo- gistic regression case, this is because the regression equation was not de- rived manually, but from the under- lying data, and is not in a simple if/ then rule format that facilitates easy understanding. For heuristic search, this is because it derives a small ‘pro- gram’ on the fly from the search al- gorithm’s inputs and goal, and it will find one of possible many solutions to the task. Models such as these are still somewhat understandable, to the point where given the actual output, the engineer could probably trace this back quite easily and see why it was made.
Deep neural networks, on the other hand, are highly opaque. Not only will the engineer have a hard time pre- dicting the output, but they will have a hard time tracing the reasoning back and debugging. This is because the power of deep neural networks comes from the so-called hidden lay- ers, where the learning algorithms find important correlations between variables and “store” them in nodes that have no meaningful labels at- tached to them, as shown in Figure 1. I believe, rather than being driven by
explainability are on typically on the grounds of regulation. But what about ethics? Is it ethical to make im- portant decisions about individuals without being able to explain these decisions? For example, in some U.S. states, parole judges use algorithms for predicting the likelihood of a pris- oner re-offending if released from jail, influencing parole decisions. Is it ethical to keep someone in jail on the basis of a decision from a black box without them knowing why, or know- ing what they need to do before their next hearing? Similarly, is it ethical to reject an individual’s application because a black box with 95 percent accuracy says so, with no way for the applicant to find out how to improve? Many people will claim it is not. Hin- ton and many AI experts argue we should merely run experiments and see if algorithms are biased or safe, effectively ignoring the small percent- age of wrong decisions. These types of arguments seem to be born from the privilege of white middle-class males such as myself, who are unlikely to be adversely affected by such decisions.
Third, at the individual level, there is also an issue of trust. Would you trust an algorithm that advises you to have invasive surgery if it could not give any reasons why? Would a doctor trust the advice of this algo- rithm? With the Watson Health proj- ect, IBM is finding out that earning trust of medical professionals is a difficult barrier to break. People do not develop trust based only on sta-
Figure 2. Saliency map highlighting important pixels of an image of a moka pot (original image on left, saliency map on the right).
XRDS • SPRING 2019 • VOL.25 • NO.3
Figure 1 by Glosser.ca – Own work, Derivative of File: Artificial neural network.svg, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=24913461
ACM Journal on Computing and Cultural Heritage
ACM JOCCH publishes papers of significant and lasting value in all areas relating to the use of ICT in support of Cultural Heritage, seeking to combine the best of computing science with real attention to any aspect of the cultural heritage sector.
For further information or to submit your manuscript,
visit jocch.acm.org
Figure 3. Causal model.
XRDS • SPRING 2019 • VOL.25 • NO.3
Lung Cancer
Alcoholism
ethics and trust, most XAI is driven by AI researchers simply wanting to un- derstand their own models better to improve them.
Researchers are attempting to overcome opaqueness in mainly three different ways:
1. Importance weighting. Many techniques reverse engineer which parts of an input were “important” for a decision; for example, by highlighting which pixels in an image were impor- tant in recognizing a particular object. This could be illustrated, for example, with the saliency map from an image of a moka pot presented in Figure 2.
2. Interpretable models. Other techniques aim to extract or learn a less opaque (more interpretable) model; for example, given a particular decision from a deep neural network, produce a decision tree or linear equa- tion that approximates that decision and is easier to understand.
3. Discard deep neural network models. More recently, some machine learning researchers advocate the idea of using deep neural networks to discover important features in the hidden layers, debugging the neural network to find what those hidden features mean, and then discarding the deep neural network model, just using the discovered features to learn a more interpretable model, such as random forests or linear regression.
In my view, none of these is the so- lution to XAI; they offer little insight to anyone other than experts in the
field. However, they will be useful for those experts and importantly, could be used to form the basis of explain- ability to end users.
Challenge 2: Causality. Causal- ity between two events indicates one event was the result of the other; for example, smoking causes lung cancer. Correlation is a statistical measure that indicates a relationship between two variables; for example, smoking is highly correlated with alcoholism. However, smoking does not cause al- coholism and alcoholism does not cause smoking. There is some con- founding variable, such as lifestyle, that causes both, so they are positively correlated. Figure 3 shows a causal model of this example.
Machine-learning algorithms excel at finding correlations between things.
Would you trust an algorithm that advises you to have invasive surgery if it could not give any reasons why? Would a doctor trust the advice
of this algorithm?
correlated
For example, correlations between age, gender, and previous purchases corre- late with future purchases; or the types of volunteer work people do with their likelihood of being hired at a particu- lar organization. However, using sta- tistics to find such relationships does not uncover causes.
This leads to issues in explainabil- ity because explanations that refer to causes, known as causal explana- tions, are easier for people to under- stand. If I asked an algorithm why it predicted a particular person had a high chance of lung cancer, the an- swer “Because they are an alcoholic” would be an unsatisfying answer. It is the reason why the prediction was made, but it is not why they may have lung cancer. A better explanation would be “Because people who drink a lot tend to lead a lifestyle in which they smoke a lot, and smoking causes cancer”.
There are two ways to get around this issue:
1. Learn causal models. Instead of learning about correlations, ma- chine-learning models can learn about causes. This would be ideal but has major challenges. First, we often do not have access to these confound- ing variables such as “lifestyle,” mean- ing that finding the causes can be challenging. Second, machine learn- ing is built on the field of statistics, and for decades statisticians have ig- nored causality, meaning that the foundations of learning causes are only in their infancy.
2. Exploiting human strengths. Humans are hard wired to extract causes from series of events. For exam- ple, if we see a cartoon character push over another character, we identify the push as the cause. However, this is not true—the illustrator “caused” the fall. If people were not hard-wired to extract causes from events, we would not be able to watch cartoons, or even televisions or movies. XAI can exploit this by giving users enough informa- tion to see the correlations and let them determine the causes (perhaps incorrectly) themselves, the same way we do with cartoons.
Challenge 3: Human-centeredness.
The final challenge for XAI is that it is ultimately humans with little knowl-
Is it ethical to make important decisions about individuals without being
able to explain these decisions?
edge of AI who will need to under- stand decisions.
As discussed earlier, I believe most XAI is driven by AI experts’ desire to better understand, debug, and im- prove their own models. In my view, this has led to a situation in which “explainable” AI is giving explana- tions for other experts, rather than for non-experts. Such systems will not offer job seekers or prisoners up for parole sufficient insight into why they did not get the decision that they hoped for.
To achieve truly XAI, we will need to attack both the technical and the human challenges. The starting point for this is to determine what users (and others affected by deci- sions) would like to understand about AI-based decisions, and what society thinks are the ethically important questions that need to be answered. This is in contrast to the orthodoxy, in which AI experts think about their complex models and determine what they think are important parts of the model that need to be explained, and then hope that these satisfy users.
Recently, I published an article that surveyed more than 200 papers from philosophy, cognitive psychol- ogy/science and social psychology on what an explanation is and how people generate, selected, present, and evaluate explanations [2]. The key finding is human-to-human ex- planations are context sensitive, which means they do not just provide causes for decisions, but that expla- nations differ based on the particu- lar question asked and the people to whom the explanation is presented. In particular, explanations are social processes; when explainers give ex-
planations, explainees argue or ask follow-up questions.
While this finding may seem obvi- ous, research in XAI is only now start- ing to frame the problem like this. Most prior research treats an explana- tion as a set of statements, not a pro- cess, simply highlighting important causes while providing no chance of follow-up if the explainee is not satis- fied. While some of the prior research can form part of an ongoing interac- tion between AI and humans, much more research is needed to determine what questions people want to ask, how to elicit these questions, how to answer them, how to present them, and how to determine whether some- one understands the explanation.
DISCUSSION
As AI becomes more prevalent in our world, it will continue to make impor- tant decisions that have real impact on people’s lives. Ethical concerns and lack of trust in these technolo- gies will continue to limit their adop- tion. In my view, XAI will be one piece of this solution. Ultimately, I believe this is a multi-disciplinary problem that will need to combine computer science, social science, and human- computer interaction.
XAI is not a panacea for all ethical concerns and problems of distrust of artificial intelligence. However,
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com