Ethics
COMP90042
Natural Language Processing
Lecture 22
Semester 1 2021 Week 11 Jey Han Lau
COPYRIGHT 2021, THE UNIVERSITY OF MELBOURNE
1
COMP90042
L22
• •
What is the right thing to do? Why?
What is Ethics?
How we ought to live — Socrates
2
COMP90042
L22
•
• •
AI technology is increasingly being deployed to real-world applications
Why Should We Care?
Have real and tangible impact to people Whose responsibility is it when things go bad?
3
COMP90042
L22
Why Is Ethics Hard?
Often no objective truth, unlike sciences
A new philosophy student may ask whether fundamental ethical theories such as utilitarianism is right
•
•
• •
But unlikely a new physics student would question the laws of thermodynamics
In examining a problem, we need to think from different perspectives to justify our reasons
4
COMP90042
L22
•
Think more about the application you build ‣ Not just its performance
‣ Its social context
‣ Its impact to other people
‣ Unintended harms
Be a socially-responsible scientist or engineer
•
Learning Outcomes
5
COMP90042
L22
• • •
Arguments against ethical checks in NLP Core NLP ethics concepts
Group discussion
Outline
6
COMP90042
L22
Arguments Against
Ethical Checks in NLP
7
COMP90042
L22
•
A common argument when ethical checks or processes are introduced:
‣ Should there be limits to scientific research? Is it right to censor research?
•
Ethical procedures are common in other fields: medicine, biology, psychology, anthropology, etc
Should We Censor Science?
8
COMP90042
L22
•
•
•
In the past, this isn’t common in computer science But this doesn’t mean this shouldn’t change
Should We Censor Science?
Technology are increasingly being integrated into society; the research we do nowadays are likely to be more deployed than 20 years ago
9
COMP90042
L22
•
•
• •
Ron Fouchier, a Dutch virologist, discovered how to make bird flu potentially more harmful in 2011
H5N1
Dutch government objected to publishing the research
Raised a lot of discussions and concerns National policies enacted
10
COMP90042
L22
•
• •
Is it always better to publish sensitive research publicly?
Isn’t Transparency Always Better?
Argument: worse if they are done underground
If goal is to raise awareness, scientific publication isn’t the only way
‣ Could work with media to raise awareness ‣ Doesn’t require exposing the technique
11
COMP90042
L22
•
Exposing vulnerability publicly is desirable in cyber-security applications
‣ Easy for developer to fix the problem
But the same logic doesn’t always apply for AI
‣ Not easy to fix, once the technology is out
•
AI vs. Cybersecurity
12
COMP90042
L22
Core NLP Ethics Concepts
13
COMP90042
L22
• •
Bias
Most ethics research in NLP focus in this aspect
A biased model is one that performs unfavourably against certain groups of users
‣ typically based on demographic features such as gender or ethnicity
14
COMP90042
L22
Bias Bias isn’t necessarily bad
‣ Guide the model to make informed decisions in the absence of more information
‣ Truly unbiased system = system that makes random decisions
‣ Bad when overwhelms evidence
•
•
Bias can arise from data, annotations, representations, models, or research design
15
COMP90042
L22
Bias in Word Embeddings
• Word Analogy (lecture 10):
‣ v(man) – v(woman) = v(king) – v(queen)
• But!
‣ v(man) – v(woman) = v(programmer) – v(homemaker) ‣ v(father) – v(mother) = v(doctor) – v(nurse)
‣ Word embeddings reflect and amplify gender stereotypes in society
‣ Lots of work done to reduce bias in word embeddings
16
COMP90042
L22
•
Every technology has a primary use, and unintended secondary consequences
‣ nuclear power, knives, electricity
‣ could be abused for things they are not
originally designed to do.
•
Since we do not know how people will use it, we need to be aware of this duality
Dual Use
17
COMP90042
L22
•
•
OpenAI developed GPT-2, a large language model trained on massive web data (lecture 1 demo)
•
GPT-2 also has amazing generation capability
‣ Can be easily fine-tuned to generate fake news, create propaganda
OpenAI GPT-2
Kickstarted the pretrained model paradigm in NLP
‣ Fine-tune pretrained models on downstream tasks (BERT lecture 11)
18
COMP90042
L22
•
•
•
•
Pretrained GPT-2 models released in stages over 9 months, starting with smaller models
OpenAI GPT-2
Collaborated with various organisations to study social implications of very large language models over this time
OpenAI’s effort is commendable, but this is voluntary
Further raises questions about self-regulation
19
COMP90042
L22
Privacy
Often conflated with anonymity
Privacy means nobody know I am doing something
•
• •
Anonymity means everyone know what I am doing, but not that it is me
20
COMP90042
L22
GDPR
• RegulationondataprivacyinEU
• Alsoaddressestransferofpersonaldata
• Aimtogiveindividualscontrolovertheirpersonaldata
• OrganisationsthatprocessEUcitizen’spersonaldata are subjected to it
• Organisationsneedtoanonymisedatasothatpeople cannot be identified
• Butwehavetechnologytode-identifyauthor attributes
21
COMP90042
L22
•
•
In 2006, AOL released anonymised search logs of users
•
Lawsuit filed against AOL
AOL Search Data Leak
Log contained sufficient information to de-identify individuals
‣ Through cross-referencing with phonebook listing an individual was identified
22
COMP90042
L22
Group discussion
23
COMP90042
L22
Prompts
• Primaryuse:doesitpromoteharmorsocialgood? • Bias?
• Dualuseconcerns?
• Privacyconcerns?Whatsortsofdatadoesituse? • Otherquestionstoconsider:
‣ Can it be weaponised against populations (e.g. facial recognition, location tracking)?
‣ Does it fit people into simple categories (e.g. gender and sexual orientation)?
‣ Does it create alternate sets of reality (e.g. fake news)?
24
COMP90042
L22
•
A model that predicts the prison sentence of an individual based on court documents
Automatic Prison Term Prediction
25
COMP90042
L22
•
A model that processes CV/resumes for a job to automatically filter candidates for interview
Automatic CV Processing
26
COMP90042
L22
Language Community Classification
•
•
A text classification tool that distinguishes LGBTQ from heterosexual language
Motivation: to understand how language used in the LGBTQ community differs from heterosexual community
27
COMP90042
L22
Take Away
Think about the applications you build
Be open-minded: ask questions, discuss with others
• •
•
• •
NLP tasks aren’t always just technical problems
Remember that the application we build could change someone else’s life
We should strive to be a socially responsible engineer/scientist
28
COMP90042
L22
•
The Elements of Moral Philosophy by James and Stuart Rachels
Readings (Optional)
29