CS计算机代考程序代写 AI l22-ethics-v1

l22-ethics-v1

COPYRIGHT 2021, THE UNIVERSITY OF MELBOURNE
1

COMP90042
Natural Language Processing

Lecture 22
Semester 1 2021 Week 11

Jey Han Lau

Ethics

COMP90042 L22

2

What is Ethics?

• What is the right thing to do?

• Why?

How we ought to live — Socrates

COMP90042 L22

3

Why Should We Care?

• AI technology is increasingly being deployed to
real-world applications

• Have real and tangible impact to people

• Whose responsibility is it when things go bad?

COMP90042 L22

4

Why Is Ethics Hard?

• Often no objective truth, unlike sciences

• A new philosophy student may ask whether
fundamental ethical theories such as utilitarianism
is right

• But unlikely a new physics student would question
the laws of thermodynamics

• In examining a problem, we need to think from
different perspectives to justify our reasons

COMP90042 L22

5

Learning Outcomes

• Think more about the application you build

‣ Not just its performance

‣ Its social context

‣ Its impact to other people

‣ Unintended harms

• Be a socially-responsible scientist or engineer

COMP90042 L22

6

Outline

• Arguments against ethical checks in NLP

• Core NLP ethics concepts

• Group discussion

COMP90042 L22

7

Arguments Against

Ethical Checks in NLP

COMP90042 L22

8

Should We Censor Science?

• A common argument when ethical checks or
processes are introduced:

‣ Should there be limits to scientific research? Is
it right to censor research?

• Ethical procedures are common in other fields:
medicine, biology, psychology, anthropology, etc

COMP90042 L22

9

Should We Censor Science?

• In the past, this isn’t common in computer science

• But this doesn’t mean this shouldn’t change

• Technology are increasingly being integrated into
society; the research we do nowadays are likely to
be more deployed than 20 years ago

COMP90042 L22

10

H5N1

• Ron Fouchier, a Dutch virologist, discovered how
to make bird flu potentially more harmful in 2011

• Dutch government objected to publishing the
research

• Raised a lot of discussions and concerns

• National policies enacted

COMP90042 L22

11

Isn’t Transparency Always Better?

• Is it always better to publish sensitive research
publicly?

• Argument: worse if they are done underground

• If goal is to raise awareness, scientific publication
isn’t the only way

‣ Could work with media to raise awareness

‣ Doesn’t require exposing the technique

COMP90042 L22

12

AI vs. Cybersecurity

• Exposing vulnerability publicly is desirable in
cyber-security applications

‣ Easy for developer to fix the problem

• But the same logic doesn’t always apply for AI

‣ Not easy to fix, once the technology is out

COMP90042 L22

13

Core NLP Ethics Concepts

COMP90042 L22

14

Bias

• Most ethics research in NLP focus in this aspect

• A biased model is one that performs unfavourably
against certain groups of users

‣ typically based on demographic features such
as gender or ethnicity

COMP90042 L22

15

Bias

• Bias isn’t necessarily bad

‣ Guide the model to make informed decisions in
the absence of more information

‣ Truly unbiased system = system that makes
random decisions

‣ Bad when overwhelms evidence

• Bias can arise from data, annotations,
representations, models, or research design

COMP90042 L22

16

Bias in Word Embeddings
• Word Analogy (lecture 10):

‣ v(man) – v(woman) = v(king) – v(queen)

• But!

‣ v(man) – v(woman) = v(programmer) – v(homemaker)

‣ v(father) – v(mother) = v(doctor) – v(nurse)

‣ Word embeddings reflect and amplify gender
stereotypes in society

‣ Lots of work done to reduce bias in word embeddings

COMP90042 L22

17

Dual Use

• Every technology has a primary use, and
unintended secondary consequences

‣ nuclear power, knives, electricity

‣ could be abused for things they are not
originally designed to do.

• Since we do not know how people will use it, we
need to be aware of this duality

COMP90042 L22

18

OpenAI GPT-2

• OpenAI developed GPT-2, a large language model
trained on massive web data (lecture 1 demo)

• Kickstarted the pretrained model paradigm in NLP

‣ Fine-tune pretrained models on downstream
tasks (BERT lecture 11)

• GPT-2 also has amazing generation capability

‣ Can be easily fine-tuned to generate fake news,
create propaganda

COMP90042 L22

19

OpenAI GPT-2

• Pretrained GPT-2 models released in stages over
9 months, starting with smaller models

• Collaborated with various organisations to study
social implications of very large language models
over this time

• OpenAI’s effort is commendable, but this is
voluntary

• Further raises questions about self-regulation

COMP90042 L22

20

Privacy

• Often conflated with anonymity

• Privacy means nobody know I am doing
something

• Anonymity means everyone know what I am
doing, but not that it is me

COMP90042 L22

21

GDPR

• Regulation on data privacy in EU

• Also addresses transfer of personal data

• Aim to give individuals control over their personal data

• Organisations that process EU citizen’s personal data
are subjected to it

• Organisations need to anonymise data so that people
cannot be identified

• But we have technology to de-identify author
attributes

COMP90042 L22

22

AOL Search Data Leak

• In 2006, AOL released anonymised search logs of
users

• Log contained sufficient information to de-identify
individuals

‣ Through cross-referencing with phonebook
listing an individual was identified

• Lawsuit filed against AOL

COMP90042 L22

23

Group discussion

COMP90042 L22

24

Prompts
• Primary use: does it promote harm or social good?

• Bias?

• Dual use concerns?

• Privacy concerns? What sorts of data does it use?

• Other questions to consider:

‣ Can it be weaponised against populations (e.g. facial
recognition, location tracking)?

‣ Does it fit people into simple categories (e.g. gender and
sexual orientation)?

‣ Does it create alternate sets of reality (e.g. fake news)?

COMP90042 L22

25

Automatic Prison Term Prediction

• A model that predicts the prison sentence of an
individual based on court documents

COMP90042 L22

26

Automatic CV Processing

• A model that processes CV/resumes for a job to
automatically filter candidates for interview

COMP90042 L22

27

Language Community Classification

• A text classification tool that distinguishes LGBTQ
from heterosexual language

• Motivation: to understand how language used in
the LGBTQ community differs from heterosexual
community

COMP90042 L22

28

Take Away

• Think about the applications you build

• Be open-minded: ask questions, discuss with
others

• NLP tasks aren’t always just technical problems

• Remember that the application we build could
change someone else’s life

• We should strive to be a socially responsible
engineer/scientist

COMP90042 L22

29

Readings (Optional)

• The Elements of Moral Philosophy by James and
Stuart Rachels