Algorithmic Fairness
Motivation: Algorithms influence our lives in many ways
• Machine Learning based systems have been used (to automate complex decision) for:
Copyright By PowCoder代写 加微信 powcoder
• Selecting job applicants
• Recidivism prediction and predictive policing
• Credit scoring and loans
• Facial recognition
• Search and recommendations
• Machine Translation
• … and many other critical applications (involving humans)
• Unfortunately it has been repeatedly shown that these systems are (often significantly) biased
Advanced Topics 26
Data Analytics and Machine Learning
Biased algorithms influence our lives in many ways
• Selecting job applicants
• XING ranks less qualified male candidates higher than more qualified
female candidates (Lahoti et al. 2018)
• Recidivism prediction and predictive policing
• COMPAS: high-risk FP: 23.5% for white vs. 44.9% for black;
low-risk FP: 47.7% for white vs. 28.0% for black (ProPublica article)
• Facial recognition
• Commercial software has much lower accuracy on females with
darker color (Buolamwini and Gebru, 2018)
• Search and recommendations
• Search queries for African-American names more likely to return ads
suggestive of an arrest (Sweeney, 2019)
• Bias found in word embeddings
• man-woman=surgeon-nurse (Bolukbasi et al. 2016)
Advanced Topics 27
Data Analytics and Machine Learning
What causes the bias?
• Tainted training data: Any ML system maintains (and amplifies) the existing bias in the data caused by human bias, e.g. hiring decisions made by a (biased) manager used as labels, historic and systematic biases in the data collection process, etc.
• Skewed sample: Initial predictions influence future observations, e.g. regions with initial high crime rate get more police attention (and thus higher recorded crime in the future), Selection bias
• Proxies: Even if we exclude legally protected features (e.g. race, gender, sexuality) other features may be highly correlated with these
• Sample size disparity: Models will tend to fit the larger groups first (possibly) trading off accuracy for minority groups
• Limited features: Features may be less informative or reliably collected for minority groups
Advanced Topics 28
Data Analytics and Machine Learning
Why Fairness is Hard
• How to define fairness?
• How can we formulate it so it can be considered in ML systems?
• Two distinct notions from the law (Barocas and Selbst, 2016):
• Disparate treatment: decisions are (partly) based on the subject’s
sensitive attribute
• Disparate impact: disproportionately hurt (or benefit) people with
certain sensitive attribute values
• Currently, no consensus on the mathematical formulations of fairness
Advanced Topics 29
Data Analytics and Machine Learning
An illustrating example
• We are a bank trying to fairly decide who should get a loan • i.e. predict which people will likely pay us back?
• We have two groups: Blue and Orange (the sensitive attribute) • This is where discrimination could occur
Figure: Simulating loan thresholds, research.google.com/bigpicture
Advanced Topics 30
Data Analytics and Machine Learning
Definitions of Fairness
• How can we test if our (loan repay) classifier is fair?
• The notions of Group fairness aim to treat all groups equally
• e.g. We can require that the same percentage of Blue and Orange
receive loans
• or Require equal false positive/negative rates, e.g.
P (no loan | would repay, Blue) = P (no loan | would repay, Orange)
• Individual notions of fairness (treat similar examples similarly) also exist but won’t be covered in this lecture
• Counterfactual fairness uses tools from causal inference
• Same decision in the actual world and a counterfactual world where the individual belonges to a different group
Advanced Topics 31
Data Analytics and Machine Learning
Setup – Group Fairness
Consider binary classification with single sensitive attribute for simplicity:
• X ∈ Rd : features of an individual (e.g. credit history)
• A ∈ {a, b, . . . }: sensitive feature (gender, race, etc.)
• R = r (X , A) ∈ {0, 1}: binary predictor (e.g. whether to grant a loan or not) which makes a decision
• thresholding a score R = r (X , A) ∈ [0, 1], e.g. a NN classifier
• Y ∈ {0, 1}: the target variable representing the ground truth
• Assume (X , A, Y ) ∼ D are generated from an underlying distribution
• X , A, Y and R are thus random variables
• Notation: Pa{R}=P{R|A=a}
Advanced Topics 32
Data Analytics and Machine Learning
Naive Approach: Fairness through Unawareness
• We should not include the sensitive attribute as a feature in the training data
• R=r(X)insteadofR=r(X,A)
Pros/Cons:
• Intuitive, easy to use and implement
• Consistent with disparate treatment which has legal support (e.g. the ”General Equal Treatment Act”in Germany)
• However, there can be many highly correlated features (e.g. neighborhood) that are proxies of the sensitive attribute (e.g. race)
Advanced Topics 33
Data Analytics and Machine Learning
First Criterion: Independence
• Require: R independent of A, denoted R A
• Also called Demographic Parity, Statistical Parity, Group Fairness,
Darlington criterion (4)
• In case of binary classification for all groups a,b: Pa {R = 1} = Pb {R = 1}
• In our example, this means that the acceptance rates of the applicants from the two groups must be equal, i.e. same percentage of applications receive loans
• Approximate versions:
Pa{R=1} ≥1−ε |Pa{R=1}−Pb{R=1}|≤ε Pb{R = 1}
Advanced Topics 34
Data Analytics and Machine Learning
How to achieve Independence?
• Post-processing
• Adjust a learned classifier so as to be uncorrelated with the sensitive
• Training time constraint
• Include the exact/approximate constraints in the optimization
• Pre-processing: e.g. via representation learning (next slide)
Advanced Topics 35
Data Analytics and Machine Learning
Representation learning approach
• Map (X , A) to a representation Z (e.g. dimensionality reduction) • Train the predictor on the representation: R = r(Z)
• How to learn a fair representation Z ?
• e.g. optimize for maxI(X;Z) and minI(A;Z), where I is some measure of information (e.g. mutual information)
• e.g. Fair PCA, Fair VAE
Figure: The Variational Fair Autoencoder (Louizos et al., 2016)
Advanced Topics 36
Data Analytics and Machine Learning
Pros/Cons of Independence
• Legal support: ”four-fifth rule”prescribes that a selection rate for any disadvantaged group that is less than four-fifths of that for the group with the highest rate must be justified
• What if 83% of Blue is likely to repay, but only 43% of Orange is? • Then Independence is too strong
• Rules out perfect predictor R = Y when base rates are different
• Laziness: We can trivially satisfy the criterion if we give loan to qualified people from one group and random people from the other
• Can even establish a negative track record for the second group
Advanced Topics 37
Data Analytics and Machine Learning
Second Criterion: Separation
• Require: The prediction R and A to be independent conditional on the target Y, denoted R A | Y
• Also called Equalized Odds, Conditional procedure Accuracy, Avoiding disparate mistreatment,
• In case of binary classification for all groups a,b:
Pa(R=1|Y =1)=Pb(R=1|Y =1) truepositive(TP)
Pa(R=1|Y =0)=Pb(R=1|Y =0) falsepositive(FP)
• Equality of Opportunity is a commonly used relaxation
• OnlymatchtheTPrate: Pa(R=1|Y =1)=Pb(R=1|Y =1)
• In our example, this means we should give loan to equal proportion of individuals who would in reality repay
Advanced Topics 38
Data Analytics and Machine Learning
Achieving Separation
• Area under the ROC (Receiver Operating Characteristic) curve • Each point on the solid curve(s) is realized by thresholding the
predicted score at some value, i.e. predict I(r (X , A) > t )
• Pick a classifier that minimizes the given cost (e.g. maximizes profit)
Figure: Intersection of area under the curves (https://fairmlbook.org/)
Advanced Topics 39
Data Analytics and Machine Learning
Pros/Cons of Separation
• Optimal predictor not ruled out: R = Y is allowed
• Penalizes laziness: it provides incentive to reduce errors uniformly in
all groups
• It may not help closing the gap between two groups
• Granting more loans to the group that is more likely to repay now
makes the groups more likely to have better living conditions and thus even more likely to repay in the future, thus widening the gap
Advanced Topics 40
Data Analytics and Machine Learning
Third Criterion: Sufficiency
• Require the target Y and A to be independent conditional on the prediction (or score) R, denoted Y A | R
• Also called Cleary model, Conditional use accuracy, Calibration within groups
• In case of binary classification for all groups a , b and all r ∈ [0, 1]: Pa(Y =1|R=r)=Pb(Y =1|R=r)
• In our example, the score used to determine if a candidate would repay should reflect the candidate’s real/actual capability of repaying
Advanced Topics 41
Data Analytics and Machine Learning
Achieving Sufficiency
• In general a classifier R is calibrated if for all r ∈ [0, 1]: P(Y =1|R=r)=r
• Of all instances assigned a score value r an r fraction of them should be positive
• Calibration for each group a implies sufficiency: Pa(Y =1|R=r)=r
• Apply standard calibration techniques to each group (if necessary)
• : given an uncalibrated score treat it as a single feature and fit a one variable regression model against Y
Advanced Topics 42
Data Analytics and Machine Learning
Pros/Cons of Sufficiency
• Satisfied by the Bayes optimal classifier r(X,A)=E[Y |X =x,A=a]
• For predicting Y do not need to see A when we have R
• Equal chance of success (Y = 1) given acceptance (R = 1)
• Similar to before it may not help closing the gap between the groups
Advanced Topics 43
Data Analytics and Machine Learning
Fairness Summary: A growing list of fairness criteria
General theme: Require some invariance w.r.t. the sensitive attribute • Independence: R A
• Separation: R A | Y
• Equality of Opportunity: R
• Sufficiency: Y A | R
• Conditional statistical parity • Predictive equality
• Predictive parity
• … and many many more
Many of these definitions are (provably) incompatible, i.e. they are mutually exclusive except in degenerate cases
Advanced Topics 44
Data Analytics and Machine Learning
Visualizing the trade-offs: research.google.com/bigpicture
Advanced Topics 45
Data Analytics and Machine Learning
Comparing different criteria
• ProfitforaTPandcostforaFP
• The cost of FP is typically much greater than the profit for TP
Figure: Different thresholds induced by different criteria (Hardt et al., 2016)
Advanced Topics 46
Data Analytics and Machine Learning
Adversarial Examples
Adversarial Examples are deliberate perturbations of the data designed to achieve a specific malicious goal (e.g. cause a misclassification)
Figure: The panda is classified as a gibbon by the NN, Goodfellow et. al, 2014
Advanced Topics 47
Data Analytics and Machine Learning
Adversarial Examples
Figure: Adversarial glasses fool Facial Recognition systems into classifying the wearer as someone else, Sharif et al., 2016
Figure: ML systems classify the adversarially modified Stop sign as a Speed Limit sign, Eykholt et al., 2018
Advanced Topics 48
Data Analytics and Machine Learning
Adversarial Examples
• Many other recent studies show that most ML models are vulnerable to adversarial examples
• On a high level this is because ML model do not really generalize
• If the distribution of the test data is even slightly different from the
distribution of the training data they fail miserably
• How can we create, detect and defend against Adversarial Examples?
• Especially important to quantify this risk if we are in a safety-critical
application, e.g. self-driving cars
• Certifiable robustness provides mathematical guarantees
• Nature as an adversary: Even if there is no adversary in our use-case, we should quantify robustness to worst-case noise
Advanced Topics 49
Data Analytics and Machine Learning
• Decision based on data are not always accurate, reliable, or fair
• Differential Privacy allows us to compute arbitrary queries on (sensitive) data with provable guarantees on information leakage
• There are no absolute privacy guarantees, your neighbor’s habits are correlated with your habits
• Algorithmic Fairness criterions require (and enforce) some invariances w.r.t. sensitive attributes
• Algorithmic Fairness ̸= Actual Fairness, social/legal/political effort also needed
• Without a model of long-term impact it is difficult to foresee the effect of a fairness criterion implemented as a constraint
• Accuracy, Fairness, Privacy, Adversarial Robustness, Explainability and other aspects are non-trivially related
• Algorithmic solutions are only (small) part of the puzzle
Advanced Topics 50
Data Analytics and Machine Learning
Reading material
Main reading
• ”The Algorithmic Foundations of Differential Privacy”by Dwork and Roth
[ch. 2, 3.1-3.5],
https://www.cis.upenn.edu/~aaroth/privacybook.html
• ”Fairness and Machine Learning”by Barocas, Hardt, and Narayanan [ch. 1, 2], https://fairmlbook.org/6
6Part of the slides adapted from the CSC 2515 lecture by and the Differential Privacy Tutorial by
Advanced Topics 51
Data Analytics and Machine Learning
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com