程序代写 CS 499/599 HW3

CS 499/599 HW3
Homework 3: data poisoning attacks and defenses

Homework Overview

The learning objective of this homework is for you to perform data poisoning attacks on machine learning models (some of the attacks will require the neural networks trained in Homework 1). You will also test the effectiveness of simple defenses against the poisoning attacks you will implement. You can start this homework from the codebase you wrote in Homework 1.

Initial Setup

We will use two datasets: MNIST-1/7 and CIFAR-10 [link]. You can construct the MNIST-1/7 dataset from MNIST. Please search for some examples on Google about how to subsample classes from a dataset [Here is an example in PyTorch].

Here, we consider two models: logistic regression [an example in PyTorch] for MNIST-1/7 and ResNet18 for CIFAR-10 [Link].

Recommended Code Structure

You will write three scripts poison_craft.py, poisoning.py, and poison_remove.py. The rest are the same as Homework 1.
– [New] poison_craft.py : a Python script to craft poisoning samples.
– [New] poison.py : a Python script for training a model on a contaminated dataset.
– [New] poison_remove.py: a Python script for removing suspicious samples from a contaminated dataset.

Task I: Poisoning Attack against Logistic Regression Models
Let’s start with poisoning attacks against logistic regression models in MNIST-1/7. Here, we will conduct an indiscriminate poisoning attack. Your job is to construct contaminated training sets that can degrade the accuracy of a model once the model is trained on it.

We will use a simple poisoning scheme called random label-flipping. It constructs a contaminated training set by randomly flipping the labels of X% samples in the original training set. For example, you can select 10% of the MNIST-1/7 training samples (~1.7k) and flip their labels from 1 to 7 (or vice versa).

Your job is to construct four contaminated training sets where each set contains {5, 10, 25, 50}% of poisons. You will train five logistic regression models: four on each corrupted training set and one on the clean MNIST-1/7 dataset. Please measure how much accuracy degradation each attack causes compared to the accuracy of the model trained on the clean data.

Here, you need to implement the following function in poison_craft.py.

def craft_random_lflip(train_set, ratio):
– train_set: an instance for the training dataset
– ratio : the percentage of samples whose labels will be flipped
// You can add more arguments if required

This function constructs a training set that has ratio% of poisons. The train_set is an instance of the clean training set and the ratio is a number between 0 and 1. Note that this is an example of writing a function for crafting poisoned training sets. Please feel free to use your own function if that is more convenient.

You will also write the training script in poison.py. This script will be mostly the same as `train.py`, but the only difference is you load the contaminated training set instead of the clean training data. Once loaded, the rest will be the same.

Task II: Poisoning Attacks on Deep Neural Networks
Now, let’s turn our attention to attacking neural networks. As I explain in the lecture, deep neural networks are less susceptible to indiscriminate poisoning attacks, i.e., it’s hard to degrade their accuracy unless we inject many poisons. We therefore focus on targeted poisoning attacks.

Your job here is to conduct Poison Frogs! [Link] attack on ResNet18 trained on CIFAR-10. You can refer to the author’s code [TensorFlow] or the community implementations [PyTorch]. Be careful if you use those community code; there is a chance that they implement the attack incorrectly.

Instructions

We conduct this attack between two classes in CIFAR-10: frogs and dogs. Particularly, we aim to make a frog sample classified into a dog. We will use the ResNet18 trained in Homework 1. Please follow the instructions below to inflict this misclassification.

Choose 5 frog images (targets) from the CIFAR-10’s test-set.
Choose 100 dog images (base images) from the CIFAR-10’s test-set. (You will use them to craft poisons).
Use the 100 base images to craft 100 poisons for each targets. Please use your ResNet18 to extract features. (see the details below).
Construct 6 contaminated training sets for each target by injecting {1, 5, 10, 25, 50, 100} poisons into the original training data.
Finetune only the last layer of your ResNet50 for 10 epochs on each contaminated training set. Check if your finetuned model misclassifies each target (frog) as a dog. If the model misclassifies the target as a dog, your attack is successful. Otherwise, it’s an attack failure.
In total, you will have 30 contaminated training sets (= 6 different sets x 5 targets).

Implementation

Here, you need to implement the following function in poison_craft.py.

def craft_clabel_poisons(model, target, bases, niter, lr, beta, …):
– model : a pre-trained ResNet18
– target: a target sample (a frog)
– bases : a set of base samples (dogs)
– niter : number of optimization iterations
– lr : learning rate for your optimization
– beta : hyper-parameter (refer to the paper)
// You can add more arguments if required

This function crafts clean-label poisons. It takes a model (ResNet18) to extract features for a single target and 100 base samples. It also takes optimization hyper-parameters such as niter, lr, beta, etc. Once the function sufficiently optimizes your poisons, it will return 100 poisons crafted from the bases. Please refer to the author’s code, the community implementations, and the original study for reference.

You will also modify the training script in poison.py. This script will be mostly the same as train.py, but the only difference is you load the contaminated training set instead of the clean training data. Once loaded, the rest will be almost the same.

[Extra +3 pts]: Defeat Data Poisoning Attacks
In the lecture, we learned two simple defense mechanisms against data poisoning attacks: (1) RONI [Paper] and (2) Data sanitization [Link]. Here, we implement those defenses and use them against the two data poisoning attacks (random label-flipping and clean-label poisoning).

Subtask I: RONI against Random Label-flipping

Let’s start with RONI. You will choose the MNIST-1/7 training set containing 20% poisons (i.e. 20% samples have flipped labels).

First, sub-sample 20% of any samples from the MNIST-1/7 training set. You will use this (D_v) to remove poisons from the training data.
Next, let’s split the contaminated training set. You can divide the contaminated training data (D_tr) into multiple sets (D_tr_i, where i in [1, 170]) where each set contains 100 training samples (c.f. this process will create approximately ~170 different sets).
You first train a logistic regression on D_tr_1; compute the model’s accuracy on D_tr_1 and save it.
Iteratively (from i=1, …, 170), train your model on D_tr_1 + … + D_tr_i. At each time, compare the i-th model’s accuracy with the (i-1)-th model’s. If the accuracy is reduced more than X% (a hyper-parameter of your choice), remove D_tr_i from the training set and continue.
Use at least two X% values and check how many poisons you removed in each case. You also need to check how the accuracy of your model is after removing suspicious samples (i.e., you will examine the effectiveness of RONI defense).
Subtask II: Data Sanitization against Clean-label Poisoning

Let’s move on and defeat clean-label poisoning (Poison Frogs!). Please choose any successful attack (i.e., choose a target and 100 poisons).

We will use ResNet18 fine-tuned on the contaminated training set. Let’s first compute features for all the training samples with the model.
Using the features (for 50k original training samples + 100 posions you add), we will detect suspicious samples and remove them from the training set. Please remove the outlier samples by running this UMAP example [link] on the collected features.
Let’s sanitize the training set by removing outliers. Please remove 2-3 amounts (100, 200, or 300) and compose sanitized training sets.
Finetune your original ResNet50 on each sanitized training set and check whether the poisoning attack is successful.

Submission Instructions
Use Canvas to submit your homework. You need to make a single compressed file (.tar.gz or .zip) that contains your code and a write-up as a PDF file. Please do not include datasets and models in your submission. Put your write-up under the reports folder. Your PDF write-up should contain the following things:

Your plot: { the ratio of poisons in the training set } vs. { classification accuracy } on the test-set
Your analysis: write-down 2-3 sentences explaining why you see those results.
Your table: 2 rows (the upper one is for the number of poisons and the lower one is the number of successful attacks over 5 targets)
Your analysis: write-down 2-3 sentences explaining why you see those results.
[Extra +3 pts]
Sub-task I
Your plot: { # iterations } vs. { the accuracy of your model } on the test-set.
Your analysis: write-down 2-3 sentences explaining why you see those results.
Sub-task II
Your analysis: write-down 2-3 sentences explaining whether you successfully mitigate clean-label poisoning or not. (If possible) analyze whether you can defeat more successfully when removing more suspicious samples.

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts