CS542 – Class Challenge – fine-grained classification of plants:¶
Our class challenge will consists of two tasks addressing an image recognition task where our dataset contains about 1K categories of plants with only about 250,000 images. There will be two parts to this task:
1. Image classification. Imagine we have cateloged all the plants we care to identify, now we just need to create a classifier for them! Use your skills from the supervised learning sections of this course to try to address this problem.
2. Semi-Supervised/Few-Shot Learning. Unfortunately, we missed some important plants we want to classify! We do have some images we think contain the plant, but we have only have a few labels. Our new goal is to develop an AI model that can learn from just these labeled examples.
Each student must submit a model on both tasks. Students in the top 3 on each task will get 5% extra credit on this assignment.
This notebook is associated with the first task (image classification).
Dataset¶
The dataset is downloaded on scc in the address: “/projectnb2/cs542-bap/classChallenge/data”. You can find the python version of this notebook there as well or you could just type “jupyter nbconvert –to script baselineModel_task1.ipynb” and it will output “baselineModel_task1.py”. You should be able to run “baselineModel_task1.py” on scc by simply typing “python baselineModel_task1.py”
Please don’t try to change or delete the dataset.
Evaluation:¶
You will compete with each other over your performance on the dedicated test set. The performance measure is top the 5 error, i.e: if the true class is in one of your top 5 likely predictions, then its error is 0, otherwise its error is 1. So, your goal is to get an error of 0. This notebook outputs top5 accuracy, so it is 1 – top5 error.
Baseline:¶
The following code is a baseline which you can use and improve to come up with your model for this task
Suggestion¶
One simple suggestion would be to use a pretrained model on imagenet and finetune it on this data similar to this link Also you should likely train more than 2 epochs.
Import TensorFlow and other libraries¶
In [ ]:
import matplotlib.pyplot as plt
import numpy as np
import os
import PIL
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
Explore the dataset¶
In [ ]:
import pathlib
data_dir = ‘/projectnb2/cs542-bap/class_challenge/’
image_dir = os.path.join(data_dir, ‘images’)
image_dir = pathlib.Path(image_dir)
image_count = len(list(image_dir.glob(‘*.jpg’)))
print(“Total number of images = “,image_count)
Total number of images = 780
Here are some images¶
In [ ]:
PIL.Image.open(os.path.join(image_dir, ‘100.jpg’))
Out[ ]:

Create a dataset¶
In [ ]:
train_ds = tf.data.TextLineDataset(os.path.join(data_dir, ‘train.txt’))
val_ds = tf.data.TextLineDataset(os.path.join(data_dir, ‘val.txt’))
test_ds = tf.data.TextLineDataset(os.path.join(data_dir, ‘test.txt’))
with open(os.path.join(data_dir, ‘classes.txt’), ‘r’) as f:
class_names = [c.strip() for c in f.readlines()]
num_classes = len(class_names)
Write a short function that converts a file path to an (img, label) pair:¶
In [ ]:
def decode_img(img, crop_size=224):
img = tf.io.read_file(img)
# convert the compressed string to a 3D uint8 tensor
img = tf.image.decode_jpeg(img, channels=3)
# resize the image to the desired size
return tf.image.resize(img, [crop_size, crop_size])
def get_label(label):
# find teh matching label
one_hot = tf.where(tf.equal(label, class_names))
# Integer encode the label
return tf.reduce_min(one_hot)
def process_path(file_path):
# should have two parts
file_path = tf.strings.split(file_path)
# second part has the class index
label = get_label(file_path[1])
# load the raw data from the file
img = decode_img(tf.strings.join([data_dir, ‘images/’, file_path[0], ‘.jpg’]))
return img, label
def process_path_test(file_path):
# load the raw data from the file
img = decode_img(tf.strings.join([data_dir, ‘images/’, file_path, ‘.jpg’]))
return img, file_path
Finish setting up data¶
In [ ]:
batch_size = 32
In [ ]:
# Set `num_parallel_calls` so multiple images are loaded/processed in parallel.
AUTOTUNE = tf.data.experimental.AUTOTUNE
train_ds = train_ds.map(process_path, num_parallel_calls=AUTOTUNE)
val_ds = val_ds.map(process_path, num_parallel_calls=AUTOTUNE)
test_ds = test_ds.map(process_path_test, num_parallel_calls=AUTOTUNE)
In [ ]:
for image, label in train_ds.take(1):
print(“Image shape: “, image.numpy().shape)
print(“Label: “, label.numpy())
Image shape: (180, 180, 3)
Label: 1
Data loader hyper-parameters for performance!¶
In [ ]:
def configure_for_performance(ds):
ds = ds.cache()
ds = ds.shuffle(buffer_size=1000)
ds = ds.batch(batch_size)
ds = ds.prefetch(buffer_size=AUTOTUNE)
return ds
train_ds = configure_for_performance(train_ds)
val_ds = configure_for_performance(val_ds)
test_ds = configure_for_performance(test_ds)
Here are some resized images ready to use!¶
In [ ]:
image_batch, label_batch = next(iter(train_ds))
plt.figure(figsize=(10, 10))
for i in range(9):
ax = plt.subplot(3, 3, i + 1)
plt.imshow(image_batch[i].numpy().astype(“uint8”))
label = label_batch[i]
plt.title(class_names[label])
plt.axis(“off”)

A simple CNN model!¶
In [ ]:
model = tf.keras.Sequential([
layers.experimental.preprocessing.Rescaling(1./255),
layers.Conv2D(64, 3),
layers.MaxPooling2D(),
layers.Conv2D(128, 3),
layers.MaxPooling2D(),
layers.Conv2D(128, 3),
layers.MaxPooling2D(),
layers.Conv2D(256, 3),
layers.MaxPooling2D(),
layers.Conv2D(256, 3),
layers.MaxPooling2D(),
layers.Conv2D(512, 3),
layers.Flatten(),
layers.Dense(1024),
layers.Dense(num_classes)
])
The usual loss function¶
In [ ]:
opt = keras.optimizers.Adam(learning_rate=0.0001)
model.compile(
optimizer=opt,
loss=tf.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[‘accuracy’,tf.keras.metrics.SparseTopKCategoricalAccuracy(k=5)])
Training¶
In [ ]:
model.fit(train_ds,validation_data=val_ds,epochs=2,shuffle=True)
Epoch 1/2
18/18 [==============================] – 76s 4s/step – loss: 2.4045 – accuracy: 0.1484 – sparse_top_k_categorical_accuracy: 0.5440 – val_loss: 2.3086 – val_accuracy: 0.1709 – val_sparse_top_k_categorical_accuracy: 0.6496
Epoch 2/2
18/18 [==============================] – 75s 4s/step – loss: 2.1331 – accuracy: 0.1502 – sparse_top_k_categorical_accuracy: 0.6300 – val_loss: 2.0969 – val_accuracy: 0.1538 – val_sparse_top_k_categorical_accuracy: 0.6496
Out[ ]:
Output submission csv for Kaggle¶
In [ ]:
with open(‘submission_task1_supervised.csv’, ‘w’) as f:
f.write(‘id,predicted\n’)
for image_batch, image_names in test_ds:
predictions = model.predict(image_batch)
for image_name, predictions in zip(image_names.numpy(), model.predict(image_batch)):
inds = np.argpartition(predictions, -5)[-5:]
line = str(int(image_name)) + ‘,’ + ‘ ‘.join([class_names[i] for i in inds])
f.write(line + ‘\n’)
Note
Absolute path is recommended here. For example, use “/projectnb2/cs542-bap/[your directory name]/submission_task1_supervised.csv” to replace “submission_task1_supervised.csv”.
Besides, you can request good resources by specify the type of gpus, such as “qsub -l gpus=1 -l gpu_type=P100 [your file name].qsub”. This is helpful to avoid potential issues of GPUs, such as out of memory, etc.
In [ ]: