09a_Object_recognition_with_CIFAR-10_and_CIFAR-100
CIFAR-10 and CIFAR-100 datasets¶
CIFAR-10 and CIFAR-100 are a pair of image classification datasets collected by collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. They are labelled subsets of the much larger 80 million tiny images. They are a common benchmark task for image classification – a list of current accuracy benchmarks for both data sets are maintained by Rodrigo Benenson here.
As the name suggests, CIFAR-10 has images in 10 classes:
airplane
automobile
bird
cat
deer
dog
frog
horse
ship
truck
with 6000 images per class for an overall dataset size of 60000. Each image has three (RGB) colour channels and pixel dimension 32×32, corresponding to a total dimension per input image of 3×32×32=3072. For each colour channel the input values have been normalised to the range [0, 1].
CIFAR-100 has images of identical dimensions to CIFAR-10 but rather than 10 classes they are instead split across 100 fine-grained classes (and 20 coarser ‘superclasses’ comprising multiple finer classes):
Superclass Classes
aquatic mammals beaver, dolphin, otter, seal, whale
fish aquarium fish, flatfish, ray, shark, trout
flowers orchids, poppies, roses, sunflowers, tulips
food containers bottles, bowls, cans, cups, plates
fruit and vegetables apples, mushrooms, oranges, pears, sweet peppers
household electrical devices clock, computer keyboard, lamp, telephone, television
household furniture bed, chair, couch, table, wardrobe
insects bee, beetle, butterfly, caterpillar, cockroach
large carnivores bear, leopard, lion, tiger, wolf
large man-made outdoor things bridge, castle, house, road, skyscraper
large natural outdoor scenes cloud, forest, mountain, plain, sea
large omnivores and herbivores camel, cattle, chimpanzee, elephant, kangaroo
medium-sized mammals fox, porcupine, possum, raccoon, skunk
non-insect invertebrates crab, lobster, snail, spider, worm
people baby, boy, girl, man, woman
reptiles crocodile, dinosaur, lizard, snake, turtle
small mammals hamster, mouse, rabbit, shrew, squirrel
trees maple, oak, palm, pine, willow
vehicles 1 bicycle, bus, motorcycle, pickup truck, train
vehicles 2 lawn-mower, rocket, streetcar, tank, tractor
Each class has 600 examples in it, giving an overall dataset size of 60000 i.e. the same as CIFAR-10.
Both CIFAR-10 and CIFAR-100 have standard splits into 50000 training examples and 10000 test examples. For CIFAR-100 as there is an optional Kaggle competition (see below) scored on predictions on the test set, we have used a non-standard assignation of examples to test and training set and only provided the inputs (and not target labels) for the 10000 examples chosen for the test set.
For CIFAR-10 the 10000 test set examples have labels provided: to avoid any accidental over-fitting to the test set you should only use these for the final evaluation of your model(s). If you repeatedly evaluate models on the test set during model development it is easy to end up indirectly fitting to the test labels – for those who have not already read it see this excellent cautionary note from the MLPR notes by Iain Murray.
For both CIFAR-10 and CIFAR-100, the remaining 50000 non-test examples have been split in to a 40000 example training dataset and a 10000 example validation dataset, each with target labels provided. If you wish to use a more complex cross-fold validation scheme you may want to combine these two portions of the dataset and define your own functions for separating out a validation set.
Data provider classes for both CIFAR-10 and CIFAR-100 are available in the mlp.data_providers module. Both have similar behaviour to the MNISTDataProvider used extensively last semester. A which_set argument can be used to specify whether to return a data provided for the training dataset (which_set=’train’) or validation dataset (which_set=’valid’).
The CIFAR-100 data provider also takes an optional use_coarse_targets argument in its constructor. By default this is set to False and the targets returned by the data provider correspond to 1-of-K encoded binary vectors for the 100 fine-grained object classes. If use_coarse_targets=True then instead the data provider will return 1-of-K encoded binary vector targets for the 20 coarse-grained superclasses associated with each input instead.
Both data provider classes provide a label_map attribute which is a list of strings which are the class labels corresponding to the integer targets (i.e. prior to conversion to a 1-of-K encoded binary vector).
Accessing the CIFAR-10 and CIFAR-100 data¶
Before using the data provider objects you will need to make sure the data files are accessible to the mlp package by existing under the directory specified by the MLP_DATA_DIR path.
The data is available as compressed NumPy .npz files
cifar-10-train.npz 235MB
cifar-10-valid.npz 59MB
cifar-10-test-inputs.npz 59MB
cifar-10-test-targets.npz 10KB
cifar-100-train.npz 235MB
cifar-100-valid.npz 59MB
cifar-100-test-inputs.npz 59MB
in the AFS directory /afs/inf.ed.ac.uk/group/teaching/mlp/data.
If you are working on DICE one option is to redefine your MLP_DATA_DIR to directly point to the shared AFS data directory by editing the env_vars.sh start up file for your environment. This will avoid using up your DICE quota by storing the data files in your homespace but may involve slower initial loading of the data on initialising the data providers if many people are trying access the same files at once. The environment variable can be redefined by running
gedit ~/miniconda2/envs/mlp/etc/conda/activate.d/env_vars.sh
in a terminal window (assuming you installed miniconda2 to your home directory), and changing the line
export MLP_DATA_DIR=$HOME/mlpractical/data
to
export MLP_DATA_DIR=”/afs/inf.ed.ac.uk/group/teaching/mlp/data”
and then saving and closing the editor. You will need reload the mlp environment using source activate mlp and restart the Jupyter notebook server in the reloaded environment for the new environment variable definition to be available.
For those working on DICE who have sufficient quota remaining or those using there own machine, an alternative option is to copy the data files in to your local mlp/data directory (or wherever your MLP_DATA_DIR environment variable currently points to if different).
Assuming your local mlpractical repository is in your home directory you should be able to copy the required files on DICE by running
cp /afs/inf.ed.ac.uk/group/teaching/mlp/data/cifar*.npz ~/mlpractical/data
On a non-DICE machine, you will need to either set up local access to AFS, use a remote file transfer client like scp or you can alternatively download the files using the iFile web interface here (requires DICE credentials).
As some of the files are quite large you may wish to copy only those you are using currently (e.g. only the files for one of the two tasks) to your local filespace to avoid filling up your quota. The cifar-100-test-inputs.npz file will only be needed by those intending to enter the associated optional Kaggle competition.
Example two-layer classifier models¶
Below example code is given for creating instances of the CIFAR-10 and CIFAR-100 data provider objects and using them to train simple two-layer feedforward network models with rectified linear activations in TensorFlow. You may wish to use this code as a starting point for your own experiments.
In [ ]:
import os
import tensorflow as tf
import numpy as np
from mlp.data_providers import CIFAR10DataProvider, CIFAR100DataProvider
import matplotlib.pyplot as plt
%matplotlib inline
CIFAR-10¶
In [ ]:
train_data = CIFAR10DataProvider(‘train’, batch_size=50)
valid_data = CIFAR10DataProvider(‘valid’, batch_size=50)
In [ ]:
def fully_connected_layer(inputs, input_dim, output_dim, nonlinearity=tf.nn.relu):
weights = tf.Variable(
tf.truncated_normal(
[input_dim, output_dim], stddev=2. / (input_dim + output_dim)**0.5),
‘weights’)
biases = tf.Variable(tf.zeros([output_dim]), ‘biases’)
outputs = nonlinearity(tf.matmul(inputs, weights) + biases)
return outputs
In [ ]:
inputs = tf.placeholder(tf.float32, [None, train_data.inputs.shape[1]], ‘inputs’)
targets = tf.placeholder(tf.float32, [None, train_data.num_classes], ‘targets’)
num_hidden = 200
with tf.name_scope(‘fc-layer-1’):
hidden_1 = fully_connected_layer(inputs, train_data.inputs.shape[1], num_hidden)
with tf.name_scope(‘output-layer’):
outputs = fully_connected_layer(hidden_1, num_hidden, train_data.num_classes, tf.identity)
with tf.name_scope(‘error’):
error = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(outputs, targets))
with tf.name_scope(‘accuracy’):
accuracy = tf.reduce_mean(tf.cast(
tf.equal(tf.argmax(outputs, 1), tf.argmax(targets, 1)),
tf.float32))
with tf.name_scope(‘train’):
train_step = tf.train.AdamOptimizer().minimize(error)
init = tf.global_variables_initializer()
In [ ]:
with tf.Session() as sess:
sess.run(init)
for e in range(10):
running_error = 0.
running_accuracy = 0.
for input_batch, target_batch in train_data:
_, batch_error, batch_acc = sess.run(
[train_step, error, accuracy],
feed_dict={inputs: input_batch, targets: target_batch})
running_error += batch_error
running_accuracy += batch_acc
running_error /= train_data.num_batches
running_accuracy /= train_data.num_batches
print(‘End of epoch {0:02d}: err(train)={1:.2f} acc(train)={2:.2f}’
.format(e + 1, running_error, running_accuracy))
if (e + 1) % 5 == 0:
valid_error = 0.
valid_accuracy = 0.
for input_batch, target_batch in valid_data:
batch_error, batch_acc = sess.run(
[error, accuracy],
feed_dict={inputs: input_batch, targets: target_batch})
valid_error += batch_error
valid_accuracy += batch_acc
valid_error /= valid_data.num_batches
valid_accuracy /= valid_data.num_batches
print(‘ err(valid)={0:.2f} acc(valid)={1:.2f}’
.format(valid_error, valid_accuracy))
CIFAR-100¶
In [ ]:
train_data = CIFAR100DataProvider(‘train’, batch_size=50)
valid_data = CIFAR100DataProvider(‘valid’, batch_size=50)
In [ ]:
tf.reset_default_graph()
inputs = tf.placeholder(tf.float32, [None, train_data.inputs.shape[1]], ‘inputs’)
targets = tf.placeholder(tf.float32, [None, train_data.num_classes], ‘targets’)
num_hidden = 200
with tf.name_scope(‘fc-layer-1’):
hidden_1 = fully_connected_layer(inputs, train_data.inputs.shape[1], num_hidden)
with tf.name_scope(‘output-layer’):
outputs = fully_connected_layer(hidden_1, num_hidden, train_data.num_classes, tf.identity)
with tf.name_scope(‘error’):
error = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(outputs, targets))
with tf.name_scope(‘accuracy’):
accuracy = tf.reduce_mean(tf.cast(
tf.equal(tf.argmax(outputs, 1), tf.argmax(targets, 1)),
tf.float32))
with tf.name_scope(‘train’):
train_step = tf.train.AdamOptimizer().minimize(error)
init = tf.global_variables_initializer()
In [ ]:
sess = tf.Session()
sess.run(init)
for e in range(10):
running_error = 0.
running_accuracy = 0.
for input_batch, target_batch in train_data:
_, batch_error, batch_acc = sess.run(
[train_step, error, accuracy],
feed_dict={inputs: input_batch, targets: target_batch})
running_error += batch_error
running_accuracy += batch_acc
running_error /= train_data.num_batches
running_accuracy /= train_data.num_batches
print(‘End of epoch {0:02d}: err(train)={1:.2f} acc(train)={2:.2f}’
.format(e + 1, running_error, running_accuracy))
if (e + 1) % 5 == 0:
valid_error = 0.
valid_accuracy = 0.
for input_batch, target_batch in valid_data:
batch_error, batch_acc = sess.run(
[error, accuracy],
feed_dict={inputs: input_batch, targets: target_batch})
valid_error += batch_error
valid_accuracy += batch_acc
valid_error /= valid_data.num_batches
valid_accuracy /= valid_data.num_batches
print(‘ err(valid)={0:.2f} acc(valid)={1:.2f}’
.format(valid_error, valid_accuracy))
Predicting test data classes and creating a Kaggle submission file¶
An optional Kaggle in Class competition (see email for invite link, you will need to sign-up with a ed.ac.uk email address to be able to enter) is being run on the CIFAR-100 (fine-grained) classification task. The scores for the competition are calculated by calculating the proportion of classes correctly predicted on the test set inputs (for which no class labels are provided). Half of the 10000 test inputs are used to calculate a public leaderboard score which will be visible while the competition is in progress and the other half are used to compute the private leaderboard score which will only be unveiled at the end of the competition. Each entrant can make up to two submissions of predictions each day during the competition.
The code and helper function below illustrate how to use the predicted outputs of the TensorFlow network model we just trained to create a submission file which can be uploaded to Kaggle. The required format of the submission file is a .csv (Comma Separated Variable) file with two columns: the first is the integer index of the test input in the array in the provided data file (i.e. first row 0, second row 1 and so on) and the second column the corresponding predicted class label as an integer between 0 and 99 inclusive. The predictions must be preceded by a header line as in the following example
Id,Class
0,81
1,35
2,12
…
Integer class label predictions can be computed from the class probability outputs of the model by performing an argmax operation along the last dimension.
In [ ]:
test_inputs = np.load(os.path.join(os.environ[‘MLP_DATA_DIR’], ‘cifar-100-test-inputs.npz’))[‘inputs’]
test_predictions = sess.run(tf.nn.softmax(outputs), feed_dict={inputs: test_inputs})
In [ ]:
def create_kaggle_submission_file(predictions, output_file, overwrite=False):
if predictions.shape != (10000, 100):
raise ValueError(‘predictions should be an array of shape (10000, 25).’)
if not (np.all(predictions >= 0.) and
np.all(predictions <= 1.)):
raise ValueError('predictions should be an array of probabilities in [0, 1].')
if not np.allclose(predictions.sum(-1), 1):
raise ValueError('predictions rows should sum to one.')
if os.path.exists(output_file) and not overwrite:
raise ValueError('File already exists at {0}'.format(output_file))
pred_classes = predictions.argmax(-1)
ids = np.arange(pred_classes.shape[0])
np.savetxt(output_file, np.column_stack([ids, pred_classes]), fmt='%d',
delimiter=',', header='Id,Class', comments='')
In [ ]:
create_kaggle_submission_file(test_predictions, 'cifar-100-example-network-submission.csv', True)