CIFAR- 10 Image Classifier

Hindavi Churi
Nov 2, 2021
5 min read

This project is about image classification using CNN model for the given cifar10 dataset. We try to make modifications in the CNN layers to see the effects of the changes on the accuracy. Many factors equally contribute to the accuracy and each paly different role. In this project we see the changes in the accuracy by playing with these factors.

Data

We use the cifar10 dataset. This dataset consists of 10 classes. Each class consist of 6000 images of 32x32 size. The ten different classes are as follows:

airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck

We now start with the existing code to understand the functioning of the classifier.

The code starts with including the modules that we are going to use for the project. Here we have used the Pytorch framework. Pytorch is highly recognized for its smooth functioning for the deep learning application.

We create a batch size which defines the batch of images that we will be using for training and testing the images. Cifar10 dataset is loaded into training and testing datasets. Now before loading the data, we normalize the data. Normalization is essential way to preprocess the data and is then by subtracting the mean and then dividing the value by standard deviation for each unit.

We then randomly select a batch from the training set and see how it looks along with its labels.

CNN model:

We define a class called ‘Net’ where we build our model. The basic cnn model consists of convolutional, activation, pooling and fully connected layers. That’s exactly how have designed for our model.

The basic pattern goes like: (Convolutional-activation-pooling) x multiple times – Fully connected layer.

We start with convolutional layer taking input size as 3 which is the rgb value and define our own output size. This output size will the input size for the next convolutional layer, if any. In between we have the pooling layer which reduces the size of the output from the first convolutional layer to its defined size. Pooling can be done by various methods such as max pooling, average pooling, etc. Here we have used the max pooling.

Next, we have defined 3 fully connected layers. Also, we have used relu activation function after each convolutional layer. The main idea of activation function is to bring non- linearity in the data which in turn, catches the highlights of each defining image.

Note: The final output size from the fully connected layer should be the no of classes that we have in our dataset. Here, we define 10.

Loss function and optimizer:

For this project we use the CrossEntropyLoss function to evaluate the loss for our model.

Th CrossEntropyLoss function basically calculates the loss for each image by calculating the difference between actual and predicted value.

The optimizer is used to upgrade the weights using the loss that we just calculated with the learning rate.

Training the model:

We define number of epochs which says how many times do we need to run each batch once and then upgrade using the optimizer function. After each batch we nullify the optimizer. We define the forward and backward functions. The function moves forward if the loss continues to decrease and step backward if we find an increase in the loss. Finally we calculate the loss for each epoch.

The output from the training model is as follows,

The loss for each epoch is calculated.

Saving the model:

We save the model that we created due to issues that may be created while training a big dataset and fear to loss the value in case of pc breakdown.

Testing the model:

We again randomly select a batch for testing.

The image batch is displayed along with its ground truth.

Now we load the model that we saved. Then we feed the randomly selected batch of images to the model.

We now predict the class of the selected batch against the trained model.

As you can see, the class of each image from the selected batch is predicted.

Accuracy:

Lastly, we calculate the accuracy for the model.

We find the evaluated accuracy to be 56%.

We also calculate the average accuracy for each class.

Though we successfully trained and tested the model for the cifar10 classifier, There are many things which can be looked into for improving the accuracy of the model.

My Contributions:

Modifications in the CNN layers:

We change the no of convolutional layers in the model. We introduce 6 more convolutional layers in the network where we essentially group them by 2 and apply batch normalization after each group. It functions same as data normalization. Also, the size of the image does not change. This is essential for pre-processing.

After each group a pooling function is applied. The number of fully connected layers remains the same.

Increasing the convolutional layers upto a certain point may increase the accuracy. After that certain point it may continue to increase the accuracy but only to overfit the model. Hence, defining proper no of convolutional layers is necessary depending on the type of problem we are dealing with.

Here are the results of the modifications,

For each class the average accuracy obtained is as follows,

Changing number of epochs:

Increasing the number of epochs simply signifies that we group each batch by specified number of times.

While making modifications to the epoch size, we should keep in mind that increasing the epoch size extremely can also result in overfitting the data. Also, keeping the epoch size very low can result in underfitting the data. Hence, appropriate selection of epoch size should be made to have the best fit for your model.

The overfitting of the data is realized when the loss in the model is constant and suddenly decreases a lot. But be careful, though the model loss decreases, the test loss increases significantly.

Here are the results from the changes that we made to the epoch size. We made the epoch size as 4. The results are as follows,

Changing the batch size:

Modifying the batch can have significant change on the accuracy of the model. Depending on the complexity of the data the batch size is modified giving different accuracies.

Increasing the batch size is suitable when the data is complex and hence will give more accurate results whereas for simple and small data, it is advisable to keep the batch size small.

Here we increase the batch size to 5. The results for this modifications are as follows,

Adding drop function in the CNN model:

The basic functionality of a drop function is that it reduces the complexity of the data that is incoming and outgoing. The reduction can be applied to any part of the model. It can be applied with the fully connected layers or it can be applied after each convolutional layer.

The addition of dropout function can overcome the overfitting issue.

Here we apply the dropout function with the fully connected layer and the results are as follows,

Here are the results of dropout function applied to every convolutional layer.

Challenges faced:

There were many challenges faced. One of them was to understand the concept of input size and output size in the CNN model. A minute change in the size would not let training process go ahead. The problems were solved using lot of tracing of the input and output size and also understanding the concepts from the references.

Another issue was to understand the factors that may affect the accuracy of the model. Understanding them and also where to apply them appropriately was a bit of a challenge.

Conclusion:

From all the experiments that we have see above, we can say that CNN model is very useful in classifying the images. A proper use of all factors contributing towards the accuracy of the model should be taken care of. Appropriate changes or modification in these factors can result into more accurate model. Hence, we have experimented and built the CIFAR- 10 Image classifier successfully.

Code: