This is first Project as an intro to deep learning.Deep Learning has been quite around a bit and is solving really complex problems.Deep learning has enabled to achieve a sate of art performances oer various tasks like human face recognition,object Detection
and several NLP tasks. This projects involves deep learning for recognition of handwritten digits . Later we shall also see how to deploy this model as an web app .so that it could be used in a real life scenario.
Now lets us begin by importing libraries of Python .We will be analyzing the data present and the features their types and the target variable
Now lets load the dataset
Lets see how many training and testing samples we have got how many features we have
Int64Index: 70000 entries, 0 to 27999 Columns: 785 entries, label to pixel99 dtypes: float64(1), int64(784) memory usage: 419.8 MB
Now lets check wether we have any missing values or not
label 28000 pixel0 0 pixel1 0 pixel10 0 pixel100 0 pixel101 0 pixel102 0 pixel103 0 pixel104 0 pixel105 0 pixel106 0 pixel107 0 pixel108 0 pixel109 0 pixel11 0 pixel110 0 pixel111 0 pixel112 0 pixel113 0 pixel114 0 pixel115 0 pixel116 0 pixel117 0 pixel118 0 pixel119 0 pixel12 0 pixel120 0 pixel121 0 pixel122 0 pixel123 0 ... pixel784 0
We don't have any missing values in the train dataset.as we can see we have columns as label and pixel values from 1 to 784.Basically the MNIST dataset that we are using for classification is having 28x28 images which have been flattened to vector of pixels.
Now lets visualize some of the digits in the Data.As we have seen that the data provided to us are in format of
[sample,784 pixels]
.But in order to visualize those images ,we need to reshape them as 28*28 images . Lets see how its done
Now as we have visualized some of the data in the dataset.Lets gets ready for training our classifier on but lets wait.we need to pre-process our data so that our model is easy to train and results are much better .
we need to normalize our images because of varying intensities of pixel values.We know that pixel value ranges from
0-255
.and images which we are dealing with are greyscaled images i.e most of them would e having pixel values as 0 or 255.Normalization basically compresses these varying intensities of pixel values .
Now lets split our train test data and ensure reproducibility by seeding.As data provided to us is already split into train and test .Just put a random seed and start training the Model
Convolutional Neural Networks are very similar to ordinary Neural Networks.Each neuron receives some inputs, performs a dot product and optionally follows it with a non-linearity.
We could also use a traditional Neural Network like the feed forward neural network . But just imagine the situation.
< the first layer of our Network would be a input layer consisting of our feature vector(input vectors).In our case feature
vectors are the raw image pixels.28x28 pixels i.e 784 raw pixel values .and if we would be having a colored image of RGB channel this would result in 2352
length feature vector.And our network would be having multiple hidden layers ,this even worsens the scenario.
Convolutional Neural Networks on the other hand exploits spatial relationship between the pixel values.Unlike
We woluld be using Keras and Tensorflow for Building Convolutional Neural Networks.
For keras some preprocessing is required for feeding our data into model.we need to put our input images into a a particular format.images we have are currently in form [height,width,channel]
but keras need it input in a format
[no of samples,height,width,channels]
We would be evaluating our model's perfomance on a multiclass classification metric. i.e cross entropy with logits keras has this evaluation metric predefined.but it requires labels to one-hot-encoded
. So lets convert our labels
into one hot encoded vectors.
________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_1 (Conv2D) (None, 28, 28, 64) 640 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 14, 14, 64) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 14, 14, 32) 18464 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 7, 7, 32) 0 _________________________________________________________________ conv2d_3 (Conv2D) (None, 7, 7, 16) 4624 _________________________________________________________________ max_pooling2d_3 (MaxPooling2 (None, 3, 3, 16) 0 _________________________________________________________________ dropout_1 (Dropout) (None, 3, 3, 16) 0 _________________________________________________________________ flatten_1 (Flatten) (None, 144) 0 _________________________________________________________________ dense_1 (Dense) (None, 128) 18560 _________________________________________________________________ dropout_2 (Dropout) (None, 128) 0 _________________________________________________________________ dense_2 (Dense) (None, 10) 1290 ================================================================= Total params: 43,578 Trainable params: 43,578 Non-trainable params: 0 _________________________________________________________________
Since we have now defined our model , we will just need to define the loss function evaluation metrics and epochs.
Train on 33600 samples, validate on 8400 samples Epoch 1/10 - 65s - loss: 0.6216 - acc: 0.7924 - val_loss: 0.1098 - val_acc: 0.9671 Epoch 2/10 - 67s - loss: 0.2327 - acc: 0.9252 - val_loss: 0.0763 - val_acc: 0.9777 Epoch 3/10 - 65s - loss: 0.1782 - acc: 0.9452 - val_loss: 0.0641 - val_acc: 0.9804 Epoch 4/10 - 66s - loss: 0.1494 - acc: 0.9550 - val_loss: 0.0524 - val_acc: 0.9837 Epoch 5/10 - 65s - loss: 0.1321 - acc: 0.9583 - val_loss: 0.0507 - val_acc: 0.9850 Epoch 6/10 - 66s - loss: 0.1202 - acc: 0.9632 - val_loss: 0.0513 - val_acc: 0.9846 Epoch 7/10 - 66s - loss: 0.1157 - acc: 0.9640 - val_loss: 0.0455 - val_acc: 0.9876 Epoch 8/10 - 66s - loss: 0.1042 - acc: 0.9676 - val_loss: 0.0439 - val_acc: 0.9865 Epoch 9/10 - 67s - loss: 0.0995 - acc: 0.9693 - val_loss: 0.0404 - val_acc: 0.9882 Epoch 10/10 - 65s - loss: 0.0908 - acc: 0.9713 - val_loss: 0.0383 - val_acc: 0.9886
we have trained our model on 10 epochs.20 % data has been set aside for validation.
The above graph clearly indicates how out model performs both on the test set and validation set.we have used 20% of our training data as our validation set
Clearly our model performs well on the train set with accuracy of about 97.13%
and on our validation set
98.86%
>.Overfitting occurs when accuracy on the train data is higher than the validation dataset or the test dataset
Now we will check our predictions on the testing dataset .lets verify by classifying one image from the dataset as well as plotting it.
But before predicting our output we need to transform our test data into format data and do some preprocessing i.e normalizing pixel intensities.
Clearly the image shows that the image is that of number 3.Lets verify it by predicting it
array([[1.35892244e-11, 6.63674093e-09, 1.89812010e-06, 9.99997854e-01, 5.45060850e-13, 7.55617506e-08, 2.06700043e-12, 1.03831745e-07, 1.64455578e-08, 2.63139732e-09]], dtype=float32)
In the last layer of Convolutional Neural Networks ,we have used a softmax classifier.It output probabilities of various classes i.e Ten probabilities corresponding to digits 0-9
.If you observe carefully the index 3 of the array
has the probability 0.99 which is the number 3 class of our dataset.
Check here for the MNIST Recognition App