Blood Cell Classification using Deep Learning

Photo by Dana Pavlichko for Happy Cog on Dribbble

The diagnosis of blood-based diseases often involves identifying and characterizing patient blood samples. Automated methods to detect and classify blood cell subtypes have important medical applications. In this notebook we will create a Convolutional Neural Network to identify and classify the different types of white blood cells from the images. The notebook uses images from a BCCD Dataset and contains around 12500 images in the training set. We have used 59 images from the dataset to use them for testing purpose.

Importing the dataset¶

!wget -N "https://cainvas-static.s3.amazonaws.com/media/user_data/cainvas-admin/Blood_cell_image_dataset.zip"
!unzip -qo Blood_cell_image_dataset.zip
!rm Blood_cell_image_dataset.zip

200 OK
Length: 105253880 (100M) [application/x-zip-compressed]
Saving to: ‘Blood_cell_image_dataset.zip’

Blood_cell_image_da 100%[===================>] 100.38M  90.4MB/s    in 1.1s    

2021-07-13 11:38:52 (90.4 MB/s) - ‘Blood_cell_image_dataset.zip’ saved [105253880/105253880]

Importing the necessary libraries¶

#Importing necessary libraries
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPool2D
from tensorflow.keras.layers import ZeroPadding2D
from tensorflow.keras.layers import Dropout
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import numpy as np
import matplotlib.pyplot as plt
import cv2

Displaying the data¶

We can see in the following cell that the image is rgb

img = cv2.imread("Blood_cell_image_dataset/images/TRAIN/EOSINOPHIL/_0_1169.jpeg")
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB )
plt.title("Eosinophil")
plt.imshow(img)

<matplotlib.image.AxesImage at 0x7f2dde9a5f28>

img_width = 64
img_height = 64

Preparing the data¶

We have used ImageDataGenerator from keras, in the subsequent cells, to fetch the images along with their labels to train the neural network

datagen = ImageDataGenerator(rescale = 1/255.0, validation_split = 0.2)

train_data_generator = datagen.flow_from_directory(directory="Blood_cell_image_dataset/images/TRAIN/", target_size = (img_width, img_height), color_mode="rgb", class_mode="categorical", batch_size = 16, shuffle=True ,subset = "training")

Found 9950 images belonging to 4 classes.

validation_data_generator = datagen.flow_from_directory(directory="Blood_cell_image_dataset/images/TRAIN/", target_size = (img_width, img_height),  color_mode="rgb", class_mode="categorical", batch_size = 16, shuffle=True, subset = "validation")

Found 2486 images belonging to 4 classes.

The labels¶

We will use one hot encoding here as our data is categorical in nature

train_data_generator.next()[1]

array([[0., 0., 1., 0.],
       [0., 0., 1., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [0., 0., 1., 0.],
       [0., 1., 0., 0.],
       [1., 0., 0., 0.],
       [0., 0., 1., 0.],
       [0., 1., 0., 0.],
       [0., 1., 0., 0.],
       [0., 1., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]], dtype=float32)

The CNN Model¶

model = Sequential()

model.add(Conv2D(32, (3,3), input_shape=(64,64,3), activation="relu"))
model.add(MaxPool2D(2,2))

model.add(Conv2D(32, (3,3), activation="relu"))
model.add(MaxPool2D(2,2))

model.add(Conv2D(16, (3,3), activation="relu"))
model.add(MaxPool2D(2,2))

model.add(Flatten())

model.add(Dense(128, activation="relu"))

model.add(Dense(4, activation="softmax"))

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 62, 62, 32)        896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 31, 31, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 29, 29, 32)        9248      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 14, 14, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 12, 12, 16)        4624      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 6, 6, 16)          0         
_________________________________________________________________
flatten (Flatten)            (None, 576)               0         
_________________________________________________________________
dense (Dense)                (None, 128)               73856     
_________________________________________________________________
dense_1 (Dense)              (None, 4)                 516       
=================================================================
Total params: 89,140
Trainable params: 89,140
Non-trainable params: 0
_________________________________________________________________

model.compile(optimizer="Adam", loss="categorical_crossentropy", metrics=['accuracy'])

my_callback = [tf.keras.callbacks.EarlyStopping(monitor = 'val_loss', patience = 5, restore_best_weights = True)]

history=model.fit(train_data_generator, steps_per_epoch=len(train_data_generator), epochs=100, validation_data=validation_data_generator, validation_steps = len(validation_data_generator), callbacks=my_callback)

Epoch 1/100
622/622 [==============================] - 13s 21ms/step - loss: 1.3867 - accuracy: 0.2596 - val_loss: 1.3655 - val_accuracy: 0.3025
Epoch 2/100
622/622 [==============================] - 13s 21ms/step - loss: 1.0780 - accuracy: 0.4934 - val_loss: 0.8424 - val_accuracy: 0.6271
Epoch 3/100
622/622 [==============================] - 13s 21ms/step - loss: 0.7748 - accuracy: 0.6590 - val_loss: 0.6882 - val_accuracy: 0.6967
Epoch 4/100
622/622 [==============================] - 13s 21ms/step - loss: 0.6383 - accuracy: 0.7172 - val_loss: 0.6037 - val_accuracy: 0.7357
Epoch 5/100
622/622 [==============================] - 13s 22ms/step - loss: 0.5668 - accuracy: 0.7501 - val_loss: 0.5696 - val_accuracy: 0.7631
Epoch 6/100
622/622 [==============================] - 13s 21ms/step - loss: 0.5102 - accuracy: 0.7748 - val_loss: 0.4915 - val_accuracy: 0.7860
Epoch 7/100
622/622 [==============================] - 13s 21ms/step - loss: 0.4614 - accuracy: 0.7978 - val_loss: 0.4693 - val_accuracy: 0.7912
Epoch 8/100
622/622 [==============================] - 13s 21ms/step - loss: 0.4109 - accuracy: 0.8198 - val_loss: 0.4185 - val_accuracy: 0.8150
Epoch 9/100
622/622 [==============================] - 13s 21ms/step - loss: 0.3697 - accuracy: 0.8352 - val_loss: 0.3922 - val_accuracy: 0.8335
Epoch 10/100
622/622 [==============================] - 13s 21ms/step - loss: 0.3310 - accuracy: 0.8559 - val_loss: 0.3968 - val_accuracy: 0.8246
Epoch 11/100
622/622 [==============================] - 13s 21ms/step - loss: 0.2971 - accuracy: 0.8724 - val_loss: 0.3585 - val_accuracy: 0.8479
Epoch 12/100
622/622 [==============================] - 13s 21ms/step - loss: 0.2737 - accuracy: 0.8843 - val_loss: 0.3739 - val_accuracy: 0.8455
Epoch 13/100
622/622 [==============================] - 13s 21ms/step - loss: 0.2553 - accuracy: 0.8918 - val_loss: 0.3305 - val_accuracy: 0.8604
Epoch 14/100
622/622 [==============================] - 13s 21ms/step - loss: 0.2405 - accuracy: 0.8944 - val_loss: 0.3884 - val_accuracy: 0.8375
Epoch 15/100
622/622 [==============================] - 13s 21ms/step - loss: 0.2104 - accuracy: 0.9103 - val_loss: 0.3277 - val_accuracy: 0.8660
Epoch 16/100
622/622 [==============================] - 13s 21ms/step - loss: 0.2022 - accuracy: 0.9167 - val_loss: 0.2989 - val_accuracy: 0.8737
Epoch 17/100
622/622 [==============================] - 13s 21ms/step - loss: 0.1771 - accuracy: 0.9258 - val_loss: 0.3787 - val_accuracy: 0.8435
Epoch 18/100
622/622 [==============================] - 13s 21ms/step - loss: 0.1866 - accuracy: 0.9235 - val_loss: 0.2983 - val_accuracy: 0.8745
Epoch 19/100
622/622 [==============================] - 13s 21ms/step - loss: 0.1585 - accuracy: 0.9349 - val_loss: 0.3144 - val_accuracy: 0.8753
Epoch 20/100
622/622 [==============================] - 13s 21ms/step - loss: 0.1689 - accuracy: 0.9309 - val_loss: 0.3433 - val_accuracy: 0.8632
Epoch 21/100
622/622 [==============================] - 13s 21ms/step - loss: 0.1190 - accuracy: 0.9555 - val_loss: 0.3611 - val_accuracy: 0.8697
Epoch 22/100
622/622 [==============================] - 13s 21ms/step - loss: 0.1205 - accuracy: 0.9530 - val_loss: 0.3319 - val_accuracy: 0.8777
Epoch 23/100
622/622 [==============================] - 13s 21ms/step - loss: 0.1353 - accuracy: 0.9481 - val_loss: 0.4199 - val_accuracy: 0.8572

Preparing the test data¶

In the following cells we will prepare the unseen data for testing and evaluate the model performance on the unseen data

datagen_test = ImageDataGenerator(rescale = 1/255.0)
test_data_generator = datagen.flow_from_directory(directory="Blood_cell_image_dataset/images/TEST/", target_size = (img_width, img_height), color_mode="rgb", class_mode="categorical", batch_size = 16, subset = "training")

Found 59 images belonging to 4 classes.

model.evaluate(test_data_generator)

4/4 [==============================] - 0s 18ms/step - loss: 0.4976 - accuracy: 0.7797

[0.4976014196872711, 0.7796609997749329]

Model accuracy and loss trends¶

Lets visualize the accuracy and loss trends throughout the training process

plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

Visualizing the predictions of the model on unseen data¶

# Getting the predicted classes from one hot encoded predicted outputs

x,y = test_data_generator.next()
pred_array=[]
max_index_arr = []
for i in range(10):
    img = x[i]
    img = img.reshape(-1,64,64,3)
    pred_val = model.predict(img)
    max_idx = np.argmax(pred_val)
    pred_array.append(max_idx)

#Making the Output meaningful using named classes

cell_dict = {0:"EOSINOPHIL", 1:"LYMPHOCYTE", 2:"MONOCYTE", 3:"NEUTROPHIL"}
predictions = {}
actual_val = {}

k=0
for arr in y[:10]:
    actual_val[k] = cell_dict[np.argmax(arr)]
    k+=1

k=0
for pred in pred_array:
    predictions[k] = cell_dict[pred]
    k+=1
    
print("ACTUAL:", actual_val)
print("PREDICTIONS:", predictions)

ACTUAL: {0: 'LYMPHOCYTE', 1: 'NEUTROPHIL', 2: 'NEUTROPHIL', 3: 'MONOCYTE', 4: 'NEUTROPHIL', 5: 'NEUTROPHIL', 6: 'NEUTROPHIL', 7: 'NEUTROPHIL', 8: 'NEUTROPHIL', 9: 'NEUTROPHIL'}
PREDICTIONS: {0: 'LYMPHOCYTE', 1: 'NEUTROPHIL', 2: 'NEUTROPHIL', 3: 'MONOCYTE', 4: 'NEUTROPHIL', 5: 'NEUTROPHIL', 6: 'EOSINOPHIL', 7: 'EOSINOPHIL', 8: 'NEUTROPHIL', 9: 'NEUTROPHIL'}

plt.figure(figsize = (20,20))
for i in range(10):
    plt.subplot(5,5,i+1)
    plt.imshow(x[i])
    plt.title('Original: {}, Predicted: {}'.format(actual_val[i], predictions[i]))
    plt.axis('Off')

plt.subplots_adjust(left=1.5, right=2.5, top=1)
plt.show()

model.save("blood_cell_classification.h5")

DeepCC¶

!deepCC blood_cell_classification.h5

[INFO]
Reading [keras model] 'blood_cell_classification.h5'
[SUCCESS]
Saved 'blood_cell_classification_deepC/blood_cell_classification.onnx'
[INFO]
Reading [onnx model] 'blood_cell_classification_deepC/blood_cell_classification.onnx'
[INFO]
Model info:
  ir_vesion : 5
  doc       : 
[WARNING]
[ONNX]: terminal (input/output) conv2d_input's shape is less than 1. Changing it to 1.
[WARNING]
[ONNX]: terminal (input/output) dense_1's shape is less than 1. Changing it to 1.
WARN (GRAPH): found operator node with the same name (dense_1) as io node.
[INFO]
Running DNNC graph sanity check ...
[SUCCESS]
Passed sanity check.
[INFO]
Writing C++ file 'blood_cell_classification_deepC/blood_cell_classification.cpp'
[INFO]
deepSea model files are ready in 'blood_cell_classification_deepC/' 
[RUNNING COMMAND]
g++ -std=c++11 -O3 -fno-rtti -fno-exceptions -I. -I/opt/tljh/user/lib/python3.7/site-packages/deepC-0.13-py3.7-linux-x86_64.egg/deepC/include -isystem /opt/tljh/user/lib/python3.7/site-packages/deepC-0.13-py3.7-linux-x86_64.egg/deepC/packages/eigen-eigen-323c052e1731 "blood_cell_classification_deepC/blood_cell_classification.cpp" -D_AITS_MAIN -o "blood_cell_classification_deepC/blood_cell_classification.exe"
[RUNNING COMMAND]
size "blood_cell_classification_deepC/blood_cell_classification.exe"
   text	   data	    bss	    dec	    hex	filename
 532605	   3784	    760	 537149	  8323d	blood_cell_classification_deepC/blood_cell_classification.exe
[SUCCESS]
Saved model as executable "blood_cell_classification_deepC/blood_cell_classification.exe"

Model Files
blood_cell_classification.h5 keras Model
deepSea Compiled Models
blood_cell_classification.exe deepSea Ubuntu