Blood Cell Classification using Deep Learning
Credit: AITS Cainvas Community
Photo by Dana Pavlichko for Happy Cog on Dribbble
The diagnosis of blood-based diseases often involves identifying and characterizing patient blood samples. Automated methods to detect and classify blood cell subtypes have important medical applications. In this notebook we will create a Convolutional Neural Network to identify and classify the different types of white blood cells from the images. The notebook uses images from a BCCD Dataset and contains around 12500 images in the training set. We have used 59 images from the dataset to use them for testing purpose.
Importing the dataset¶
!wget -N "https://cainvas-static.s3.amazonaws.com/media/user_data/cainvas-admin/Blood_cell_image_dataset.zip"
!unzip -qo Blood_cell_image_dataset.zip
!rm Blood_cell_image_dataset.zip
Importing the necessary libraries¶
#Importing necessary libraries
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPool2D
from tensorflow.keras.layers import ZeroPadding2D
from tensorflow.keras.layers import Dropout
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import numpy as np
import matplotlib.pyplot as plt
import cv2
Displaying the data¶
We can see in the following cell that the image is rgb
img = cv2.imread("Blood_cell_image_dataset/images/TRAIN/EOSINOPHIL/_0_1169.jpeg")
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB )
plt.title("Eosinophil")
plt.imshow(img)
img_width = 64
img_height = 64
Preparing the data¶
We have used ImageDataGenerator from keras, in the subsequent cells, to fetch the images along with their labels to train the neural network
datagen = ImageDataGenerator(rescale = 1/255.0, validation_split = 0.2)
train_data_generator = datagen.flow_from_directory(directory="Blood_cell_image_dataset/images/TRAIN/", target_size = (img_width, img_height), color_mode="rgb", class_mode="categorical", batch_size = 16, shuffle=True ,subset = "training")
validation_data_generator = datagen.flow_from_directory(directory="Blood_cell_image_dataset/images/TRAIN/", target_size = (img_width, img_height), color_mode="rgb", class_mode="categorical", batch_size = 16, shuffle=True, subset = "validation")
The labels¶
We will use one hot encoding here as our data is categorical in nature
train_data_generator.next()[1]
The CNN Model¶
model = Sequential()
model.add(Conv2D(32, (3,3), input_shape=(64,64,3), activation="relu"))
model.add(MaxPool2D(2,2))
model.add(Conv2D(32, (3,3), activation="relu"))
model.add(MaxPool2D(2,2))
model.add(Conv2D(16, (3,3), activation="relu"))
model.add(MaxPool2D(2,2))
model.add(Flatten())
model.add(Dense(128, activation="relu"))
model.add(Dense(4, activation="softmax"))
model.summary()
model.compile(optimizer="Adam", loss="categorical_crossentropy", metrics=['accuracy'])
my_callback = [tf.keras.callbacks.EarlyStopping(monitor = 'val_loss', patience = 5, restore_best_weights = True)]
history=model.fit(train_data_generator, steps_per_epoch=len(train_data_generator), epochs=100, validation_data=validation_data_generator, validation_steps = len(validation_data_generator), callbacks=my_callback)
Preparing the test data¶
In the following cells we will prepare the unseen data for testing and evaluate the model performance on the unseen data
datagen_test = ImageDataGenerator(rescale = 1/255.0)
test_data_generator = datagen.flow_from_directory(directory="Blood_cell_image_dataset/images/TEST/", target_size = (img_width, img_height), color_mode="rgb", class_mode="categorical", batch_size = 16, subset = "training")
model.evaluate(test_data_generator)
Model accuracy and loss trends¶
Lets visualize the accuracy and loss trends throughout the training process
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
Visualizing the predictions of the model on unseen data¶
# Getting the predicted classes from one hot encoded predicted outputs
x,y = test_data_generator.next()
pred_array=[]
max_index_arr = []
for i in range(10):
img = x[i]
img = img.reshape(-1,64,64,3)
pred_val = model.predict(img)
max_idx = np.argmax(pred_val)
pred_array.append(max_idx)
#Making the Output meaningful using named classes
cell_dict = {0:"EOSINOPHIL", 1:"LYMPHOCYTE", 2:"MONOCYTE", 3:"NEUTROPHIL"}
predictions = {}
actual_val = {}
k=0
for arr in y[:10]:
actual_val[k] = cell_dict[np.argmax(arr)]
k+=1
k=0
for pred in pred_array:
predictions[k] = cell_dict[pred]
k+=1
print("ACTUAL:", actual_val)
print("PREDICTIONS:", predictions)
plt.figure(figsize = (20,20))
for i in range(10):
plt.subplot(5,5,i+1)
plt.imshow(x[i])
plt.title('Original: {}, Predicted: {}'.format(actual_val[i], predictions[i]))
plt.axis('Off')
plt.subplots_adjust(left=1.5, right=2.5, top=1)
plt.show()
model.save("blood_cell_classification.h5")
DeepCC¶
!deepCC blood_cell_classification.h5