NOTE: This Use Case is not purposed for resource constrained devices.
Malaria Parasite Detection¶
Credit: AITS Cainvas Community
Photo by Kurzgesagt — In a Nutshell on YouTube
Malaria is a life-threatening disease caused by parasites that are transmitted to people through the bites of infected female Anopheles mosquitoes. This notebook uses Convolutional Neural Networks to predict if a thin blood smear is parasitic or uninfected in nature.
This notebook uses highly processed images from the Malaria Dataset from the National Library of Medicine.
Each colored image is converted to 50X50 grayscale image to reduce the size of the dataset from ~350MB to 40MB.
Importing the dataset
In [1]:
!wget -N "https://cainvas-static.s3.amazonaws.com/media/user_data/cainvas-admin/malaria-dataset-processed.zip"
!unzip -qo malaria-dataset-processed.zip
!rm malaria-dataset-processed.zip
In [2]:
#Importing necessary libraries
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPool2D
from tensorflow.keras.layers import ZeroPadding2D
from tensorflow.keras.layers import Dropout
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import numpy as np
import matplotlib.pyplot as plt
import cv2
In [3]:
img = cv2.imread("cell_images/train/Parasitized/C33P1thinF_IMG_20150619_114756a_cell_181.png")
plt.title("Infected cell")
plt.imshow(img)
Out[3]:
In [4]:
img = cv2.imread("cell_images/train/Uninfected/C1_thinF_IMG_20150604_104722_cell_79.png")
plt.title("Uninfected cell")
plt.imshow(img)
Out[4]:
In [5]:
image_size = (50, 50)
Preparing the data
We have used ImageDataGenerator from keras, in the subsequent cells, to fetch the images along with their labels to train the neural network
In [6]:
datagen = ImageDataGenerator(rescale = 1/255.0, validation_split = 0.25)
In [7]:
train_data_generator = datagen.flow_from_directory(directory="cell_images/train", target_size = image_size, class_mode="binary", batch_size = 16, subset = "training")
In [8]:
validation_data_generator = datagen.flow_from_directory(directory="cell_images/train", target_size = image_size, class_mode="binary", batch_size = 16, subset = "validation")
The 0 label means the cell is Parasitic and 1 means Uninfected
In [9]:
train_data_generator.labels
Out[9]:
The Model
In [10]:
model = Sequential()
model.add(Conv2D(16, (3,3), input_shape=(*image_size, 3), activation="relu"))
model.add(MaxPool2D(2,2))
# model.add(Dropout(0.2))
model.add(Conv2D(32, (3,3), activation="relu"))
model.add(MaxPool2D(2,2))
# model.add(Dropout(0.3))
model.add(Conv2D(16, (3,3), activation="relu"))
model.add(MaxPool2D(2,2))
model.add(Flatten())
model.add(Dense(32, activation="relu"))
# model.add(Dense(64, activation="relu"))
# model.add(Dropout(0.5))
model.add(Dense(1, activation="sigmoid"))
In [11]:
model.summary()
In [12]:
model.compile(optimizer="adam", loss="binary_crossentropy", metrics=['accuracy'])
In [13]:
cb = [tf.keras.callbacks.EarlyStopping(monitor = 'val_loss', patience = 5, restore_best_weights = True)]
history=model.fit(train_data_generator,
steps_per_epoch=len(train_data_generator),
epochs=50,
validation_data=validation_data_generator,
validation_steps = len(validation_data_generator),
callbacks=cb)
In [14]:
datagen_test = ImageDataGenerator(rescale = 1/255.0)
test_data_generator = datagen.flow_from_directory(directory="cell_images/valid", target_size = image_size, class_mode="binary", batch_size = 16, subset = "training")
In [15]:
test_data_generator.labels
Out[15]:
In [16]:
model.evaluate(test_data_generator)
Out[16]:
In [17]:
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
In [18]:
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
In [19]:
x,y = test_data_generator.next()
pred_array=[]
for i in range(10):
img = x[i]
img = img.reshape(-1,50,50,3)
pred_val = model.predict(img)
if(pred_val > 0.5):
pred_val = 1
else:
pred_val = 0
pred_array.append(pred_val)
print("Predicted Values:", pred_array)
print("Actual Values:", y[:10])
Visualizing the predictions of the trained model on unseen data
In [20]:
plt.figure(figsize = (10,10))
for i in range(10):
plt.subplot(5,5,i+1)
plt.imshow(x[i])
plt.title('Original: {}, Predicted: {}'.format(y[i], pred_array[i]))
plt.axis('Off')
plt.subplots_adjust(left=1.5, right=3, top=1.2)
plt.show()
In [21]:
model.save("malaria_parasite_detection.h5")
deepCC¶
In [22]:
!deepCC "malaria_parasite_detection.h5"