NOTE: This Use Case is not purposed for resource constrained devices.
Detecting Covid19 using lung CT scans¶
Credit: AITS Cainvas Community
Photo by Cloudy gif
Using the Lung CT scans to predict whether a person has COVID 19.
Deep learning models have proven useful and very efficient in the medical field to process scans, x-rays and other medical information to output useful information.
In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras import layers
import os
import random
Dataset¶
In [2]:
!wget -N https://cainvas-static.s3.amazonaws.com/media/user_data/cainvas-admin/COVID_CT_SCAN.zip
!unzip -qo COVID_CT_SCAN.zip
!rm COVID_CT_SCAN.zip
The dataset has the following:
* CT_COVID - This folder has images corresponding to a positive case of the COVID.
* CT_NonCOVID - This folder has images corresponding to a negative case of the COVID.
* A xlsx file - Contains the meta data of the images.
In [3]:
data_dir = 'COVID_CT_SCAN'
print("Number of samples")
for f in os.listdir(data_dir + '/'):
if os.path.isdir(data_dir + '/' + f):
print(f, " : ", len(os.listdir(data_dir + '/' + f +'/')))
Its an almost balanced dataset.
In [4]:
batch_size = 64
print("Training set")
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="training",
seed=113,
batch_size=batch_size)
print("Validation set")
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="validation",
seed=113,
batch_size=batch_size)
Looking into the classes
In [5]:
class_names = train_ds.class_names
print(class_names)
Visualization¶
In [6]:
plt.figure(figsize=(10, 10))
for images, labels in train_ds.take(1):
for i in range(9):
ax = plt.subplot(3, 3, i + 1)
plt.imshow(images[i].numpy().astype("uint8"))
plt.title(class_names[labels[i]])
plt.axis("off")
In [7]:
print("Shape of one training batch")
for image_batch, labels_batch in train_ds:
print("Input: ", image_batch.shape)
print("Labels: ", labels_batch.shape)
break
In [8]:
AUTOTUNE = tf.data.experimental.AUTOTUNE
train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
In [9]:
# Normalizing the pixel values
normalization_layer = layers.experimental.preprocessing.Rescaling(1./255)
train_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
val_ds = val_ds.map(lambda x, y: (normalization_layer(x), y))
Model¶
In [10]:
model = tf.keras.models.Sequential([
layers.Conv2D(16, 3, padding='same', activation='relu', input_shape = (256, 256, 3)),
layers.MaxPooling2D(),
layers.Conv2D(32, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(64, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dense(64, activation = 'relu'),
layers.Dense(1, activation = 'sigmoid')
])
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
loss=tf.keras.losses.BinaryCrossentropy(),
metrics=['accuracy'])
model.summary()
In [11]:
epochs=16
history = model.fit(train_ds, validation_data=val_ds, epochs=epochs)
In [12]:
model.evaluate(val_ds)
Out[12]:
There is a difference in accuracy between the train and validation accuracy. This high variance can be reducesd by training with a larger dataset, thus resulting in higher accuracy.
Plotting the metrics¶
In [13]:
def plot(history, variable, variable2):
plt.plot(range(len(history[variable])), history[variable])
plt.plot(range(len(history[variable2])), history[variable2])
plt.title(variable)
In [14]:
plot(history.history, "accuracy", 'val_accuracy')
In [15]:
plot(history.history, "loss", "val_loss")
Prediction¶
In [16]:
# pick random test data sample from one batch
x = random.randint(0, batch_size - 1)
for i in val_ds.as_numpy_iterator():
img, label = i
plt.axis('off') # remove axes
plt.imshow(img[x]) # shape from (64, 256, 256, 3) --> (256, 256, 3)
output = model.predict(np.expand_dims(img[x],0)) # getting output; input shape (256, 256, 3) --> (1, 256, 256, 3)
pred = int(output[0][0]>0.5)
print("Prdicted: ", class_names[pred]) # Picking the label from class_names base don the model output
print("True: ", class_names[label[x]], "( ", output[0][0], " --> ", pred, " )")
break
deepC¶
In [20]:
model.save('lungct.h5')
#!deepCC lungct.h5