NOTE: This Use Case is not purposed for resource constrained devices.
Fingerprint Pattern Classification
Credit: AITS Cainvas Community
Photo by Manu Designer on Dribbble
Fingerprint, as a unique feature of each person, can be divided into different types. In this notebook, we identify real fingerprints pattern and classify them with convolutional neural networks(CNN).
Let's get started!¶
Importing the necessary libraries¶
In [1]:
import os
import random
import cv2
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from tensorflow import keras
from tensorflow.keras import layers
Getting Data¶
In [2]:
!wget -N https://cainvas-static.s3.amazonaws.com/media/user_data/cainvas-admin/dataset_HFu5SVU.zip
!unzip -qo dataset_HFu5SVU.zip
dir = 'dataset'
Data Analysis¶
The data is in the format:¶
- name.png (For example, f0038_02.png)
- name.txt (For example, f0038_02.txt)
A sample of the content in the text file is given:
- Gender: M
- Class: T
- History: f0038_02.pct TL a2652.pct
There are 5 different classes namely:
- Arch (A)
- Left Loop (L)
- Right Loop (R)
- Tented Arch (T)
- Whirl (W)
Reading the text file and saving the required information to a csv file¶
In [3]:
labels = []
img_names = []
img_paths = []
gender = []
for file in os.listdir(dir):
if file.endswith('.txt'):
with open(os.path.join(dir, file), 'r') as t:
content = t.readlines()
gender.append(content[0].rsplit(' ')[1][0])
img_name = content[2].rsplit(' ')[1][:-4] + '.png'
img_paths.append(os.path.join(dir,img_name))
img_names.append(img_name)
labels.append((content[1].rsplit(' ')[1][0]))
In [4]:
df = pd.DataFrame()
df['IMAGE PATH'] = img_paths
df['IMAGE NAME'] = img_names
df['LABEL'] = labels
df['GENDER'] = gender
In [5]:
df.head()
Out[5]:
Checking the data for any imbalance¶
In [6]:
fig, axes = plt.subplots(1, 2, figsize=(15, 5))
sns.countplot(ax=axes[0], data = df, x = 'LABEL')
sns.countplot(ax=axes[1], data = df, x = 'LABEL', hue = 'GENDER')
Out[6]:
In [7]:
df['LABEL'].value_counts()
Out[7]:
From the plots above, it is clearly evident that there is a huge imbalance of the gender category. We won't train the model using that category and hence we can drop it. The labels are perfectly balanced, so we will continue use it without any changes.
In [8]:
df.drop(columns = 'GENDER',inplace=True)
df.head()
Out[8]:
Mapping the classes to an integer¶
In [9]:
classes = list(np.unique(labels))
print(classes)
map_classes = dict(zip(classes, [t for t in range(len(classes))]))
print(map_classes)
df['MAPPED LABELS'] = [map_classes[i] for i in df['LABEL']]
df = df.sample(frac = 1)
df.to_csv('dataset.csv')
df.head()
Out[9]:
In [10]:
df['MAPPED LABELS'] = [map_classes[i] for i in df['LABEL']]
df = df.sample(frac = 1)
df.to_csv('dataset.csv')
df.head()
Out[10]:
Plotting one image from each of the different classes¶
In [11]:
dim = len(classes)
fig,axes = plt.subplots(1,dim)
fig.subplots_adjust(0,0,2,2)
for idx, i in enumerate(classes):
dum = df[df['LABEL'] == i]
random_num = random.choice(dum.index)
label = df.loc[random_num]['LABEL']
axes[idx].imshow(cv2.imread(df.loc[random_num]['IMAGE PATH']))
axes[idx].set_title("CLASS: "+label +"\n" + "LABEL:"+str(map_classes[label]))
axes[idx].axis('off')
Checking if the images are grayscale¶
In [12]:
random_number = random.randint(0,len(df))
img_path = df.loc[random_number]['IMAGE PATH']
gray_img = cv2.imread(img_path,0)
color_img = cv2.imread(img_path)
resized_img = cv2.resize(cv2.imread(img_path,0), (128,128)) #Resized Grayscale image
fig, axes = plt.subplots(1, 3, figsize=(15, 5))
axes[0].set_title('SINGLE CHANNEL\n'+ str(gray_img.shape))
axes[0].imshow(gray_img, cmap='gray')
axes[1].set_title('THREE CHANNELS\n'+ str(color_img.shape))
axes[1].imshow(color_img)
axes[2].set_title('RESIZED IMAGE\n'+ str(resized_img.shape))
axes[2].imshow(resized_img, cmap='gray')
Out[12]:
The images that will be used is of the size (128,128). Now we can proceed towards building a model
Model and Inference¶
Data Preparation¶
In [13]:
X_data = df['IMAGE PATH']
y_data = df['MAPPED LABELS']
X_train, X_test, y_train, y_test = train_test_split(X_data, y_data, shuffle=True, test_size=0.01,stratify=y_data)
#Creating numpy arrays of images
X = []
y = []
for i in X_train:
X.append(cv2.imread(i))
for i in y_train:
y.append(i)
X = np.array(X)
y = np.array(y)
# Converting the labels vector to one-hot format
y = keras.utils.to_categorical(y, 5)
In [14]:
print(f"Total number of Images: {len(X_data)}")
print(f"Number of Training Images: {len(X_train)}")
print(f"Number of Test Images: {len(X_test)}") # Saving a small number of images for model testing|
print(f"Shape of Images: {X[0].shape}") #Printing the shape of Images
Model Architecture¶
In [15]:
model = keras.Sequential(
[
layers.Conv2D(32, input_shape=(128,128,3),padding="same",kernel_size=(3, 3), activation="relu"),
layers.MaxPooling2D(pool_size=(2, 2)),
layers.Conv2D(32, kernel_size=(3, 3), padding="same",activation="relu"),
layers.MaxPooling2D(pool_size=(2, 2)),
layers.Conv2D(64, kernel_size=(3, 3), padding="same",activation="relu"),
layers.MaxPooling2D(pool_size=(2, 2)),
layers.Conv2D(64, kernel_size=(3, 3),padding="same",activation="relu"),
layers.MaxPooling2D(pool_size=(2, 2)),
layers.Conv2D(64, kernel_size=(3, 3),padding="same",activation="relu"),
layers.MaxPooling2D(pool_size=(2, 2)),
layers.Conv2D(128, kernel_size=(3, 3),padding="same",activation="relu"),
layers.MaxPooling2D(pool_size=(2, 2)),
layers.Flatten(),
layers.Dropout(0.5),
layers.Dense(5, activation="softmax",kernel_regularizer='l1_l2'),
]
)
Checking the model parameters¶
In [16]:
model.summary()
Let the training begin!¶
In [17]:
#Hyperparameters
LOSS = 'categorical_crossentropy'
OPTIMIZER = 'adam'
BATCH_SIZE = 64
EPOCHS = 20
In [ ]:
model.compile(loss=LOSS, optimizer=OPTIMIZER, metrics=['accuracy'])
history=model.fit(x=X, y=y, batch_size=BATCH_SIZE, epochs=EPOCHS, validation_split=0.2)
Plotting Loss and Accuracy graphs¶
In [ ]:
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['train', 'val'], loc='center right')
plt.show()
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['train', 'val'], loc='upper right')
plt.show()
Preparing data for testing¶
In [ ]:
test_X = []
for i in X_test:
im = cv2.imread(i)
im = np.reshape(im, (1,128,128,3))
test_X.append(im)
test_X = np.array(test_X)
Plotting the predictions for the test images¶
In [ ]:
fig,axes = plt.subplots(5,5)
fig.subplots_adjust(0,0,3,3)
for i in range(0,5,1):
for j in range(0,5,1):
num = random.randint(0,len(test_X)-1)
display_image = test_X[num].squeeze(0)
image = test_X[num]
predicted_prob = model.predict(image)
predicted_class = np.argmax(predicted_prob)
ground_truth =classes[y_test.iloc[num]]
axes[i,j].imshow(display_image)
axes[i,j].imshow(display_image)
if(classes[predicted_class] != classes[y_test.iloc[num]]):
t = 'PREDICTED {} \n GROUND TRUTH[{}]'.format(classes[predicted_class], classes[y_test.iloc[num]])
axes[i,j].set_title(t, fontdict={'color': 'darkred'})
else:
t = '[CORRECT] {}'.format(classes[predicted_class])
axes[i,j].set_title(t)
axes[i,j].axis('off')
Saving the model¶
In [ ]:
#Saving the model
model.save('fingerprint.h5')
DeepCC¶
In [ ]:
!deepCC fingerprint.h5