Cainvas
Model Files
model(1).h5
keras
Model
model(1).h5
keras
Model

Hand Sign Recognition Application

Credit: AITS Cainvas Community

Photo by Kristen Brittain on Dribbble

IoT Use Case

This model can be helpful in smart watches. Different hand signs can have different application. For example, 1R can pause the song, 2R can increase volume and so. It is not only limited to smart watches but they can be greatly helpful in other IoT enabled devices as well!

In [1]:
!wget https://cainvas-static.s3.amazonaws.com/media/user_data/cainvas-admin/fingers.zip
--2021-09-07 07:02:06--  https://cainvas-static.s3.amazonaws.com/media/user_data/cainvas-admin/fingers.zip
Resolving cainvas-static.s3.amazonaws.com (cainvas-static.s3.amazonaws.com)... 52.219.156.19
Connecting to cainvas-static.s3.amazonaws.com (cainvas-static.s3.amazonaws.com)|52.219.156.19|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 189639195 (181M) [application/x-zip-compressed]
Saving to: ‘fingers.zip’

fingers.zip         100%[===================>] 180.85M  89.1MB/s    in 2.0s    

2021-09-07 07:02:08 (89.1 MB/s) - ‘fingers.zip’ saved [189639195/189639195]

In [2]:
!unzip -qo fingers.zip

Import Libraries

In [3]:
import tensorflow as tf
from tensorflow.keras.models import Sequential,Model,load_model
from tensorflow.keras.layers import Dense, Dropout, Flatten, BatchNormalization, Conv2D, MaxPooling2D, Activation, GlobalMaxPooling2D
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import regularizers

from skimage import io, transform

import os, glob

import os
import cv2
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix, plot_confusion_matrix
import seaborn as sns

Data Preprocessing

Load all images

In [4]:
train_img_list = glob.glob("fingers/train/*.png")
test_img_list = glob.glob("fingers/test/*.png")
In [5]:
len(train_img_list),len(test_img_list)
Out[5]:
(18000, 3600)

Convert images into NumPy array

In [6]:
def import_data():
    train_img_data = []
    test_img_data = []
    train_label_data = []
    test_label_data = []
    
    for img in train_img_list:
        img_read = io.imread(img)
        img_read = transform.resize(img_read, (64,64), mode = 'constant')
        train_img_data.append(img_read)
        train_label_data.append(img[-6:-4])
    
    for img in test_img_list:
        img_read = io.imread(img)
        img_read = transform.resize(img_read, (64,64), mode = 'constant')
        test_img_data.append(img_read)
        test_label_data.append(img[-6:-4])
        
    return np.array(train_img_data), np.array(test_img_data), np.array(train_label_data), np.array(test_label_data)
In [7]:
X_train, X_test, y_train, y_test = import_data()
In [8]:
X_train.shape, X_test.shape
Out[8]:
((18000, 64, 64), (3600, 64, 64))
In [9]:
y_train.shape, y_test.shape
Out[9]:
((18000,), (3600,))

Display sample images

In [10]:
for i in range(4):
    io.imshow(X_train[i])
    print(y_train[i])
    plt.show()
0L
3L
5L
3R
In [11]:
# Resize image as expected by model

X_train = X_train.reshape(X_train.shape[0], 64, 64, 1)
X_test = X_test.reshape(X_test.shape[0], 64, 64, 1)
In [12]:
print(X_train.shape,X_test.shape)
(18000, 64, 64, 1) (3600, 64, 64, 1)

Once hot encode output variable

In [13]:
label_to_int={
    '0R' : 0,
    '1R' : 1,
    '2R' : 2,
    '3R' : 3,
    '4R' : 4,
    '5R' : 5,
    '0L' : 6,
    '1L' : 7,
    '2L' : 8,
    '3L' : 9,
    '4L' : 10,
    '5L' : 11
}
In [14]:
names = ['0R','1R','2R','3R','4R','5R','0L','1L','2L','3L','4L','5L']
In [15]:
temp = []
for label in y_train:
    temp.append(label_to_int[label])
y_train = temp.copy()

temp = []
for label in y_test:
    temp.append(label_to_int[label])
y_test = temp.copy()
In [16]:
y_train = tf.keras.utils.to_categorical(y_train, num_classes = 12)
y_test = tf.keras.utils.to_categorical(y_test, num_classes = 12)

Define Model and Train

In [17]:
weight_decay = 1e-4

num_classes = 12

Define Model

In [18]:
model = Sequential()

model.add(Conv2D(64, (4,4), padding='same', kernel_regularizer=regularizers.l2(weight_decay), input_shape=(64,64,1)))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(Dropout(0.2))
 
model.add(Conv2D(128, (4,4), padding='same', kernel_regularizer=regularizers.l2(weight_decay)))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.3))

model.add(GlobalMaxPooling2D())

model.add(Dense(128, activation="linear"))
model.add(Activation('elu'))
model.add(Dense(num_classes, activation='softmax'))
In [19]:
model.compile(loss='categorical_crossentropy', optimizer=tf.keras.optimizers.Adam(0.0003), metrics=['accuracy'])
 
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 64, 64, 64)        1088      
_________________________________________________________________
activation (Activation)      (None, 64, 64, 64)        0         
_________________________________________________________________
batch_normalization (BatchNo (None, 64, 64, 64)        256       
_________________________________________________________________
dropout (Dropout)            (None, 64, 64, 64)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 64, 64, 128)       131200    
_________________________________________________________________
activation_1 (Activation)    (None, 64, 64, 128)       0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 64, 64, 128)       512       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 32, 32, 128)       0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 32, 32, 128)       0         
_________________________________________________________________
global_max_pooling2d (Global (None, 128)               0         
_________________________________________________________________
dense (Dense)                (None, 128)               16512     
_________________________________________________________________
activation_2 (Activation)    (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 12)                1548      
=================================================================
Total params: 151,116
Trainable params: 150,732
Non-trainable params: 384
_________________________________________________________________

Train the model

In [20]:
history = model.fit(x = X_train,y = y_train, batch_size=64, validation_data = (X_test,y_test), epochs = 5)
Epoch 1/5
282/282 [==============================] - 20s 70ms/step - loss: 0.8668 - accuracy: 0.7838 - val_loss: 1.9706 - val_accuracy: 0.3142
Epoch 2/5
282/282 [==============================] - 20s 70ms/step - loss: 0.1051 - accuracy: 0.9798 - val_loss: 0.4685 - val_accuracy: 0.9317
Epoch 3/5
282/282 [==============================] - 20s 71ms/step - loss: 0.0569 - accuracy: 0.9908 - val_loss: 0.0699 - val_accuracy: 0.9950
Epoch 4/5
282/282 [==============================] - 20s 72ms/step - loss: 0.0383 - accuracy: 0.9943 - val_loss: 0.0528 - val_accuracy: 0.9942
Epoch 5/5
282/282 [==============================] - 20s 72ms/step - loss: 0.0315 - accuracy: 0.9952 - val_loss: 0.0285 - val_accuracy: 0.9994

Model Evaluation

Accuracy Plot

In [21]:
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

Loss Plot

In [22]:
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper right')
plt.show()
In [23]:
model.save('model.h5')
In [24]:
model = load_model('model.h5')

Predictions

In [25]:
y_test_predict = model.predict(X_test)
In [26]:
y_test_predict = np.argmax(y_test_predict,axis=1)

Display Predictions

In [27]:
fig = plt.figure(figsize=(18, 18))
for i, idx in enumerate(np.random.choice(X_test.shape[0], size=9, replace=False)):
    ax = fig.add_subplot(3, 3, i + 1, xticks=[], yticks=[])
    ax.imshow(np.squeeze(X_test[idx]), cmap='gray')
    pred_idx = y_test_predict[idx]
    true_idx = np.argmax(y_test[idx])
    ax.set_title("{} ({})".format(names[pred_idx], names[true_idx]),color=("#4876ff" if pred_idx == true_idx else "darkred"))
In [28]:
y_test_orig = []
In [29]:
for i in y_test:
    y_test_orig.append(np.argmax(i)) 
y_test_orig = np.array(y_test_orig)

Plot Confusion Matrix

In [30]:
cnf = confusion_matrix(y_test_predict,y_test_orig)


df_cnf = pd.DataFrame(cnf, range(1,13), range(1,13))
fig, ax = plt.subplots(figsize=(10,10)) 
sns.set(font_scale = 1)
sns.heatmap(df_cnf, annot=True, linewidths=.5, ax=ax)
plt.title("Confusion Matrix")
plt.show()
In [ ]: