Cainvas
Model Files
human_activity_model.h5
keras
Model
deepSea Compiled Models
human_activity_model.exe
deepSea
Ubuntu

Download the dataset

In [1]:
!wget -N "https://cainvas-static.s3.amazonaws.com/media/user_data/dark/Human_Activity_Data.zip"
!unzip -o Human_Activity_Data.zip -d data
!rm Human_Activity_Data.zip
--2020-08-31 07:09:10--  https://cainvas-static.s3.amazonaws.com/media/user_data/dark/Human_Activity_Data.zip
Resolving cainvas-static.s3.amazonaws.com (cainvas-static.s3.amazonaws.com)... 52.219.62.48
Connecting to cainvas-static.s3.amazonaws.com (cainvas-static.s3.amazonaws.com)|52.219.62.48|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 25693584 (25M) [application/zip]
Saving to: ‘Human_Activity_Data.zip’

Human_Activity_Data 100%[===================>]  24.50M  90.6MB/s    in 0.3s    

2020-08-31 07:09:11 (90.6 MB/s) - ‘Human_Activity_Data.zip’ saved [25693584/25693584]

Archive:  Human_Activity_Data.zip
  inflating: data/test.csv           
  inflating: data/train.csv          

Introduction

In this notebook, the human activity recognition dataset having records of acceleration and angular velocity measurements from different physical aspects in all three spatial dimensions (X, Y, Z) is used to train a machine and predict the activity from one of the six activities performed.

To start with, let's do some exploratory analysis in hope of understanding various measures and their effect on the activities.

In [2]:
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sb
%matplotlib inline
In [3]:
model_path = 'Human_activity_model.h5'
In [4]:
# load data
train = pd.read_csv("data/train.csv")
test = pd.read_csv("data/test.csv")
print('Train Data', train.shape,'\n', train.columns)
print('\nTest Data', test.shape)
Train Data (7352, 563) 
 Index(['tBodyAcc-mean()-X', 'tBodyAcc-mean()-Y', 'tBodyAcc-mean()-Z',
       'tBodyAcc-std()-X', 'tBodyAcc-std()-Y', 'tBodyAcc-std()-Z',
       'tBodyAcc-mad()-X', 'tBodyAcc-mad()-Y', 'tBodyAcc-mad()-Z',
       'tBodyAcc-max()-X',
       ...
       'fBodyBodyGyroJerkMag-kurtosis()', 'angle(tBodyAccMean,gravity)',
       'angle(tBodyAccJerkMean),gravityMean)',
       'angle(tBodyGyroMean,gravityMean)',
       'angle(tBodyGyroJerkMean,gravityMean)', 'angle(X,gravityMean)',
       'angle(Y,gravityMean)', 'angle(Z,gravityMean)', 'subject', 'Activity'],
      dtype='object', length=563)

Test Data (2947, 563)

Exploratory Analysis

The data has 7352 observations with 563 variables with the first few columns representing the mean and standard deviations of body accelerations in 3 spatial dimensions (X, Y, Z). The last two columns are "subject" and "Acitivity" which represent the subject that the observation is taken from and the corresponding activity respectively. Let's see what activities have been recorded in this data.

In [5]:
print('Train labels', train['Activity'].unique(), '\nTest Labels', test['Activity'].unique())
Train labels ['STANDING' 'SITTING' 'LAYING' 'WALKING' 'WALKING_DOWNSTAIRS'
 'WALKING_UPSTAIRS'] 
Test Labels ['STANDING' 'SITTING' 'LAYING' 'WALKING' 'WALKING_DOWNSTAIRS'
 'WALKING_UPSTAIRS']
In [6]:
labels = ['STANDING', 'SITTING', 'LAYING', 'WALKING', 'WALKING_DOWNSTAIRS', 'WALKING_UPSTAIRS']

We have 6 activities, 3 passive (laying, standing and sitting) and 3 active (walking, walking_downstairs, walking_upstairs) which involve walking. So, each observation in the dataset represent one of the six activities whose features are recorded in the 561 variables. Our goal would be trian a machine to predict one of the six activities given a feature set of these 561 variables.

Let's check how many observations are recorded by each subject.

In [7]:
pd.crosstab(train.subject, train.Activity)
Out[7]:
Activity LAYING SITTING STANDING WALKING WALKING_DOWNSTAIRS WALKING_UPSTAIRS
subject
1 50 47 53 95 49 53
3 62 52 61 58 49 59
5 52 44 56 56 47 47
6 57 55 57 57 48 51
7 52 48 53 57 47 51
8 54 46 54 48 38 41
11 57 53 47 59 46 54
14 51 54 60 59 45 54
15 72 59 53 54 42 48
16 70 69 78 51 47 51
17 71 64 78 61 46 48
19 83 73 73 52 39 40
21 90 85 89 52 45 47
22 72 62 63 46 36 42
23 72 68 68 59 54 51
25 73 65 74 74 58 65
26 76 78 74 59 50 55
27 74 70 80 57 44 51
28 80 72 79 54 46 51
29 69 60 65 53 48 49
30 70 62 59 65 62 65

Training

In this section, we are going to train a model in TensorFlow using the train set and predict the activity using the test set.

In [8]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
In [9]:
# load train and test data
num_labels = 6
train_x = np.asarray(train.iloc[:,:-2])
train_y = np.asarray(train.iloc[:,562])
act = np.unique(train_y)
for i in np.arange(num_labels):
    np.put(train_y, np.where(train_y==act[i]), i)
train_y = np.eye(num_labels)[train_y.astype('int')] # one-hot encoding

test_x = np.asarray(test.iloc[:,:-2])
test_y = np.asarray(test.iloc[:,562])
for i in np.arange(num_labels):
    np.put(test_y, np.where(test_y==act[i]), i)
test_y = np.eye(num_labels)[test_y.astype('int')]

# shuffle the data
seed = 456
np.random.seed(seed)
np.random.shuffle(train_x)
np.random.seed(seed)
np.random.shuffle(train_y)
np.random.seed(seed)
np.random.shuffle(test_x)
np.random.seed(seed)
np.random.shuffle(test_y)
In [10]:
train_x.shape, train_y.shape
Out[10]:
((7352, 561), (7352, 6))

We use two dense layers with a softmax activation

In [11]:
model = Sequential()

model.add(Dense(30, activation='relu', input_shape=(train_x.shape[1],)))
model.add(Dense(train_y.shape[1], activation='softmax'))

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001), loss='categorical_crossentropy', metrics=['accuracy'])
In [12]:
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 30)                16860     
_________________________________________________________________
dense_1 (Dense)              (None, 6)                 186       
=================================================================
Total params: 17,046
Trainable params: 17,046
Non-trainable params: 0
_________________________________________________________________
In [13]:
EPOCHS = 25

history = model.fit(train_x, 
                    train_y, 
                    epochs=EPOCHS,
                    validation_data=(test_x, test_y),
                   callbacks = [tf.keras.callbacks.EarlyStopping(monitor='val_loss', min_delta=0, patience=10, verbose=0, mode='min'),
                       tf.keras.callbacks.ModelCheckpoint(model_path,monitor='val_loss', save_best_only=True, mode='min', verbose=0)]
                   )
Epoch 1/25
230/230 [==============================] - 0s 2ms/step - loss: 0.5720 - accuracy: 0.8017 - val_loss: 0.3189 - val_accuracy: 0.8880
Epoch 2/25
230/230 [==============================] - 0s 1ms/step - loss: 0.2234 - accuracy: 0.9266 - val_loss: 0.2748 - val_accuracy: 0.8989
Epoch 3/25
230/230 [==============================] - 0s 1ms/step - loss: 0.1580 - accuracy: 0.9464 - val_loss: 0.1858 - val_accuracy: 0.9376
Epoch 4/25
230/230 [==============================] - 0s 1ms/step - loss: 0.1192 - accuracy: 0.9604 - val_loss: 0.2058 - val_accuracy: 0.9097
Epoch 5/25
230/230 [==============================] - 0s 1ms/step - loss: 0.1051 - accuracy: 0.9649 - val_loss: 0.1663 - val_accuracy: 0.9365
Epoch 6/25
230/230 [==============================] - 0s 1ms/step - loss: 0.0929 - accuracy: 0.9667 - val_loss: 0.1667 - val_accuracy: 0.9382
Epoch 7/25
230/230 [==============================] - 0s 1ms/step - loss: 0.0785 - accuracy: 0.9718 - val_loss: 0.1511 - val_accuracy: 0.9430
Epoch 8/25
230/230 [==============================] - 0s 1ms/step - loss: 0.0738 - accuracy: 0.9739 - val_loss: 0.1596 - val_accuracy: 0.9369
Epoch 9/25
230/230 [==============================] - 0s 1ms/step - loss: 0.0685 - accuracy: 0.9789 - val_loss: 0.1427 - val_accuracy: 0.9460
Epoch 10/25
230/230 [==============================] - 0s 1ms/step - loss: 0.0641 - accuracy: 0.9785 - val_loss: 0.1469 - val_accuracy: 0.9454
Epoch 11/25
230/230 [==============================] - 0s 1ms/step - loss: 0.0634 - accuracy: 0.9776 - val_loss: 0.1778 - val_accuracy: 0.9304
Epoch 12/25
230/230 [==============================] - 0s 1ms/step - loss: 0.0588 - accuracy: 0.9795 - val_loss: 0.2339 - val_accuracy: 0.9125
Epoch 13/25
230/230 [==============================] - 0s 1ms/step - loss: 0.0577 - accuracy: 0.9795 - val_loss: 0.1646 - val_accuracy: 0.9440
Epoch 14/25
230/230 [==============================] - 0s 1ms/step - loss: 0.0530 - accuracy: 0.9811 - val_loss: 0.1729 - val_accuracy: 0.9365
Epoch 15/25
230/230 [==============================] - 0s 1ms/step - loss: 0.0511 - accuracy: 0.9819 - val_loss: 0.1750 - val_accuracy: 0.9372
Epoch 16/25
230/230 [==============================] - 0s 1ms/step - loss: 0.0485 - accuracy: 0.9815 - val_loss: 0.1933 - val_accuracy: 0.9325
Epoch 17/25
230/230 [==============================] - 0s 1ms/step - loss: 0.0524 - accuracy: 0.9807 - val_loss: 0.1518 - val_accuracy: 0.9413
Epoch 18/25
230/230 [==============================] - 0s 1ms/step - loss: 0.0543 - accuracy: 0.9792 - val_loss: 0.1582 - val_accuracy: 0.9423
Epoch 19/25
230/230 [==============================] - 0s 1ms/step - loss: 0.0506 - accuracy: 0.9796 - val_loss: 0.1620 - val_accuracy: 0.9427
In [14]:
model.evaluate(
    test_x, test_y
)
93/93 [==============================] - 0s 705us/step - loss: 0.1620 - accuracy: 0.9427
Out[14]:
[0.1619744598865509, 0.9426535367965698]

Loss and Accuracy graph of the model

In [15]:
import matplotlib.pyplot as plt

plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper right')
plt.show()
In [16]:
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

Make predictions

In [17]:
y_preds = model.predict(test_x)
In [18]:
y_preds.shape , test_y.shape
Out[18]:
((2947, 6), (2947, 6))
In [19]:
y_preds_argmax = np.argmax(y_preds, axis=1)
y_test = np.argmax(test_y, axis=1)
In [20]:
y_preds_argmax.shape, y_test.shape
Out[20]:
((2947,), (2947,))
In [21]:
for a, b in zip(y_preds_argmax[:15], y_test[:15]):
    print("Predicted: {}, True: {}".format(labels[a], labels[b]))
Predicted: WALKING, True: WALKING
Predicted: WALKING_DOWNSTAIRS, True: WALKING_DOWNSTAIRS
Predicted: WALKING_DOWNSTAIRS, True: WALKING_DOWNSTAIRS
Predicted: WALKING_DOWNSTAIRS, True: WALKING_DOWNSTAIRS
Predicted: LAYING, True: LAYING
Predicted: SITTING, True: SITTING
Predicted: WALKING, True: WALKING_UPSTAIRS
Predicted: WALKING_DOWNSTAIRS, True: WALKING_DOWNSTAIRS
Predicted: SITTING, True: SITTING
Predicted: LAYING, True: LAYING
Predicted: WALKING, True: WALKING
Predicted: STANDING, True: STANDING
Predicted: WALKING, True: WALKING
Predicted: STANDING, True: STANDING
Predicted: WALKING, True: WALKING

Compile using deepC

In [22]:
!deepCC Human_activity_model.h5
reading [keras model] from 'Human_activity_model.h5'
Saved 'Human_activity_model.onnx'
reading onnx model from file  Human_activity_model.onnx
Model info:
  ir_vesion :  3 
  doc       : 
WARN (ONNX): terminal (input/output) dense_input's shape is less than 1.
             changing it to 1.
WARN (ONNX): terminal (input/output) dense_1's shape is less than 1.
             changing it to 1.
WARN (GRAPH): found operator node with the same name (dense_1) as io node.
running DNNC graph sanity check ... passed.
Writing C++ file  Human_activity_model_deepC/Human_activity_model.cpp
INFO (ONNX): model files are ready in dir Human_activity_model_deepC
g++ -O3 -I. -I/opt/tljh/user/lib/python3.7/site-packages/deepC-0.13-py3.7-linux-x86_64.egg/deepC/include -isystem /opt/tljh/user/lib/python3.7/site-packages/deepC-0.13-py3.7-linux-x86_64.egg/deepC/packages/eigen-eigen-323c052e1731 Human_activity_model_deepC/Human_activity_model.cpp -o Human_activity_model_deepC/Human_activity_model.exe
Model executable  Human_activity_model_deepC/Human_activity_model.exe

Single sample prediction

In [23]:
sample_data = test_x[103]
np.savetxt('sample.data', sample_data.flatten())
In [24]:
!./Human_activity_model_deepC/Human_activity_model.exe sample.data
reading file sample.data.
writing file dense_1.out.
In [25]:
nn_out = np.loadtxt('dense_1.out')

print("True Prediction: ", y_test[103])
print("Model Prediction: ", y_preds_argmax[103])
print("DeepC compiled model prediction: ", np.argmax(nn_out))
True Prediction:  5
Model Prediction:  5
DeepC compiled model prediction:  5