Human Activity Recognition¶
Credits: Pradeep Babburi, AITS Cainvas Community¶
Dataset: Human Activity Recognition with Smartphones¶
Photo by Irfan Ahmed Khan on Hackernoon
Download the dataset¶
!wget -N "https://cainvas-static.s3.amazonaws.com/media/user_data/dark/Human_Activity_Data.zip"
!unzip -o Human_Activity_Data.zip -d data
!rm Human_Activity_Data.zip
Introduction¶
In this notebook, the human activity recognition dataset having records of acceleration and angular velocity measurements from different physical aspects in all three spatial dimensions (X, Y, Z) is used to train a machine and predict the activity from one of the six activities performed.
To start with, let's do some exploratory analysis in hope of understanding various measures and their effect on the activities.
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sb
%matplotlib inline
model_path = 'Human_activity_model.h5'
# load data
train = pd.read_csv("data/train.csv")
test = pd.read_csv("data/test.csv")
print('Train Data', train.shape,'\n', train.columns)
print('\nTest Data', test.shape)
Exploratory Analysis¶
The data has 7352 observations with 563 variables with the first few columns representing the mean and standard deviations of body accelerations in 3 spatial dimensions (X, Y, Z). The last two columns are "subject" and "Acitivity" which represent the subject that the observation is taken from and the corresponding activity respectively. Let's see what activities have been recorded in this data.
print('Train labels', train['Activity'].unique(), '\nTest Labels', test['Activity'].unique())
labels = ['STANDING', 'SITTING', 'LAYING', 'WALKING', 'WALKING_DOWNSTAIRS', 'WALKING_UPSTAIRS']
We have 6 activities, 3 passive (laying, standing and sitting) and 3 active (walking, walking_downstairs, walking_upstairs) which involve walking. So, each observation in the dataset represent one of the six activities whose features are recorded in the 561 variables. Our goal would be trian a machine to predict one of the six activities given a feature set of these 561 variables.
Let's check how many observations are recorded by each subject.
pd.crosstab(train.subject, train.Activity)
Training¶
In this section, we are going to train a model in TensorFlow using the train set and predict the activity using the test set.
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
# load train and test data
num_labels = 6
train_x = np.asarray(train.iloc[:,:-2])
train_y = np.asarray(train.iloc[:,562])
act = np.unique(train_y)
for i in np.arange(num_labels):
np.put(train_y, np.where(train_y==act[i]), i)
train_y = np.eye(num_labels)[train_y.astype('int')] # one-hot encoding
test_x = np.asarray(test.iloc[:,:-2])
test_y = np.asarray(test.iloc[:,562])
for i in np.arange(num_labels):
np.put(test_y, np.where(test_y==act[i]), i)
test_y = np.eye(num_labels)[test_y.astype('int')]
# shuffle the data
seed = 456
np.random.seed(seed)
np.random.shuffle(train_x)
np.random.seed(seed)
np.random.shuffle(train_y)
np.random.seed(seed)
np.random.shuffle(test_x)
np.random.seed(seed)
np.random.shuffle(test_y)
train_x.shape, train_y.shape
We use two dense layers with a softmax activation
model = Sequential()
model.add(Dense(30, activation='relu', input_shape=(train_x.shape[1],)))
model.add(Dense(train_y.shape[1], activation='softmax'))
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001), loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()
EPOCHS = 25
history = model.fit(train_x,
train_y,
epochs=EPOCHS,
validation_data=(test_x, test_y),
callbacks = [tf.keras.callbacks.EarlyStopping(monitor='val_loss', min_delta=0, patience=10, verbose=0, mode='min'),
tf.keras.callbacks.ModelCheckpoint(model_path,monitor='val_loss', save_best_only=True, mode='min', verbose=0)]
)
model.evaluate(
test_x, test_y
)
Loss and Accuracy graph of the model¶
import matplotlib.pyplot as plt
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper right')
plt.show()
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
Make predictions¶
y_preds = model.predict(test_x)
y_preds.shape , test_y.shape
y_preds_argmax = np.argmax(y_preds, axis=1)
y_test = np.argmax(test_y, axis=1)
y_preds_argmax.shape, y_test.shape
for a, b in zip(y_preds_argmax[:15], y_test[:15]):
print("Predicted: {}, True: {}".format(labels[a], labels[b]))
Compile using deepC¶
!deepCC Human_activity_model.h5
Single sample prediction¶
sample_data = test_x[103]
np.savetxt('sample.data', sample_data.flatten())
!./Human_activity_model_deepC/Human_activity_model.exe sample.data
nn_out = np.loadtxt('dense_1.out')
print("True Prediction: ", y_test[103])
print("Model Prediction: ", y_preds_argmax[103])
print("DeepC compiled model prediction: ", np.argmax(nn_out))