Cainvas
Model Files
heart_disease.h5
keras
Model
deepSea Compiled Models
heart_disease.exe
deepSea
Ubuntu

Heart Disease Prediction using Neural Networks

Credit: AITS Cainvas Community

Photo by Diana Pasternak on Dribbble

This project will focus on predicting heart disease using neural networks. Based on attributes such as blood pressure, cholestoral levels, heart rate, and other characteristic attributes, patients will be classified according to varying degrees of coronary artery disease. This project will utilize a dataset of 303 patients and distributed by the UCI Deep Learning Repository.

We will be using some common Python libraries, such as pandas, numpy, and matplotlib. Furthermore, for the deep learning side of this project, we will be using sklearn and keras.

Importing the Dataset

his dataset contains patient data concerning heart disease diagnosis that was collected at several locations around the world. There are 76 attributes, including age, sex, resting blood pressure, cholestoral levels, echocardiogram data, exercise habits, and many others. To data, all published studies using this data focus on a subset of 14 attributes - so we will do the same. More specifically, we will use the data collected at the Cleveland Clinic Foundation.

In [1]:
!wget https://cainvas-static.s3.amazonaws.com/media/user_data/cainvas-admin/heart.csv
--2021-07-14 04:53:55--  https://cainvas-static.s3.amazonaws.com/media/user_data/cainvas-admin/heart.csv
Resolving cainvas-static.s3.amazonaws.com (cainvas-static.s3.amazonaws.com)... 52.219.160.47
Connecting to cainvas-static.s3.amazonaws.com (cainvas-static.s3.amazonaws.com)|52.219.160.47|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 11328 (11K) [text/csv]
Saving to: ‘heart.csv.3’

heart.csv.3         100%[===================>]  11.06K  --.-KB/s    in 0s      

2021-07-14 04:53:55 (206 MB/s) - ‘heart.csv.3’ saved [11328/11328]

Importing necessary libraries

In [2]:
import sys
import pandas as pd
import numpy as np
import sklearn
import matplotlib
from tensorflow import keras

print('Python: {}'.format(sys.version))
print('Pandas: {}'.format(pd.__version__))
print('Numpy: {}'.format(np.__version__))
print('Sklearn: {}'.format(sklearn.__version__))
print('Matplotlib: {}'.format(matplotlib.__version__))
print('Keras: {}'.format(keras.__version__))
Python: 3.7.3 (default, Mar 27 2019, 22:11:17) 
[GCC 7.3.0]
Pandas: 1.1.4
Numpy: 1.18.5
Sklearn: 0.23.2
Matplotlib: 3.3.3
Keras: 2.4.0
In [3]:
import matplotlib.pyplot as plt
from pandas.plotting import scatter_matrix
import seaborn as sns

Now, we are importing the dataset or say we are reading the dataset.

In [4]:
# read the csv

cleveland = pd.read_csv('heart.csv')
In [5]:
# print the shape of the DataFrame, so we can see how many examples we have

print( 'Shape of DataFrame: {}'.format(cleveland.shape))
print (cleveland.loc[1])
Shape of DataFrame: (303, 14)
age          37.0
sex           1.0
cp            2.0
trestbps    130.0
chol        250.0
fbs           0.0
restecg       1.0
thalach     187.0
exang         0.0
oldpeak       3.5
slope         0.0
ca            0.0
thal          2.0
target        1.0
Name: 1, dtype: float64
In [6]:
# print the last twenty or so data points

cleveland.loc[280:]
Out[6]:
age sex cp trestbps chol fbs restecg thalach exang oldpeak slope ca thal target
280 42 1 0 136 315 0 1 125 1 1.8 1 0 1 0
281 52 1 0 128 204 1 1 156 1 1.0 1 0 0 0
282 59 1 2 126 218 1 1 134 0 2.2 1 1 1 0
283 40 1 0 152 223 0 1 181 0 0.0 2 0 3 0
284 61 1 0 140 207 0 0 138 1 1.9 2 1 3 0
285 46 1 0 140 311 0 1 120 1 1.8 1 2 3 0
286 59 1 3 134 204 0 1 162 0 0.8 2 2 2 0
287 57 1 1 154 232 0 0 164 0 0.0 2 1 2 0
288 57 1 0 110 335 0 1 143 1 3.0 1 1 3 0
289 55 0 0 128 205 0 2 130 1 2.0 1 1 3 0
290 61 1 0 148 203 0 1 161 0 0.0 2 1 3 0
291 58 1 0 114 318 0 2 140 0 4.4 0 3 1 0
292 58 0 0 170 225 1 0 146 1 2.8 1 2 1 0
293 67 1 2 152 212 0 0 150 0 0.8 1 0 3 0
294 44 1 0 120 169 0 1 144 1 2.8 0 0 1 0
295 63 1 0 140 187 0 0 144 1 4.0 2 2 3 0
296 63 0 0 124 197 0 1 136 1 0.0 1 0 2 0
297 59 1 0 164 176 1 0 90 0 1.0 1 2 1 0
298 57 0 0 140 241 0 1 123 1 0.2 1 0 3 0
299 45 1 3 110 264 0 1 132 0 1.2 1 0 3 0
300 68 1 0 144 193 1 1 141 0 3.4 1 2 3 0
301 57 1 0 130 131 0 1 115 1 1.2 1 1 3 0
302 57 0 1 130 236 0 0 174 0 0.0 1 1 2 0
In [7]:
# remove missing data (indicated with a "?")

data = cleveland[~cleveland.isin(['?'])]
data.loc[280:]
Out[7]:
age sex cp trestbps chol fbs restecg thalach exang oldpeak slope ca thal target
280 42 1 0 136 315 0 1 125 1 1.8 1 0 1 0
281 52 1 0 128 204 1 1 156 1 1.0 1 0 0 0
282 59 1 2 126 218 1 1 134 0 2.2 1 1 1 0
283 40 1 0 152 223 0 1 181 0 0.0 2 0 3 0
284 61 1 0 140 207 0 0 138 1 1.9 2 1 3 0
285 46 1 0 140 311 0 1 120 1 1.8 1 2 3 0
286 59 1 3 134 204 0 1 162 0 0.8 2 2 2 0
287 57 1 1 154 232 0 0 164 0 0.0 2 1 2 0
288 57 1 0 110 335 0 1 143 1 3.0 1 1 3 0
289 55 0 0 128 205 0 2 130 1 2.0 1 1 3 0
290 61 1 0 148 203 0 1 161 0 0.0 2 1 3 0
291 58 1 0 114 318 0 2 140 0 4.4 0 3 1 0
292 58 0 0 170 225 1 0 146 1 2.8 1 2 1 0
293 67 1 2 152 212 0 0 150 0 0.8 1 0 3 0
294 44 1 0 120 169 0 1 144 1 2.8 0 0 1 0
295 63 1 0 140 187 0 0 144 1 4.0 2 2 3 0
296 63 0 0 124 197 0 1 136 1 0.0 1 0 2 0
297 59 1 0 164 176 1 0 90 0 1.0 1 2 1 0
298 57 0 0 140 241 0 1 123 1 0.2 1 0 3 0
299 45 1 3 110 264 0 1 132 0 1.2 1 0 3 0
300 68 1 0 144 193 1 1 141 0 3.4 1 2 3 0
301 57 1 0 130 131 0 1 115 1 1.2 1 1 3 0
302 57 0 1 130 236 0 0 174 0 0.0 1 1 2 0
In [8]:
# drop rows with NaN values from DataFrame

data = data.dropna(axis=0)
data.loc[280:]
Out[8]:
age sex cp trestbps chol fbs restecg thalach exang oldpeak slope ca thal target
280 42 1 0 136 315 0 1 125 1 1.8 1 0 1 0
281 52 1 0 128 204 1 1 156 1 1.0 1 0 0 0
282 59 1 2 126 218 1 1 134 0 2.2 1 1 1 0
283 40 1 0 152 223 0 1 181 0 0.0 2 0 3 0
284 61 1 0 140 207 0 0 138 1 1.9 2 1 3 0
285 46 1 0 140 311 0 1 120 1 1.8 1 2 3 0
286 59 1 3 134 204 0 1 162 0 0.8 2 2 2 0
287 57 1 1 154 232 0 0 164 0 0.0 2 1 2 0
288 57 1 0 110 335 0 1 143 1 3.0 1 1 3 0
289 55 0 0 128 205 0 2 130 1 2.0 1 1 3 0
290 61 1 0 148 203 0 1 161 0 0.0 2 1 3 0
291 58 1 0 114 318 0 2 140 0 4.4 0 3 1 0
292 58 0 0 170 225 1 0 146 1 2.8 1 2 1 0
293 67 1 2 152 212 0 0 150 0 0.8 1 0 3 0
294 44 1 0 120 169 0 1 144 1 2.8 0 0 1 0
295 63 1 0 140 187 0 0 144 1 4.0 2 2 3 0
296 63 0 0 124 197 0 1 136 1 0.0 1 0 2 0
297 59 1 0 164 176 1 0 90 0 1.0 1 2 1 0
298 57 0 0 140 241 0 1 123 1 0.2 1 0 3 0
299 45 1 3 110 264 0 1 132 0 1.2 1 0 3 0
300 68 1 0 144 193 1 1 141 0 3.4 1 2 3 0
301 57 1 0 130 131 0 1 115 1 1.2 1 1 3 0
302 57 0 1 130 236 0 0 174 0 0.0 1 1 2 0
In [9]:
# print the shape and data type of the dataframe

print(data.shape)
print(data.dtypes)
(303, 14)
age           int64
sex           int64
cp            int64
trestbps      int64
chol          int64
fbs           int64
restecg       int64
thalach       int64
exang         int64
oldpeak     float64
slope         int64
ca            int64
thal          int64
target        int64
dtype: object
In [10]:
# transform data to numeric to enable further analysis

data = data.apply(pd.to_numeric)
data.dtypes
Out[10]:
age           int64
sex           int64
cp            int64
trestbps      int64
chol          int64
fbs           int64
restecg       int64
thalach       int64
exang         int64
oldpeak     float64
slope         int64
ca            int64
thal          int64
target        int64
dtype: object
In [11]:
# print data characteristics, usings pandas built-in describe() function

data.describe()
Out[11]:
age sex cp trestbps chol fbs restecg thalach exang oldpeak slope ca thal target
count 303.000000 303.000000 303.000000 303.000000 303.000000 303.000000 303.000000 303.000000 303.000000 303.000000 303.000000 303.000000 303.000000 303.000000
mean 54.366337 0.683168 0.966997 131.623762 246.264026 0.148515 0.528053 149.646865 0.326733 1.039604 1.399340 0.729373 2.313531 0.544554
std 9.082101 0.466011 1.032052 17.538143 51.830751 0.356198 0.525860 22.905161 0.469794 1.161075 0.616226 1.022606 0.612277 0.498835
min 29.000000 0.000000 0.000000 94.000000 126.000000 0.000000 0.000000 71.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
25% 47.500000 0.000000 0.000000 120.000000 211.000000 0.000000 0.000000 133.500000 0.000000 0.000000 1.000000 0.000000 2.000000 0.000000
50% 55.000000 1.000000 1.000000 130.000000 240.000000 0.000000 1.000000 153.000000 0.000000 0.800000 1.000000 0.000000 2.000000 1.000000
75% 61.000000 1.000000 2.000000 140.000000 274.500000 0.000000 1.000000 166.000000 1.000000 1.600000 2.000000 1.000000 3.000000 1.000000
max 77.000000 1.000000 3.000000 200.000000 564.000000 1.000000 2.000000 202.000000 1.000000 6.200000 2.000000 4.000000 3.000000 1.000000
In [12]:
# plot histograms for each variable

data.hist(figsize = (12, 12))
plt.show()
In [13]:
pd.crosstab(data.age,data.target).plot(kind="bar",figsize=(20,6))
plt.title('Heart Disease Frequency for Ages')
plt.xlabel('Age')
plt.ylabel('Frequency')
plt.show()
In [14]:
plt.figure(figsize=(10,10))
sns.heatmap(data.corr(),annot=True,fmt='.1f')
plt.show()
In [15]:
age_unique=sorted(data.age.unique())
age_thalach_values=data.groupby('age')['thalach'].count().values
mean_thalach=[]
for i,age in enumerate(age_unique):
    mean_thalach.append(sum(data[data['age']==age].thalach)/age_thalach_values[i])
    
plt.figure(figsize=(10,5))
sns.pointplot(x=age_unique,y=mean_thalach,color='red',alpha=0.8)
plt.xlabel('Age',fontsize = 15,color='blue')
plt.xticks(rotation=45)
plt.ylabel('Thalach',fontsize = 15,color='blue')
plt.title('Age vs Thalach',fontsize = 15,color='blue')
plt.grid()
plt.show()

Create Training and Testing Datasets

Now that we have preprocessed the data appropriately, we can split it into training and testings datasets. We will use Sklearn's train_test_split() function to generate a training dataset (80 percent of the total data) and testing dataset (20 percent of the total data).

In [16]:
X = np.array(data.drop(['target'], 1))
y = np.array(data['target'])
In [17]:
X[0]
Out[17]:
array([ 63. ,   1. ,   3. , 145. , 233. ,   1. ,   0. , 150. ,   0. ,
         2.3,   0. ,   0. ,   1. ])
In [18]:
mean = X.mean(axis=0)
X -= mean
std = X.std(axis=0)
X /= std
In [19]:
X[0]
Out[19]:
array([ 0.9521966 ,  0.68100522,  1.97312292,  0.76395577, -0.25633371,
        2.394438  , -1.00583187,  0.01544279, -0.69663055,  1.08733806,
       -2.27457861, -0.71442887, -2.14887271])
In [20]:
# create X and Y datasets for training\

from sklearn import model_selection

X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, stratify=y, random_state=42, test_size = 0.2)
In [21]:
# convert the data to categorical labels

from tensorflow.keras.utils import to_categorical

Y_train = to_categorical(y_train, num_classes=None)
Y_test = to_categorical(y_test, num_classes=None)
print (Y_train.shape)
print (Y_train[:10])
(242, 2)
[[0. 1.]
 [1. 0.]
 [1. 0.]
 [1. 0.]
 [0. 1.]
 [0. 1.]
 [0. 1.]
 [0. 1.]
 [1. 0.]
 [0. 1.]]
In [22]:
X_train[0]
Out[22]:
array([ 1.61392956, -1.46841752,  1.97312292,  0.47839125, -0.14038081,
       -0.41763453,  0.89896224,  0.05917329, -0.69663055,  0.65599028,
        0.97635214,  1.24459328, -0.51292188])

Building and Training the Neural Network

Now that we have our data fully processed and split into training and testing datasets, we can begin building a neural network to solve this classification problem. Using keras, we will define a simple neural network with one hidden layer. Since this is a categorical classification problem, we will use a softmax activation function in the final layer of our network and a categorical_crossentropy loss during our training phase.

In [23]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Dropout
from tensorflow.keras import regularizers

# define a function to build the keras model

def create_model():
    
    # create model
    
    model = Sequential()
    model.add(Dense(16, input_dim=13, kernel_initializer='normal', kernel_regularizer=regularizers.l2(0.001), activation='relu'))
    model.add(Dropout(0.25))
    model.add(Dense(8, kernel_initializer='normal', kernel_regularizer=regularizers.l2(0.001), activation='relu'))
    model.add(Dropout(0.25))
    model.add(Dense(2, activation='softmax'))
    
    # compile model
    
    adam = Adam(lr=0.001)
    model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
    return model

model = create_model()

print(model.summary())
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 16)                224       
_________________________________________________________________
dropout (Dropout)            (None, 16)                0         
_________________________________________________________________
dense_1 (Dense)              (None, 8)                 136       
_________________________________________________________________
dropout_1 (Dropout)          (None, 8)                 0         
_________________________________________________________________
dense_2 (Dense)              (None, 2)                 18        
=================================================================
Total params: 378
Trainable params: 378
Non-trainable params: 0
_________________________________________________________________
None
In [24]:
# fit the model to the training data

history=model.fit(X_train, Y_train, validation_data=(X_test, Y_test),epochs=50, batch_size=10)
Epoch 1/50
25/25 [==============================] - 0s 8ms/step - loss: 0.6823 - accuracy: 0.6570 - val_loss: 0.6658 - val_accuracy: 0.7705
Epoch 2/50
25/25 [==============================] - 0s 3ms/step - loss: 0.6450 - accuracy: 0.7521 - val_loss: 0.6306 - val_accuracy: 0.7705
Epoch 3/50
25/25 [==============================] - 0s 3ms/step - loss: 0.6055 - accuracy: 0.7934 - val_loss: 0.5843 - val_accuracy: 0.7869
Epoch 4/50
25/25 [==============================] - 0s 2ms/step - loss: 0.5481 - accuracy: 0.8140 - val_loss: 0.5379 - val_accuracy: 0.7869
Epoch 5/50
25/25 [==============================] - 0s 3ms/step - loss: 0.4947 - accuracy: 0.8140 - val_loss: 0.4990 - val_accuracy: 0.8033
Epoch 6/50
25/25 [==============================] - 0s 3ms/step - loss: 0.4626 - accuracy: 0.8388 - val_loss: 0.4702 - val_accuracy: 0.8033
Epoch 7/50
25/25 [==============================] - 0s 3ms/step - loss: 0.4308 - accuracy: 0.8306 - val_loss: 0.4470 - val_accuracy: 0.8033
Epoch 8/50
25/25 [==============================] - 0s 3ms/step - loss: 0.4323 - accuracy: 0.8388 - val_loss: 0.4330 - val_accuracy: 0.8033
Epoch 9/50
25/25 [==============================] - 0s 2ms/step - loss: 0.4081 - accuracy: 0.8512 - val_loss: 0.4192 - val_accuracy: 0.8197
Epoch 10/50
25/25 [==============================] - 0s 3ms/step - loss: 0.4033 - accuracy: 0.8554 - val_loss: 0.4147 - val_accuracy: 0.8197
Epoch 11/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3623 - accuracy: 0.8636 - val_loss: 0.4081 - val_accuracy: 0.8197
Epoch 12/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3940 - accuracy: 0.8554 - val_loss: 0.4047 - val_accuracy: 0.8197
Epoch 13/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3672 - accuracy: 0.8636 - val_loss: 0.4038 - val_accuracy: 0.8033
Epoch 14/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3872 - accuracy: 0.8471 - val_loss: 0.4011 - val_accuracy: 0.8197
Epoch 15/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3930 - accuracy: 0.8554 - val_loss: 0.3994 - val_accuracy: 0.8033
Epoch 16/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3572 - accuracy: 0.8595 - val_loss: 0.4052 - val_accuracy: 0.7869
Epoch 17/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3584 - accuracy: 0.8595 - val_loss: 0.4075 - val_accuracy: 0.7869
Epoch 18/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3783 - accuracy: 0.8636 - val_loss: 0.4082 - val_accuracy: 0.7869
Epoch 19/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3733 - accuracy: 0.8719 - val_loss: 0.4067 - val_accuracy: 0.7869
Epoch 20/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3833 - accuracy: 0.8554 - val_loss: 0.4094 - val_accuracy: 0.7869
Epoch 21/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3705 - accuracy: 0.8636 - val_loss: 0.4093 - val_accuracy: 0.7869
Epoch 22/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3508 - accuracy: 0.8678 - val_loss: 0.4098 - val_accuracy: 0.7869
Epoch 23/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3417 - accuracy: 0.8719 - val_loss: 0.4102 - val_accuracy: 0.7869
Epoch 24/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3431 - accuracy: 0.8719 - val_loss: 0.4129 - val_accuracy: 0.7869
Epoch 25/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3640 - accuracy: 0.8719 - val_loss: 0.4114 - val_accuracy: 0.7869
Epoch 26/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3322 - accuracy: 0.8554 - val_loss: 0.4104 - val_accuracy: 0.7869
Epoch 27/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3614 - accuracy: 0.8760 - val_loss: 0.4102 - val_accuracy: 0.7869
Epoch 28/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3421 - accuracy: 0.8719 - val_loss: 0.4111 - val_accuracy: 0.8033
Epoch 29/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3218 - accuracy: 0.8843 - val_loss: 0.4107 - val_accuracy: 0.8033
Epoch 30/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3494 - accuracy: 0.8843 - val_loss: 0.4118 - val_accuracy: 0.8033
Epoch 31/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3505 - accuracy: 0.8843 - val_loss: 0.4144 - val_accuracy: 0.8033
Epoch 32/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3310 - accuracy: 0.8884 - val_loss: 0.4128 - val_accuracy: 0.8033
Epoch 33/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3519 - accuracy: 0.8678 - val_loss: 0.4130 - val_accuracy: 0.8033
Epoch 34/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3495 - accuracy: 0.8636 - val_loss: 0.4188 - val_accuracy: 0.8033
Epoch 35/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3533 - accuracy: 0.8678 - val_loss: 0.4201 - val_accuracy: 0.8033
Epoch 36/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3122 - accuracy: 0.8843 - val_loss: 0.4198 - val_accuracy: 0.8033
Epoch 37/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3132 - accuracy: 0.8802 - val_loss: 0.4202 - val_accuracy: 0.8033
Epoch 38/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3290 - accuracy: 0.8719 - val_loss: 0.4219 - val_accuracy: 0.8033
Epoch 39/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3343 - accuracy: 0.8884 - val_loss: 0.4246 - val_accuracy: 0.8033
Epoch 40/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3282 - accuracy: 0.8802 - val_loss: 0.4263 - val_accuracy: 0.8033
Epoch 41/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3470 - accuracy: 0.8884 - val_loss: 0.4256 - val_accuracy: 0.7869
Epoch 42/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3317 - accuracy: 0.8884 - val_loss: 0.4288 - val_accuracy: 0.7869
Epoch 43/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3299 - accuracy: 0.8884 - val_loss: 0.4261 - val_accuracy: 0.8033
Epoch 44/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3106 - accuracy: 0.8843 - val_loss: 0.4287 - val_accuracy: 0.8033
Epoch 45/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3197 - accuracy: 0.8843 - val_loss: 0.4319 - val_accuracy: 0.7869
Epoch 46/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3309 - accuracy: 0.8760 - val_loss: 0.4385 - val_accuracy: 0.7869
Epoch 47/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3058 - accuracy: 0.9008 - val_loss: 0.4393 - val_accuracy: 0.7869
Epoch 48/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3089 - accuracy: 0.8884 - val_loss: 0.4384 - val_accuracy: 0.7869
Epoch 49/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3314 - accuracy: 0.8884 - val_loss: 0.4359 - val_accuracy: 0.7869
Epoch 50/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3469 - accuracy: 0.8884 - val_loss: 0.4384 - val_accuracy: 0.8033
In [25]:
import matplotlib.pyplot as plt
%matplotlib inline

# Model accuracy

plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'])
plt.show()
In [26]:
# Model Losss

plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model Loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'])
plt.show()

Improving Results - A Binary Classification Problem

Although we achieved promising results, we still have a fairly large error. This could be because it is very difficult to distinguish between the different severity levels of heart disease (classes 1 - 4). Let's simplify the problem by converting the data to a binary classification problem - heart disease or no heart disease.

In [27]:
# convert into binary classification problem - heart disease or no heart disease

Y_train_binary = y_train.copy()
Y_test_binary = y_test.copy()

Y_train_binary[Y_train_binary > 0] = 1
Y_test_binary[Y_test_binary > 0] = 1

print(Y_train_binary[:20])
[1 0 0 0 1 1 1 1 0 1 0 1 0 0 0 1 0 0 1 1]
In [28]:
# define a new keras model for binary classification

def create_binary_model():
    
    # create model
    
    model = Sequential()
    model.add(Dense(16, input_dim=13, kernel_initializer='normal',  kernel_regularizer=regularizers.l2(0.001),activation='relu'))
    model.add(Dropout(0.25))
    model.add(Dense(8, kernel_initializer='normal',  kernel_regularizer=regularizers.l2(0.001),activation='relu'))
    model.add(Dropout(0.25))
    model.add(Dense(1, activation='sigmoid'))
    
    # Compile model
    
    adam = Adam(lr=0.001)
    model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
    return model

binary_model = create_binary_model()

print(binary_model.summary())
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_3 (Dense)              (None, 16)                224       
_________________________________________________________________
dropout_2 (Dropout)          (None, 16)                0         
_________________________________________________________________
dense_4 (Dense)              (None, 8)                 136       
_________________________________________________________________
dropout_3 (Dropout)          (None, 8)                 0         
_________________________________________________________________
dense_5 (Dense)              (None, 1)                 9         
=================================================================
Total params: 369
Trainable params: 369
Non-trainable params: 0
_________________________________________________________________
None
In [29]:
# fit the binary model on the training data

history=binary_model.fit(X_train, Y_train_binary, validation_data=(X_test, Y_test_binary), epochs=50, batch_size=10)
Epoch 1/50
25/25 [==============================] - 0s 6ms/step - loss: 0.6797 - accuracy: 0.6612 - val_loss: 0.6718 - val_accuracy: 0.7049
Epoch 2/50
25/25 [==============================] - 0s 2ms/step - loss: 0.6490 - accuracy: 0.7438 - val_loss: 0.6396 - val_accuracy: 0.7377
Epoch 3/50
25/25 [==============================] - 0s 3ms/step - loss: 0.6103 - accuracy: 0.7810 - val_loss: 0.6013 - val_accuracy: 0.7705
Epoch 4/50
25/25 [==============================] - 0s 3ms/step - loss: 0.5562 - accuracy: 0.7934 - val_loss: 0.5569 - val_accuracy: 0.7705
Epoch 5/50
25/25 [==============================] - 0s 3ms/step - loss: 0.5224 - accuracy: 0.7934 - val_loss: 0.5226 - val_accuracy: 0.7869
Epoch 6/50
25/25 [==============================] - 0s 2ms/step - loss: 0.4833 - accuracy: 0.8264 - val_loss: 0.4906 - val_accuracy: 0.8197
Epoch 7/50
25/25 [==============================] - 0s 3ms/step - loss: 0.4715 - accuracy: 0.8264 - val_loss: 0.4691 - val_accuracy: 0.8197
Epoch 8/50
25/25 [==============================] - 0s 3ms/step - loss: 0.4318 - accuracy: 0.8388 - val_loss: 0.4480 - val_accuracy: 0.8361
Epoch 9/50
25/25 [==============================] - 0s 2ms/step - loss: 0.4200 - accuracy: 0.8264 - val_loss: 0.4331 - val_accuracy: 0.8197
Epoch 10/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3959 - accuracy: 0.8388 - val_loss: 0.4222 - val_accuracy: 0.8033
Epoch 11/50
25/25 [==============================] - 0s 2ms/step - loss: 0.4107 - accuracy: 0.8388 - val_loss: 0.4138 - val_accuracy: 0.8033
Epoch 12/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3847 - accuracy: 0.8471 - val_loss: 0.4106 - val_accuracy: 0.7869
Epoch 13/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3790 - accuracy: 0.8512 - val_loss: 0.4069 - val_accuracy: 0.7869
Epoch 14/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3450 - accuracy: 0.8636 - val_loss: 0.4038 - val_accuracy: 0.7869
Epoch 15/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3579 - accuracy: 0.8388 - val_loss: 0.4049 - val_accuracy: 0.7869
Epoch 16/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3651 - accuracy: 0.8512 - val_loss: 0.4031 - val_accuracy: 0.7869
Epoch 17/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3875 - accuracy: 0.8719 - val_loss: 0.4026 - val_accuracy: 0.7869
Epoch 18/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3640 - accuracy: 0.8595 - val_loss: 0.4028 - val_accuracy: 0.7869
Epoch 19/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3770 - accuracy: 0.8595 - val_loss: 0.4047 - val_accuracy: 0.7869
Epoch 20/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3592 - accuracy: 0.8554 - val_loss: 0.4013 - val_accuracy: 0.8033
Epoch 21/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3658 - accuracy: 0.8554 - val_loss: 0.4011 - val_accuracy: 0.8033
Epoch 22/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3487 - accuracy: 0.8512 - val_loss: 0.3994 - val_accuracy: 0.8033
Epoch 23/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3306 - accuracy: 0.8678 - val_loss: 0.4011 - val_accuracy: 0.8033
Epoch 24/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3547 - accuracy: 0.8636 - val_loss: 0.4006 - val_accuracy: 0.8033
Epoch 25/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3519 - accuracy: 0.8802 - val_loss: 0.4033 - val_accuracy: 0.8033
Epoch 26/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3418 - accuracy: 0.8719 - val_loss: 0.4061 - val_accuracy: 0.8033
Epoch 27/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3311 - accuracy: 0.8802 - val_loss: 0.4011 - val_accuracy: 0.8033
Epoch 28/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3539 - accuracy: 0.8802 - val_loss: 0.4067 - val_accuracy: 0.8033
Epoch 29/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3372 - accuracy: 0.8719 - val_loss: 0.4059 - val_accuracy: 0.8033
Epoch 30/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3208 - accuracy: 0.8760 - val_loss: 0.4091 - val_accuracy: 0.8197
Epoch 31/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3324 - accuracy: 0.8678 - val_loss: 0.4124 - val_accuracy: 0.7869
Epoch 32/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3471 - accuracy: 0.8678 - val_loss: 0.4091 - val_accuracy: 0.7869
Epoch 33/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3261 - accuracy: 0.8843 - val_loss: 0.4081 - val_accuracy: 0.7869
Epoch 34/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3492 - accuracy: 0.8719 - val_loss: 0.4107 - val_accuracy: 0.8197
Epoch 35/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3333 - accuracy: 0.8760 - val_loss: 0.4111 - val_accuracy: 0.8197
Epoch 36/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3153 - accuracy: 0.8678 - val_loss: 0.4099 - val_accuracy: 0.8361
Epoch 37/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3080 - accuracy: 0.8884 - val_loss: 0.4097 - val_accuracy: 0.8197
Epoch 38/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3342 - accuracy: 0.8719 - val_loss: 0.4126 - val_accuracy: 0.8361
Epoch 39/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3110 - accuracy: 0.8884 - val_loss: 0.4116 - val_accuracy: 0.8361
Epoch 40/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3196 - accuracy: 0.8884 - val_loss: 0.4099 - val_accuracy: 0.7869
Epoch 41/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3279 - accuracy: 0.8678 - val_loss: 0.4106 - val_accuracy: 0.8197
Epoch 42/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3066 - accuracy: 0.8802 - val_loss: 0.4119 - val_accuracy: 0.8197
Epoch 43/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3052 - accuracy: 0.8802 - val_loss: 0.4121 - val_accuracy: 0.8033
Epoch 44/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3320 - accuracy: 0.8760 - val_loss: 0.4109 - val_accuracy: 0.8197
Epoch 45/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3441 - accuracy: 0.8843 - val_loss: 0.4103 - val_accuracy: 0.8197
Epoch 46/50
25/25 [==============================] - 0s 2ms/step - loss: 0.2999 - accuracy: 0.8760 - val_loss: 0.4111 - val_accuracy: 0.8361
Epoch 47/50
25/25 [==============================] - 0s 2ms/step - loss: 0.3553 - accuracy: 0.8843 - val_loss: 0.4072 - val_accuracy: 0.8361
Epoch 48/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3328 - accuracy: 0.8884 - val_loss: 0.4070 - val_accuracy: 0.8361
Epoch 49/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3234 - accuracy: 0.8843 - val_loss: 0.4072 - val_accuracy: 0.8197
Epoch 50/50
25/25 [==============================] - 0s 3ms/step - loss: 0.3233 - accuracy: 0.9050 - val_loss: 0.4110 - val_accuracy: 0.8033
In [30]:
import matplotlib.pyplot as plt
%matplotlib inline

# Model accuracy

plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'])
plt.show()
In [31]:
# Model Losss

plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model Loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'])
plt.show()

Results and Metrics

The accuracy results we have been seeing are for the training data, but what about the testing dataset? If our model's cannot generalize to data that wasn't used to train them, they won't provide any utility.

Let's test the performance of both our categorical model and binary model. To do this, we will make predictions on the training dataset and calculate performance metrics using Sklearn.

In [32]:
# generate classification report using predictions for categorical model

from sklearn.metrics import classification_report, accuracy_score

categorical_pred = np.argmax(model.predict(X_test), axis=1)

print('Results for Categorical Model')
print(accuracy_score(y_test, categorical_pred))
print(classification_report(y_test, categorical_pred))
Results for Categorical Model
0.8032786885245902
              precision    recall  f1-score   support

           0       0.86      0.68      0.76        28
           1       0.77      0.91      0.83        33

    accuracy                           0.80        61
   macro avg       0.82      0.79      0.80        61
weighted avg       0.81      0.80      0.80        61

In [33]:
# generate classification report using predictions for binary model

from sklearn.metrics import classification_report, accuracy_score

# generate classification report using predictions for binary model

binary_pred = np.round(binary_model.predict(X_test)).astype(int)

print('Results for Binary Model')
print(accuracy_score(Y_test_binary, binary_pred))
print(classification_report(Y_test_binary, binary_pred))
Results for Binary Model
0.8032786885245902
              precision    recall  f1-score   support

           0       0.86      0.68      0.76        28
           1       0.77      0.91      0.83        33

    accuracy                           0.80        61
   macro avg       0.82      0.79      0.80        61
weighted avg       0.81      0.80      0.80        61

Now, we save our model

In [34]:
model.save('heart_disease.h5')
In [35]:
from tensorflow.keras.models import load_model
In [36]:
m = load_model('heart_disease.h5')
In [37]:
m.predict_classes(X_test)
WARNING:tensorflow:From <ipython-input-37-b92a4831ffe9>:1: Sequential.predict_classes (from tensorflow.python.keras.engine.sequential) is deprecated and will be removed after 2021-01-01.
Instructions for updating:
Please use instead:* `np.argmax(model.predict(x), axis=-1)`,   if your model does multi-class classification   (e.g. if it uses a `softmax` last-layer activation).* `(model.predict(x) > 0.5).astype("int32")`,   if your model does binary classification   (e.g. if it uses a `sigmoid` last-layer activation).
Out[37]:
array([0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0,
       1, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1,
       0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1])

deepCC

In [1]:
!deepCC heart_disease.h5
[INFO]
Reading [keras model] 'heart_disease.h5'
[SUCCESS]
Saved 'heart_disease_deepC/heart_disease.onnx'
[INFO]
Reading [onnx model] 'heart_disease_deepC/heart_disease.onnx'
[INFO]
Model info:
  ir_vesion : 4
  doc       : 
[WARNING]
[ONNX]: terminal (input/output) dense_input's shape is less than 1. Changing it to 1.
[WARNING]
[ONNX]: terminal (input/output) dense_2's shape is less than 1. Changing it to 1.
WARN (GRAPH): found operator node with the same name (dense_2) as io node.
[INFO]
Running DNNC graph sanity check ...
[SUCCESS]
Passed sanity check.
[INFO]
Writing C++ file 'heart_disease_deepC/heart_disease.cpp'
[INFO]
deepSea model files are ready in 'heart_disease_deepC/' 
[RUNNING COMMAND]
g++ -std=c++11 -O3 -fno-rtti -fno-exceptions -I. -I/opt/tljh/user/lib/python3.7/site-packages/deepC-0.13-py3.7-linux-x86_64.egg/deepC/include -isystem /opt/tljh/user/lib/python3.7/site-packages/deepC-0.13-py3.7-linux-x86_64.egg/deepC/packages/eigen-eigen-323c052e1731 "heart_disease_deepC/heart_disease.cpp" -D_AITS_MAIN -o "heart_disease_deepC/heart_disease.exe"
[RUNNING COMMAND]
size "heart_disease_deepC/heart_disease.exe"
   text	   data	    bss	    dec	    hex	filename
 125341	   2984	    760	 129085	  1f83d	heart_disease_deepC/heart_disease.exe
[SUCCESS]
Saved model as executable "heart_disease_deepC/heart_disease.exe"
In [ ]: