Cainvas
Model Files
online_shopper.h5
keras
Model
deepSea Compiled Models
online_shopper.exe
deepSea
Ubuntu

Online Shopper's Intention Prediction

Credit: AITS Cainvas Community

Photo by Karol Cichoń on Dribbble

Predict a customer's behaviour in online shopping websites for KPI and marketing analysis.

In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from keras import models, optimizers, losses, layers, callbacks
from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
import random
import warnings
warnings.filterwarnings("ignore")

The dataset

  1. C. Okan Sakar Department of Computer Engineering, Faculty of Engineering and Natural Sciences, Bahcesehir University, 34349 Besiktas, Istanbul, Turkey
  2. Yomi Kastro Inveon Information Technologies Consultancy and Trade, 34335 Istanbul, Turkey

Sakar, C.O., Polat, S.O., Katircioglu, M. et al. Neural Comput & Applic (2018).

The dataset is a CSV file with 18 attributes (10 numerical and 8 categorical) and 1 target columns.

Administrative, Administrative Duration, Informational, Informational Duration, Product Related and Product Related Duration represent the number of times visited and duration of time spent in the respective categories of websites.

The Bounce Rate (the percentage of visitors who enter and leave the site without triggering any request), Exit Rate (percentage of sessions that ended int he page relative to all page views) and Page Value (the average value for a web page that a user visited before completing an e-commerce transaction) features represent the metrics measured by "Google Analytics" for each page in the e-commerce site.

The Special Day feature indicates the closeness of the site visiting time to a specific special day.

Other attributes such as operating system, browser, region, traffic type, visitor type, weekend, and month are also available.

In [2]:
df = pd.read_csv('https://cainvas-static.s3.amazonaws.com/media/user_data/cainvas-admin/online_shoppers_intention.csv')
df
Out[2]:
Administrative Administrative_Duration Informational Informational_Duration ProductRelated ProductRelated_Duration BounceRates ExitRates PageValues SpecialDay Month OperatingSystems Browser Region TrafficType VisitorType Weekend Revenue
0 0.0 0.0 0.0 0.0 1.0 0.000000 0.200000 0.200000 0.000000 0.0 Feb 1 1 1 1 Returning_Visitor False False
1 0.0 0.0 0.0 0.0 2.0 64.000000 0.000000 0.100000 0.000000 0.0 Feb 2 2 1 2 Returning_Visitor False False
2 0.0 -1.0 0.0 -1.0 1.0 -1.000000 0.200000 0.200000 0.000000 0.0 Feb 4 1 9 3 Returning_Visitor False False
3 0.0 0.0 0.0 0.0 2.0 2.666667 0.050000 0.140000 0.000000 0.0 Feb 3 2 2 4 Returning_Visitor False False
4 0.0 0.0 0.0 0.0 10.0 627.500000 0.020000 0.050000 0.000000 0.0 Feb 3 3 1 4 Returning_Visitor True False
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
12325 3.0 145.0 0.0 0.0 53.0 1783.791667 0.007143 0.029031 12.241717 0.0 Dec 4 6 1 1 Returning_Visitor True False
12326 0.0 0.0 0.0 0.0 5.0 465.750000 0.000000 0.021333 0.000000 0.0 Nov 3 2 1 8 Returning_Visitor True False
12327 0.0 0.0 0.0 0.0 6.0 184.250000 0.083333 0.086667 0.000000 0.0 Nov 3 2 1 13 Returning_Visitor True False
12328 4.0 75.0 0.0 0.0 15.0 346.000000 0.000000 0.021053 0.000000 0.0 Nov 2 2 3 11 Returning_Visitor False False
12329 0.0 0.0 0.0 0.0 3.0 21.250000 0.000000 0.066667 0.000000 0.0 Nov 3 2 1 2 New_Visitor True False

12330 rows × 18 columns

Looking into the columns in the data frame

In [3]:
df.columns
Out[3]:
Index(['Administrative', 'Administrative_Duration', 'Informational',
       'Informational_Duration', 'ProductRelated', 'ProductRelated_Duration',
       'BounceRates', 'ExitRates', 'PageValues', 'SpecialDay', 'Month',
       'OperatingSystems', 'Browser', 'Region', 'TrafficType', 'VisitorType',
       'Weekend', 'Revenue'],
      dtype='object')

Defining the numeric columns for standardization later

In [4]:
numeric_columns = ['Administrative', 'Administrative_Duration', 'Informational',
       'Informational_Duration', 'ProductRelated', 'ProductRelated_Duration',
       'BounceRates', 'ExitRates', 'PageValues', 'SpecialDay']

Checking for NaN values

...and dropping them.

In [5]:
print(df.isna().sum())

df = df.dropna()
Administrative             14
Administrative_Duration    14
Informational              14
Informational_Duration     14
ProductRelated             14
ProductRelated_Duration    14
BounceRates                14
ExitRates                  14
PageValues                  0
SpecialDay                  0
Month                       0
OperatingSystems            0
Browser                     0
Region                      0
TrafficType                 0
VisitorType                 0
Weekend                     0
Revenue                     0
dtype: int64

A peek into the class label distribution

In [6]:
df['Revenue'].value_counts()
Out[6]:
False    10408
True      1908
Name: Revenue, dtype: int64

Its not balanced but let us see how our model performs on this data.

A peek in to the values in the 'Month' column

In [7]:
df['Month'].value_counts()
Out[7]:
May     3363
Nov     2998
Mar     1894
Dec     1727
Oct      549
Sep      448
Aug      433
Jul      432
June     288
Feb      184
Name: Month, dtype: int64

Only 10 out of 12 months are in the dataframe. The month column needs to be one-hot encoded with all the 12 months in count.

In [8]:
# Convert binary to int
df['Weekend'] = df['Weekend'].astype('int64')
df['Revenue'] = df['Revenue'].astype('int64')

# One hot encoding 
dummy_columns = ['OperatingSystems','Browser','Region','TrafficType','VisitorType']

for column in dummy_columns:
    df_dummies = pd.get_dummies(df[column], drop_first = True, prefix = column+"_")    
    df = pd.concat([df, df_dummies], axis = 1)
    
df = df.drop(columns = dummy_columns)

# Accounting for all months in the calendar
months = ['Jan','Feb','Mar','Apr','May','June','Jul','Aug','Sep','Oct','Nov','Dec']

for mx in months[1:]:    # drop_first = True
    df[mx] = (df['Month'] == mx).astype('int64')

df = df.drop(columns = ['Month'])

Defining input and output columns

In [9]:
input_columns = df.columns.tolist()
input_columns.remove('Revenue')

output_columns = ['Revenue']

Train-val-test split based on 80-10-10 ratio

In [10]:
# Splitting into train, val and test set -- 80-10-10 split

# First, an 80-20 split
train_df, val_test_df = train_test_split(df, test_size = 0.2)

# Then split the 20% into half
val_df, test_df = train_test_split(val_test_df, test_size = 0.5)

print("Number of samples in...")
print("Training set: ", len(train_df))
print("Validation set: ", len(val_df))
print("Testing set: ", len(test_df))
Number of samples in...
Training set:  9852
Validation set:  1232
Testing set:  1232

Standardizing the numeric column values

In [11]:
ss = StandardScaler()

train_df[numeric_columns] = ss.fit_transform(train_df[numeric_columns])
val_df[numeric_columns] = ss.transform(val_df[numeric_columns])
test_df[numeric_columns] = ss.transform(test_df[numeric_columns])
In [12]:
# Splitting into X (input) and y (output)

Xtrain, ytrain = np.array(train_df[input_columns]), np.array(train_df[output_columns])

Xval, yval = np.array(val_df[input_columns]), np.array(val_df[output_columns])

Xtest, ytest = np.array(test_df[input_columns]).astype('float16'), np.array(test_df[output_columns])

The model

In [13]:
model = models.Sequential([
    layers.Dense(16, activation = 'relu', input_shape = Xtrain[0].shape),
    layers.Dense(8, activation = 'relu'),
    layers.Dense(1, activation = 'sigmoid')
])

cb = callbacks.EarlyStopping(patience = 5, restore_best_weights = True)
In [14]:
model.compile(optimizer = optimizers.Adam(0.0001), loss = losses.BinaryCrossentropy(), metrics = ['accuracy'])

history = model.fit(Xtrain, ytrain, validation_data = (Xval, yval), epochs = 256, callbacks = cb)
Epoch 1/256
308/308 [==============================] - 1s 2ms/step - loss: 0.6496 - accuracy: 0.6569 - val_loss: 0.5617 - val_accuracy: 0.8271
Epoch 2/256
308/308 [==============================] - 0s 2ms/step - loss: 0.4670 - accuracy: 0.8650 - val_loss: 0.4285 - val_accuracy: 0.8482
Epoch 3/256
308/308 [==============================] - 0s 2ms/step - loss: 0.3722 - accuracy: 0.8680 - val_loss: 0.3737 - val_accuracy: 0.8490
Epoch 4/256
308/308 [==============================] - 0s 2ms/step - loss: 0.3342 - accuracy: 0.8711 - val_loss: 0.3519 - val_accuracy: 0.8523
Epoch 5/256
308/308 [==============================] - 0s 2ms/step - loss: 0.3160 - accuracy: 0.8761 - val_loss: 0.3382 - val_accuracy: 0.8555
Epoch 6/256
308/308 [==============================] - 0s 2ms/step - loss: 0.3033 - accuracy: 0.8797 - val_loss: 0.3281 - val_accuracy: 0.8580
Epoch 7/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2936 - accuracy: 0.8839 - val_loss: 0.3207 - val_accuracy: 0.8636
Epoch 8/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2863 - accuracy: 0.8851 - val_loss: 0.3150 - val_accuracy: 0.8677
Epoch 9/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2806 - accuracy: 0.8891 - val_loss: 0.3107 - val_accuracy: 0.8693
Epoch 10/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2761 - accuracy: 0.8902 - val_loss: 0.3069 - val_accuracy: 0.8742
Epoch 11/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2724 - accuracy: 0.8935 - val_loss: 0.3038 - val_accuracy: 0.8782
Epoch 12/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2693 - accuracy: 0.8926 - val_loss: 0.3010 - val_accuracy: 0.8807
Epoch 13/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2665 - accuracy: 0.8931 - val_loss: 0.2986 - val_accuracy: 0.8823
Epoch 14/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2640 - accuracy: 0.8929 - val_loss: 0.2963 - val_accuracy: 0.8823
Epoch 15/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2618 - accuracy: 0.8934 - val_loss: 0.2945 - val_accuracy: 0.8831
Epoch 16/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2599 - accuracy: 0.8948 - val_loss: 0.2929 - val_accuracy: 0.8839
Epoch 17/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2581 - accuracy: 0.8957 - val_loss: 0.2914 - val_accuracy: 0.8823
Epoch 18/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2564 - accuracy: 0.8971 - val_loss: 0.2900 - val_accuracy: 0.8799
Epoch 19/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2548 - accuracy: 0.8968 - val_loss: 0.2889 - val_accuracy: 0.8799
Epoch 20/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2533 - accuracy: 0.8970 - val_loss: 0.2878 - val_accuracy: 0.8774
Epoch 21/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2520 - accuracy: 0.8973 - val_loss: 0.2869 - val_accuracy: 0.8750
Epoch 22/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2507 - accuracy: 0.8967 - val_loss: 0.2860 - val_accuracy: 0.8758
Epoch 23/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2495 - accuracy: 0.8976 - val_loss: 0.2850 - val_accuracy: 0.8766
Epoch 24/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2484 - accuracy: 0.8981 - val_loss: 0.2843 - val_accuracy: 0.8774
Epoch 25/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2473 - accuracy: 0.8981 - val_loss: 0.2836 - val_accuracy: 0.8791
Epoch 26/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2462 - accuracy: 0.8981 - val_loss: 0.2830 - val_accuracy: 0.8774
Epoch 27/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2453 - accuracy: 0.8990 - val_loss: 0.2823 - val_accuracy: 0.8774
Epoch 28/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2443 - accuracy: 0.8986 - val_loss: 0.2816 - val_accuracy: 0.8782
Epoch 29/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2435 - accuracy: 0.8982 - val_loss: 0.2810 - val_accuracy: 0.8766
Epoch 30/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2425 - accuracy: 0.8989 - val_loss: 0.2801 - val_accuracy: 0.8782
Epoch 31/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2418 - accuracy: 0.8990 - val_loss: 0.2796 - val_accuracy: 0.8782
Epoch 32/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2410 - accuracy: 0.8999 - val_loss: 0.2791 - val_accuracy: 0.8799
Epoch 33/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2402 - accuracy: 0.9002 - val_loss: 0.2786 - val_accuracy: 0.8799
Epoch 34/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2395 - accuracy: 0.9002 - val_loss: 0.2782 - val_accuracy: 0.8774
Epoch 35/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2389 - accuracy: 0.9004 - val_loss: 0.2780 - val_accuracy: 0.8782
Epoch 36/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2383 - accuracy: 0.9006 - val_loss: 0.2772 - val_accuracy: 0.8799
Epoch 37/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2376 - accuracy: 0.9014 - val_loss: 0.2769 - val_accuracy: 0.8782
Epoch 38/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2370 - accuracy: 0.9022 - val_loss: 0.2766 - val_accuracy: 0.8782
Epoch 39/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2364 - accuracy: 0.9008 - val_loss: 0.2764 - val_accuracy: 0.8791
Epoch 40/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2359 - accuracy: 0.9016 - val_loss: 0.2761 - val_accuracy: 0.8782
Epoch 41/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2353 - accuracy: 0.9013 - val_loss: 0.2757 - val_accuracy: 0.8807
Epoch 42/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2348 - accuracy: 0.9015 - val_loss: 0.2755 - val_accuracy: 0.8774
Epoch 43/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2343 - accuracy: 0.9022 - val_loss: 0.2751 - val_accuracy: 0.8774
Epoch 44/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2338 - accuracy: 0.9021 - val_loss: 0.2745 - val_accuracy: 0.8791
Epoch 45/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2333 - accuracy: 0.9021 - val_loss: 0.2745 - val_accuracy: 0.8782
Epoch 46/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2329 - accuracy: 0.9015 - val_loss: 0.2739 - val_accuracy: 0.8782
Epoch 47/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2323 - accuracy: 0.9021 - val_loss: 0.2736 - val_accuracy: 0.8774
Epoch 48/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2319 - accuracy: 0.9029 - val_loss: 0.2735 - val_accuracy: 0.8774
Epoch 49/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2315 - accuracy: 0.9027 - val_loss: 0.2731 - val_accuracy: 0.8774
Epoch 50/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2312 - accuracy: 0.9029 - val_loss: 0.2728 - val_accuracy: 0.8782
Epoch 51/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2307 - accuracy: 0.9031 - val_loss: 0.2726 - val_accuracy: 0.8782
Epoch 52/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2303 - accuracy: 0.9033 - val_loss: 0.2723 - val_accuracy: 0.8791
Epoch 53/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2299 - accuracy: 0.9030 - val_loss: 0.2720 - val_accuracy: 0.8791
Epoch 54/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2295 - accuracy: 0.9028 - val_loss: 0.2720 - val_accuracy: 0.8799
Epoch 55/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2291 - accuracy: 0.9034 - val_loss: 0.2717 - val_accuracy: 0.8799
Epoch 56/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2288 - accuracy: 0.9035 - val_loss: 0.2717 - val_accuracy: 0.8782
Epoch 57/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2284 - accuracy: 0.9037 - val_loss: 0.2714 - val_accuracy: 0.8791
Epoch 58/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2281 - accuracy: 0.9041 - val_loss: 0.2712 - val_accuracy: 0.8782
Epoch 59/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2277 - accuracy: 0.9043 - val_loss: 0.2710 - val_accuracy: 0.8807
Epoch 60/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2275 - accuracy: 0.9038 - val_loss: 0.2708 - val_accuracy: 0.8782
Epoch 61/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2272 - accuracy: 0.9049 - val_loss: 0.2708 - val_accuracy: 0.8782
Epoch 62/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2268 - accuracy: 0.9048 - val_loss: 0.2710 - val_accuracy: 0.8774
Epoch 63/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2265 - accuracy: 0.9044 - val_loss: 0.2705 - val_accuracy: 0.8799
Epoch 64/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2262 - accuracy: 0.9053 - val_loss: 0.2703 - val_accuracy: 0.8791
Epoch 65/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2259 - accuracy: 0.9053 - val_loss: 0.2704 - val_accuracy: 0.8758
Epoch 66/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2256 - accuracy: 0.9054 - val_loss: 0.2702 - val_accuracy: 0.8774
Epoch 67/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2253 - accuracy: 0.9057 - val_loss: 0.2701 - val_accuracy: 0.8791
Epoch 68/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2250 - accuracy: 0.9054 - val_loss: 0.2700 - val_accuracy: 0.8782
Epoch 69/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2247 - accuracy: 0.9055 - val_loss: 0.2699 - val_accuracy: 0.8774
Epoch 70/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2244 - accuracy: 0.9057 - val_loss: 0.2696 - val_accuracy: 0.8791
Epoch 71/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2243 - accuracy: 0.9058 - val_loss: 0.2695 - val_accuracy: 0.8766
Epoch 72/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2239 - accuracy: 0.9055 - val_loss: 0.2697 - val_accuracy: 0.8750
Epoch 73/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2237 - accuracy: 0.9059 - val_loss: 0.2694 - val_accuracy: 0.8766
Epoch 74/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2233 - accuracy: 0.9056 - val_loss: 0.2697 - val_accuracy: 0.8758
Epoch 75/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2233 - accuracy: 0.9061 - val_loss: 0.2692 - val_accuracy: 0.8766
Epoch 76/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2230 - accuracy: 0.9060 - val_loss: 0.2690 - val_accuracy: 0.8782
Epoch 77/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2227 - accuracy: 0.9064 - val_loss: 0.2691 - val_accuracy: 0.8774
Epoch 78/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2225 - accuracy: 0.9062 - val_loss: 0.2690 - val_accuracy: 0.8774
Epoch 79/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2222 - accuracy: 0.9062 - val_loss: 0.2691 - val_accuracy: 0.8774
Epoch 80/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2220 - accuracy: 0.9059 - val_loss: 0.2691 - val_accuracy: 0.8766
Epoch 81/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2217 - accuracy: 0.9070 - val_loss: 0.2689 - val_accuracy: 0.8782
Epoch 82/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2216 - accuracy: 0.9068 - val_loss: 0.2689 - val_accuracy: 0.8774
Epoch 83/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2213 - accuracy: 0.9068 - val_loss: 0.2689 - val_accuracy: 0.8774
Epoch 84/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2210 - accuracy: 0.9067 - val_loss: 0.2692 - val_accuracy: 0.8774
Epoch 85/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2209 - accuracy: 0.9071 - val_loss: 0.2689 - val_accuracy: 0.8791
Epoch 86/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2206 - accuracy: 0.9064 - val_loss: 0.2690 - val_accuracy: 0.8774
Epoch 87/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2204 - accuracy: 0.9066 - val_loss: 0.2688 - val_accuracy: 0.8807
Epoch 88/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2201 - accuracy: 0.9071 - val_loss: 0.2683 - val_accuracy: 0.8807
Epoch 89/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2199 - accuracy: 0.9066 - val_loss: 0.2682 - val_accuracy: 0.8799
Epoch 90/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2197 - accuracy: 0.9069 - val_loss: 0.2686 - val_accuracy: 0.8799
Epoch 91/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2194 - accuracy: 0.9075 - val_loss: 0.2684 - val_accuracy: 0.8799
Epoch 92/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2192 - accuracy: 0.9081 - val_loss: 0.2686 - val_accuracy: 0.8791
Epoch 93/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2191 - accuracy: 0.9075 - val_loss: 0.2686 - val_accuracy: 0.8799
Epoch 94/256
308/308 [==============================] - 0s 2ms/step - loss: 0.2188 - accuracy: 0.9082 - val_loss: 0.2690 - val_accuracy: 0.8774
In [15]:
model.evaluate(Xtest, ytest)
39/39 [==============================] - 0s 1000us/step - loss: 0.2418 - accuracy: 0.8912
Out[15]:
[0.2418023943901062, 0.8912337422370911]
In [16]:
cm = confusion_matrix(ytest, (model.predict(Xtest)>0.5).astype('int64'))
cm = cm.astype('int') / cm.sum(axis=1)[:, np.newaxis]

fig = plt.figure(figsize = (5, 5))
ax = fig.add_subplot(111)

for i in range(cm.shape[1]):
    for j in range(cm.shape[0]):
        if cm[i,j] > 0.8:
            clr = "white"
        else:
            clr = "black"
        ax.text(j, i, format(cm[i, j], '.2f'), horizontalalignment="center", color=clr)

_ = ax.imshow(cm, cmap=plt.cm.Blues)
ax.set_xticks(range(2))
ax.set_yticks(range(2))
ax.set_xticklabels(['True', 'False'], rotation = 90)
ax.set_yticklabels(['True', 'False'])
plt.xlabel('Predicted')
plt.ylabel('True')
plt.show()

Plotting the metrics

In [17]:
def plot(history, variable, variable2):
    plt.plot(range(len(history[variable])), history[variable])
    plt.plot(range(len(history[variable2])), history[variable2])
    plt.legend([variable, variable2])
    plt.title(variable)
In [18]:
plot(history.history, "loss", "val_loss")
In [19]:
plot(history.history, "accuracy", "val_accuracy")

Prediction

In [20]:
# pick random test data sample from one batch
x = random.randint(0, len(Xtest) - 1)

output = model.predict(Xtest[x].reshape(1, -1))[0][0]
pred = (output>0.5).astype('int64')

print("Predicted: ", bool(pred), "(", output, "-->", pred, ")")   

print("True: ", bool(ytest[x]))
Predicted:  False ( 0.0036888116 --> 0 )
True:  False

deepC

In [21]:
model.save('online_shopper.h5')

!deepCC online_shopper.h5
[INFO]
Reading [keras model] 'online_shopper.h5'
[SUCCESS]
Saved 'online_shopper_deepC/online_shopper.onnx'
[INFO]
Reading [onnx model] 'online_shopper_deepC/online_shopper.onnx'
[INFO]
Model info:
  ir_vesion : 4
  doc       : 
[WARNING]
[ONNX]: terminal (input/output) dense_input's shape is less than 1. Changing it to 1.
[WARNING]
[ONNX]: terminal (input/output) dense_2's shape is less than 1. Changing it to 1.
WARN (GRAPH): found operator node with the same name (dense_2) as io node.
[INFO]
Running DNNC graph sanity check ...
[SUCCESS]
Passed sanity check.
[INFO]
Writing C++ file 'online_shopper_deepC/online_shopper.cpp'
[INFO]
deepSea model files are ready in 'online_shopper_deepC/' 
[RUNNING COMMAND]
g++ -std=c++11 -O3 -fno-rtti -fno-exceptions -I. -I/opt/tljh/user/lib/python3.7/site-packages/deepC-0.13-py3.7-linux-x86_64.egg/deepC/include -isystem /opt/tljh/user/lib/python3.7/site-packages/deepC-0.13-py3.7-linux-x86_64.egg/deepC/packages/eigen-eigen-323c052e1731 "online_shopper_deepC/online_shopper.cpp" -D_AITS_MAIN -o "online_shopper_deepC/online_shopper.exe"
[RUNNING COMMAND]
size "online_shopper_deepC/online_shopper.exe"
   text	   data	    bss	    dec	    hex	filename
 122279	   2568	    760	 125607	  1eaa7	online_shopper_deepC/online_shopper.exe
[SUCCESS]
Saved model as executable "online_shopper_deepC/online_shopper.exe"
In [22]:
x = random.randint(0, len(Xtest) - 1)

np.savetxt('sample.data', Xtest[x])    # xth sample into text file

# run exe with input
!online_shopper_deepC/online_shopper.exe sample.data

output = model.predict(Xtest[x].reshape(1, -1))[0][0]
predm = (output>0.5).astype('int64')

# show predicted output
nn_out = np.loadtxt('deepSea_result_1.out')

pred = (nn_out>0.5).astype('int64')
print("Predicted (deepC): ", bool(pred), "(", nn_out, "-->", pred, ")")   
print("Predicted (model): ", bool(predm), "(", output, "-->", predm, ")")   

print("True: ", bool(ytest[x]))
writing file deepSea_result_1.out.
Predicted (deepC):  True ( 1.0 --> 1 )
Predicted (model):  False ( 0.112443924 --> 0 )
True:  False