Cainvas
Model Files
churn_model_new.h5
keras
Model
deepSea Compiled Models
churn_model_new.exe
deepSea
Ubuntu

Predicting Churn for Bank Customers

Credit: AITS Cainvas Community

Photo by Vic López on Dribbble

In [1]:
!wget -N "https://cainvas-static.s3.amazonaws.com/media/user_data/cainvas-admin/project.zip"
!unzip -o project.zip 
!rm project.zip
--2021-07-05 12:04:18--  https://cainvas-static.s3.amazonaws.com/media/user_data/cainvas-admin/project.zip
Resolving cainvas-static.s3.amazonaws.com (cainvas-static.s3.amazonaws.com)... 52.219.158.19
Connecting to cainvas-static.s3.amazonaws.com (cainvas-static.s3.amazonaws.com)|52.219.158.19|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 268000 (262K) [application/x-zip-compressed]
Saving to: ‘project.zip’

project.zip         100%[===================>] 261.72K  --.-KB/s    in 0.002s  

2021-07-05 12:04:18 (159 MB/s) - ‘project.zip’ saved [268000/268000]

Archive:  project.zip
  inflating: Churn Model.csv         

Importing Libraries

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
In [3]:
churn=pd.read_csv("Churn Model.csv") 
In [4]:
churn
Out[4]:
RowNumber CustomerId Surname CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited
0 1 15634602 Hargrave 619 France Female 42 2 0.00 1 1 1 101348.88 1
1 2 15647311 Hill 608 Spain Female 41 1 83807.86 1 0 1 112542.58 0
2 3 15619304 Onio 502 France Female 42 8 159660.80 3 1 0 113931.57 1
3 4 15701354 Boni 699 France Female 39 1 0.00 2 0 0 93826.63 0
4 5 15737888 Mitchell 850 Spain Female 43 2 125510.82 1 1 1 79084.10 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
9995 9996 15606229 Obijiaku 771 France Male 39 5 0.00 2 1 0 96270.64 0
9996 9997 15569892 Johnstone 516 France Male 35 10 57369.61 1 1 1 101699.77 0
9997 9998 15584532 Liu 709 France Female 36 7 0.00 1 0 1 42085.58 1
9998 9999 15682355 Sabbatini 772 Germany Male 42 3 75075.31 2 1 0 92888.52 1
9999 10000 15628319 Walker 792 France Female 28 4 130142.79 1 1 0 38190.78 0

10000 rows × 14 columns

In [5]:
churn.columns
Out[5]:
Index(['RowNumber', 'CustomerId', 'Surname', 'CreditScore', 'Geography',
       'Gender', 'Age', 'Tenure', 'Balance', 'NumOfProducts', 'HasCrCard',
       'IsActiveMember', 'EstimatedSalary', 'Exited'],
      dtype='object')

Checking Null values

In [6]:
churn.isnull().sum()  #to check any null values
Out[6]:
RowNumber          0
CustomerId         0
Surname            0
CreditScore        0
Geography          0
Gender             0
Age                0
Tenure             0
Balance            0
NumOfProducts      0
HasCrCard          0
IsActiveMember     0
EstimatedSalary    0
Exited             0
dtype: int64

Creating heat map

In [7]:
plt.figure(figsize=(14,14))
sns.heatmap(churn.corr(), annot=True, cmap="coolwarm")
Out[7]:
<AxesSubplot:>

Replacing Categorical with Numerical

In [8]:
churn['Gender'].replace('Female',0,inplace=True)
churn['Gender'].replace('Male',1,inplace=True)
In [9]:
churn
Out[9]:
RowNumber CustomerId Surname CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited
0 1 15634602 Hargrave 619 France 0 42 2 0.00 1 1 1 101348.88 1
1 2 15647311 Hill 608 Spain 0 41 1 83807.86 1 0 1 112542.58 0
2 3 15619304 Onio 502 France 0 42 8 159660.80 3 1 0 113931.57 1
3 4 15701354 Boni 699 France 0 39 1 0.00 2 0 0 93826.63 0
4 5 15737888 Mitchell 850 Spain 0 43 2 125510.82 1 1 1 79084.10 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
9995 9996 15606229 Obijiaku 771 France 1 39 5 0.00 2 1 0 96270.64 0
9996 9997 15569892 Johnstone 516 France 1 35 10 57369.61 1 1 1 101699.77 0
9997 9998 15584532 Liu 709 France 0 36 7 0.00 1 0 1 42085.58 1
9998 9999 15682355 Sabbatini 772 Germany 1 42 3 75075.31 2 1 0 92888.52 1
9999 10000 15628319 Walker 792 France 0 28 4 130142.79 1 1 0 38190.78 0

10000 rows × 14 columns

In [10]:
X = churn.iloc[:, 3:13].values
y = churn.iloc[:, 13].values
In [11]:
print(X)
[[619 'France' 0 ... 1 1 101348.88]
 [608 'Spain' 0 ... 0 1 112542.58]
 [502 'France' 0 ... 1 0 113931.57]
 ...
 [709 'France' 0 ... 0 1 42085.58]
 [772 'Germany' 1 ... 1 0 92888.52]
 [792 'France' 0 ... 1 0 38190.78]]
In [12]:
print(y)
[1 0 1 ... 1 1 0]
In [13]:
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
X[:, 2] = le.fit_transform(X[:, 2])
In [14]:
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers=[('encoder', OneHotEncoder(), [1])], remainder='passthrough')
X = np.array(ct.fit_transform(X))
In [15]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

Feature Scaling

In [16]:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
In [17]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D,Input
from tensorflow.keras.layers import LeakyReLU,PReLU,ELU
In [18]:
model = Sequential()
In [19]:
model.add(Dense(units = 6, kernel_initializer = 'he_uniform', activation = 'relu'))
model.add(Dense(units = 12, kernel_initializer = 'he_uniform', activation = 'relu'))
model.add(Dense(units = 24, kernel_initializer = 'he_uniform', activation = 'relu'))
model.add(Dense(units = 36, kernel_initializer = 'he_uniform', activation = 'relu'))
model.add(Dense(units = 48, kernel_initializer = 'he_uniform', activation = 'relu'))
model.add(Dense(units = 54, kernel_initializer = 'he_uniform', activation = 'relu'))
model.add(Dense(units = 60, kernel_initializer = 'he_uniform', activation = 'relu'))
model.add(Dense(units = 1, kernel_initializer = 'glorot_uniform', activation = 'sigmoid'))
In [30]:
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 6)                 78        
_________________________________________________________________
dense_1 (Dense)              (None, 12)                84        
_________________________________________________________________
dense_2 (Dense)              (None, 24)                312       
_________________________________________________________________
dense_3 (Dense)              (None, 36)                900       
_________________________________________________________________
dense_4 (Dense)              (None, 48)                1776      
_________________________________________________________________
dense_5 (Dense)              (None, 54)                2646      
_________________________________________________________________
dense_6 (Dense)              (None, 60)                3300      
_________________________________________________________________
dense_7 (Dense)              (None, 1)                 61        
=================================================================
Total params: 9,157
Trainable params: 9,157
Non-trainable params: 0
_________________________________________________________________
In [24]:
model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
In [26]:
y
Out[26]:
array([1, 0, 1, ..., 1, 1, 0])
In [27]:
X
Out[27]:
array([[1.0, 0.0, 0.0, ..., 1, 1, 101348.88],
       [0.0, 0.0, 1.0, ..., 0, 1, 112542.58],
       [1.0, 0.0, 0.0, ..., 1, 0, 113931.57],
       ...,
       [1.0, 0.0, 0.0, ..., 0, 1, 42085.58],
       [0.0, 1.0, 0.0, ..., 1, 0, 92888.52],
       [1.0, 0.0, 0.0, ..., 1, 0, 38190.78]], dtype=object)
In [28]:
churn.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   RowNumber        10000 non-null  int64  
 1   CustomerId       10000 non-null  int64  
 2   Surname          10000 non-null  object 
 3   CreditScore      10000 non-null  int64  
 4   Geography        10000 non-null  object 
 5   Gender           10000 non-null  int64  
 6   Age              10000 non-null  int64  
 7   Tenure           10000 non-null  int64  
 8   Balance          10000 non-null  float64
 9   NumOfProducts    10000 non-null  int64  
 10  HasCrCard        10000 non-null  int64  
 11  IsActiveMember   10000 non-null  int64  
 12  EstimatedSalary  10000 non-null  float64
 13  Exited           10000 non-null  int64  
dtypes: float64(2), int64(10), object(2)
memory usage: 1.1+ MB
In [29]:
history= model.fit(X_train,y_train, epochs = 25,validation_data=(X_test, y_test))
Epoch 1/25
250/250 [==============================] - 1s 3ms/step - loss: 0.5060 - accuracy: 0.7915 - val_loss: 0.5140 - val_accuracy: 0.7780
Epoch 2/25
250/250 [==============================] - 0s 2ms/step - loss: 0.4623 - accuracy: 0.8016 - val_loss: 0.4587 - val_accuracy: 0.8095
Epoch 3/25
250/250 [==============================] - 0s 2ms/step - loss: 0.4456 - accuracy: 0.8111 - val_loss: 0.4473 - val_accuracy: 0.8100
Epoch 4/25
250/250 [==============================] - 0s 2ms/step - loss: 0.4359 - accuracy: 0.8146 - val_loss: 0.4329 - val_accuracy: 0.8085
Epoch 5/25
250/250 [==============================] - 0s 2ms/step - loss: 0.4290 - accuracy: 0.8167 - val_loss: 0.4311 - val_accuracy: 0.8130
Epoch 6/25
250/250 [==============================] - 0s 2ms/step - loss: 0.4232 - accuracy: 0.8216 - val_loss: 0.4232 - val_accuracy: 0.8185
Epoch 7/25
250/250 [==============================] - 0s 2ms/step - loss: 0.4158 - accuracy: 0.8245 - val_loss: 0.4142 - val_accuracy: 0.8240
Epoch 8/25
250/250 [==============================] - 0s 2ms/step - loss: 0.4085 - accuracy: 0.8219 - val_loss: 0.4080 - val_accuracy: 0.8270
Epoch 9/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3994 - accuracy: 0.8270 - val_loss: 0.3966 - val_accuracy: 0.8305
Epoch 10/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3938 - accuracy: 0.8305 - val_loss: 0.3852 - val_accuracy: 0.8310
Epoch 11/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3801 - accuracy: 0.8389 - val_loss: 0.3747 - val_accuracy: 0.8425
Epoch 12/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3706 - accuracy: 0.8414 - val_loss: 0.3720 - val_accuracy: 0.8480
Epoch 13/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3659 - accuracy: 0.8445 - val_loss: 0.3629 - val_accuracy: 0.8525
Epoch 14/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3582 - accuracy: 0.8495 - val_loss: 0.3494 - val_accuracy: 0.8585
Epoch 15/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3532 - accuracy: 0.8528 - val_loss: 0.3494 - val_accuracy: 0.8570
Epoch 16/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3497 - accuracy: 0.8574 - val_loss: 0.3454 - val_accuracy: 0.8580
Epoch 17/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3491 - accuracy: 0.8570 - val_loss: 0.3505 - val_accuracy: 0.8540
Epoch 18/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3459 - accuracy: 0.8555 - val_loss: 0.3452 - val_accuracy: 0.8590
Epoch 19/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3425 - accuracy: 0.8596 - val_loss: 0.3540 - val_accuracy: 0.8510
Epoch 20/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3410 - accuracy: 0.8618 - val_loss: 0.3452 - val_accuracy: 0.8595
Epoch 21/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3393 - accuracy: 0.8615 - val_loss: 0.3505 - val_accuracy: 0.8565
Epoch 22/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3407 - accuracy: 0.8625 - val_loss: 0.3435 - val_accuracy: 0.8570
Epoch 23/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3407 - accuracy: 0.8608 - val_loss: 0.3405 - val_accuracy: 0.8585
Epoch 24/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3379 - accuracy: 0.8621 - val_loss: 0.3460 - val_accuracy: 0.8545
Epoch 25/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3366 - accuracy: 0.8625 - val_loss: 0.3437 - val_accuracy: 0.8580
In [31]:
model.evaluate(X_test,y_test)
63/63 [==============================] - 0s 1ms/step - loss: 0.3437 - accuracy: 0.8580
Out[31]:
[0.3437216281890869, 0.8579999804496765]

Graph Plotting

In [32]:
plt.plot(history.history['accuracy'])
plt.title('Train model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.show()
In [33]:
plt.plot(history.history['loss'])
plt.title('Train Model loss')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.show()
In [34]:
plt.plot(history.history['val_accuracy'])
plt.title(' Val model Accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.show()
In [35]:
plt.plot(history.history['val_loss'])
plt.title(' Val Model Loss')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.show()
In [36]:
loss_train = history.history['loss']
loss_val = history.history['val_loss']
plt.plot(loss_train, 'g', label='Training loss')
plt.plot(loss_val, 'b', label='validation loss')
plt.title('Training and Validation loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()
In [37]:
accuracy_train = history.history['accuracy']
accuracy_val = history.history['val_accuracy']
plt.plot(accuracy_train, 'g', label='Training accuracy')
plt.plot(accuracy_val, 'b', label='Validation accuracy')
plt.title('Training and Validation accuracy')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()
In [38]:
y_pred = model.predict(X_test)
y_pred = (y_pred>0.5) #Threshold value
In [39]:
from sklearn.metrics import confusion_matrix, accuracy_score
a=confusion_matrix(y_test, y_pred)
In [40]:
print(a)
[[1530   65]
 [ 219  186]]
In [41]:
m = confusion_matrix(y_test, y_pred)
a= pd.DataFrame(m, index=[0, 1], columns=[0, 1])
figure = plt.figure(figsize=(10, 10))
sns.heatmap(a, annot=True)
Out[41]:
<AxesSubplot:>

Prediction

In [42]:
country = input("Please enter the country of the customer:")
credit = input("Please enter the credit score of the customer:")
gender= input("Please enter the gender of the customer:")
age= input("Please enter the age of the customer:")
tenure = input("Please enter the tenure of the customer:")
balance = input("Please enter the balance of the customer:")
products = input("Please enter the no of products of the customer:")
card = input("Please enter if customer has a credit card:")
member = input("Please enter if customer is a active member:")
salary = input("Please enter the salary of the customer:")
In [43]:
print(model.predict(sc.transform([[1, 0, 0, 699,0, 39, 1,0, 2, 0, 0, 93826.63]])) > 0.5)
print("------------This customer will not leave the bank------------")
[[False]]
------------This customer will not leave the bank------------

Predicited which customer will leave bank

In [44]:
y_pred[0:100]   
Out[44]:
array([[False],
       [False],
       [False],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [False],
       [False],
       [ True],
       [ True],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [False],
       [ True],
       [ True],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False]])
We can clearly observe the first person to leave the bank will be the 6th person and so on.
In [45]:
model.save("churn_model_new.h5")

deepCC

In [46]:
!deepCC churn_model_new.h5
[INFO]
Reading [keras model] 'churn_model_new.h5'
[SUCCESS]
Saved 'churn_model_new_deepC/churn_model_new.onnx'
[INFO]
Reading [onnx model] 'churn_model_new_deepC/churn_model_new.onnx'
[INFO]
Model info:
  ir_vesion : 4
  doc       : 
[WARNING]
[ONNX]: terminal (input/output) dense_input's shape is less than 1. Changing it to 1.
[WARNING]
[ONNX]: terminal (input/output) dense_7's shape is less than 1. Changing it to 1.
WARN (GRAPH): found operator node with the same name (dense_7) as io node.
[INFO]
Running DNNC graph sanity check ...
[SUCCESS]
Passed sanity check.
[INFO]
Writing C++ file 'churn_model_new_deepC/churn_model_new.cpp'
[INFO]
deepSea model files are ready in 'churn_model_new_deepC/' 
[RUNNING COMMAND]
g++ -std=c++11 -O3 -fno-rtti -fno-exceptions -I. -I/opt/tljh/user/lib/python3.7/site-packages/deepC-0.13-py3.7-linux-x86_64.egg/deepC/include -isystem /opt/tljh/user/lib/python3.7/site-packages/deepC-0.13-py3.7-linux-x86_64.egg/deepC/packages/eigen-eigen-323c052e1731 "churn_model_new_deepC/churn_model_new.cpp" -D_AITS_MAIN -o "churn_model_new_deepC/churn_model_new.exe"
[RUNNING COMMAND]
size "churn_model_new_deepC/churn_model_new.exe"
   text	   data	    bss	    dec	    hex	filename
 162419	   2968	    760	 166147	  28903	churn_model_new_deepC/churn_model_new.exe
[SUCCESS]
Saved model as executable "churn_model_new_deepC/churn_model_new.exe"
In [ ]: