Predicting Churn for Bank Customers

!wget -N "https://cainvas-static.s3.amazonaws.com/media/user_data/cainvas-admin/project.zip"
!unzip -o project.zip 
!rm project.zip

--2021-07-05 12:04:18--  https://cainvas-static.s3.amazonaws.com/media/user_data/cainvas-admin/project.zip
Resolving cainvas-static.s3.amazonaws.com (cainvas-static.s3.amazonaws.com)... 52.219.158.19
Connecting to cainvas-static.s3.amazonaws.com (cainvas-static.s3.amazonaws.com)|52.219.158.19|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 268000 (262K) [application/x-zip-compressed]
Saving to: ‘project.zip’

project.zip         100%[===================>] 261.72K  --.-KB/s    in 0.002s  

2021-07-05 12:04:18 (159 MB/s) - ‘project.zip’ saved [268000/268000]

Archive:  project.zip
  inflating: Churn Model.csv

Importing Libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

churn=pd.read_csv("Churn Model.csv")

churn

churn.columns

Index(['RowNumber', 'CustomerId', 'Surname', 'CreditScore', 'Geography',
       'Gender', 'Age', 'Tenure', 'Balance', 'NumOfProducts', 'HasCrCard',
       'IsActiveMember', 'EstimatedSalary', 'Exited'],
      dtype='object')

Checking Null values

churn.isnull().sum()  #to check any null values

RowNumber          0
CustomerId         0
Surname            0
CreditScore        0
Geography          0
Gender             0
Age                0
Tenure             0
Balance            0
NumOfProducts      0
HasCrCard          0
IsActiveMember     0
EstimatedSalary    0
Exited             0
dtype: int64

Creating heat map

plt.figure(figsize=(14,14))
sns.heatmap(churn.corr(), annot=True, cmap="coolwarm")

<AxesSubplot:>

Replacing Categorical with Numerical

churn['Gender'].replace('Female',0,inplace=True)
churn['Gender'].replace('Male',1,inplace=True)

churn

X = churn.iloc[:, 3:13].values
y = churn.iloc[:, 13].values

print(X)

[[619 'France' 0 ... 1 1 101348.88]
 [608 'Spain' 0 ... 0 1 112542.58]
 [502 'France' 0 ... 1 0 113931.57]
 ...
 [709 'France' 0 ... 0 1 42085.58]
 [772 'Germany' 1 ... 1 0 92888.52]
 [792 'France' 0 ... 1 0 38190.78]]

print(y)

[1 0 1 ... 1 1 0]

from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
X[:, 2] = le.fit_transform(X[:, 2])

from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers=[('encoder', OneHotEncoder(), [1])], remainder='passthrough')
X = np.array(ct.fit_transform(X))

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

Feature Scaling

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D,Input
from tensorflow.keras.layers import LeakyReLU,PReLU,ELU

model = Sequential()

model.add(Dense(units = 6, kernel_initializer = 'he_uniform', activation = 'relu'))
model.add(Dense(units = 12, kernel_initializer = 'he_uniform', activation = 'relu'))
model.add(Dense(units = 24, kernel_initializer = 'he_uniform', activation = 'relu'))
model.add(Dense(units = 36, kernel_initializer = 'he_uniform', activation = 'relu'))
model.add(Dense(units = 48, kernel_initializer = 'he_uniform', activation = 'relu'))
model.add(Dense(units = 54, kernel_initializer = 'he_uniform', activation = 'relu'))
model.add(Dense(units = 60, kernel_initializer = 'he_uniform', activation = 'relu'))
model.add(Dense(units = 1, kernel_initializer = 'glorot_uniform', activation = 'sigmoid'))

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 6)                 78        
_________________________________________________________________
dense_1 (Dense)              (None, 12)                84        
_________________________________________________________________
dense_2 (Dense)              (None, 24)                312       
_________________________________________________________________
dense_3 (Dense)              (None, 36)                900       
_________________________________________________________________
dense_4 (Dense)              (None, 48)                1776      
_________________________________________________________________
dense_5 (Dense)              (None, 54)                2646      
_________________________________________________________________
dense_6 (Dense)              (None, 60)                3300      
_________________________________________________________________
dense_7 (Dense)              (None, 1)                 61        
=================================================================
Total params: 9,157
Trainable params: 9,157
Non-trainable params: 0
_________________________________________________________________

model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

y

array([1, 0, 1, ..., 1, 1, 0])

X

array([[1.0, 0.0, 0.0, ..., 1, 1, 101348.88],
       [0.0, 0.0, 1.0, ..., 0, 1, 112542.58],
       [1.0, 0.0, 0.0, ..., 1, 0, 113931.57],
       ...,
       [1.0, 0.0, 0.0, ..., 0, 1, 42085.58],
       [0.0, 1.0, 0.0, ..., 1, 0, 92888.52],
       [1.0, 0.0, 0.0, ..., 1, 0, 38190.78]], dtype=object)

churn.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   RowNumber        10000 non-null  int64  
 1   CustomerId       10000 non-null  int64  
 2   Surname          10000 non-null  object 
 3   CreditScore      10000 non-null  int64  
 4   Geography        10000 non-null  object 
 5   Gender           10000 non-null  int64  
 6   Age              10000 non-null  int64  
 7   Tenure           10000 non-null  int64  
 8   Balance          10000 non-null  float64
 9   NumOfProducts    10000 non-null  int64  
 10  HasCrCard        10000 non-null  int64  
 11  IsActiveMember   10000 non-null  int64  
 12  EstimatedSalary  10000 non-null  float64
 13  Exited           10000 non-null  int64  
dtypes: float64(2), int64(10), object(2)
memory usage: 1.1+ MB

history= model.fit(X_train,y_train, epochs = 25,validation_data=(X_test, y_test))

Epoch 1/25
250/250 [==============================] - 1s 3ms/step - loss: 0.5060 - accuracy: 0.7915 - val_loss: 0.5140 - val_accuracy: 0.7780
Epoch 2/25
250/250 [==============================] - 0s 2ms/step - loss: 0.4623 - accuracy: 0.8016 - val_loss: 0.4587 - val_accuracy: 0.8095
Epoch 3/25
250/250 [==============================] - 0s 2ms/step - loss: 0.4456 - accuracy: 0.8111 - val_loss: 0.4473 - val_accuracy: 0.8100
Epoch 4/25
250/250 [==============================] - 0s 2ms/step - loss: 0.4359 - accuracy: 0.8146 - val_loss: 0.4329 - val_accuracy: 0.8085
Epoch 5/25
250/250 [==============================] - 0s 2ms/step - loss: 0.4290 - accuracy: 0.8167 - val_loss: 0.4311 - val_accuracy: 0.8130
Epoch 6/25
250/250 [==============================] - 0s 2ms/step - loss: 0.4232 - accuracy: 0.8216 - val_loss: 0.4232 - val_accuracy: 0.8185
Epoch 7/25
250/250 [==============================] - 0s 2ms/step - loss: 0.4158 - accuracy: 0.8245 - val_loss: 0.4142 - val_accuracy: 0.8240
Epoch 8/25
250/250 [==============================] - 0s 2ms/step - loss: 0.4085 - accuracy: 0.8219 - val_loss: 0.4080 - val_accuracy: 0.8270
Epoch 9/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3994 - accuracy: 0.8270 - val_loss: 0.3966 - val_accuracy: 0.8305
Epoch 10/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3938 - accuracy: 0.8305 - val_loss: 0.3852 - val_accuracy: 0.8310
Epoch 11/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3801 - accuracy: 0.8389 - val_loss: 0.3747 - val_accuracy: 0.8425
Epoch 12/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3706 - accuracy: 0.8414 - val_loss: 0.3720 - val_accuracy: 0.8480
Epoch 13/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3659 - accuracy: 0.8445 - val_loss: 0.3629 - val_accuracy: 0.8525
Epoch 14/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3582 - accuracy: 0.8495 - val_loss: 0.3494 - val_accuracy: 0.8585
Epoch 15/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3532 - accuracy: 0.8528 - val_loss: 0.3494 - val_accuracy: 0.8570
Epoch 16/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3497 - accuracy: 0.8574 - val_loss: 0.3454 - val_accuracy: 0.8580
Epoch 17/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3491 - accuracy: 0.8570 - val_loss: 0.3505 - val_accuracy: 0.8540
Epoch 18/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3459 - accuracy: 0.8555 - val_loss: 0.3452 - val_accuracy: 0.8590
Epoch 19/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3425 - accuracy: 0.8596 - val_loss: 0.3540 - val_accuracy: 0.8510
Epoch 20/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3410 - accuracy: 0.8618 - val_loss: 0.3452 - val_accuracy: 0.8595
Epoch 21/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3393 - accuracy: 0.8615 - val_loss: 0.3505 - val_accuracy: 0.8565
Epoch 22/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3407 - accuracy: 0.8625 - val_loss: 0.3435 - val_accuracy: 0.8570
Epoch 23/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3407 - accuracy: 0.8608 - val_loss: 0.3405 - val_accuracy: 0.8585
Epoch 24/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3379 - accuracy: 0.8621 - val_loss: 0.3460 - val_accuracy: 0.8545
Epoch 25/25
250/250 [==============================] - 0s 2ms/step - loss: 0.3366 - accuracy: 0.8625 - val_loss: 0.3437 - val_accuracy: 0.8580

model.evaluate(X_test,y_test)

63/63 [==============================] - 0s 1ms/step - loss: 0.3437 - accuracy: 0.8580

[0.3437216281890869, 0.8579999804496765]

Graph Plotting

plt.plot(history.history['accuracy'])
plt.title('Train model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.show()

plt.plot(history.history['loss'])
plt.title('Train Model loss')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.show()

plt.plot(history.history['val_accuracy'])
plt.title(' Val model Accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.show()

plt.plot(history.history['val_loss'])
plt.title(' Val Model Loss')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.show()

loss_train = history.history['loss']
loss_val = history.history['val_loss']
plt.plot(loss_train, 'g', label='Training loss')
plt.plot(loss_val, 'b', label='validation loss')
plt.title('Training and Validation loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()

accuracy_train = history.history['accuracy']
accuracy_val = history.history['val_accuracy']
plt.plot(accuracy_train, 'g', label='Training accuracy')
plt.plot(accuracy_val, 'b', label='Validation accuracy')
plt.title('Training and Validation accuracy')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()

y_pred = model.predict(X_test)
y_pred = (y_pred>0.5) #Threshold value

from sklearn.metrics import confusion_matrix, accuracy_score
a=confusion_matrix(y_test, y_pred)

print(a)

[[1530   65]
 [ 219  186]]

m = confusion_matrix(y_test, y_pred)
a= pd.DataFrame(m, index=[0, 1], columns=[0, 1])
figure = plt.figure(figsize=(10, 10))
sns.heatmap(a, annot=True)

<AxesSubplot:>

Prediction

country = input("Please enter the country of the customer:")
credit = input("Please enter the credit score of the customer:")
gender= input("Please enter the gender of the customer:")
age= input("Please enter the age of the customer:")
tenure = input("Please enter the tenure of the customer:")
balance = input("Please enter the balance of the customer:")
products = input("Please enter the no of products of the customer:")
card = input("Please enter if customer has a credit card:")
member = input("Please enter if customer is a active member:")
salary = input("Please enter the salary of the customer:")

print(model.predict(sc.transform([[1, 0, 0, 699,0, 39, 1,0, 2, 0, 0, 93826.63]])) > 0.5)
print("------------This customer will not leave the bank------------")

[[False]]
------------This customer will not leave the bank------------

Predicited which customer will leave bank

y_pred[0:100]

array([[False],
       [False],
       [False],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [False],
       [False],
       [ True],
       [ True],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [False],
       [ True],
       [ True],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [False],
       [False],
       [ True],
       [False],
       [False],
       [False],
       [False],
       [False],
       [False]])

We can clearly observe the first person to leave the bank will be the 6th person and so on.

model.save("churn_model_new.h5")

deepCC¶

!deepCC churn_model_new.h5

[INFO]
Reading [keras model] 'churn_model_new.h5'
[SUCCESS]
Saved 'churn_model_new_deepC/churn_model_new.onnx'
[INFO]
Reading [onnx model] 'churn_model_new_deepC/churn_model_new.onnx'
[INFO]
Model info:
  ir_vesion : 4
  doc       : 
[WARNING]
[ONNX]: terminal (input/output) dense_input's shape is less than 1. Changing it to 1.
[WARNING]
[ONNX]: terminal (input/output) dense_7's shape is less than 1. Changing it to 1.
WARN (GRAPH): found operator node with the same name (dense_7) as io node.
[INFO]
Running DNNC graph sanity check ...
[SUCCESS]
Passed sanity check.
[INFO]
Writing C++ file 'churn_model_new_deepC/churn_model_new.cpp'
[INFO]
deepSea model files are ready in 'churn_model_new_deepC/' 
[RUNNING COMMAND]
g++ -std=c++11 -O3 -fno-rtti -fno-exceptions -I. -I/opt/tljh/user/lib/python3.7/site-packages/deepC-0.13-py3.7-linux-x86_64.egg/deepC/include -isystem /opt/tljh/user/lib/python3.7/site-packages/deepC-0.13-py3.7-linux-x86_64.egg/deepC/packages/eigen-eigen-323c052e1731 "churn_model_new_deepC/churn_model_new.cpp" -D_AITS_MAIN -o "churn_model_new_deepC/churn_model_new.exe"
[RUNNING COMMAND]
size "churn_model_new_deepC/churn_model_new.exe"
   text	   data	    bss	    dec	    hex	filename
 162419	   2968	    760	 166147	  28903	churn_model_new_deepC/churn_model_new.exe
[SUCCESS]
Saved model as executable "churn_model_new_deepC/churn_model_new.exe"

	RowNumber	CustomerId	Surname	CreditScore	Geography	Gender	Age	Tenure	Balance	NumOfProducts	HasCrCard	IsActiveMember	EstimatedSalary	Exited
0	1	15634602	Hargrave	619	France	Female	42	2	0.00	1	1	1	101348.88	1
1	2	15647311	Hill	608	Spain	Female	41	1	83807.86	1	0	1	112542.58	0
2	3	15619304	Onio	502	France	Female	42	8	159660.80	3	1	0	113931.57	1
3	4	15701354	Boni	699	France	Female	39	1	0.00	2	0	0	93826.63	0
4	5	15737888	Mitchell	850	Spain	Female	43	2	125510.82	1	1	1	79084.10	0
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
9995	9996	15606229	Obijiaku	771	France	Male	39	5	0.00	2	1	0	96270.64	0
9996	9997	15569892	Johnstone	516	France	Male	35	10	57369.61	1	1	1	101699.77	0
9997	9998	15584532	Liu	709	France	Female	36	7	0.00	1	0	1	42085.58	1
9998	9999	15682355	Sabbatini	772	Germany	Male	42	3	75075.31	2	1	0	92888.52	1
9999	10000	15628319	Walker	792	France	Female	28	4	130142.79	1	1	0	38190.78	0

Model Files
churn_model_new.h5 keras Model
deepSea Compiled Models
churn_model_new.exe deepSea Ubuntu