Sonar data - Rocks or Mines?¶

Sonar (sound navigationa ranging) is a technique based on the principle of reflection of ultrasonic sound waves. These waves propogate through water and reflect on hitting the ocean bed or any object obstructing its path.

Sonar has been widely used in submarine navigation, communication with or detection of objects on or underthe water surface (like other vessels), hazard identification etc.

Neural networks can be used to automate the process of identifying the objects from which the waves bounced off.

import numpy as np
import pandas as pd
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, LSTM
from tensorflow.keras.optimizers import Adam, SGD
from tensorflow.keras.callbacks import EarlyStopping
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.metrics import confusion_matrix
import random
import matplotlib.pyplot as plt

The dataset¶

This dataset was used in Gorman, R. P., and Sejnowski, T. J. (1988). “Analysis of Hidden Units in a Layered Network Trained to Classify Sonar Targets” in Neural Networks, Vol. 1, pp. 75-89.

The CSV files contain data regarding sonar signals bounced off a metal cylinder (mines - M) and a roughly cylindrical rock (rock - R) at various angles and under various conditions.

df = pd.read_csv('https://cainvas-static.s3.amazonaws.com/media/user_data/cainvas-admin/sonar.all-data.csv', header = None)
df

# The spread of labels in the dataframe

df[60].value_counts()

M    111
R     97
Name: 60, dtype: int64

This is a fairly balanced dataset.

Preprocessing¶

Categorical features¶

The class attribute has R and M to denote the classes. We have to convert them into numeric values.

df[60] = (df[60] == 'M').astype('int')
df

# Storing the class names corresponding to the index in arrays for reference later.

class_names = ['Rocks', 'Mines']

Balancing dataset¶

Even though there is only a difference of only 14 samples, in comparison to the total number of data samples available, this needs to be balanced.

# separating into 2 dataframes, one for each class 

df0 = df[df[60] == 0]
df1 = df[df[60] == 1]

print("Number of samples in:")
print("Class label 0 - ", len(df0))
print("Class label 1 - ", len(df1))

# Upsampling 

df0 = df0.sample(len(df1), replace = True)    # replace = True enables resampling

print('\nAfter resampling - ')

print("Number of samples in:")
print("Class label 0 - ", len(df0))
print("Class label 1 - ", len(df1))

Number of samples in:
Class label 0 -  97
Class label 1 -  111

After resampling - 
Number of samples in:
Class label 0 -  111
Class label 1 -  111

# concatente to form a single dataframe

df = df1.append(df0)

print('Total number of samples - ', len(df))

Total number of samples -  222

# defining the input and output columns to separate the dataset in the later cells.

input_columns = list(df.columns[:-1]) 
output_columns = [df.columns[-1]]

print("Number of input columns: ", len(input_columns))
#print("Input columns: ", ', '.join(input_columns))

print("Number of output columns: ", len(output_columns))
#print("Output columns: ", ', '.join(output_columns))

Number of input columns:  60
Number of output columns:  1

Train - val split¶

# Splitting into train and val set -- 90-10 split

train_df, val_df = train_test_split(df, test_size = 0.1, random_state = 2)

print("Number of samples in...")
print("Training set: ", len(train_df))
print("Validation set: ", len(val_df))

Number of samples in...
Training set:  199
Validation set:  23

# Looking into the spread of values in the train and val sets

print("Training - ")
print(train_df[60].value_counts())

print("\nValidation - ")
print(val_df[60].value_counts())

Training - 
1    100
0     99
Name: 60, dtype: int64

Validation - 
0    12
1    11
Name: 60, dtype: int64

# Splitting into X (input) and y (output)

Xtrain, ytrain = np.array(train_df[input_columns]), np.array(train_df[output_columns])

Xval, yval = np.array(val_df[input_columns]), np.array(val_df[output_columns])

Standardization¶

df.describe()

The range of values for the attributes are almost of the same range, but the little difference has caused a shift of the means.

# Using standard scaler to standardize them to values with mean = 0 and variance = 1.

standard_scaler = StandardScaler()

# Fit on training set alone
Xtrain = standard_scaler.fit_transform(Xtrain)

# Use it to transform val and test input
Xval = standard_scaler.transform(Xval)
#Xtest = standard_scaler.transform(Xtest)

pd.DataFrame(Xtrain).describe()

The means are (almost) 0.

The model¶

model = Sequential([
    Dense(64, activation = 'relu', input_shape = Xtrain[0].shape),
    Dense(32, activation = 'relu'),
    Dense(16, activation = 'relu'),
    Dense(1, activation = 'sigmoid')
])

cb = [EarlyStopping(monitor = 'val_loss', patience = 5, restore_best_weights = True)]
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 64)                3904      
_________________________________________________________________
dense_1 (Dense)              (None, 32)                2080      
_________________________________________________________________
dense_2 (Dense)              (None, 16)                528       
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 17        
=================================================================
Total params: 6,529
Trainable params: 6,529
Non-trainable params: 0
_________________________________________________________________

model.compile(optimizer=Adam(0.01), loss='binary_crossentropy', metrics=['accuracy'])

history1 = model.fit(Xtrain, ytrain, validation_data = (Xval, yval), epochs=16, callbacks = cb)

Epoch 1/16
7/7 [==============================] - 0s 29ms/step - loss: 0.4682 - accuracy: 0.7387 - val_loss: 0.3722 - val_accuracy: 0.7826
Epoch 2/16
7/7 [==============================] - 0s 3ms/step - loss: 0.2432 - accuracy: 0.9146 - val_loss: 0.2586 - val_accuracy: 0.7826
Epoch 3/16
7/7 [==============================] - 0s 3ms/step - loss: 0.1208 - accuracy: 0.9548 - val_loss: 0.1964 - val_accuracy: 0.9565
Epoch 4/16
7/7 [==============================] - 0s 3ms/step - loss: 0.0553 - accuracy: 0.9950 - val_loss: 0.2436 - val_accuracy: 0.8261
Epoch 5/16
7/7 [==============================] - 0s 3ms/step - loss: 0.0279 - accuracy: 1.0000 - val_loss: 0.1797 - val_accuracy: 0.9130
Epoch 6/16
7/7 [==============================] - 0s 3ms/step - loss: 0.0098 - accuracy: 1.0000 - val_loss: 0.0875 - val_accuracy: 0.9565
Epoch 7/16
7/7 [==============================] - 0s 3ms/step - loss: 0.0032 - accuracy: 1.0000 - val_loss: 0.1415 - val_accuracy: 0.9130
Epoch 8/16
7/7 [==============================] - 0s 3ms/step - loss: 0.0014 - accuracy: 1.0000 - val_loss: 0.0912 - val_accuracy: 1.0000
Epoch 9/16
7/7 [==============================] - 0s 3ms/step - loss: 4.9176e-04 - accuracy: 1.0000 - val_loss: 0.0805 - val_accuracy: 0.9565
Epoch 10/16
7/7 [==============================] - 0s 3ms/step - loss: 3.3669e-04 - accuracy: 1.0000 - val_loss: 0.0779 - val_accuracy: 0.9565
Epoch 11/16
7/7 [==============================] - 0s 3ms/step - loss: 2.3005e-04 - accuracy: 1.0000 - val_loss: 0.0710 - val_accuracy: 0.9565
Epoch 12/16
7/7 [==============================] - 0s 3ms/step - loss: 1.7350e-04 - accuracy: 1.0000 - val_loss: 0.0649 - val_accuracy: 1.0000
Epoch 13/16
7/7 [==============================] - 0s 3ms/step - loss: 1.3643e-04 - accuracy: 1.0000 - val_loss: 0.0608 - val_accuracy: 1.0000
Epoch 14/16
7/7 [==============================] - 0s 3ms/step - loss: 1.2200e-04 - accuracy: 1.0000 - val_loss: 0.0572 - val_accuracy: 1.0000
Epoch 15/16
7/7 [==============================] - 0s 3ms/step - loss: 1.0796e-04 - accuracy: 1.0000 - val_loss: 0.0546 - val_accuracy: 1.0000
Epoch 16/16
7/7 [==============================] - 0s 3ms/step - loss: 9.9030e-05 - accuracy: 1.0000 - val_loss: 0.0527 - val_accuracy: 1.0000

model.compile(optimizer=Adam(0.001), loss='binary_crossentropy', metrics=['accuracy'])

history2 = model.fit(Xtrain, ytrain, validation_data = (Xval, yval), epochs=16, callbacks = cb)

Epoch 1/16
7/7 [==============================] - 0s 15ms/step - loss: 8.4725e-05 - accuracy: 1.0000 - val_loss: 0.0402 - val_accuracy: 1.0000
Epoch 2/16
7/7 [==============================] - 0s 3ms/step - loss: 3.4590e-05 - accuracy: 1.0000 - val_loss: 0.0298 - val_accuracy: 1.0000
Epoch 3/16
7/7 [==============================] - 0s 3ms/step - loss: 1.6445e-05 - accuracy: 1.0000 - val_loss: 0.0333 - val_accuracy: 1.0000
Epoch 4/16
7/7 [==============================] - 0s 3ms/step - loss: 9.6578e-06 - accuracy: 1.0000 - val_loss: 0.0417 - val_accuracy: 1.0000
Epoch 5/16
7/7 [==============================] - 0s 3ms/step - loss: 6.2922e-06 - accuracy: 1.0000 - val_loss: 0.0439 - val_accuracy: 1.0000
Epoch 6/16
7/7 [==============================] - 0s 3ms/step - loss: 4.5481e-06 - accuracy: 1.0000 - val_loss: 0.0452 - val_accuracy: 0.9565
Epoch 7/16
7/7 [==============================] - 0s 3ms/step - loss: 3.5737e-06 - accuracy: 1.0000 - val_loss: 0.0422 - val_accuracy: 1.0000

model.evaluate(Xval, yval)

1/1 [==============================] - 0s 857us/step - loss: 0.0298 - accuracy: 1.0000

[0.029818745329976082, 1.0]

Plotting the metrics¶

def plot(history1, history2, variable1, variable2):
    # combining metrics from both trainings    
    var1_history = history1[variable1]
    var1_history.extend(history2[variable1])
    
    var2_history = history1[variable2]
    var2_history.extend(history2[variable2])
    
    # plotting them
    plt.plot(range(len(var1_history)), var1_history)
    plt.plot(range(len(var2_history)), var2_history)
    plt.legend([variable1, variable2])
    plt.title(variable1)

plot(history1.history, history2.history, "accuracy", 'val_accuracy')

plot(history1.history, history2.history, "loss", 'val_loss')

Prediction¶

# pick random test data sample from one batch
x = random.randint(0, len(Xval) - 1)

output_true = np.array(yval)[x][0]
print("True: ", class_names[output_true])

output = model.predict(Xval[x].reshape(1, -1))[0][0]
pred = int(output>0.5)    # finding max
print("Predicted: ", class_names[pred], "(",output, "-->", pred, ")")    # Picking the label from class_names base don the model

True:  Rocks
Predicted:  Rocks ( 0.016077979 --> 0 )

deepC¶

model.save('sonar.h5')

!deepCC sonar.h5

[INFO]
Reading [keras model] 'sonar.h5'
[SUCCESS]
Saved 'sonar_deepC/sonar.onnx'
[INFO]
Reading [onnx model] 'sonar_deepC/sonar.onnx'
[INFO]
Model info:
  ir_vesion : 4
  doc       : 
[WARNING]
[ONNX]: terminal (input/output) dense_input's shape is less than 1. Changing it to 1.
[WARNING]
[ONNX]: terminal (input/output) dense_3's shape is less than 1. Changing it to 1.
WARN (GRAPH): found operator node with the same name (dense_3) as io node.
[INFO]
Running DNNC graph sanity check ...
[SUCCESS]
Passed sanity check.
[INFO]
Writing C++ file 'sonar_deepC/sonar.cpp'
[INFO]
deepSea model files are ready in 'sonar_deepC/' 
[RUNNING COMMAND]
g++ -std=c++11 -O3 -fno-rtti -fno-exceptions -I. -I/opt/tljh/user/lib/python3.7/site-packages/deepC-0.13-py3.7-linux-x86_64.egg/deepC/include -isystem /opt/tljh/user/lib/python3.7/site-packages/deepC-0.13-py3.7-linux-x86_64.egg/deepC/packages/eigen-eigen-323c052e1731 "sonar_deepC/sonar.cpp" -D_AITS_MAIN -o "sonar_deepC/sonar.exe"
[RUNNING COMMAND]
size "sonar_deepC/sonar.exe"
   text	   data	    bss	    dec	    hex	filename
 145803	   2968	    760	 149531	  2481b	sonar_deepC/sonar.exe
[SUCCESS]
Saved model as executable "sonar_deepC/sonar.exe"

# pick random test data sample from one batch
x = random.randint(0, len(Xval) - 1)

output_true = np.array(yval)[x][0]
print("True: ", class_names[output_true])

np.savetxt('sample.data', Xval[x])

# run exe with input
!sonar_deepC/sonar.exe sample.data

# show predicted output
#nn_out = np.loadtxt('dense_3.out')

#pred = int(nn_out>0.5)    # finding max
#print("Predicted: ", class_names[pred], "(",nn_out, "-->", pred, ")")    # Picking the label from class_names base don the model output

True:  Mines
writing file deepSea_result_1.out.

	0	1	2	3	4	5	6	7	8	9	...	51	52	53	54	55	56	57	58	59	60
0	0.0200	0.0371	0.0428	0.0207	0.0954	0.0986	0.1539	0.1601	0.3109	0.2111	...	0.0027	0.0065	0.0159	0.0072	0.0167	0.0180	0.0084	0.0090	0.0032	R
1	0.0453	0.0523	0.0843	0.0689	0.1183	0.2583	0.2156	0.3481	0.3337	0.2872	...	0.0084	0.0089	0.0048	0.0094	0.0191	0.0140	0.0049	0.0052	0.0044	R
2	0.0262	0.0582	0.1099	0.1083	0.0974	0.2280	0.2431	0.3771	0.5598	0.6194	...	0.0232	0.0166	0.0095	0.0180	0.0244	0.0316	0.0164	0.0095	0.0078	R
3	0.0100	0.0171	0.0623	0.0205	0.0205	0.0368	0.1098	0.1276	0.0598	0.1264	...	0.0121	0.0036	0.0150	0.0085	0.0073	0.0050	0.0044	0.0040	0.0117	R
4	0.0762	0.0666	0.0481	0.0394	0.0590	0.0649	0.1209	0.2467	0.3564	0.4459	...	0.0031	0.0054	0.0105	0.0110	0.0015	0.0072	0.0048	0.0107	0.0094	R
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
203	0.0187	0.0346	0.0168	0.0177	0.0393	0.1630	0.2028	0.1694	0.2328	0.2684	...	0.0116	0.0098	0.0199	0.0033	0.0101	0.0065	0.0115	0.0193	0.0157	M
204	0.0323	0.0101	0.0298	0.0564	0.0760	0.0958	0.0990	0.1018	0.1030	0.2154	...	0.0061	0.0093	0.0135	0.0063	0.0063	0.0034	0.0032	0.0062	0.0067	M
205	0.0522	0.0437	0.0180	0.0292	0.0351	0.1171	0.1257	0.1178	0.1258	0.2529	...	0.0160	0.0029	0.0051	0.0062	0.0089	0.0140	0.0138	0.0077	0.0031	M
206	0.0303	0.0353	0.0490	0.0608	0.0167	0.1354	0.1465	0.1123	0.1945	0.2354	...	0.0086	0.0046	0.0126	0.0036	0.0035	0.0034	0.0079	0.0036	0.0048	M
207	0.0260	0.0363	0.0136	0.0272	0.0214	0.0338	0.0655	0.1400	0.1843	0.2354	...	0.0146	0.0129	0.0047	0.0039	0.0061	0.0040	0.0036	0.0061	0.0115	M

	0	1	2	3	4	5	6	7	8	9	...	51	52	53	54	55	56	57	58	59	60
0	0.0200	0.0371	0.0428	0.0207	0.0954	0.0986	0.1539	0.1601	0.3109	0.2111	...	0.0027	0.0065	0.0159	0.0072	0.0167	0.0180	0.0084	0.0090	0.0032	0
1	0.0453	0.0523	0.0843	0.0689	0.1183	0.2583	0.2156	0.3481	0.3337	0.2872	...	0.0084	0.0089	0.0048	0.0094	0.0191	0.0140	0.0049	0.0052	0.0044	0
2	0.0262	0.0582	0.1099	0.1083	0.0974	0.2280	0.2431	0.3771	0.5598	0.6194	...	0.0232	0.0166	0.0095	0.0180	0.0244	0.0316	0.0164	0.0095	0.0078	0
3	0.0100	0.0171	0.0623	0.0205	0.0205	0.0368	0.1098	0.1276	0.0598	0.1264	...	0.0121	0.0036	0.0150	0.0085	0.0073	0.0050	0.0044	0.0040	0.0117	0
4	0.0762	0.0666	0.0481	0.0394	0.0590	0.0649	0.1209	0.2467	0.3564	0.4459	...	0.0031	0.0054	0.0105	0.0110	0.0015	0.0072	0.0048	0.0107	0.0094	0
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
203	0.0187	0.0346	0.0168	0.0177	0.0393	0.1630	0.2028	0.1694	0.2328	0.2684	...	0.0116	0.0098	0.0199	0.0033	0.0101	0.0065	0.0115	0.0193	0.0157	1
204	0.0323	0.0101	0.0298	0.0564	0.0760	0.0958	0.0990	0.1018	0.1030	0.2154	...	0.0061	0.0093	0.0135	0.0063	0.0063	0.0034	0.0032	0.0062	0.0067	1
205	0.0522	0.0437	0.0180	0.0292	0.0351	0.1171	0.1257	0.1178	0.1258	0.2529	...	0.0160	0.0029	0.0051	0.0062	0.0089	0.0140	0.0138	0.0077	0.0031	1
206	0.0303	0.0353	0.0490	0.0608	0.0167	0.1354	0.1465	0.1123	0.1945	0.2354	...	0.0086	0.0046	0.0126	0.0036	0.0035	0.0034	0.0079	0.0036	0.0048	1
207	0.0260	0.0363	0.0136	0.0272	0.0214	0.0338	0.0655	0.1400	0.1843	0.2354	...	0.0146	0.0129	0.0047	0.0039	0.0061	0.0040	0.0036	0.0061	0.0115	1

	0	1	2	3	4	5	6	7	8	9	...	51	52	53	54	55	56	57	58	59	60
count	222.000000	222.000000	222.000000	222.000000	222.000000	222.000000	222.000000	222.000000	222.000000	222.000000	...	222.000000	222.000000	222.000000	222.000000	222.000000	222.000000	222.000000	222.000000	222.000000	222.00000
mean	0.029696	0.038828	0.043216	0.054287	0.073933	0.104227	0.122296	0.133746	0.172952	0.204101	...	0.013323	0.011126	0.011029	0.009168	0.008521	0.007865	0.008059	0.008098	0.006573	0.50000
std	0.022436	0.031898	0.037753	0.044910	0.053527	0.059349	0.063341	0.084254	0.118808	0.133248	...	0.009705	0.007275	0.007226	0.006860	0.005673	0.005705	0.006436	0.006193	0.004975	0.50113
min	0.001500	0.001700	0.001500	0.006100	0.006700	0.010200	0.013000	0.005500	0.007500	0.011300	...	0.000800	0.000500	0.001000	0.000600	0.000400	0.000300	0.000600	0.000100	0.000600	0.00000
25%	0.013175	0.017200	0.019000	0.028600	0.039150	0.070225	0.083425	0.073875	0.097325	0.103825	...	0.007125	0.005400	0.005325	0.004300	0.004700	0.003500	0.003600	0.004300	0.003125	0.00000
50%	0.023400	0.030900	0.033150	0.042150	0.063250	0.090600	0.110200	0.111800	0.146250	0.180700	...	0.011400	0.010050	0.009550	0.007500	0.007300	0.006150	0.005900	0.006950	0.005450	0.50000
75%	0.036800	0.048400	0.054500	0.062625	0.096250	0.134200	0.159200	0.170000	0.227525	0.266800	...	0.016475	0.015175	0.014500	0.011350	0.011775	0.010575	0.010500	0.010400	0.008500	1.00000
max	0.137100	0.233900	0.305900	0.426400	0.401000	0.382300	0.372900	0.459000	0.682800	0.710600	...	0.070900	0.039000	0.035200	0.044700	0.039400	0.035500	0.044000	0.036400	0.043900	1.00000

	0	1	2	3	4	5	6	7	8	9	...	50	51	52	53	54	55	56	57	58	59
count	1.990000e+02	1.990000e+02	1.990000e+02	1.990000e+02	1.990000e+02	1.990000e+02	1.990000e+02	1.990000e+02	1.990000e+02	1.990000e+02	...	1.990000e+02	1.990000e+02	1.990000e+02	1.990000e+02	1.990000e+02	1.990000e+02	1.990000e+02	1.990000e+02	1.990000e+02	1.990000e+02
mean	1.986128e-16	3.704463e-16	-1.115802e-16	-2.901085e-17	1.562123e-17	9.819058e-17	4.641736e-16	3.570567e-17	2.856453e-16	2.856453e-16	...	6.248491e-17	1.517491e-16	-8.926416e-17	-2.574713e-16	-1.076749e-16	1.785283e-16	-7.364293e-17	-1.260856e-16	1.472859e-16	-1.249698e-16
std	1.002522e+00	1.002522e+00	1.002522e+00	1.002522e+00	1.002522e+00	1.002522e+00	1.002522e+00	1.002522e+00	1.002522e+00	1.002522e+00	...	1.002522e+00	1.002522e+00	1.002522e+00	1.002522e+00	1.002522e+00	1.002522e+00	1.002522e+00	1.002522e+00	1.002522e+00	1.002522e+00
min	-1.238565e+00	-1.166549e+00	-1.108573e+00	-1.040976e+00	-1.218213e+00	-1.678201e+00	-1.805583e+00	-1.588742e+00	-1.407048e+00	-1.369219e+00	...	-1.247793e+00	-1.293398e+00	-1.449526e+00	-1.334143e+00	-1.265912e+00	-1.389308e+00	-1.277568e+00	-1.156763e+00	-1.279241e+00	-1.190228e+00
25%	-7.257857e-01	-6.635488e-01	-6.379778e-01	-5.515081e-01	-6.634999e-01	-5.834075e-01	-6.118920e-01	-7.402314e-01	-6.280911e-01	-7.556075e-01	...	-6.406657e-01	-6.399768e-01	-7.934790e-01	-7.962525e-01	-7.221334e-01	-6.507908e-01	-7.399612e-01	-6.854368e-01	-6.249674e-01	-6.876642e-01
50%	-2.837346e-01	-2.520032e-01	-2.531888e-01	-2.871079e-01	-1.922723e-01	-2.184764e-01	-1.669863e-01	-2.447926e-01	-2.215679e-01	-1.790500e-01	...	-1.842737e-01	-1.820675e-01	-1.510996e-01	-2.024769e-01	-2.077481e-01	-2.147990e-01	-3.064072e-01	-3.397973e-01	-1.807076e-01	-2.245171e-01
75%	3.218754e-01	2.793794e-01	2.804211e-01	1.695833e-01	2.993627e-01	5.725162e-01	5.003724e-01	4.841289e-01	4.546925e-01	4.649454e-01	...	3.474858e-01	3.272923e-01	5.322827e-01	4.821114e-01	4.168626e-01	4.792288e-01	4.913321e-01	3.829035e-01	3.847139e-01	3.667344e-01
max	4.755648e+00	6.149817e+00	7.053780e+00	8.143106e+00	6.096946e+00	3.819847e+00	3.553433e+00	4.150250e+00	4.348197e+00	3.753843e+00	...	7.000760e+00	5.919960e+00	3.812518e+00	3.374148e+00	5.215343e+00	5.550970e+00	4.826872e+00	5.661762e+00	4.584988e+00	7.343503e+00

Model Files
sonar.h5 keras Model
deepSea Compiled Models
sonar.exe deepSea Ubuntu