Garbage Classification¶

Separating recyclable waste materials is the basis for any recycling process.

In many cases waste is not separated at the house level. A deep learning model can be used to separate the different components of garbage automatically at the recovering facilities or biological treatment systems with high efficiency.

This notebook classifies wastes into 6 categories - glass, cardboard, paper, plastic and metal.

import numpy as np
import os
import tensorflow as tf
import tensorflow.keras
import matplotlib.pyplot as plt
from tensorflow.keras import layers, models, optimizers
import random
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import train_test_split
from PIL import Image

Dataset¶

The dataset has 2 folders - train and test.

Each of these has sub-folders containing images belonging to a specific class which is specified as the sub-folder name

!wget https://cainvas-static.s3.amazonaws.com/media/user_data/cainvas-admin/Garbage_classification.zip

!unzip -qo Garbage_classification.zip

# zip folder is not needed anymore
!rm Garbage_classification.zip

--2021-09-08 07:23:12--  https://cainvas-static.s3.amazonaws.com/media/user_data/cainvas-admin/Garbage_classification.zip
Resolving cainvas-static.s3.amazonaws.com (cainvas-static.s3.amazonaws.com)... 52.219.64.92
Connecting to cainvas-static.s3.amazonaws.com (cainvas-static.s3.amazonaws.com)|52.219.64.92|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 42074220 (40M) [application/zip]
Saving to: ‘Garbage_classification.zip’

Garbage_classificat 100%[===================>]  40.12M  69.7MB/s    in 0.6s    

2021-09-08 07:23:13 (69.7 MB/s) - ‘Garbage_classification.zip’ saved [42074220/42074220]

# Loading the dataset

path = 'Garbage classification/'

batch = 32

# The train and test datasets
print("Train dataset")
train_ds = tf.keras.preprocessing.image_dataset_from_directory(path+'train', batch_size=batch)

print("Test dataset")
test_ds = tf.keras.preprocessing.image_dataset_from_directory(path+'test', batch_size=batch)

Train dataset
Found 1910 files belonging to 5 classes.
Test dataset
Found 480 files belonging to 5 classes.

# How may samples in each class

for t in ['train', 'test']:
    print('\n', t.upper())
    for x in os.listdir(path + t):
        print(x, ' - ', len(os.listdir(path + t + '/' + x)))

 TRAIN
glass  -  400
cardboard  -  322
plastic  -  385
paper  -  475
metal  -  328

 TEST
glass  -  101
cardboard  -  81
plastic  -  97
paper  -  119
metal  -  82

# Looking into the class labels

class_names = train_ds.class_names

print("Train class names: ", train_ds.class_names)
print("Test class names: ", test_ds.class_names)

Train class names:  ['cardboard', 'glass', 'metal', 'paper', 'plastic']
Test class names:  ['cardboard', 'glass', 'metal', 'paper', 'plastic']

# Looking into the shape of the batches and individual samples
# Set the input shape

print("Looking into the shape of images and labels in one batch\n")  

for image_batch, labels_batch in train_ds:
    input_shape = image_batch[0].shape
    print("Shape of images input for one batch: ", image_batch.shape)
    print("Shape of images labels for one batch: ", labels_batch.shape)
    break

Looking into the shape of images and labels in one batch

Shape of images input for one batch:  (32, 256, 256, 3)
Shape of images labels for one batch:  (32,)

Visualization¶

num_samples = 4    # the number of samples to be displayed in each class

for x in class_names:
    plt.figure(figsize=(10, 10))

    filenames = os.listdir(path + 'train/' + x)

    for i in range(num_samples):
        ax = plt.subplot(1, num_samples, i + 1)
        img = Image.open(path +'train/' + x + '/' + filenames[i])
        plt.imshow(img)
        plt.title(x)
        plt.axis("off")

Preprocessing the data¶

# Normalizing the pixel values

normalization_layer = layers.experimental.preprocessing.Rescaling(1./255)

train_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
test_ds = test_ds.map(lambda x, y: (normalization_layer(x), y))

The model¶

base_model = tensorflow.keras.applications.Xception(weights='imagenet', input_shape=input_shape, include_top=False)    # False, do not include the classification layer of the model

base_model.trainable = False

inputs = tf.keras.Input(shape=input_shape)

x = base_model(inputs, training=False)
x = tensorflow.keras.layers.GlobalAveragePooling2D()(x)
outputs = tensorflow.keras.layers.Dense(len(class_names), activation = 'softmax')(x)    # Add own classififcation layer

model = tensorflow.keras.Model(inputs, outputs)

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/xception/xception_weights_tf_dim_ordering_tf_kernels_notop.h5
83689472/83683744 [==============================] - 11s 0us/step

model.compile(loss='sparse_categorical_crossentropy', optimizer=optimizers.Adam(0.001), metrics=['accuracy'])

history = model.fit(train_ds, validation_data =  test_ds, epochs=8)

Epoch 1/8
60/60 [==============================] - 16s 266ms/step - loss: 0.8261 - accuracy: 0.7319 - val_loss: 0.5806 - val_accuracy: 0.8188
Epoch 2/8
60/60 [==============================] - 15s 248ms/step - loss: 0.4498 - accuracy: 0.8550 - val_loss: 0.5134 - val_accuracy: 0.8333
Epoch 3/8
60/60 [==============================] - 15s 252ms/step - loss: 0.3698 - accuracy: 0.8801 - val_loss: 0.4712 - val_accuracy: 0.8542
Epoch 4/8
60/60 [==============================] - 15s 254ms/step - loss: 0.3188 - accuracy: 0.8995 - val_loss: 0.4558 - val_accuracy: 0.8583
Epoch 5/8
60/60 [==============================] - 15s 258ms/step - loss: 0.2881 - accuracy: 0.9120 - val_loss: 0.4438 - val_accuracy: 0.8562
Epoch 6/8
60/60 [==============================] - 16s 264ms/step - loss: 0.2570 - accuracy: 0.9204 - val_loss: 0.4361 - val_accuracy: 0.8604
Epoch 7/8
60/60 [==============================] - 16s 264ms/step - loss: 0.2355 - accuracy: 0.9298 - val_loss: 0.4290 - val_accuracy: 0.8625
Epoch 8/8
60/60 [==============================] - 17s 284ms/step - loss: 0.2170 - accuracy: 0.9398 - val_loss: 0.4290 - val_accuracy: 0.8667

model.evaluate(test_ds)
model.summary()

15/15 [==============================] - 3s 192ms/step - loss: 0.4290 - accuracy: 0.8667
Model: "functional_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         [(None, 256, 256, 3)]     0         
_________________________________________________________________
xception (Functional)        (None, 8, 8, 2048)        20861480  
_________________________________________________________________
global_average_pooling2d (Gl (None, 2048)              0         
_________________________________________________________________
dense (Dense)                (None, 5)                 10245     
=================================================================
Total params: 20,871,725
Trainable params: 10,245
Non-trainable params: 20,861,480
_________________________________________________________________

Plotting the metrics¶

def plot(history, variable, variable1):
    plt.plot(range(len(history[variable])), history[variable])
    plt.plot(range(len(history[variable1])), history[variable1])
    plt.title(variable)
    plt.legend([variable, variable1])
    plt.title(variable)

plot(history.history, "accuracy", "val_accuracy")

plot(history.history, "loss", "val_loss")

Prediction¶

x = random.randint(0, batch - 1)

for i in test_ds.as_numpy_iterator():
    img, label = i    
    plt.axis('off')   # remove axes
    plt.imshow(img[x])    # shape from (32, 256, 256, 3) --> (256, 256, 3)
    output = model.predict(np.expand_dims(img[x],0))    # getting output; input shape (256, 256, 3) --> (1, 256, 256, 3)
    pred = np.argmax(output[0])    # finding max
    print("Prdicted: ", class_names[pred])    # Picking the label from class_names based on the model output
    print("True: ", class_names[label[x]])
    print("Probability: ", output[0][pred])
    break

Prdicted:  cardboard
True:  cardboard
Probability:  0.9498115

deepC¶

model.save('garbage_classification.h5')

!deepCC garbage_classification.h5

[INFO]
Reading [keras model] 'garbage_classification.h5'
[SUCCESS]
Saved 'garbage_classification_deepC/garbage_classification.onnx'
[ERROR]
ONNX Model size exceeds threshold of 30MB.          
Current ONNX Model size is 79MB.          
Override with '--mem_override'

usage: deepCC [-h] [--output] [--format] [--verbose] [--profile ]
              [--app_tensors FILE] [--archive] [--bundle] [--debug]
              [--mem_override] [--optimize_peak_mem] [--init_net_model]
              [--input_data_type] [--input_shape] [--cc] [--cc_flags  [...]]
              [--board]
              input

Model Files
garbage_classification.h5 keras Model
deepSea Compiled Models
garbage_classification.exe deepSea Ubuntu