Cainvas
Model Files
fruit_quality.h5
keras
Model
deepSea Compiled Models
fruit_quality.exe
deepSea
Ubuntu

Assessing the grade and quality of fruit

Credit: AITS Cainvas Community

Photo by ILLO on Dribbble

Fruits arrive in bulk at industries (like fruit juice or jam or any kind that uses fruit) and vary in quality from fresh to almost rotten.

It is important to categorise them based on their quality so as to not affect the taste and quality of the final manufactured product.

For example, a rotten orange can spoil the taste of the entire juice batch.

Here we have 1080 images of pomegranates divided into 12 categories based on grade and quality with 90 in each.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras import layers
import os
import tensorflow.keras
import random
from PIL import Image

Dataset

Fruits of 3 grades (G1, G2, G3) are collected. Once the pomegranate fruits are collected they are then imaged for every alternate day up to a duration of eight days, leading into four qualities (Q1, Q2, Q3, Q4) for each grade. Since, the process is recited for three grades, this resulted into a total of 12 classes of effective quality criteria, with four qualities within each grade.

The dataset folder has 12 subfolders, each corresponding to one of the 12 classes. Each of these subfolders has 90 images.

Citation

[1] Kumar R, A., Rajpurohit, V. S., & Bidari, K. Y. (2019). Multi Class Grading and Quality Assessment of Pomegranate Fruits Based on Physical and Visual Parameters. International Journal of Fruit Science, 19(4), 372-396.

[2] Arun Kumar R, Vijay S. Rajpurohit, and Bhairu J. Jirage, "Pomegranate Fruit Quality Assessment Using Machine Intelligence and Wavelet Features," Journal of Horticultural Research, vol. 26, no. 1, pp. 53–60, 2018. doi: 10.2478/johr-2018-0006

In [2]:
!wget https://cainvas-static.s3.amazonaws.com/media/user_data/cainvas-admin/Pomegranate.zip
!unzip -qo Pomegranate.zip
!rm Pomegranate.zip
--2021-09-08 05:54:17--  https://cainvas-static.s3.amazonaws.com/media/user_data/cainvas-admin/Pomegranate.zip
Resolving cainvas-static.s3.amazonaws.com (cainvas-static.s3.amazonaws.com)... 52.219.62.116
Connecting to cainvas-static.s3.amazonaws.com (cainvas-static.s3.amazonaws.com)|52.219.62.116|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 127534298 (122M) [application/zip]
Saving to: ‘Pomegranate.zip’

Pomegranate.zip     100%[===================>] 121.63M   105MB/s    in 1.2s    

2021-09-08 05:54:19 (105 MB/s) - ‘Pomegranate.zip’ saved [127534298/127534298]

In [3]:
data_dir = 'Pomegranate'

print("Number of samples")
for f in os.listdir(data_dir + '/'):
    if os.path.isdir(data_dir + '/' + f):
        print(f, " : ", len(os.listdir(data_dir + '/' + f +'/')))
Number of samples
G1_Q2  :  90
G1_Q3  :  90
G3_Q2  :  90
G3_Q1  :  90
G1_Q4  :  90
G2_Q4  :  90
G2_Q1  :  90
G3_Q4  :  90
G1_Q1  :  90
G2_Q3  :  90
G2_Q2  :  90
G3_Q3  :  90
In [4]:
# Splitting into train and validation dataset  - 80-20 split.

batch_size = 16

print("Training set")
train_ds = tf.keras.preprocessing.image_dataset_from_directory(data_dir, validation_split=0.2, subset="training", seed=113, batch_size=batch_size)  

print("Validation set")
val_ds = tf.keras.preprocessing.image_dataset_from_directory(data_dir, validation_split=0.2, subset="validation", seed=113, batch_size=batch_size)  
Training set
Found 1080 files belonging to 12 classes.
Using 864 files for training.
Validation set
Found 1080 files belonging to 12 classes.
Using 216 files for validation.
In [5]:
# Looking into the class names

class_names = train_ds.class_names
print(class_names)
['G1_Q1', 'G1_Q2', 'G1_Q3', 'G1_Q4', 'G2_Q1', 'G2_Q2', 'G2_Q3', 'G2_Q4', 'G3_Q1', 'G3_Q2', 'G3_Q3', 'G3_Q4']

Visualisation

In [6]:
num_samples = 4    # the number of samples to be displayed in each class

for x in class_names:
    plt.figure(figsize=(10, 10))

    filenames = os.listdir(data_dir + '/' + x)

    for i in range(num_samples):
        j = i
        if filenames[i][-4:] =='xlsx':
            j = i+num_samples
        ax = plt.subplot(1, num_samples, i + 1)
        img = Image.open(data_dir +'/' + x + '/' + filenames[j])
        plt.imshow(img)
        plt.title(x)
        plt.axis("off")

Preprocessing

In [7]:
# Normalizing the pixel values - apply to both train and validation set

normalization_layer = tf.keras.Sequential(
    [
        layers.experimental.preprocessing.Rescaling(1./255)
    ])


train_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
val_ds = val_ds.map(lambda x, y: (normalization_layer(x), y))

Model architecture and training

In [8]:
# Using transfer learning
base_model = tf.keras.applications.VGG16(weights='imagenet', input_shape=(256, 256, 3), include_top=False)    # False, do not include the classification layer of the model

base_model.trainable = False

inputs = tf.keras.Input(shape=(256, 256, 3))

x = base_model(inputs, training=False)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
outputs = tf.keras.layers.Dense(len(class_names), activation = 'softmax')(x)    # Add own classififcation layer

model = tf.keras.Model(inputs, outputs)
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
58892288/58889256 [==============================] - 1s 0us/step
In [9]:
# training with a learning rate of 0.1
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.1), loss=tf.keras.losses.SparseCategoricalCrossentropy(), metrics=['accuracy'])

history1 = model.fit(train_ds, validation_data=val_ds, epochs=16)
Epoch 1/16
54/54 [==============================] - 9s 165ms/step - loss: 4.4554 - accuracy: 0.1528 - val_loss: 3.0274 - val_accuracy: 0.2037
Epoch 2/16
54/54 [==============================] - 8s 152ms/step - loss: 2.2799 - accuracy: 0.3553 - val_loss: 1.8581 - val_accuracy: 0.3519
Epoch 3/16
54/54 [==============================] - 8s 151ms/step - loss: 1.7842 - accuracy: 0.4514 - val_loss: 2.0416 - val_accuracy: 0.3981
Epoch 4/16
54/54 [==============================] - 8s 152ms/step - loss: 1.4377 - accuracy: 0.5451 - val_loss: 1.4827 - val_accuracy: 0.4861
Epoch 5/16
54/54 [==============================] - 8s 151ms/step - loss: 1.0777 - accuracy: 0.6539 - val_loss: 1.2759 - val_accuracy: 0.5556
Epoch 6/16
54/54 [==============================] - 8s 152ms/step - loss: 1.0802 - accuracy: 0.6273 - val_loss: 1.2676 - val_accuracy: 0.6019
Epoch 7/16
54/54 [==============================] - 8s 150ms/step - loss: 1.2641 - accuracy: 0.6331 - val_loss: 1.5635 - val_accuracy: 0.5093
Epoch 8/16
54/54 [==============================] - 8s 152ms/step - loss: 0.8909 - accuracy: 0.7130 - val_loss: 1.6158 - val_accuracy: 0.5833
Epoch 9/16
54/54 [==============================] - 8s 153ms/step - loss: 0.7694 - accuracy: 0.7685 - val_loss: 1.1179 - val_accuracy: 0.6528
Epoch 10/16
54/54 [==============================] - 8s 152ms/step - loss: 0.7312 - accuracy: 0.7662 - val_loss: 0.9712 - val_accuracy: 0.6528
Epoch 11/16
54/54 [==============================] - 8s 152ms/step - loss: 0.6221 - accuracy: 0.8148 - val_loss: 1.7353 - val_accuracy: 0.6250
Epoch 12/16
54/54 [==============================] - 8s 152ms/step - loss: 1.0276 - accuracy: 0.7222 - val_loss: 0.9078 - val_accuracy: 0.7130
Epoch 13/16
54/54 [==============================] - 8s 153ms/step - loss: 0.4796 - accuracy: 0.8333 - val_loss: 0.8929 - val_accuracy: 0.6991
Epoch 14/16
54/54 [==============================] - 8s 153ms/step - loss: 0.4518 - accuracy: 0.8391 - val_loss: 1.0129 - val_accuracy: 0.6528
Epoch 15/16
54/54 [==============================] - 8s 153ms/step - loss: 0.4290 - accuracy: 0.8576 - val_loss: 1.4687 - val_accuracy: 0.5880
Epoch 16/16
54/54 [==============================] - 8s 153ms/step - loss: 0.7023 - accuracy: 0.7743 - val_loss: 1.3117 - val_accuracy: 0.6019
In [10]:
model.summary()
Model: "functional_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         [(None, 256, 256, 3)]     0         
_________________________________________________________________
vgg16 (Functional)           (None, 8, 8, 512)         14714688  
_________________________________________________________________
global_average_pooling2d (Gl (None, 512)               0         
_________________________________________________________________
dense (Dense)                (None, 12)                6156      
=================================================================
Total params: 14,720,844
Trainable params: 6,156
Non-trainable params: 14,714,688
_________________________________________________________________
In [11]:
# training with a learning rate of 0.01
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.01), loss=tf.keras.losses.SparseCategoricalCrossentropy(), metrics=['accuracy'])

history2 = model.fit(train_ds, validation_data=val_ds, epochs=16)
Epoch 1/16
54/54 [==============================] - 8s 155ms/step - loss: 0.2533 - accuracy: 0.9236 - val_loss: 0.6635 - val_accuracy: 0.7639
Epoch 2/16
54/54 [==============================] - 8s 153ms/step - loss: 0.1826 - accuracy: 0.9525 - val_loss: 0.5981 - val_accuracy: 0.7546
Epoch 3/16
54/54 [==============================] - 8s 157ms/step - loss: 0.1614 - accuracy: 0.9699 - val_loss: 0.5438 - val_accuracy: 0.8102
Epoch 4/16
54/54 [==============================] - 8s 154ms/step - loss: 0.1653 - accuracy: 0.9676 - val_loss: 0.5719 - val_accuracy: 0.8009
Epoch 5/16
54/54 [==============================] - 8s 154ms/step - loss: 0.1577 - accuracy: 0.9699 - val_loss: 0.5551 - val_accuracy: 0.7963
Epoch 6/16
54/54 [==============================] - 8s 154ms/step - loss: 0.1674 - accuracy: 0.9699 - val_loss: 0.6061 - val_accuracy: 0.7731
Epoch 7/16
54/54 [==============================] - 8s 155ms/step - loss: 0.1717 - accuracy: 0.9641 - val_loss: 0.5718 - val_accuracy: 0.7963
Epoch 8/16
54/54 [==============================] - 8s 155ms/step - loss: 0.1532 - accuracy: 0.9734 - val_loss: 0.5689 - val_accuracy: 0.8009
Epoch 9/16
54/54 [==============================] - 8s 155ms/step - loss: 0.1518 - accuracy: 0.9722 - val_loss: 0.5280 - val_accuracy: 0.8009
Epoch 10/16
54/54 [==============================] - 8s 155ms/step - loss: 0.1540 - accuracy: 0.9676 - val_loss: 0.5721 - val_accuracy: 0.7731
Epoch 11/16
54/54 [==============================] - 8s 155ms/step - loss: 0.1312 - accuracy: 0.9850 - val_loss: 0.4950 - val_accuracy: 0.8056
Epoch 12/16
54/54 [==============================] - 8s 155ms/step - loss: 0.1391 - accuracy: 0.9838 - val_loss: 0.5023 - val_accuracy: 0.8056
Epoch 13/16
54/54 [==============================] - 8s 156ms/step - loss: 0.1369 - accuracy: 0.9780 - val_loss: 0.5627 - val_accuracy: 0.7917
Epoch 14/16
54/54 [==============================] - 8s 157ms/step - loss: 0.1399 - accuracy: 0.9769 - val_loss: 0.6060 - val_accuracy: 0.7870
Epoch 15/16
54/54 [==============================] - 8s 156ms/step - loss: 0.1463 - accuracy: 0.9769 - val_loss: 0.5517 - val_accuracy: 0.8009
Epoch 16/16
54/54 [==============================] - 8s 156ms/step - loss: 0.1374 - accuracy: 0.9769 - val_loss: 0.5504 - val_accuracy: 0.7917
In [12]:
output = model.evaluate(val_ds)
14/14 [==============================] - 1s 87ms/step - loss: 0.5504 - accuracy: 0.7917

Plotting the metrics

In [13]:
def plot(history1, history2, variable1, variable2):
    # combining metrics from both trainings    
    var1_history = history1[variable1]
    var1_history.extend(history2[variable1])
    
    var2_history = history1[variable2]
    var2_history.extend(history2[variable2])
    
    # plotting them
    plt.plot(range(len(var1_history)), var1_history)
    plt.plot(range(len(var2_history)), var2_history)
    plt.legend([variable1, variable2])
    plt.title(variable1)
In [14]:
plot(history1.history, history2.history, "accuracy", 'val_accuracy')
In [15]:
plot(history1.history, history2.history, "loss", 'val_loss')

Prediction

In [17]:
model.save('pomegranate.h5')
model = tf.keras.models.load_model('pomegranate.h5')
In [18]:
# pick random test data sample from one batch
x = random.randint(0, batch_size - 1)

for i in val_ds.as_numpy_iterator():
    img, label = i    
    plt.axis('off')   # remove axes
    plt.imshow(img[x])    # shape from (64, 256, 256, 3) --> (256, 256, 3)
    output = model.predict(np.expand_dims(img[x],0))    # getting output; input shape (256, 256, 3) --> (1, 256, 256, 3)
    pred = np.argmax(output[0])    # finding max
    print("Prdicted: ", class_names[pred])    # Picking the label from class_names base don the model output
    print("True: ", class_names[label[x]])
    print("Probability: ", output[0][pred])
    break
Prdicted:  G2_Q2
True:  G1_Q2
Probability:  0.58616024