Pneumonia Detection Utilizing CNN in Python

0
11
Adv1


Adv2

On this article, we are going to learn to construct a classifier utilizing a easy Convolution Neural Community which might classify the pictures of affected person’s xray to detect wheather the affected person is Regular or affected by Pneumonia.

To get extra understanding, observe the steps accordingly.

Importing Libraries

The libraries we are going to utilizing are : 

  • Pandas- The pandas library is a well-liked open-source information manipulation and evaluation software in Python. It supplies a knowledge construction known as a DataFrame, which is analogous to a spreadsheet or a SQL desk, and permits for straightforward manipulation and evaluation of information.
  • Numpy- NumPy is a well-liked open-source library in Python for scientific computing, particularly for working with numerical information. It supplies instruments for working with massive, multi-dimensional arrays and matrices, and gives a variety of mathematical features for performing operations on these arrays.
  • Matplotlib- It’s a widespread open-source information visualization library in Python. It supplies a spread of instruments for creating high-quality visualizations of information, together with line plots, scatter plots, bar plots, histograms, and extra.
  • TensorFlow- TensorFlow is a well-liked open-source library in Python for constructing and coaching machine studying fashions. It was developed by Google and is broadly utilized in each academia and business for quite a lot of functions, together with picture and speech recognition, pure language processing, and advice techniques.

Python3

import matplotlib.pyplot as plt

import tensorflow as tf

import pandas as pd

import numpy as np

    

import warnings

warnings.filterwarnings('ignore')

    

from tensorflow import keras

from keras import layers

from tensorflow.keras.fashions import Sequential

from tensorflow.keras.layers import Activation, Dropout, Flatten, Dense

from tensorflow.keras.layers import Conv2D, MaxPooling2D

from tensorflow.keras.utils import image_dataset_from_directory

from tensorflow.keras.preprocessing.picture import ImageDataGenerator, load_img

from tensorflow.keras.preprocessing import image_dataset_from_directory

    

import os

import matplotlib.picture as mpimg

Importing Dataset 

To run the pocket book within the native system. The dataset will be downloaded from [ https://www.kaggle.com/datasets/paultimothymooney/chest-xray-pneumonia ]. The dataset is within the format of a zipper file. So to import after which unzip it, by working the beneath code. 

Python3

import zipfile

zip_ref = zipfile.ZipFile('/content material/chest-xray-pneumonia.zip', 'r')

zip_ref.extractall('/content material')

zip_ref.shut()

Learn the picture dataset

On this part, we are going to attempt to perceive visualize some pictures which have been offered to us to construct the classifier for every class.

Let’s load the coaching picture

Python3

path = '/content material/chest_xray/chest_xray/practice'

path = '/kaggle/enter/chest-xray-pneumonia/chest_xray/practice'

courses = os.listdir(path)

print(courses)

Output:

['PNEUMONIA', 'NORMAL']

This exhibits that, there are two courses that we have now right here i.e. Regular and Pneumonia.

Python3

PNEUMONIA_dir = os.path.be part of(path + '/' + courses[0])

NORMAL_dir = os.path.be part of(path + '/' + courses[1])

  

pneumonia_names = os.listdir(PNEUMONIA_dir)

normal_names = os.listdir(NORMAL_dir)

  

print('There are ', len(pneumonia_names),

      'pictures of pneumonia imfected in coaching dataset')

print('There are ', len(normal_names), 'regular pictures in coaching dataset')

Output:

There are  3875 pictures of pneumonia imfected in coaching dataset
There are  1341 regular pictures in coaching dataset

Plot the Pneumonia contaminated Chest X-ray pictures

Python3

fig = plt.gcf()

fig.set_size_inches(16, 8)

  

pic_index = 210

  

pneumonia_images = [os.path.join(PNEUMONIA_dir, fname)

                    for fname in pneumonia_names[pic_index-8:pic_index]]

for i, img_path in enumerate(pneumonia_images):

    sp = plt.subplot(2, 4, i+1)

    sp.axis('Off')

  

    

    img = mpimg.imread(img_path)

    plt.imshow(img)

  

plt.present()

Output:

Pneumonia infected Chest X-ray images - Geeksforgeeks

Pneumonia contaminated Chest X-ray pictures

Plot the Regular Chest X-ray pictures

Python3

fig = plt.gcf()

fig.set_size_inches(16, 8)

  

pic_index = 210

  

normal_images = [os.path.join(NORMAL_dir, fname)

              for fname in normal_names[pic_index-8:pic_index]]

for i, img_path in enumerate(normal_images):

    sp = plt.subplot(2, 4, i+1)

    sp.axis('Off')

  

    

    img = mpimg.imread(img_path)

    plt.imshow(img)

  

plt.present()

Output:

Normal Chest X-ray images - Geeksforgeeks

Regular Chest X-ray pictures

Knowledge Preparation for Coaching

On this part, we are going to classify the dataset into practice,take a look at and validation format.

Python3

Prepare = keras.utils.image_dataset_from_directory(

    listing='/content material/chest_xray/chest_xray/practice',

    labels="inferred",

    label_mode="categorical",

    batch_size=32,

    image_size=(256, 256))

Check = keras.utils.image_dataset_from_directory(

    listing='/content material/chest_xray/chest_xray/take a look at',

    labels="inferred",

    label_mode="categorical",

    batch_size=32,

    image_size=(256, 256))

Validation = keras.utils.image_dataset_from_directory(

    listing='/content material/chest_xray/chest_xray/val',

    labels="inferred",

    label_mode="categorical",

    batch_size=32,

    image_size=(256, 256))

Output:

Discovered 5216 recordsdata belonging to 2 courses.
Discovered 624 recordsdata belonging to 2 courses.
Discovered 16 recordsdata belonging to 2 courses.

Mannequin Structure

The mannequin structure will be described as follows:

  • Enter layer: Conv2D layer with 32 filters, 3×3 kernel dimension, ‘relu’ activation operate, and enter form of (256, 256, 3)
  • MaxPooling2D layer with 2×2 pool dimension
  • Conv2D layer with 64 filters, 3×3 kernel dimension, ‘relu’ activation operate
  • MaxPooling2D layer with 2×2 pool dimension
  • Conv2D layer with 64 filters, 3×3 kernel dimension, ‘relu’ activation operate
  • MaxPooling2D layer with 2×2 pool dimension
  • Conv2D layer with 64 filters, 3×3 kernel dimension, ‘relu’ activation operate
  • MaxPooling2D layer with 2×2 pool dimension
  • Flatten layer
  • Dense layer with 512 neurons, ‘relu’ activation operate
  • BatchNormalization layer
  • Dense layer with 512 neurons, ‘relu’ activation operate
  • Dropout layer with a price of 0.1
  • BatchNormalization layer
  • Dense layer with 512 neurons, ‘relu’ activation operate
  • Dropout layer with a price of 0.2
  • BatchNormalization layer
  • Dense layer with 512 neurons, ‘relu’ activation operate
  • Dropout layer with a price of 0.2
  • BatchNormalization layer
  • Output layer: Dense layer with 2 neurons and ‘sigmoid’ activation operate, representing the chances of the 2 courses (pneumonia or regular)

In abstract our mannequin has : 

  • 4 Convolutional Layers adopted by MaxPooling Layers.
  • Then One Flatten layer to obtain and flatten the output of the convolutional layer.
  • Then we could have three absolutely related layers adopted by the output of the flattened layer.
  • We’ve included some BatchNormalization layers to allow steady and quick coaching and a Dropout layer earlier than the ultimate layer to keep away from any risk of overfitting.
  • The ultimate layer is the output layer which has the activation operate sigmoid to categorise the outcomes into two courses(i,e Regular or Pneumonia).

Python3

mannequin = tf.keras.fashions.Sequential([

    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(256, 256, 3)),

    layers.MaxPooling2D(2, 2),

    layers.Conv2D(64, (3, 3), activation='relu'),

    layers.MaxPooling2D(2, 2),

    layers.Conv2D(64, (3, 3), activation='relu'),

    layers.MaxPooling2D(2, 2),

    layers.Conv2D(64, (3, 3), activation='relu'),

    layers.MaxPooling2D(2, 2),

  

    layers.Flatten(),

    layers.Dense(512, activation='relu'),

    layers.BatchNormalization(),

    layers.Dense(512, activation='relu'),

    layers.Dropout(0.1),

    layers.BatchNormalization(),

    layers.Dense(512, activation='relu'),

    layers.Dropout(0.2),

    layers.BatchNormalization(),

    layers.Dense(512, activation='relu'),

    layers.Dropout(0.2),

    layers.BatchNormalization(),

    layers.Dense(2, activation='sigmoid')

])

Print the abstract of the mannequin structure:

Output:

Mannequin: "sequential"
_________________________________________________________________
 Layer (kind)                Output Form              Param #   
=================================================================
 conv2d (Conv2D)             (None, 254, 254, 32)      896       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 127, 127, 32)     0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 125, 125, 64)      18496     
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 62, 62, 64)       0         
 2D)                                                             
                                                                 
 conv2d_2 (Conv2D)           (None, 60, 60, 64)        36928     
                                                                 
 max_pooling2d_2 (MaxPooling  (None, 30, 30, 64)       0         
 2D)                                                             
                                                                 
 conv2d_3 (Conv2D)           (None, 28, 28, 64)        36928     
                                                                 
 max_pooling2d_3 (MaxPooling  (None, 14, 14, 64)       0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 12544)             0         
                                                                 
 dense (Dense)               (None, 512)               6423040   
                                                                 
 batch_normalization (BatchN  (None, 512)              2048      
 ormalization)                                                   
                                                                 
 dense_1 (Dense)             (None, 512)               262656    
                                                                 
 dropout (Dropout)           (None, 512)               0         
                                                                 
 batch_normalization_1 (Batc  (None, 512)              2048      
 hNormalization)                                                 
                                                                 
 dense_2 (Dense)             (None, 512)               262656    
                                                                 
 dropout_1 (Dropout)         (None, 512)               0         
                                                                 
 batch_normalization_2 (Batc  (None, 512)              2048      
 hNormalization)                                                 
                                                                 
 dense_3 (Dense)             (None, 512)               262656    
                                                                 
 dropout_2 (Dropout)         (None, 512)               0         
                                                                 
 batch_normalization_3 (Batc  (None, 512)              2048      
 hNormalization)                                                 
                                                                 
 dense_4 (Dense)             (None, 2)                 1026      
                                                                 
=================================================================
Complete params: 7,313,474
Trainable params: 7,309,378
Non-trainable params: 4,096
_________________________________________________________________

The enter picture we have now taken initially resized into 256 X 256. And later it reworked into the binary classification worth. 

Plot the mannequin structure:

Python3

keras.utils.plot_model(

    mannequin,

    

    show_shapes=True,

    

    show_dtype=True,

    

    show_layer_activations=True

)

Output:

Model - Geeksforgeeks

Mannequin

Compile the Mannequin:

Python3

mannequin.compile(

    

    loss='binary_crossentropy',

    

    optimizer='adam',

    

    metrics=['accuracy']

)

Prepare the mannequin

Now we will practice our mannequin,  right here we outline epochs = 10, however you possibly can carry out hyperparameter tuning for higher outcomes.

Python3

historical past = mannequin.match(Prepare,

          epochs=10,

          validation_data=Validation)

Output:

Epoch 1/10
163/163 [==============================] - 59s 259ms/step - loss: 0.2657 - accuracy: 0.9128 - val_loss: 2.1434 - val_accuracy: 0.5625
Epoch 2/10
163/163 [==============================] - 34s 201ms/step - loss: 0.1493 - accuracy: 0.9505 - val_loss: 3.0297 - val_accuracy: 0.6250
Epoch 3/10
163/163 [==============================] - 34s 198ms/step - loss: 0.1107 - accuracy: 0.9626 - val_loss: 0.5933 - val_accuracy: 0.7500
Epoch 4/10
163/163 [==============================] - 33s 197ms/step - loss: 0.0992 - accuracy: 0.9640 - val_loss: 0.3691 - val_accuracy: 0.8125
Epoch 5/10
163/163 [==============================] - 34s 202ms/step - loss: 0.0968 - accuracy: 0.9651 - val_loss: 3.5919 - val_accuracy: 0.5000
Epoch 6/10
163/163 [==============================] - 34s 199ms/step - loss: 0.1012 - accuracy: 0.9653 - val_loss: 3.8678 - val_accuracy: 0.5000
Epoch 7/10
163/163 [==============================] - 34s 198ms/step - loss: 0.1026 - accuracy: 0.9613 - val_loss: 3.2006 - val_accuracy: 0.5625
Epoch 8/10
163/163 [==============================] - 35s 204ms/step - loss: 0.0785 - accuracy: 0.9701 - val_loss: 1.7824 - val_accuracy: 0.5000
Epoch 9/10
163/163 [==============================] - 34s 198ms/step - loss: 0.0717 - accuracy: 0.9745 - val_loss: 3.3485 - val_accuracy: 0.5625
Epoch 10/10
163/163 [==============================] - 35s 200ms/step - loss: 0.0699 - accuracy: 0.9770 - val_loss: 0.5788 - val_accuracy: 0.6250

Mannequin Analysis

Let’s visualize the coaching and validation accuracy with every epoch.

Python3

history_df = pd.DataFrame(historical past.historical past)

history_df.loc[:, ['loss', 'val_loss']].plot()

history_df.loc[:, ['accuracy', 'val_accuracy']].plot()

plt.present()

Output:

Losses per iterations -Geeksforgeeks

Losses per iterations

Accuracy per iterations -Geeksforgeeks

Accuracy per iterations

Our mannequin is performing good on coaching dataset, however not on take a look at dataset. So, that is the case of overfitting. This can be as a result of imbalanced dataset.

Discover the accuracy on Check Datasets

Python3

loss, accuracy = mannequin.consider(Check)

print('The accuracy of the mannequin on take a look at dataset is',

      np.spherical(accuracy*100))

Output:

20/20 [==============================] - 4s 130ms/step - loss: 0.4542 - accuracy: 0.8237
The accuracy of the mannequin on take a look at dataset is 82.0

Prediction

Let’s examine the mannequin for random pictures.

Python3

test_image = tf.keras.utils.load_img(

    "/content material/chest_xray/chest_xray/take a look at/NORMAL/IM-0010-0001.jpeg",

    target_size=(256, 256))

  

plt.imshow(test_image)

  

test_image = tf.keras.utils.img_to_array(test_image)

test_image = np.expand_dims(test_image, axis=0)

  

consequence = mannequin.predict(test_image)

  

class_probabilities = consequence[0]

  

if class_probabilities[0] > class_probabilities[1]:

    print("Regular")

else:

    print("Pneumonia")

Output:

1/1 [==============================] - 0s 328ms/step
Pneumonia

Regular Chest X-ray

Python3

test_image = tf.keras.utils.load_img(

    "/content material/chest_xray/chest_xray/take a look at/PNEUMONIA/person100_bacteria_478.jpeg",

     target_size=(256, 256))

  

plt.imshow(test_image)

  

test_image = tf.keras.utils.img_to_array(test_image)

test_image = np.expand_dims(test_image, axis=0)

  

consequence = mannequin.predict(test_image)

  

class_probabilities = consequence[0]

  

if class_probabilities[0] > class_probabilities[1]:

    print("Regular")

else:

    print("Pneumonia")

Output:

1/1 [==============================] - 0s 328ms/step
Pneumonia
Pneumonia Infected Chest - Geeksforgeeks

Pneumonia Contaminated Chest

Conclusions:

Our mannequin is performing nicely however as per losses and accuracy curve per iterations. It’s overfitting. This can be as a result of unbalanced dataset. By balancing the dataset with an equal variety of regular and pneumonia pictures. We will get a greater consequence.

Adv3