Saturday 15 January 2022

Application of Convolution Neural Network in disease detection such as pneumonia/covid detection through Chest X-ray/ heart beat classification etc.


Objective :- Application of Convolution Neural Network in disease detection such as pneumonia/covid detection through Chest X-ray/ heart beat classification etc.

Abstract : CNNs are powerful image processing, artificial intelligence (AI) that use deep learning to perform both generative and descriptive tasks, often using machine vison that includes image and video recognition, along with recommender systems and natural language processing (NLP).

A CNN uses a system much like a multilayer perceptron that has been designed for reduced processing requirements. The layers of a CNN consist of an input layer, an output layer and a hidden layer that includes multiple convolutional layers, pooling layers, fully connected layers and normalization layers. The removal of limitations and increase in efficiency for image processing results in a system that is far more effective, simpler to trains limited for image processing and natural language processing.

a).Application of Convolution Neural Network in pneumonia through Chest X-ray classification

Introduction:

Pneumonia is a lung parenchyma inflammation often caused by pathogenic microorganisms, factors of physical and chemical, immunologic injury and other pharmaceuticals. There are several popular pneumonia classification methods: (1) pneumonia is classified as infectious and non-infectious based on different pathogeneses in which infectious pneumonia is then classified to bacteria, virus, mycoplasmas, chlamydial pneumonia, and others, while non-infectious pneumonia is classified as immune-associated pneumonia, aspiration pneumonia caused by physical and chemical factors, and radiation pneumonia. (2) Pneumonia is classified as CAP (community-acquired pneumonia), HAP (hospital-acquired pneumonia) and VAP (ventilator-associated pneumonia) based on different infections, among which CAP accounts for a larger part. Because of the different range of pathogens, HAP is easier to develop resistance to various antibiotics, making treatment more difficult.

Related Work:

Several methods have been introduced to describe a brief process in pneumonia detection using chest X-ray images in recent years, especially some deep learning methods. Deep Learning has been successfully applied to improve the performance of computer aided diagnosis technology (CAD), especially in the field of medical imaging [5], image segmentation [6,7] and image reconstruction [8,9]. In 2017, Rampura et al. [10] proposed a classical deep learning network named DenseNet-121 [11], which was a 121-layer CNN model to accelerate the diagnosis for pneumonia

Background:

In the past few decades, machine learning (ML) algorithms have gradually attracted researchers’ attention. This type of algorithm could take full advantage of the giant computing power of calculators in images processing through given algorithms or specified steps.

However, traditional ML methods in classification tasks need to manually design algorithms or manually set feature extraction layers to classify images

Proposed CNN Model

Figure 4 illustrates the architecture of our proposed model that has been applied for the detection of whether the input image shows pneumonia. Figure 5 displays our model that contains a total of six layers, where we employed 3 × 3 kernel convolution layers whose strides are 1 × 1 and the activation function is ReLU. After each convolution layer, a 2 × 2 strides kernel operation was employed as a max-pooling operation to retain the maximum of each sub-region, which is split according to strides. Besides, we set several drop layers to randomly fit weights to zero, aiming to improve the model performance. Then two densely fully-connected layers followed by Sigmoid function are utilized to take full advantage of the features extracted through previous layers, outputting the possibility of patients suffering from pneumonia or not. As illustrated above, the input channel is 224 × 224 × 1 and the output size is y ∈ {0, 1}, where 0 denotes that the image does not show pneumonia, while 1 denotes that the image shows pneumonia





      b).Application of Convolution Neural Network in covid detection through heart beat  classification: -

Introduction:-

CNN is used in pattern recognition with superior feature learning capabilities, being a suitable model to deal with image data. Indeed, CNN is a dominant architecture of DL for image classification and can rival human accuracies in many tasks. CNN uses hierarchical layers of tiled convolutional filters to mimic the effects of human receptive fields on feedforward processing in the early visual cortex thereby exploiting the local spatial correlations present in images while developing robustness to natural transformations such as changes of viewpoint

or scale. A CNN-based model generally requires a large set of training samples to achieve good generalization capabilities. Its basic structure is represented as a sequence of Convolutional—Pooling—Fully Connected Layers possibly with other intermediary layers for normalization and/or dropout.

Network architecture:-

  1. Input layer

The input layer basically depends on the dimension of the images. In our network, all images must have the same dimension presented as a grayscale (single colour channel) image.

  1. Batch Normalization layer.

Batch normalization converts the distribution of the inputs to a standard normal distribution with mean 0 and variance 1, avoiding the problem of gradient dispersion and accelerating the training process.

  1. Convolutional layer.

Convolutions are the main building blocks of a CNN. Filter kernels are slid over the image and for each position the dot product of the filter kernel and the part of the image covered by the kernel is taken. All kernels used in this layer are 3 × 3 pixels. The chosen activation function of convolutional layers is the rectified linear unit (ReLU), which is easy to train due to its piecewise linear and sparse characteristics.

  1. Max pooling layer.

Max pooling is a sub-sampling procedure that uses the maximum value of a window as the output. The size of such a window was chosen as 2 × 2 pixels.

  1. Fire layer.

A fire module is comprised of a squeeze convolutional layer (which has only 1 × 1 filters) feeding into an expand layer that has a mix of 1 × 1 and 3 × 3 convolution filters. The use of a fire layer could reduce training time while still extracting data characteristics in comparison with dense layers with the same number of parameters. The layer is represented in Fig 4 in which Input and Output have the same dimensions.

Proposed model:-

Despite their self-learning capacity and superior prediction performance, LWL and SOM models achieve human-like precision in image description and prediction issues. Our framework aims mainly at providing distinguishing visual properties and a quick diagnostic system that can be used to classify new COVID-19 X-rays. This technique can also be useful to clinicians as a treatment plan that can be used depending on the type of infection and can provide prompt decisions.


Related Work:-

Real-time reverse transcription-polymerase chain reaction (RT-PCR) is the primary research technique currently in use for COVID-19 diagnosis. Chest radiographic images, such as CT images and X-rays, are critical for the early diagnosis and treatment of the condition. The low sensitivity of RT-PCR (60–70%) allows symptoms to be detected by analysing radiographic images of patients, even though adverse findings are obtained.


Conclusion:-

Within this context, the literature suggests that the diagnosis may be assisted by the use of data mining methods to classify pneumonia disease in chest X-rays. However, the issue is much more difficult when we look at chest images of patients suffering from pneumonia caused by multiple types of pathogens and attempt to forecast a particular form of pneumonia (COVID-19).

Classify the objects using deep learning techniques.


Objective:
Classify the objects using deep learning techniques.

Theory:
Image classification involves assigning a class label to an image, whereas object localization involves drawing a bounding box around one or more objects in an image. Object detection is more challenging and combines these two tasks and draws a bounding box around each object of interest in the image and assigns them a class label. Together, all of these problems are referred to as object recognition.

  • Image Classification: Predict the type or class of an object in an image.
    • Input: An image with a single object, such as a photograph.
    • Output: A class label (e.g. one or more integers that are mapped to class labels).
  • Object Localization: Locate the presence of objects in an image and indicate their location with a bounding box.
    • Input: An image with one or more objects, such as a photograph.
    • Output: One or more bounding boxes (e.g. defined by a point, width, and height).
  • Object Detection: Locate the presence of objects with a bounding box and types or classes of the located objects in an image.
    • Input: An image with one or more objects, such as a photograph.
    • Output: One or more bounding boxes (e.g. defined by a point, width, and height), and a class label for each bounding box.




Conclusion:
Object detection can be used in many areas to reduce human efforts and increase the efficiency of processes in various fields. Object detection, as well as deep learning, are areas that will be blooming in the future and making its presence across numerous fields.

There is a lot of scope in these fields and also many opportunities for improvements.

Application of Multi-Layer Perceptron on classification Problem



Objective: Application of Multi-Layer Perceptron on classification Problem

Theory: Multi-layer perceptron (MLP) is a supplement of a feed-forward neural network. It consists of three types of layers—the input layer, output layer, and hidden layer, as shown in Fig. below. The input layer receives the input signal to be processed. The required task such as prediction and classification is performed by the output layer. An arbitrary number of hidden layers that are placed in between the input and output layer are the true computational engine of the MLP. Similar to a feed-forward network in an MLP the data flows in the forward direction from input to output layer. The neurons in the MLP are trained with the backpropagation learning algorithm. MLPs are designed to approximate any continuous function and can solve problems that are not linearly separable. The major use cases of MLP are pattern classification, recognition, prediction,

and approximation.

The computations taking place at every neuron in the output and hidden layer are as follows,

o(x)=G(b(2)+W(2)h(x)) …(1)

h(x)=Φ(x)=s(b(1)+W(1)x) …(2)

with bias vectors b(1), b(2); weight matrices W(1), W(2) and activation functions G and s. The set of parameters to learn is the set θ = {W(1), b(1), W(2), b(2)}. Typical

choices for s include tanh function with tanh(a) = (ea − e− a)/(ea + e− a) or the logistic sigmoid function, with sigmoid(a) = 1/(1 + e− a)


Perceptron for Binary Classification

With this discrete output, controlled by the activation function, the perceptron can be used as a binary classification model, defining a linear decision boundary. It finds the separating hyperplane that minimizes the distance between misclassified points and the decision boundary


To minimize this distance, Perceptron uses Stochastic Gradient Descent as the optimization function.

If the data is linearly separable, it is guaranteed that Stochastic Gradient Descent will converge in a finite number of steps.

The last piece that Perceptron needs is the activation function, the function that determines if the neuron will fire or not. Initial Perceptron models used sigmoid function, and just by looking at its shape, it makes a lot of sense! The sigmoid function maps any real input to a value that is either 0 or 1 and encodes a

non-linear function. The neuron can receive negative numbers as input, and it will still be able to produce an output that is either 0 or 1.

A Multilayer Perceptron has input and output layers, and one or more hidden layers with many neurons stacked together. And while in the Perceptron the neuron must have an activation function that imposes a threshold, like ReLU or sigmoid, neurons in a Multilayer Perceptron can use any arbitrary activation function.

Conclusion

Perceptron is a neural network with only one neuron, and can only understand linear relationships between the input and output data provided.

However, with Multilayer Perceptron, horizons are expanded and now this neural network can have many layers of neurons.

Solve the problem of human recognition from their faces using machine learning techniques.


Objective: Solve the problem of human recognition from their faces using machine learning techniques.

Theory:
Let us introduce a new benchmark data set of face images with variable makeup, hairstyles and occlusions, named BookClub artistic makeup data, and then examine the performance of the ANNs under different conditions. Makeup and other occlusions can be used not only to disguise a person's identity from the ANN algorithms, but also to spoof a wrong identification.

ANN Algorithm:
Artificial Neural Network (ANN) are capable of learning patterns of interest from data in the presence of variations. An Artificial Neural Network in the field of Artificial intelligence where it attempts to mimic the network of neurons makes up a human brain so that computers will have an option to understand things and make decisions in a human-like manner. The artificial neural network is designed by programming computers to behave simply like interconnected brain cells.


Artificial Neural Network primarily consists of three layers:

  • Input Layer
  • Hidden Layer
  • Output Layer







Procedure:

  1. The images used in this are kept coloured and downsized and compressed into JPEG format with the dimension of 48x48 pixels.
  2. The downsizing is done due to computational restrictions to keep processing times reasonable. However, observations made on the small size images are extendable to larger sizes.
  3. For computational experiments, ‘Keras’ library with Tensorflow back-end were used.
  4. The ANN consists of the four sequential groups of layers of the Gaussian noise, convolution with ReLU activation functions, normalization, pooling and dropout layers.
  5. It is topped with the fully connected layers, the softmax activation function of the last layer and cross-entropy loss function. "Adam" learning algorithm with 0:001 coecient, mini-batch size 32 and 100 epochs parameters are used.

Output:





Conclusion:
Despite the small size images were scaled to and not very deep ANN, mean accuracy of the face recognition of the model trained on the samples from all photo-sessions of all subjects is quite high at 92%, and higher (up to 99:9%)

Solve the weather problem to predict the possibility of a rain happening under known parameters for e.g., temperature, humidity, wind flow, sunny or cloudy etc. using Bayesian Learning.

 

Experiment-2

Objective: Solve the weather problem to predict the possibility of a rain happening under known parameters for e.g., temperature, humidity, wind flow, sunny or cloudy etc. using Bayesian Learning.

Theory: The basic idea of Bayesian networks (BNs) (BNs) is to reproduce the most important dependencies and independencies among a set of variables in a graphical form (a directed acyclic graph) which is easy to understand and interpret. Let us consider the subset of climatic stations shown in the graph in Fıgure, where the variables (rainfall) are represented pictorially by a set of nodes; one node for each variable (for clarity of exposition, the set of nodes is denoted {y1, yn}). These nodes are connected by arrows, which represent a cause and

effect relationship. That is, if there is an arrow from node yi to node yj , we say that yi is the cause of yj , or equivalently, yj is the effect of yi. Another popular terminology of this is to say that yi is a parent of yj or yj is a child of yi. For example, in Figure, the nodes Gijon and Amieva and

Proaza are a child of Gijon and Rioseco (the set of parents of a node yi is denoted by πi). Directed graphs provide a simple defınition of independence (d-separation) based on the existence or not of certain paths between the variables.

The dependency/independency structure displayed by an acyclic directed graph can be also expressed in terms of a the Joint Probability Distribution (JPD) factorized as a product of several conditional distributions as follows:

Pr(y1,y2, …., yn) = n∏i=1 P(yi | πi)

Therefore, the independencies from the graph are easily translated to the probabilistic model in a sound form. For instance, the JPD of a BN defıned by the graph given in Fıgure requires the specifıcation of 100 conditional probability tables, one for each variable conditioned to its parents’ set. Hereafter we shall consider rainfall discretized into three different states (0=“no rain”, 1=“weak rain”, 2=“heavy rain”), associated with the thresholds 0, 2, and 10 mm, respectively.



Procedure:


Learning Bayesian Networks from Data-

In addition to the graph structure, a BN requires that we specify the conditional probability of each node given its parents. However, in many practical problems, we do not know neither the complete topology of the graph, nor some of the required probabilities. For this reason, several methods have been recently introduced for learning the graphical structure (structure learning) and estimating probabilities (parametric learning) from data. A learning algorithm consists of two parts:

  1. A quality measure, which is used for computing the quality of the candidate BNs. This is a global measure, since it measures both the quality of the graphical structure and the quality of the estimated parameters.
  2. A search algorithm, which is used to effıciently search the space of possible BNs to fınd the one with highest quality. Note that the number of all possible networks, even for a small number of variables and, therefore, the search space is huge.


Among the different quality measures proposed in the literature the basic idea of Bayesian quality measures is to assign to every BN a quality value that is a function of the posterior probability distribution of the available data D = {y 1, …, y 100} (with the index t running daily from 1979 to 1993), given the BN (M,θ) with network structure M and the corresponding estimated probabilities θ. The posterior probability distribution p(M, θ|D) is calculated as follows:




Geiger and Heckerman consider multinomial networks and assume certain hypothesis about the prior distributions of the parameters, leading to the quality measure

where n is the number of variables, ri is the cardinal of the i-th variable, si the number of realizations of the parent’s set ∏i , ηijk are the “a priori” Dirichlet hyper-parameters for the conditional distribution of node i, Nijk is the number of realizations in the database consistent with yi = j and πi = k, Nik is the number of realizations in the database consistent with πi = k and Г is the gamma function.

  1. Inference- Once a model describing the relationships among the set of variables has been selected, it can then be used to answer queries when evidence becomes available.
  1. Validation of the Bayesian Network Forecast Model- To check the quality of BN in a simple case, we shall apply this methodology to a nowcasting problem. In this case we are given a forecast in a given subset of stations and we need to infer a prediction for the remaining stations in the network. To this aim, consider that we are given predictions in the fıve stations of the primary network. These predictions shall be plugged in the network as evidence, obtaining the probabilities for the remaining stations in the secondary network.
  1. Connecting With Numerical Atmospheric Models- Since we are interested in rainfall forecasts, we shall use the gridded forecasts of total precipitation given by the operative ECMWF model (these values are obtained by adding both the convective and the large scale precipitation outputs). The forecasts are obtained 24 hours ahead; therefore, they give a numeric estimation of the future precipitation pattern (one day ahead) on a coarse-grained resolution grid.

Output:

Bayesian network of precipitation grid points and local precipitation at the network of local stations.

Conclusion: We have used bayesian network learning and show their applicability for local weather forecasting and downscaling. The preliminary results presented how such models can be built and how they can be used for performing inferen

For a given network of cities, find an optimal path to reach from a given source city to any other destination city using an admissible heuristic.


Objective: For a given network of cities, find an optimal path to reach from a given source city to any other destination city using an admissible heuristic.

Theory:

Heuristics: The heuristic function h(n) tells A* an estimate of the minimum cost from any vertex n to the goal. It’s important to choose a good heuristic function.

The heuristic can be used to control A*’s behavior.

  • At one extreme, if h(n) is 0, then only g(n) plays a role, and A* turns into Dijkstra’s Algorithm, which is guaranteed to find a shortest path.
  • If h(n) is always lower than (or equal to) the cost of moving from n to the goal, then A* is guaranteed to find a shortest path. The lower h(n) is, the more node A* expands, making it slower.
  • If h(n) is exactly equal to the cost of moving from n to the goal, then A* will only follow the best path and never expand anything else, making it very fast. Although you can’t make this happen in all cases, you can make it exact in some special cases. It’s nice to know that given perfect information, A* will behave perfectly.
  • If h(n) is sometimes greater than the cost of moving from n to the goal, then A* is not guaranteed to find a shortest path, but it can run faster.
  • At the other extreme, if h(n) is very high relative to g(n), then only h(n) plays a role, and A* turns into Greedy Best-First-Search.

So we have an interesting situation in that we can decide what we want to get out of A*. With 100% accurate estimates, we’ll get shortest paths really quickly. If we’re too low, then we’ll continue to get shortest paths, but it’ll slow down. If we’re too high, then we give up shortest paths, but A* will run faster.

Procedure:

  1. Put the start node son a list called OPENof unexpanded nodes.
  2. If OPEN is empty exit with failure; no solutions exists.
  3. Remove the first OPEN node n at which f is minimum (break ties arbitrarily), and place it on a list called CLOSEDto be used for expanded nodes.
  4. If nis a goal node, exit successfully with the solution obtained by tracing the path along the pointers from the goal back to s.
  5. Otherwise expand node n, generating all it’s successors with pointers back to n.
  6. For every successor n’on n:a. Calculate f(n’).b. if n’ was neither on OPENnor on CLOSED, add it to OPEN. Attach a pointer from n’back to n. Assign the newly computed f(n’)to node n’.c. if n’ already resided on OPENor CLOSED, compare the newly computed f(n’)with the value previously assigned to n’. If the old value is lower, discard the newly generated node. If the new value is lower, substitute it for the old (n’ now points back to n instead of to its previous predecessor). If the matching node n’ resides on CLOSED, move it back to OPEN.
  7. Go to step 2.
















Conclusion: When h is consistent, the f values of nodes expanded by A* are never decreasing. When A* selected n for expansion it already found the shortest path to it. When h is consistent every node is expanded once.Normally the heuristics we encounter are consistent

–the number of misplaced tiles

–Manhattan distance

–straight-line distance










The Future of Web Development: Why Next.js is Going Viral

  Are you ready to level up your web development game? Look no further than Next.js, the latest sensation in the world of web development th...