A. Convolutional Neural NetworksConvolutional neural networks (CNN) are a type of the neural networks that are particularly suited for image analysis. Convolutional neural networks are widely used for image classification, recognition, objects detection [13]. A typical CNN architecture contains convolutional, pooling and fully-connected layers. Relatively novel techniques such as batch normalization, dropout, and shortcut connections[14] can additionally be used to increase classification accuracy.B. Conv Net ArchitectureFor Convolutional Neural Networks, VGGNet is a well-documented and generally used architecture. It is now very popular due to its impressive performance on image data.
For best-performing (with 16 and 19 weight layers) have been made publicly available. In this work, the VGG16 architecture was chosen, since it has been found easy to be applied to different types of datasets. Make things easier well to other datasets. During training, the input to our ConvNets is a fixed-size 224 X 224 RGB images. The only pre-processing, we do is subtracting the mean RGB value, computed on the training set, from each pixel.
The image is passed through a stack of convolutional layers, where we use filters with a very small receptive field 3 X 3 (which is the smallest size to capture the notion of left/ right, up/down, center). Only max pooling is used in VGG-16. The pooling kernel size is always 2*2 and the stride is always 2 in VGG-16. Fully connected layers are implemented using convolution in VGG-16. Its size is shown in the format n1*n2, where n1 is the size of the input tensor, and n2 is the size of the output tensor.
Dropout is a technique to improve the generalization of deep learning methods. It sets the weights connected to a certain percentage of nodes in the network to 0 (and VGG-16 set the percentage to 0.5 in the two dropout layers) [15]. The input layer of the network expects a 224×224 pixel RGB image. All hidden layers are set with a ReLU (Rectified Linear Unit) as the activation function layer (nonlinearity operation) and include three-dimensional pooling through use of a max-pooling layer. The network is established with a classifier block consisting of three Fully Connected (FC) layers.Fig. 1. VGG-16 ArchitectureC. Support Vector MachineSupport vector machines (SVMs, also support vector networks) is a supervised learning model that studies data used for classification and regression analysis. An SVMmodel is a symbol of the examples as points in space, mapped so that the examples of the detached groups are separated by a clear gap that is as wide as possible. New examples are then mapped into that same space and expected to belong to a group based on which side of the gap they fall. In addition, the task of linear classification, SVMs can perform non-linear classification using the kernel the trick, indirectly mapping their inputs into high-dimensional feature spaces.D. Random ForestRandom Forest is a supervised learning algorithm. Random forest builds multiple decision trees and merges them together to get a more accurate and stable prediction. Random Forest is a flexible, easy to use machine learning algorithm that produces, even without hyper-parameter modification, a great result most of the time. It is also one of the commonly used algorithms, due to its simplicity and the fact that it can be used for both classification and regression tasks. The forest it builds, is an ensemble of decision trees, most of the time trained with the bagging method. The general idea of the bagging method is that a combination of learning models increases the overall result. Random Forest has nearly the same hyper parameters as a decision tree or a bagging classifier. Random Forest adds additional randomness to the model while growing the trees [7]. Instead of searching for the most important feature while splitting a node, it searches for the best feature among a random subset of features. These results in a wide range that generally results in a better model. Therefore, in a random forest, only a random subset of the features is taken into consideration by the algorithm for splitting a node. Random forest is a collection of decision trees, but there are some differences. One difference is that deep decision trees might suffer from overfitting. Random forest prevents over- fitting most of the time, by creating random subsets of the features and building smaller trees using these subsets. Random forests are a way of averaging multiple deep decision trees, trained on different parts of the same training set, with the goal of reducing the variance.IV. PROPOSED METHODOLOGYIn this research, several methods, from classical machine learning algorithms like SVM, tree-based algorithm Random Forest, and deep learning-based algorithm have been investigated. The process of disease detection and classification is shown in the below figure. A. Pre-ProcessingIn our work, we attempt to keep the pre-processing steps, minimal to confirm better generalization skill when tested on other dermoscopic skin lesion datasets. We thus only apply some standard pre-processing steps. First, we normalize the pixel values of the images. Next, the images are resized and the size of 224 x 224 pixels.B. Data AugmentationData Augmentation is a method that is used to avoid overfitting when training Machine Learning models. The goal of data augmentation is to learn how to raise our data set the size to train robust Convolutional Network models with limited or small amounts of data. This study is requiring for improving the performance of an image classification model. Some of the simplest augmentations that are flipping, translations, rotation, scaling, color enhancement, isolating individual R, G, B color channels, and adding noise, etc. Generating augmented input for CNN, using image analysis filters the traditional input to CNN architecture consists of whole images, or image patches, of a standard size, in RGB format. In this work, we augment the input of the CNN with the response of a number of well-established filters that are frequently used for image feature extraction. We augment the training set by blurring the images and we use Gaussian blur for reducing noise and make the image smoother. After that we convert the RGB image to enhance the red color on the image and apply a detached layer on an image, later partition is performed on those images. This augmentation leads to an increase in training data.Fig. 3. Data Augmentation Fig. 2. Flow Diagram of our modelThe results of this research have the potential to be used as a practical tool for diagnosis. C. Image SegmentationImage segmentation is an important area in an image processing background. It is the process to classify an image into different groups. There are many different methods, and k-means is one of the most popular methods. K-means clustering in such a fashion that the different regions of the image are marked with different colors and if possible, boundaries are created separating different regions. The motivation behind image segmentation using k-means is that we try to assign labels to each pixel based on the RGB values. Color Quantization is the process of reducing the number of colors in an image. Sometimes, some devices may have a constraint such that it can produce only a limited number of colors. In those cases, also color quantization is performed. Here we use k-means clustering for color quantization. There are 3 features, say, R, G, B. So, we need to reshape the image to an array of Mx3 size (M is the number of pixels in the image). We also set a criteria value for k-means which defines the termination criteria for k-means. We return the segmented output and also the labeled result. The labeled result contains the labels 0 to N, where N is the number of partitions we choose. Each label corresponds to one partition. And after the clustering, we apply centroid values (it is also R, G, B) to all pixels, such that resulting image will have a specified number of colors. And again, we need to reshape it back to the shape of the original image.D. ClassificationFor disease classification, we use a collection of recentmachine learning models such as SVM, Random Forest,Convolutional neural networks. While implementing deeplearning algorithms, I have chosen one novel Convolutionalneural network architecture and it is VGG-16 model. In oursystem, we propose to use the method of segmentation,classification and Convolution Neural Network. Since wehave an only little amount of data to feed into Convolutionalneural networks, we used data augmentation to increase thesize of our training data, so that it fits well on the validationdata. This classification method proves to be efficient formost of the skin images.Result and Discussion