Image segmentation and classification tasks in computer vision have proven to be highly effective using neural networks, specifically Convolutional Neural Networks (CNNs). These tasks have numerous
practical applications, such as in medical imaging, autonomous driving, and surveillance. CNNs are capable
of learning complex features directly from images and achieving outstanding performance across several
datasets. In this work, we have utilized three different datasets to investigate the efficacy of various preprocessing and classification techniques in accurssedately segmenting and classifying different structures
within the MRI and natural images. We have utilized both sample gradient and Canny Edge Detection
methods for pre-processing, and K-means clustering have been applied to segment the images. Image
augmentation improves the size and diversity of datasets for training the models for image classification
Evaluation of deep neural network architectures in the identification of bone...TELKOMNIKA JOURNAL
This document evaluates the performance of three deep neural network architectures - ResNet, DenseNet, and NASNet - in identifying bone fissures in radiological images. The networks were trained on a dataset of 1000 labeled images of fissured and seamless bones. NASNet achieved the best performance with 75% accuracy, outperforming ResNet and DenseNet. While all networks reduced classification errors, NASNet did so with the fewest parameters. The document concludes NASNet is the best solution for this bone fissure identification task.
This document outlines a project on brain tumor detection and diagnosis using convolutional neural networks. It discusses the objective of outlining current automatic segmentation techniques using CNNs. It then provides an introduction on the importance of accurate brain tumor segmentation for diagnosis and treatment. The remaining sections cover literature reviews on CNN segmentation methods, the overall architecture and working principles, applications and the future scope of this area of research.
Unveiling the Power of Convolutional Neural Networks in Image Processing.pdfEnterprise Wired
In this comprehensive guide, we'll explore the significance of convolutional neural networks, delve into their architecture and functioning, and highlight their transformative impact on image processing and beyond.
Accuracy study of image classification for reverse vending machine waste segr...IJECEIAES
This study aims to create a sorting system with high accuracy that can classify various beverage containers based on types and separate them accordingly. This reverse vending machine (RVM) provides an image classification method and allows for recycling three types of beverage containers: drink carton boxes, polyethylene terephthalate (PET) bottles, and aluminium cans. The image classification method used in this project is transfer learning with convolutional neural networks (CNN). AlexNet, GoogLeNet, DenseNet201, InceptionResNetV2, InceptionV3, MobileNetV2, XceptionNet, ShuffleNet, ResNet 18, ResNet 50, and ResNet 101 are the neural networks that used in this project. This project will compare the F1- score and computational time among the eleven networks. The F1-score and computational time of image classification differs for each neural network. In this project, the AlexNet network gave the best F1-score, 97.50% with the shortest computational time, 2229.235 s among the eleven neural networks.
IRJET- A Survey on Medical Image Interpretation for Predicting PneumoniaIRJET Journal
This document summarizes research on using machine learning and deep learning techniques to interpret medical images and predict pneumonia. It first discusses how medical image analysis is an active field for machine learning. It then reviews several related studies on using convolutional neural networks (CNNs) and transfer learning to classify chest x-rays and detect pneumonia. Specifically, it examines research on developing CNN models for pneumonia classification and using pre-trained CNN architectures like VGG16, VGG19, and ResNet with transfer learning. The document concludes that computer-aided diagnosis systems using deep learning can provide accurate predictions to assist radiologists in pneumonia diagnosis from chest x-rays.
Dilated Inception U-Net for Nuclei Segmentation in Multi-Organ Histology ImagesIRJET Journal
The document summarizes a study that used a Dilated Inception U-Net model for nuclei segmentation in histology images. Key points:
1. A Dilated Inception U-Net model was used to segment nuclei in histology images, which employs dilated convolutions to efficiently generate feature maps over a large input area.
2. The model was tested on the MoNuSeg dataset containing H&E stained images. Preprocessing included color normalization, data augmentation, and extracting 256x256 patches.
3. The Dilated Inception U-Net modifies the classic U-Net by replacing convolutional blocks with dilated inception blocks containing 1x1 and 3x3 filters with different dilation rates, allowing it to
This document provides an overview of medical image segmentation using deep learning techniques. It discusses several deep learning architectures used for medical image segmentation, including U-Net, V-Net, GoogleNet, and ResNet. U-Net uses a symmetric encoder-decoder structure with skip connections to efficiently segment biomedical images. V-Net directly processes 3D MRI volumes for prostate segmentation. GoogleNet and ResNet employ inception modules and residual connections, respectively, to reduce parameters and enable training of very deep networks for medical image analysis tasks. The document aims to classify medical image segmentation approaches, discuss challenges, and outline future research directions using deep learning.
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION cscpconf
This paper aims at providing insight on the transferability of deep CNN features to
unsupervised problems. We study the impact of different pretrained CNN feature extractors on
the problem of image set clustering for object classification as well as fine-grained
classification. We propose a rather straightforward pipeline combining deep-feature extraction
using a CNN pretrained on ImageNet and a classic clustering algorithm to classify sets of
images. This approach is compared to state-of-the-art algorithms in image-clustering and
provides better results. These results strengthen the belief that supervised training of deep CNN
on large datasets, with a large variability of classes, extracts better features than most carefully
designed engineering approaches, even for unsupervised tasks. We also validate our approach
on a robotic application, consisting in sorting and storing objects smartly based on clustering
Evaluation of deep neural network architectures in the identification of bone...TELKOMNIKA JOURNAL
This document evaluates the performance of three deep neural network architectures - ResNet, DenseNet, and NASNet - in identifying bone fissures in radiological images. The networks were trained on a dataset of 1000 labeled images of fissured and seamless bones. NASNet achieved the best performance with 75% accuracy, outperforming ResNet and DenseNet. While all networks reduced classification errors, NASNet did so with the fewest parameters. The document concludes NASNet is the best solution for this bone fissure identification task.
This document outlines a project on brain tumor detection and diagnosis using convolutional neural networks. It discusses the objective of outlining current automatic segmentation techniques using CNNs. It then provides an introduction on the importance of accurate brain tumor segmentation for diagnosis and treatment. The remaining sections cover literature reviews on CNN segmentation methods, the overall architecture and working principles, applications and the future scope of this area of research.
Unveiling the Power of Convolutional Neural Networks in Image Processing.pdfEnterprise Wired
In this comprehensive guide, we'll explore the significance of convolutional neural networks, delve into their architecture and functioning, and highlight their transformative impact on image processing and beyond.
Accuracy study of image classification for reverse vending machine waste segr...IJECEIAES
This study aims to create a sorting system with high accuracy that can classify various beverage containers based on types and separate them accordingly. This reverse vending machine (RVM) provides an image classification method and allows for recycling three types of beverage containers: drink carton boxes, polyethylene terephthalate (PET) bottles, and aluminium cans. The image classification method used in this project is transfer learning with convolutional neural networks (CNN). AlexNet, GoogLeNet, DenseNet201, InceptionResNetV2, InceptionV3, MobileNetV2, XceptionNet, ShuffleNet, ResNet 18, ResNet 50, and ResNet 101 are the neural networks that used in this project. This project will compare the F1- score and computational time among the eleven networks. The F1-score and computational time of image classification differs for each neural network. In this project, the AlexNet network gave the best F1-score, 97.50% with the shortest computational time, 2229.235 s among the eleven neural networks.
IRJET- A Survey on Medical Image Interpretation for Predicting PneumoniaIRJET Journal
This document summarizes research on using machine learning and deep learning techniques to interpret medical images and predict pneumonia. It first discusses how medical image analysis is an active field for machine learning. It then reviews several related studies on using convolutional neural networks (CNNs) and transfer learning to classify chest x-rays and detect pneumonia. Specifically, it examines research on developing CNN models for pneumonia classification and using pre-trained CNN architectures like VGG16, VGG19, and ResNet with transfer learning. The document concludes that computer-aided diagnosis systems using deep learning can provide accurate predictions to assist radiologists in pneumonia diagnosis from chest x-rays.
Dilated Inception U-Net for Nuclei Segmentation in Multi-Organ Histology ImagesIRJET Journal
The document summarizes a study that used a Dilated Inception U-Net model for nuclei segmentation in histology images. Key points:
1. A Dilated Inception U-Net model was used to segment nuclei in histology images, which employs dilated convolutions to efficiently generate feature maps over a large input area.
2. The model was tested on the MoNuSeg dataset containing H&E stained images. Preprocessing included color normalization, data augmentation, and extracting 256x256 patches.
3. The Dilated Inception U-Net modifies the classic U-Net by replacing convolutional blocks with dilated inception blocks containing 1x1 and 3x3 filters with different dilation rates, allowing it to
This document provides an overview of medical image segmentation using deep learning techniques. It discusses several deep learning architectures used for medical image segmentation, including U-Net, V-Net, GoogleNet, and ResNet. U-Net uses a symmetric encoder-decoder structure with skip connections to efficiently segment biomedical images. V-Net directly processes 3D MRI volumes for prostate segmentation. GoogleNet and ResNet employ inception modules and residual connections, respectively, to reduce parameters and enable training of very deep networks for medical image analysis tasks. The document aims to classify medical image segmentation approaches, discuss challenges, and outline future research directions using deep learning.
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION cscpconf
This paper aims at providing insight on the transferability of deep CNN features to
unsupervised problems. We study the impact of different pretrained CNN feature extractors on
the problem of image set clustering for object classification as well as fine-grained
classification. We propose a rather straightforward pipeline combining deep-feature extraction
using a CNN pretrained on ImageNet and a classic clustering algorithm to classify sets of
images. This approach is compared to state-of-the-art algorithms in image-clustering and
provides better results. These results strengthen the belief that supervised training of deep CNN
on large datasets, with a large variability of classes, extracts better features than most carefully
designed engineering approaches, even for unsupervised tasks. We also validate our approach
on a robotic application, consisting in sorting and storing objects smartly based on clustering
The document describes a study that used a convolutional neural network with a ConvNeXtLarge architecture to classify skin cancer images into benign and malignant classes. The CNN model was trained on a dataset of 3,297 skin cancer images from Kaggle. It achieved an AUC of 0.91 for classifying the images, demonstrating the ConvNeXtLarge architecture is effective for this task. The study aims to help early diagnosis and treatment of skin cancers.
PADDY CROP DISEASE DETECTION USING SVM AND CNN ALGORITHMIRJET Journal
- The document discusses a study on detecting diseases in paddy/rice crops using deep learning algorithms like convolutional neural networks (CNN) and support vector machines (SVM).
- A dataset of rice leaf images was created and a CNN model using transfer learning with MobileNet was developed and trained on the dataset to classify rice diseases.
- The proposed method aims to automatically classify rice disease images to help farmers more accurately identify diseases, as manual identification can be difficult and inaccurate. This could help improve treatment and support farmers.
Power of Convolutional Neural Networks in Modern AI | The Lifesciences MagazineThe Lifesciences Magazine
Convolutional neural networks (CNNs) stand out as a ground-breaking technique with significant ramifications across multiple areas in the rapidly changing field of artificial intelligence (AI).
DIRECTIONAL CLASSIFICATION OF BRAIN TUMOR IMAGES FROM MRI USING CNN-BASED DEE...IRJET Journal
This document presents research on using a convolutional neural network (CNN) model for the detection and classification of brain tumors from MRI images. The CNN model improves the accuracy of tumor detection and can serve as a useful tool for physicians. The researchers trained and tested several CNN architectures, including CNN, ResNet50, MobileNetV2, and VGG19 on an MRI brain image database. Their proposed model uses a modified Residual U-Net architecture with residual blocks and attention gates to better segment tumors and extract local features from MRI images. Evaluation results found their model achieved better accuracy than existing methods like U-Net and CNN for brain tumor segmentation tasks.
Overview of convolutional neural networks architectures for brain tumor segm...IJECEIAES
Due to the paramount importance of the medical field in the lives of people, researchers and experts exploited advancements in computer techniques to solve many diagnostic and analytical medical problems. Brain tumor diagnosis is one of the most important computational problems that has been studied and focused on. The brain tumor is determined by segmentation of brain images using many techniques based on magnetic resonance imaging (MRI). Brain tumor segmentation methods have been developed since a long time and are still evolving, but the current trend is to use deep convolutional neural networks (CNNs) due to its many breakthroughs and unprecedented results that have been achieved in various applications and their capacity to learn a hierarchy of progressively complicated characteristics from input without requiring manual feature extraction. Considering these unprecedented results, we present this paper as a brief review for main CNNs architecture types used in brain tumor segmentation. Specifically, we focus on researcher works that used the well-known brain tumor segmentation (BraTS) dataset.
This chapter introduces concepts related to artificial intelligence, machine learning, and deep learning. It specifically discusses convolutional neural networks and their application to building computer-aided diagnosis models for chest x-rays. Key factors for designing chest x-ray CAD models using CNNs are discussed, including network architecture, training types like transfer learning, and data augmentation. The chapter motivates the use of deep learning techniques for medical image analysis and diagnosis to help automate feature learning and improve over traditional CAD systems.
Image Forgery Detection Methods- A ReviewIRJET Journal
This document reviews various methods for detecting image forgery. It begins with an introduction to the topic, explaining the need for image forgery detection techniques due to the widespread manipulation of images online. It then categorizes common types of image manipulation and provides a literature review comparing the accuracy and citations of different detection techniques, such as CNN-based methods, transform-domain methods using DCT and DWT, and methods analyzing JPEG compression artifacts. The review finds that CNN-based methods generally achieve the highest accuracy, around 90-100%, but also notes transform-domain and JPEG-based methods can also achieve reasonably high accuracy ranging from 70-100% depending on the technique and testing parameters.
Image Captioning Generator using Deep Machine Learningijtsrd
Technologys scope has evolved into one of the most powerful tools for human development in a variety of fields.AI and machine learning have become one of the most powerful tools for completing tasks quickly and accurately without the need for human intervention. This project demonstrates how deep machine learning can be used to create a caption or a sentence for a given picture. This can be used for visually impaired persons, as well as automobiles for self identification, and for various applications to verify quickly and easily. The Convolutional Neural Network CNN is used to describe the alphabet, and the Long Short Term Memory LSTM is used to organize the right meaningful sentences in this model. The flicker 8k and flicker 30k datasets were used to train this. Sreejith S P | Vijayakumar A "Image Captioning Generator using Deep Machine Learning" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4 , June 2021, URL: https://www.ijtsrd.compapers/ijtsrd42344.pdf Paper URL: https://www.ijtsrd.comcomputer-science/artificial-intelligence/42344/image-captioning-generator-using-deep-machine-learning/sreejith-s-p
Lung Cancer Detection using transfer learning.pptx.pdfjagan477830
Lung cancer is one of the deadliest cancers worldwide. However, the early detection of lung cancer significantly improves survival rate. Cancerous (malignant) and noncancerous (benign) pulmonary nodules are the small growths of cells inside the lung. Detection of malignant lung nodules at an early stage is necessary for the crucial prognosis.
The document proposes a method for face recognition using deep learning and data augmentation. It cleans and pre-processes existing face datasets to remove noise and extracts faces. It then uses image processing techniques to add masks to the faces to create a new masked face dataset. An Inception Resnet-v1 model is trained on the new dataset. The method is applied to build a face recognition application for employee timekeeping that achieves high accuracy even when faces are masked.
A Survey of Convolutional Neural Network Architectures for Deep Learning via ...ijtsrd
Convolutional Neural Network CNN designs can successfully classify, predict and cluster in many artificial intelligence applications. In the health sector, intensive studies continue for disease classification. When the literature in this field is examined, it is seen that the studies are concentrated on the health sector. Thanks to these studies, doctors can make an accurate diagnosis by examining radiological images more consistently. In addition, doctors can save time to do other patient work by using CNN. In this study, related current manuscripts in the health sector were examined. The contributions of these publications to the literature were explained and evaluated. Complementary and contradictory arguments of the presented perspectives were revealed. It has been stated that the current status of the studies carried out and in which direction the future studies should evolve and that they can make an important contribution to the literature. Suggestions have been made for the guidance for future studies. Ahmet Özcan | Mahmut Ünver | Atilla Ergüzen "A Survey of Convolutional Neural Network Architectures for Deep Learning via Health Images" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-6 | Issue-2 , February 2022, URL: https://www.ijtsrd.com/papers/ijtsrd49156.pdf Paper URL: https://www.ijtsrd.com/computer-science/artificial-intelligence/49156/a-survey-of-convolutional-neural-network-architectures-for-deep-learning-via-health-images/ahmet-özcan
A SYSTEMATIC STUDY OF DEEP LEARNING ARCHITECTURES FOR ANALYSIS OF GLAUCOMA AN...ijaia
This document provides a review of deep learning architectures that can be used to segment and classify ocular diseases like glaucoma and hypertensive retinopathy from fundus images. It discusses various deep learning models like U-Net, CNNs, FCNs and autoencoders. The review analyzes the performance of these models and their suitability for deployment on edge devices. It also provides an overview of works that have applied deep learning techniques for glaucoma and hypertensive retinopathy classification and segmentation using datasets like ORIGA and MESSIDOR. The review concludes with discussing future directions in applying deep learning models for medical image analysis.
Machine learning based augmented reality for improved learning application th...IJECEIAES
Detection of objects and their location in an image are important elements of current research in computer vision. In May 2020, Meta released its state-ofthe-art object-detection model based on a transformer architecture called detection transformer (DETR). There are several object-detection models such as region-based convolutional neural network (R-CNN), you only look once (YOLO) and single shot detectors (SSD), but none have used a transformer to accomplish this task. These models mentioned earlier, use all sorts of hyperparameters and layers. However, the advantages of using a transformer pattern make the architecture simple and easy to implement. In this paper, we determine the name of a chemical experiment through two steps: firstly, by building a DETR model, trained on a customized dataset, and then integrate it into an augmented reality mobile application. By detecting the objects used during the realization of an experiment, we can predict the name of the experiment using a multi-class classification approach. The combination of various computer vision techniques with augmented reality is indeed promising and offers a better user experience.
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...IRJET Journal
This document discusses and compares different techniques for object and text detection from real-time images, including OCR, RCNN, Mask RCNN, Fast RCNN, and Faster RCNN algorithms. It finds that Mask RCNN, an extension of Faster RCNN, is generally the best algorithm for object detection in real-time images, as it outperforms other models in accuracy for tasks like object detection, segmentation, and captioning challenges. The document provides background on machine learning and neural networks approaches to image recognition and object detection.
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONijaia
Most of the currently known methods treat person re-identification task as classification problem and used commonly neural networks. However, these methods used only high-level convolutional feature or to express the feature representation of pedestrians. Moreover, the current data sets for person reidentification is relatively small. Under the limitation of the number of training set, deep convolutional networks are difficult to train adequately. Therefore, it is very worthwhile to introduce auxiliary data sets to help training. In order to solve this problem, this paper propose a novel method of deep transfer learning, and combines the comparison model with the classification model and multi-level fusion of the convolution features on the basis of transfer learning. In a multi-layers convolutional network, the characteristics of each layer of network are the dimensionality reduction of the previous layer of results, but the information of multi-level features is not only inclusive, but also has certain complementarity. We can using the information gap of different layers of convolutional neural networks to extract a better feature expression. Finally, the algorithm proposed in this paper is fully tested on four data sets (VIPeR, CUHK01, GRID and PRID450S). The obtained re-identification results prove the effectiveness of the algorithm.
Image compression and reconstruction using a new approach by artificial neura...Hưng Đặng
This document describes a neural network approach to image compression and reconstruction. It discusses using a backpropagation neural network with three layers (input, hidden, output) to compress an image by representing it with fewer hidden units than input units, then reconstructing the image from the hidden unit values. It also covers preprocessing steps like converting images to YCbCr color space, downsampling chrominance, normalizing pixel values, and segmenting images into blocks for the neural network. The neural network weights are initially randomized and then trained using backpropagation to learn the image compression.
Image compression and reconstruction using a new approach by artificial neura...Hưng Đặng
This document describes a neural network approach to image compression and reconstruction. It discusses using a backpropagation neural network with three layers (input, hidden, output) to compress an image by representing it with fewer hidden units than input units, then reconstructing the image from the hidden unit values. It also covers preprocessing steps like converting images to YCbCr color space, downsampling chrominance, normalizing pixel values, and segmenting images into blocks for the neural network. The neural network weights are initially randomized and then trained using backpropagation to learn the image compression.
Brain tumor classification in magnetic resonance imaging images using convol...IJECEIAES
Deep learning (DL) is a subfield of artificial intelligence (AI) used in several sectors, such as cybersecurity, finance, marketing, automated vehicles, and medicine. Due to the advancement of computer performance, DL has become very successful. In recent years, it has processed large amounts of data, and achieved good results, especially in image analysis such as segmentation and classification. Manual evaluation of tumors, based on medical images, requires expensive human labor and can easily lead to misdiagnosis of tumors. Researchers are interested in using DL algorithms for automatic tumor diagnosis. convolutional neural network (CNN) is one such algorithm. It is suitable for medical image classification tasks. In this paper, we will focus on the development of four sequential CNN models to classify brain tumors in magnetic resonance imaging (MRI) images. We followed two steps, the first being data preprocessing and the second being automatic classification of preprocessed images using CNN. The experiments were conducted on a dataset of 3,000 MRI images, divided into two classes: tumor and normal. We obtained a good accuracy of 98,27%, which outperforms other existing models.
AUTOMATIC FRUIT RECOGNITION BASED ON DCNN FOR COMMERCIAL SOURCE TRACE SYSTEMijcsa
Automatically fruit recognition by using machine vision is considered as challenging task due to similarities between various types of fruits and external environmental changes e-g lighting. In this paper, fruit recognition algorithm based on Deep Convolution Neural Network(DCNN) is proposed. Most of the previous techniques have some limitations because they were examined and evaluated under limited dataset, furthermore they have not considered external environmental changes. Another major contribution in this paper is that we established fruit images database having 15 different categories comprising of 44406 images which were collected within a period of 6 months by keeping in view the limitations of existing dataset under different real-world conditions. Images were directly used as input to DCNN for training and recognition without extracting features, besides this DCNN learn optimal features from images through adaptation process. The final decision was totally based on a fusion of all regional classification using probability mechanism. Experimental results exhibit that the proposed approach have efficient capability of automatically recognizing the fruit with a high accuracy of 99% and it can also effectively meet real world application requirements.
Home security is of paramount importance in today's world, where we rely more on technology, home
security is crucial. Using technology to make homes safer and easier to control from anywhere is
important. Home security is important for the occupant’s safety. In this paper, we came up with a low cost,
AI based model home security system. The system has a user-friendly interface, allowing users to start
model training and face detection with simple keyboard commands. Our goal is to introduce an innovative
home security system using facial recognition technology. Unlike traditional systems, this system trains
and saves images of friends and family members. The system scans this folder to recognize familiar faces
and provides real-time monitoring. If an unfamiliar face is detected, it promptly sends an email alert,
ensuring a proactive response to potential security threats.
In the era of data-driven warfare, the integration of big data and machine learning (ML) techniques has
become paramount for enhancing defence capabilities. This research report delves into the applications of
big data and ML in the defence sector, exploring their potential to revolutionize intelligence gathering,
strategic decision-making, and operational efficiency. By leveraging vast amounts of data and advanced
algorithms, these technologies offer unprecedented opportunities for threat detection, predictive analysis,
and optimized resource allocation. However, their adoption also raises critical concerns regarding data
privacy, ethical implications, and the potential for misuse. This report aims to provide a comprehensive
understanding of the current state of big data and ML in defence, while examining the challenges and
ethical considerations that must be addressed to ensure responsible and effective implementation.
Weitere ähnliche Inhalte
Ähnlich wie Image Segmentation and Classification using Neural Network
The document describes a study that used a convolutional neural network with a ConvNeXtLarge architecture to classify skin cancer images into benign and malignant classes. The CNN model was trained on a dataset of 3,297 skin cancer images from Kaggle. It achieved an AUC of 0.91 for classifying the images, demonstrating the ConvNeXtLarge architecture is effective for this task. The study aims to help early diagnosis and treatment of skin cancers.
PADDY CROP DISEASE DETECTION USING SVM AND CNN ALGORITHMIRJET Journal
- The document discusses a study on detecting diseases in paddy/rice crops using deep learning algorithms like convolutional neural networks (CNN) and support vector machines (SVM).
- A dataset of rice leaf images was created and a CNN model using transfer learning with MobileNet was developed and trained on the dataset to classify rice diseases.
- The proposed method aims to automatically classify rice disease images to help farmers more accurately identify diseases, as manual identification can be difficult and inaccurate. This could help improve treatment and support farmers.
Power of Convolutional Neural Networks in Modern AI | The Lifesciences MagazineThe Lifesciences Magazine
Convolutional neural networks (CNNs) stand out as a ground-breaking technique with significant ramifications across multiple areas in the rapidly changing field of artificial intelligence (AI).
DIRECTIONAL CLASSIFICATION OF BRAIN TUMOR IMAGES FROM MRI USING CNN-BASED DEE...IRJET Journal
This document presents research on using a convolutional neural network (CNN) model for the detection and classification of brain tumors from MRI images. The CNN model improves the accuracy of tumor detection and can serve as a useful tool for physicians. The researchers trained and tested several CNN architectures, including CNN, ResNet50, MobileNetV2, and VGG19 on an MRI brain image database. Their proposed model uses a modified Residual U-Net architecture with residual blocks and attention gates to better segment tumors and extract local features from MRI images. Evaluation results found their model achieved better accuracy than existing methods like U-Net and CNN for brain tumor segmentation tasks.
Overview of convolutional neural networks architectures for brain tumor segm...IJECEIAES
Due to the paramount importance of the medical field in the lives of people, researchers and experts exploited advancements in computer techniques to solve many diagnostic and analytical medical problems. Brain tumor diagnosis is one of the most important computational problems that has been studied and focused on. The brain tumor is determined by segmentation of brain images using many techniques based on magnetic resonance imaging (MRI). Brain tumor segmentation methods have been developed since a long time and are still evolving, but the current trend is to use deep convolutional neural networks (CNNs) due to its many breakthroughs and unprecedented results that have been achieved in various applications and their capacity to learn a hierarchy of progressively complicated characteristics from input without requiring manual feature extraction. Considering these unprecedented results, we present this paper as a brief review for main CNNs architecture types used in brain tumor segmentation. Specifically, we focus on researcher works that used the well-known brain tumor segmentation (BraTS) dataset.
This chapter introduces concepts related to artificial intelligence, machine learning, and deep learning. It specifically discusses convolutional neural networks and their application to building computer-aided diagnosis models for chest x-rays. Key factors for designing chest x-ray CAD models using CNNs are discussed, including network architecture, training types like transfer learning, and data augmentation. The chapter motivates the use of deep learning techniques for medical image analysis and diagnosis to help automate feature learning and improve over traditional CAD systems.
Image Forgery Detection Methods- A ReviewIRJET Journal
This document reviews various methods for detecting image forgery. It begins with an introduction to the topic, explaining the need for image forgery detection techniques due to the widespread manipulation of images online. It then categorizes common types of image manipulation and provides a literature review comparing the accuracy and citations of different detection techniques, such as CNN-based methods, transform-domain methods using DCT and DWT, and methods analyzing JPEG compression artifacts. The review finds that CNN-based methods generally achieve the highest accuracy, around 90-100%, but also notes transform-domain and JPEG-based methods can also achieve reasonably high accuracy ranging from 70-100% depending on the technique and testing parameters.
Image Captioning Generator using Deep Machine Learningijtsrd
Technologys scope has evolved into one of the most powerful tools for human development in a variety of fields.AI and machine learning have become one of the most powerful tools for completing tasks quickly and accurately without the need for human intervention. This project demonstrates how deep machine learning can be used to create a caption or a sentence for a given picture. This can be used for visually impaired persons, as well as automobiles for self identification, and for various applications to verify quickly and easily. The Convolutional Neural Network CNN is used to describe the alphabet, and the Long Short Term Memory LSTM is used to organize the right meaningful sentences in this model. The flicker 8k and flicker 30k datasets were used to train this. Sreejith S P | Vijayakumar A "Image Captioning Generator using Deep Machine Learning" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4 , June 2021, URL: https://www.ijtsrd.compapers/ijtsrd42344.pdf Paper URL: https://www.ijtsrd.comcomputer-science/artificial-intelligence/42344/image-captioning-generator-using-deep-machine-learning/sreejith-s-p
Lung Cancer Detection using transfer learning.pptx.pdfjagan477830
Lung cancer is one of the deadliest cancers worldwide. However, the early detection of lung cancer significantly improves survival rate. Cancerous (malignant) and noncancerous (benign) pulmonary nodules are the small growths of cells inside the lung. Detection of malignant lung nodules at an early stage is necessary for the crucial prognosis.
The document proposes a method for face recognition using deep learning and data augmentation. It cleans and pre-processes existing face datasets to remove noise and extracts faces. It then uses image processing techniques to add masks to the faces to create a new masked face dataset. An Inception Resnet-v1 model is trained on the new dataset. The method is applied to build a face recognition application for employee timekeeping that achieves high accuracy even when faces are masked.
A Survey of Convolutional Neural Network Architectures for Deep Learning via ...ijtsrd
Convolutional Neural Network CNN designs can successfully classify, predict and cluster in many artificial intelligence applications. In the health sector, intensive studies continue for disease classification. When the literature in this field is examined, it is seen that the studies are concentrated on the health sector. Thanks to these studies, doctors can make an accurate diagnosis by examining radiological images more consistently. In addition, doctors can save time to do other patient work by using CNN. In this study, related current manuscripts in the health sector were examined. The contributions of these publications to the literature were explained and evaluated. Complementary and contradictory arguments of the presented perspectives were revealed. It has been stated that the current status of the studies carried out and in which direction the future studies should evolve and that they can make an important contribution to the literature. Suggestions have been made for the guidance for future studies. Ahmet Özcan | Mahmut Ünver | Atilla Ergüzen "A Survey of Convolutional Neural Network Architectures for Deep Learning via Health Images" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-6 | Issue-2 , February 2022, URL: https://www.ijtsrd.com/papers/ijtsrd49156.pdf Paper URL: https://www.ijtsrd.com/computer-science/artificial-intelligence/49156/a-survey-of-convolutional-neural-network-architectures-for-deep-learning-via-health-images/ahmet-özcan
A SYSTEMATIC STUDY OF DEEP LEARNING ARCHITECTURES FOR ANALYSIS OF GLAUCOMA AN...ijaia
This document provides a review of deep learning architectures that can be used to segment and classify ocular diseases like glaucoma and hypertensive retinopathy from fundus images. It discusses various deep learning models like U-Net, CNNs, FCNs and autoencoders. The review analyzes the performance of these models and their suitability for deployment on edge devices. It also provides an overview of works that have applied deep learning techniques for glaucoma and hypertensive retinopathy classification and segmentation using datasets like ORIGA and MESSIDOR. The review concludes with discussing future directions in applying deep learning models for medical image analysis.
Machine learning based augmented reality for improved learning application th...IJECEIAES
Detection of objects and their location in an image are important elements of current research in computer vision. In May 2020, Meta released its state-ofthe-art object-detection model based on a transformer architecture called detection transformer (DETR). There are several object-detection models such as region-based convolutional neural network (R-CNN), you only look once (YOLO) and single shot detectors (SSD), but none have used a transformer to accomplish this task. These models mentioned earlier, use all sorts of hyperparameters and layers. However, the advantages of using a transformer pattern make the architecture simple and easy to implement. In this paper, we determine the name of a chemical experiment through two steps: firstly, by building a DETR model, trained on a customized dataset, and then integrate it into an augmented reality mobile application. By detecting the objects used during the realization of an experiment, we can predict the name of the experiment using a multi-class classification approach. The combination of various computer vision techniques with augmented reality is indeed promising and offers a better user experience.
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...IRJET Journal
This document discusses and compares different techniques for object and text detection from real-time images, including OCR, RCNN, Mask RCNN, Fast RCNN, and Faster RCNN algorithms. It finds that Mask RCNN, an extension of Faster RCNN, is generally the best algorithm for object detection in real-time images, as it outperforms other models in accuracy for tasks like object detection, segmentation, and captioning challenges. The document provides background on machine learning and neural networks approaches to image recognition and object detection.
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONijaia
Most of the currently known methods treat person re-identification task as classification problem and used commonly neural networks. However, these methods used only high-level convolutional feature or to express the feature representation of pedestrians. Moreover, the current data sets for person reidentification is relatively small. Under the limitation of the number of training set, deep convolutional networks are difficult to train adequately. Therefore, it is very worthwhile to introduce auxiliary data sets to help training. In order to solve this problem, this paper propose a novel method of deep transfer learning, and combines the comparison model with the classification model and multi-level fusion of the convolution features on the basis of transfer learning. In a multi-layers convolutional network, the characteristics of each layer of network are the dimensionality reduction of the previous layer of results, but the information of multi-level features is not only inclusive, but also has certain complementarity. We can using the information gap of different layers of convolutional neural networks to extract a better feature expression. Finally, the algorithm proposed in this paper is fully tested on four data sets (VIPeR, CUHK01, GRID and PRID450S). The obtained re-identification results prove the effectiveness of the algorithm.
Image compression and reconstruction using a new approach by artificial neura...Hưng Đặng
This document describes a neural network approach to image compression and reconstruction. It discusses using a backpropagation neural network with three layers (input, hidden, output) to compress an image by representing it with fewer hidden units than input units, then reconstructing the image from the hidden unit values. It also covers preprocessing steps like converting images to YCbCr color space, downsampling chrominance, normalizing pixel values, and segmenting images into blocks for the neural network. The neural network weights are initially randomized and then trained using backpropagation to learn the image compression.
Image compression and reconstruction using a new approach by artificial neura...Hưng Đặng
This document describes a neural network approach to image compression and reconstruction. It discusses using a backpropagation neural network with three layers (input, hidden, output) to compress an image by representing it with fewer hidden units than input units, then reconstructing the image from the hidden unit values. It also covers preprocessing steps like converting images to YCbCr color space, downsampling chrominance, normalizing pixel values, and segmenting images into blocks for the neural network. The neural network weights are initially randomized and then trained using backpropagation to learn the image compression.
Brain tumor classification in magnetic resonance imaging images using convol...IJECEIAES
Deep learning (DL) is a subfield of artificial intelligence (AI) used in several sectors, such as cybersecurity, finance, marketing, automated vehicles, and medicine. Due to the advancement of computer performance, DL has become very successful. In recent years, it has processed large amounts of data, and achieved good results, especially in image analysis such as segmentation and classification. Manual evaluation of tumors, based on medical images, requires expensive human labor and can easily lead to misdiagnosis of tumors. Researchers are interested in using DL algorithms for automatic tumor diagnosis. convolutional neural network (CNN) is one such algorithm. It is suitable for medical image classification tasks. In this paper, we will focus on the development of four sequential CNN models to classify brain tumors in magnetic resonance imaging (MRI) images. We followed two steps, the first being data preprocessing and the second being automatic classification of preprocessed images using CNN. The experiments were conducted on a dataset of 3,000 MRI images, divided into two classes: tumor and normal. We obtained a good accuracy of 98,27%, which outperforms other existing models.
AUTOMATIC FRUIT RECOGNITION BASED ON DCNN FOR COMMERCIAL SOURCE TRACE SYSTEMijcsa
Automatically fruit recognition by using machine vision is considered as challenging task due to similarities between various types of fruits and external environmental changes e-g lighting. In this paper, fruit recognition algorithm based on Deep Convolution Neural Network(DCNN) is proposed. Most of the previous techniques have some limitations because they were examined and evaluated under limited dataset, furthermore they have not considered external environmental changes. Another major contribution in this paper is that we established fruit images database having 15 different categories comprising of 44406 images which were collected within a period of 6 months by keeping in view the limitations of existing dataset under different real-world conditions. Images were directly used as input to DCNN for training and recognition without extracting features, besides this DCNN learn optimal features from images through adaptation process. The final decision was totally based on a fusion of all regional classification using probability mechanism. Experimental results exhibit that the proposed approach have efficient capability of automatically recognizing the fruit with a high accuracy of 99% and it can also effectively meet real world application requirements.
Ähnlich wie Image Segmentation and Classification using Neural Network (20)
Home security is of paramount importance in today's world, where we rely more on technology, home
security is crucial. Using technology to make homes safer and easier to control from anywhere is
important. Home security is important for the occupant’s safety. In this paper, we came up with a low cost,
AI based model home security system. The system has a user-friendly interface, allowing users to start
model training and face detection with simple keyboard commands. Our goal is to introduce an innovative
home security system using facial recognition technology. Unlike traditional systems, this system trains
and saves images of friends and family members. The system scans this folder to recognize familiar faces
and provides real-time monitoring. If an unfamiliar face is detected, it promptly sends an email alert,
ensuring a proactive response to potential security threats.
In the era of data-driven warfare, the integration of big data and machine learning (ML) techniques has
become paramount for enhancing defence capabilities. This research report delves into the applications of
big data and ML in the defence sector, exploring their potential to revolutionize intelligence gathering,
strategic decision-making, and operational efficiency. By leveraging vast amounts of data and advanced
algorithms, these technologies offer unprecedented opportunities for threat detection, predictive analysis,
and optimized resource allocation. However, their adoption also raises critical concerns regarding data
privacy, ethical implications, and the potential for misuse. This report aims to provide a comprehensive
understanding of the current state of big data and ML in defence, while examining the challenges and
ethical considerations that must be addressed to ensure responsible and effective implementation.
Cloud Computing, being one of the most recent innovative developments of the IT world, has been
instrumental not just to the success of SMEs but, through their productivity and innovative contribution to
the economy, has even made a remarkable contribution to the economic growth of the United States. To
this end, the study focuses on how cloud computing technology has impacted economic growth through
SMEs in the United States. Relevant literature connected to the variables of interest in this study was
reviewed, and secondary data was generated and utilized in the analysis section of this paper. The findings
of this paper revealed that there have been meaningful contributions that the usage of virtualization has
made in the commercial dealings of small firms in the United States, and this has also been reflected in the
economic growth of the country. This paper further revealed that as important as cloud-based software is,
some SMEs are still skeptical about how it can help improve their business and increase their bottom line
and hence have failed to adopt it. Apart from the SMEs, some notable large firms in different industries,
including information and educational services, have adopted cloud computing technology and hence
contributed to the economic growth of the United States. Lastly, findings from our inferential statistics
revealed that no discernible change has occurred in innovation between small and big businesses in the
adoption of cloud computing. Both categories of businesses adopt cloud computing in the same way, and
their contribution to the American economy has no significant difference in the usage of virtualization.
Energy-constrained Wireless Sensor Networks (WSNs) have garnered significant research interest in
recent years. Multiple-Input Multiple-Output (MIMO), or Cooperative MIMO, represents a specialized
application of MIMO technology within WSNs. This approach operates effectively, especially in
challenging and resource-constrained environments. By facilitating collaboration among sensor nodes,
Cooperative MIMO enhances reliability, coverage, and energy efficiency in WSN deployments.
Consequently, MIMO finds application in diverse WSN scenarios, spanning environmental monitoring,
industrial automation, and healthcare applications.
The AIRCC's International Journal of Computer Science and Information Technology (IJCSIT) is devoted to fields of Computer Science and Information Systems. The IJCSIT is a open access peer-reviewed scientific journal published in electronic form as well as print form. The mission of this journal is to publish original contributions in its field in order to propagate knowledge amongst its readers and to be a reference publication. IJCSIT publishes original research papers and review papers, as well as auxiliary material such as: research papers, case studies, technical reports etc.
With growing, Car parking increases with the number of car users. With the increased use of smartphones
and their applications, users prefer mobile phone-based solutions. This paper proposes the Smart Parking
Management System (SPMS) that depends on Arduino parts, Android applications, and based on IoT. This
gave the client the ability to check available parking spaces and reserve a parking spot. IR sensors are
utilized to know if a car park space is allowed. Its area data are transmitted using the WI-FI module to the
server and are recovered by the mobile application which offers many options attractively and with no cost
to users and lets the user check reservation details. With IoT technology, the smart parking system can be
connected wirelessly to easily track available locations.
Welcome to AIRCC's International Journal of Computer Science and Information Technology (IJCSIT), your gateway to the latest advancements in the dynamic fields of Computer Science and Information Systems.
Computer-Assisted Language Learning (CALL) are computer-based tutoring systems that deal with
linguistic skills. Adding intelligence in such systems is mainly based on using Natural Language
Processing (NLP) tools to diagnose student errors, especially in language grammar. However, most such
systems do not consider the modeling of student competence in linguistic skills, especially for the Arabic
language. In this paper, we will deal with basic grammar concepts of the Arabic language taught for the
fourth grade of the elementary school in Egypt. This is through Arabic Grammar Trainer (AGTrainer)
which is an Intelligent CALL. The implemented system (AGTrainer) trains the students through different
questions that deal with the different concepts and have different difficulty levels. Constraint-based student
modeling (CBSM) technique is used as a short-term student model. CBSM is used to define in small grain
level the different grammar skills through the defined skill structures. The main contribution of this paper
is the hierarchal representation of the system's basic grammar skills as domain knowledge. That
representation is used as a mechanism for efficiently checking constraints to model the student knowledge
and diagnose the student errors and identify their cause. In addition, satisfying constraints and the number
of trails the student takes for answering each question and fuzzy logic decision system are used to
determine the student learning level for each lesson as a long-term model. The results of the evaluation
showed the system's effectiveness in learning in addition to the satisfaction of students and teachers with its
features and abilities.
In the realm of computer security, the importance of efficient and reliable user authentication methods has
become increasingly critical. This paper examines the potential of mouse movement dynamics as a
consistent metric for continuous authentication. By analysing user mouse movement patterns in two
contrasting gaming scenarios, "Team Fortress" and "Poly Bridge," we investigate the distinctive
behavioral patterns inherent in high-intensity and low-intensity UI interactions. The study extends beyond
conventional methodologies by employing a range of machine learning models. These models are carefully
selected to assess their effectiveness in capturing and interpreting the subtleties of user behavior as
reflected in their mouse movements. This multifaceted approach allows for a more nuanced and
comprehensive understanding of user interaction patterns. Our findings reveal that mouse movement
dynamics can serve as a reliable indicator for continuous user authentication. The diverse machine
learning models employed in this study demonstrate competent performance in user verification, marking
an improvement over previous methods used in this field. This research contributes to the ongoing efforts to
enhance computer security and highlights the potential of leveraging user behavior, specifically mouse
dynamics, in developing robust authentication systems.
The AIRCC's International Journal of Computer Science and Information Technology (IJCSIT) is devoted to fields of Computer Science and Information Systems. The IJCSIT is a open access peer-reviewed scientific journal published in electronic form as well as print form. The mission of this journal is to publish original contributions in its field in order to propagate knowledge amongst its readers and to be a reference publication.
The AIRCC's International Journal of Computer Science and Information Technology (IJCSIT) is devoted to fields of Computer Science and Information Systems. The IJCSIT is a open access peer-reviewed scientific journal published in electronic form as well as print form. The mission of this journal is to publish original contributions in its field in order to propagate knowledge amongst its readers and to be a reference publication.
This research aims to further understanding in the field of continuous authentication using behavioural
biometrics. We are contributing a novel dataset that encompasses the gesture data of 15 users playing
Minecraft with a Samsung Tablet, each for a duration of 15 minutes. Utilizing this dataset, we employed
machine learning (ML) binary classifiers, being Random Forest (RF), K-Nearest Neighbors (KNN), and
Support Vector Classifier (SVC), to determine the authenticity of specific user actions. Our most robust
model was SVC, which achieved an average accuracy of approximately 90%, demonstrating that touch
dynamics can effectively distinguish users. However, further studies are needed to make it viable option
for authentication systems. You can access our dataset at the following
link:https://github.com/AuthenTech2023/authentech-repo
This paper discusses the capabilities and limitations of GPT-3 (0), a state-of-the-art language model, in the
context of text understanding. We begin by describing the architecture and training process of GPT-3, and
provide an overview of its impressive performance across a wide range of natural language processing
tasks, such as language translation, question-answering, and text completion. Throughout this research
project, a summarizing tool was also created to help us retrieve content from any types of document,
specifically IELTS (0) Reading Test data in this project. We also aimed to improve the accuracy of the
summarizing, as well as question-answering capabilities of GPT-3 (0) via long text
In the realm of computer security, the importance of efficient and reliable user authentication methods has
become increasingly critical. This paper examines the potential of mouse movement dynamics as a
consistent metric for continuous authentication. By analysing user mouse movement patterns in two
contrasting gaming scenarios, "Team Fortress" and "Poly Bridge," we investigate the distinctive
behavioral patterns inherent in high-intensity and low-intensity UI interactions. The study extends beyond
conventional methodologies by employing a range of machine learning models. These models are carefully
selected to assess their effectiveness in capturing and interpreting the subtleties of user behavior as
reflected in their mouse movements. This multifaceted approach allows for a more nuanced and
comprehensive understanding of user interaction patterns. Our findings reveal that mouse movement
dynamics can serve as a reliable indicator for continuous user authentication. The diverse machine
learning models employed in this study demonstrate competent performance in user verification, marking
an improvement over previous methods used in this field. This research contributes to the ongoing efforts to
enhance computer security and highlights the potential of leveraging user behavior, specifically mouse
dynamics, in developing robust authentication systems.
Image segmentation and classification tasks in computer vision have proven to be highly effective using neural networks, specifically Convolutional Neural Networks (CNNs). These tasks have numerous
practical applications, such as in medical imaging, autonomous driving, and surveillance. CNNs are capable
of learning complex features directly from images and achieving outstanding performance across several
datasets. In this work, we have utilized three different datasets to investigate the efficacy of various preprocessing and classification techniques in accurssedately segmenting and classifying different structures
within the MRI and natural images. We have utilized both sample gradient and Canny Edge Detection
methods for pre-processing, and K-means clustering have been applied to segment the images. Image
augmentation improves the size and diversity of datasets for training the models for image classification.
This work highlights transfer learning’s effectiveness in image classification using CNNs and VGG 16 that
provides insights into the selection of pre-trained models and hyper parameters for optimal performance.
We have proposed a comprehensive approach for image segmentation and classification, incorporating preprocessing techniques, the K-means algorithm for segmentation, and employing deep learning models such
as CNN and VGG 16 for classification.
- The document presents 6 different models for defining foot size in Tunisia: 2 statistical models, 2 neural network models using unsupervised learning, and 2 models combining neural networks and fuzzy logic.
- The statistical models (SM and SHM) are based on applying statistical equations to morphological foot data.
- The neural network models (MSK and MHSK) use self-organizing Kohonen maps to cluster foot data and model full and half sizes.
- The fuzzy neural network models (MSFK and MHSFK) incorporate fuzzy logic into the neural network learning process to better account for uncertainty in foot sizes.
The security of Electric Vehicle (EV) charging has gained momentum after the increase in the EV adoption
in the past few years. Mobile applications have been integrated into EV charging systems that mainly use a
cloud-based platform to host their services and data. Like many complex systems, cloud systems are
susceptible to cyberattacks if proper measures are not taken by the organization to secure them. In this
paper, we explore the security of key components in the EV charging infrastructure, including the mobile
application and its cloud service. We conducted an experiment that initiated a Man in the Middle attack
between an EV app and its cloud services. Our results showed that it is possible to launch attacks against
the connected infrastructure by taking advantage of vulnerabilities that may have substantial economic and
operational ramifications on the EV charging ecosystem. We conclude by providing mitigation suggestions
and future research directions.
The AIRCC's International Journal of Computer Science and Information Technology (IJCSIT) is devoted to fields of Computer Science and Information Systems. The IJCSIT is a open access peer-reviewed scientific journal published in electronic form as well as print form. The mission of this journal is to publish original contributions in its field in order to propagate knowledge amongst its readers and to be a reference publication.
The AIRCC's International Journal of Computer Science and Information Technology (IJCSIT) is devoted to fields of Computer Science and Information Systems. The IJCSIT is a open access peer-reviewed scientific journal published in electronic form as well as print form. The mission of this journal is to publish original contributions in its field in order to propagate knowledge amongst its readers and to be a reference publication.
This paper describes the outcome of an attempt to implement the same transitive closure (TC) algorithm
for Apache MapReduce running on different Apache Hadoop distributions. Apache MapReduce is a
software framework used with Apache Hadoop, which has become the de facto standard platform for
processing and storing large amounts of data in a distributed computing environment. The research
presented here focuses on the variations observed among the results of an efficient iterative transitive
closure algorithm when run against different distributed environments. The results from these comparisons
were validated against the benchmark results from OYSTER, an open source Entity Resolution system. The
experiment results highlighted the inconsistencies that can occur when using the same codebase with
different implementations of Map Reduce.
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...shadow0702a
This document serves as a comprehensive step-by-step guide on how to effectively use PyCharm for remote debugging of the Windows Subsystem for Linux (WSL) on a local Windows machine. It meticulously outlines several critical steps in the process, starting with the crucial task of enabling permissions, followed by the installation and configuration of WSL.
The guide then proceeds to explain how to set up the SSH service within the WSL environment, an integral part of the process. Alongside this, it also provides detailed instructions on how to modify the inbound rules of the Windows firewall to facilitate the process, ensuring that there are no connectivity issues that could potentially hinder the debugging process.
The document further emphasizes on the importance of checking the connection between the Windows and WSL environments, providing instructions on how to ensure that the connection is optimal and ready for remote debugging.
It also offers an in-depth guide on how to configure the WSL interpreter and files within the PyCharm environment. This is essential for ensuring that the debugging process is set up correctly and that the program can be run effectively within the WSL terminal.
Additionally, the document provides guidance on how to set up breakpoints for debugging, a fundamental aspect of the debugging process which allows the developer to stop the execution of their code at certain points and inspect their program at those stages.
Finally, the document concludes by providing a link to a reference blog. This blog offers additional information and guidance on configuring the remote Python interpreter in PyCharm, providing the reader with a well-rounded understanding of the process.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
Null Bangalore | Pentesters Approach to AWS IAMDivyanshu
#Abstract:
- Learn more about the real-world methods for auditing AWS IAM (Identity and Access Management) as a pentester. So let us proceed with a brief discussion of IAM as well as some typical misconfigurations and their potential exploits in order to reinforce the understanding of IAM security best practices.
- Gain actionable insights into AWS IAM policies and roles, using hands on approach.
#Prerequisites:
- Basic understanding of AWS services and architecture
- Familiarity with cloud security concepts
- Experience using the AWS Management Console or AWS CLI.
- For hands on lab create account on [killercoda.com](https://killercoda.com/cloudsecurity-scenario/)
# Scenario Covered:
- Basics of IAM in AWS
- Implementing IAM Policies with Least Privilege to Manage S3 Bucket
- Objective: Create an S3 bucket with least privilege IAM policy and validate access.
- Steps:
- Create S3 bucket.
- Attach least privilege policy to IAM user.
- Validate access.
- Exploiting IAM PassRole Misconfiguration
-Allows a user to pass a specific IAM role to an AWS service (ec2), typically used for service access delegation. Then exploit PassRole Misconfiguration granting unauthorized access to sensitive resources.
- Objective: Demonstrate how a PassRole misconfiguration can grant unauthorized access.
- Steps:
- Allow user to pass IAM role to EC2.
- Exploit misconfiguration for unauthorized access.
- Access sensitive resources.
- Exploiting IAM AssumeRole Misconfiguration with Overly Permissive Role
- An overly permissive IAM role configuration can lead to privilege escalation by creating a role with administrative privileges and allow a user to assume this role.
- Objective: Show how overly permissive IAM roles can lead to privilege escalation.
- Steps:
- Create role with administrative privileges.
- Allow user to assume the role.
- Perform administrative actions.
- Differentiation between PassRole vs AssumeRole
Try at [killercoda.com](https://killercoda.com/cloudsecurity-scenario/)
An improved modulation technique suitable for a three level flying capacitor ...IJECEIAES
This research paper introduces an innovative modulation technique for controlling a 3-level flying capacitor multilevel inverter (FCMLI), aiming to streamline the modulation process in contrast to conventional methods. The proposed
simplified modulation technique paves the way for more straightforward and
efficient control of multilevel inverters, enabling their widespread adoption and
integration into modern power electronic systems. Through the amalgamation of
sinusoidal pulse width modulation (SPWM) with a high-frequency square wave
pulse, this controlling technique attains energy equilibrium across the coupling
capacitor. The modulation scheme incorporates a simplified switching pattern
and a decreased count of voltage references, thereby simplifying the control
algorithm.
Batteries -Introduction – Types of Batteries – discharging and charging of battery - characteristics of battery –battery rating- various tests on battery- – Primary battery: silver button cell- Secondary battery :Ni-Cd battery-modern battery: lithium ion battery-maintenance of batteries-choices of batteries for electric vehicle applications.
Fuel Cells: Introduction- importance and classification of fuel cells - description, principle, components, applications of fuel cells: H2-O2 fuel cell, alkaline fuel cell, molten carbonate fuel cell and direct methanol fuel cells.
Image Segmentation and Classification using Neural Network
1. IMAGE SEGMENTATION AND CLASSIFICATION
USING NEURAL NETWORK
Fatema Tuj Zohra, Rifa Tasfia Ratri, Shaheena Sultana, and Humayara Binte Rashid
Department of Computer Science and Engineering
Notre Dame University Bangladesh
Abstract. Image segmentation and classification tasks in computer vision have proven to be highly effec-
tive using neural networks, specifically Convolutional Neural Networks (CNNs). These tasks have numerous
practical applications, such as in medical imaging, autonomous driving, and surveillance. CNNs are capable
of learning complex features directly from images and achieving outstanding performance across several
datasets. In this work, we have utilized three different datasets to investigate the efficacy of various pre-
processing and classification techniques in accurssedately segmenting and classifying different structures
within the MRI and natural images. We have utilized both sample gradient and Canny Edge Detection
methods for pre-processing, and K-means clustering have been applied to segment the images. Image
augmentation improves the size and diversity of datasets for training the models for image classification.
This work highlights transfer learning’s effectiveness in image classification using CNNs and VGG 16 that
provides insights into the selection of pre-trained models and hyper parameters for optimal performance.
We have proposed a comprehensive approach for image segmentation and classification, incorporating pre-
processing techniques, the K-means algorithm for segmentation, and employing deep learning models such
as CNN and VGG 16 for classification.
Keywords: Convolutional Neural Network, VGG 16, Image Segmentation, K-means, Image Classification.
1 INTRODUCTION
In the world of artificial intelligence (AI) and computer science, computer vision is a
branch that focuses on giving computers the ability to interpret, process, and comprehend
visual data from the outside environment. It involves developing algorithms and tech-
niques for processing and analyzing images and videos, and extracting meaningful insights
and information from them. A Convolutional Neural Network (CNN) is a type of deep
neural network made for processing and evaluating that has a data grid-like structure,
together with pictures or movies. Regarding computer vision, image identification is a
key task and CNNs have emerged as the most advanced technique for it. Convolutional
layers are used to extract information from images and fully connected layers are used to
produce predictions in CNNs [30]. CNN is a sort of neural network developed primarily
for image recognition tasks. Deep learning is another approach in machine learning that
focuses on teaching neural networks. This carries out tasks that require intelligence that is
comparable to that of a person, such as speech recognition, object recognition in pictures,
and language translation. Recent developments in deep learning have helped the medical
imaging industry detect many diseases [4]. To obtain high accuracy, the main purpose of
medical image classification is to identify the parts of the human body that are harmful
to health [5]. In the pre-processing stage, MRI images are preprocessed to remove noise
and enhance contrast using a combination of histogram equalization and median filtering
techniques.
In the segmentation stage K-means clustering algorithm divides the images into homo-
geneous segments based on color, texture, or intensity [29]. K-means is an unsupervised
learning algorithm that partitions data into K clusters, initially selecting random points
International Journal of Computer Science & Information Technology (IJCSIT) Vol 16, No 1, February 2024
DOI: 10.5121/ijcsit.2024.16102 13
2. as cluster centers and iteratively refining them. The algorithm assigns data points to the
nearest cluster center and updates centers based on the mean of points in each cluster.
The process continues until cluster centers stabilize or a specified iteration limit is reached.
Subtractive clustering is then employed to enhance segmentation by eliminating noise and
merging similar clusters, demonstrating the effectiveness through experimental results. In
the classification stage, pre-processed images are fed into a CNN model that is based
on the VGG 16 architecture [6]. Nath et.al. provided a comprehensive overview of various
image classification methods, including traditional techniques and deep learning-based ap-
proaches. They discussed the benefits and drawbacks of various approaches and how they
were applied in various fields [7].
The motivation behind employing neural networks, particularly CNNs, for image segmen-
tation and classification lies in their demonstrated effectiveness in various computer vision
tasks. Neural networks offer improved accuracy, faster processing speeds, adaptability to
diverse data types, and the potential for novel applications. By leveraging the structure
and operation inspired by the human brain, these models can autonomously process and
analyze images, reducing the reliance on manual intervention.
However, the adoption of neural networks in image processing poses challenges. Train-
ing large-scale deep neural networks demand substantial computational resources, and the
scalability of such models remains a significant challenge. Additionally, ensuring generaliz-
ability across diverse datasets and real- world scenarios require addressing issues related to
overfitting and model robustness. Furthermore, interpretability and explainability of neu-
ral network decisions can be challenging, especially in critical applications where human
understanding is crucial. Balancing the trade-off between model complexity and compu-
tational efficiency is an ongoing challenge in the deployment of neural networks for image
segmentation and classification. Despite challenges related to interpretability and gener-
alizability, the use of artificial intelligence-based visual systems, as demonstrated in fruit
classification through cameras and algorithms, showcase the potential for autonomous im-
age analysis. Continued research and innovation in neural network techniques are essential
to overcome challenges and unlock the full potential of these models in revolutionizing im-
age processing and analysis.
We have structured the rest of the paper as follows: Section 2 reviews the related works
in this field, highlighting the gaps in current knowledge and explaining how this work
addresses those gaps. Section 3 describes the models used for classification. Section 4 de-
scribes the methodology used to conduct the work. Section 5 presents the result of this
work, including performance analysis. Section 6 summarizes the findings of our work,
concludes, and suggests avenues for future research.
2 LITERATURE REVIEW
This section reviews the related works in this field, highlights the gaps in current knowl-
edge, and explains how this work addresses those gaps. Methodological innovation is an
important contribution to method articles, and editors typically ask how the technique
in question differs from previously published methods. Image segmentation and classifica-
tion is a supreme task in the execution of computer vision. Object recognition, medical
image analysis, autonomous driving, etc. are part of computer vision. Deep learning meth-
ods such as CNN, VGG 16, and k-means clustering have been admired in recent years
for these applications. We have organized our literature review into two distinct sections,
each addressing a specific aspect of our work: Convolutional Neural Networks (CNNs) and
Transfer Learning.
International Journal of Computer Science & Information Technology (IJCSIT) Vol 16, No 1, February 2024
14
3. In the section dedicated to Convolutional Neural Networks, we have extensively reviewed
prior research endeavors related to CNN models. Specifically, we have delved into the
historical body of work surrounding CNNs, analyzing their development and various ap-
plications.
In the context of Transfer Learning, our focus has been on VGG 16, a prominent
model in this domain. Within this section, we have incorporated relevant studies and
findings concerning VGG 16’s usage and its adaptations in the realm of transfer learning.
This approach enables us to comprehensively explore the landscape of previous research,
encompassing both the broader CNN field and the specific contributions of VGG 16 in
transfer learning applications.
2.1 Convolutional Neural Network (CNN)
CNNs have an architecture that helps them understand images step by step. They do this
by using three layers(convolutional layers, pooling layers, and fully connected layers) that
scan the image, and group information, and then make sense of it. AlexNet, VGGNet,
GoogLeNet, ResNet, and DenseNet are just a few of the CNN designs that Sultana et al.
offered an overview of, along with information on how well they performed on well-known
image classification benchmarks like ImageNet [8]. A CNN model with two completely
connected layers and four convolutional layers was produced by Khan et al. To avoid over-
fitting, they employed a dropout and Rectified Linear Unit (ReLU) activation function.
The rate of dropout is 0.5 [9]. A well-known benchmark dataset for computer vision is the
MNIST dataset. To attain high accuracy, Chattopadhyay et al. suggested a CNN architec-
ture that combines convolutional, max-pooling, dropout, and fully connected layers with
optimal hyperparameters [10]. Kumar et al. proposed convolutional layers, which make up
the CNN’s architecture have various filter sizes and pooling layers for down-sampling the
feature maps. In addition, the authors employ batch normalization and dropout methods
to reduce overfitting and increase the network’s generalization capabilities [11]. As pro-
posed by Kaushik et al. an approach that uses a CNN to learn the features of the image
and predict the segmentation map. Gomez et al. presented an approach that is evaluated
using a dataset of thermal images obtained from breast cancer patients and healthy sub-
jects. According to the results, the suggested method works well in categorizing thermal
pictures into normal and malignant breast tissues, and it has the potential to be employed
as a non-invasive tool for early breast cancer testing [12]. Tripathi analyzed the effect of
different factors, such as network architecture, data augmentation, and hyperparameters,
on classification accuracy. They concluded that deeper networks with appropriate regular-
ization techniques and data augmentation can significantly improve classification accuracy
[13].
2.2 Transfer Learning
Transfer learning is a strong approach that allows pre-trained models to be utilized for new
image categorization problems. It has proven cutting-edge performance on many bench-
marks. Transfer learning is applied to a range of applications that also involve medical
picture classification. To identify brain tumors in MRI images, Siddique et al. proposed a
CNN model that has a high level of accuracy. The proposed model is used as a diagnostic
tool in clinical settings to aid radiologists in the detection of brain tumors [14]. Agarwal
proposed a deep-learning approach for classifying cooking images into different states using
the VGG 19 network [15]. D.C. Febrianto et al. suggested a method for training a CNN
International Journal of Computer Science & Information Technology (IJCSIT) Vol 16, No 1, February 2024
15
4. using a dataset of brain magnetic resonance imaging (MRI) images that contain both nor-
mal and tumor scans [16]. Hoque provided valuable insights into the application of deep
learning models for medical image analysis, specifically for brain tumor detection, and
highlighted the potential benefits of using CNNs for this task. The comparative analysis
of VGG 16 and VGG 19 models provides a useful benchmark for future studies in this area
[17]. Abd-Ellahet al. compared the performance of the VGG 16 and VGG 19 networks on a
dataset of 500 MRI scans, consisting of 250 normal scans and 250 scans with tumors. They
also compared the performance of the VGG networks with a conventional CNN approach
[18]. Pravallika and Baskar suggested a brain tumor classification method based on image
processing that employs the VGG 16 CNN and classifiers using support vector machines
(SVM). Analysis of the proposed system on a dataset of 210 brain MRI images, and the
results show that the VGG 16 network outperforms the SVM classifier with an accuracy
of 95.2% compared to 89.5% [19]. Agus et al. proposed a system that involves training the
VGG 16 model on the MRI images to extract features followed by a classification layer
using softmax regression for classifying the image into one of the two categories, glioma
or non glioma [20]. Simonyan et al. introduced the VGGNet architecture, which achieved
outstanding performance on the ImageNet dataset. They also demonstrated the efficiency
of transfer learning for image classification by adjusting the previously trained VGGNet
on a smaller dataset [34]. Long et al. presented a deep adaptation network (DAN) that can
learn domainagnostic characteristics. They used various photos to highlight the usefulness
of DAN.
3 MODELS USED IN IMAGE CLASSIFICATION
In this section, we have discussed the models that have been used for classification. We
have used CNN and VGG 16 to classify our images.
3.1 Convolutional Neural Network
A convolutional neural network has three layers. Input layer, hidden layer, and output
layer. Data is received by the input layer, then it is sent to hidden layers. The process of
removing features from the data is carried out by the hidden layers, and each layer has
multiple nodes or neurons that perform the calculations. Based on the features that The
output layer has been gathered by the hidden layers generates the outcome or prediction.
At the time of training, the weights of the nodes are adjusted. This has been done to
minimize the error between the predicted output and the actual output. This process has
been repeated iteratively until the network achieves the desired accuracy. CNNs, in partic-
ular, are specialized neural networks that have been designed to process images and other
types of multidimensional data [30]. Convolutional Neural Networks are often employed in
image categorization, object identification, and other computer vision applications. CNNs
are built to automatically learn and extract features from pictures using convolution and
pooling. Convolution is performed by applying a tiny filter or kernel over an input picture
and computing the dot product of the filter with each patch of pixel then the output is
routed via an activation function like ReLU to induce nonlinearity. The feature maps are
then down-sampled and the output dimensionality is reduced via pooling. Figure 1 shows
us the architecture of convolutional neural network [32]. Here are the key details of CNN:
1. Convolutional Layer: To extract characteristics from the input image, this layer has
used several filters (kernels). The filters convolve over the image and produce a feature
map. Convolutional layers can learn low-level features like edges, lines, and curves. The
International Journal of Computer Science & Information Technology (IJCSIT) Vol 16, No 1, February 2024
16
5. formula for the 2D convolution operation in a convolutional layer can be expressed as
follows:
2. ReLU Activation Layer: ReLU is an activation function for Rectified Linear Units.
This activation function has been applied to the convolutional layer’s output. This aids
in introducing non-linearity into the model and permits it to pick up on more intricate
aspects. The ReLU function could be defined as:
f(x) = max(0, x) (1)
3. Pooling Layer: This layer has been used to reduce the size of the feature maps. It
has taken the maximum or average value of each patch of pixels. This has decreased
the size of feature maps which helped to reduce overfitting.
4. Fully Connected Layer: It has taken the output of the layer and runs it through
a group of neurons that are completely linked to every neuron in the layer before. To
categorize the supplied image, utilize this layer.
5. Softmax Layer: For generating a probability distribution over the classes, it has
applied a softmax function to the output of the fully connected layer. The softmax
function follow as:
softmax(xi) =
exi
P
j exj
(2)
6. Loss Function: A loss function computes the differentiation between two values. a
model’s anticipated output and the actual result (i.e., the ground truth) for a cer-
tain input. It is a crucial component of building a machine-learning model because it
has directed the optimization process to the best possible collection of model param-
eters. CNNs can be trained using a variety of optimization algorithms, such as SDG
(Stochastic gradient descent) or Adam. They can also be trained on large datasets,
such as ImageNet, and fine-tuned on smaller datasets for specific tasks. The most ad-
vanced performance on a wide range of computer vision tasks is possible with CNNs
when trained via backpropagation and gradient descent. Numerous more uses include
style transfer, segmentation, object identification, and image classification.
Fig. 1: CNN Architecture.
3.2 Visual Geometry Group
VGG (Visual Geometry Group) is a group of researchers from Oxford University that
specializes in computer vision and deep learning. They are well-known for their contribu-
tions to the area of convolutional neural networks that excel in image recognition tasks.
International Journal of Computer Science & Information Technology (IJCSIT) Vol 16, No 1, February 2024
17
6. In 2014, Simonyan et al. suggested a new architecture for CNNs that delivered outstand-
ing outcomes on several image recognition benchmarks [34]. This architecture is known
as VGGNet. This consists of a series of convolutional layers with small 3x3 filters. These
layers are followed by max-pooling layers and end with several fully connected layers. The
VGGNet is known for its simplicity and is still widely used today as a baseline for image
recognition tasks. VGG 16 deep neural network with 16 layers. Figure 2 shows us how the
VGG 16 model works layer by layer [33]. Here are the details of the VGG 16 architecture:
1. Input Layer: A RGB image with the dimensions 224 x 224 x 3 have been used as the
input layer.
2. Convolutional Layers: There are 13 convolutional layers in VGG 16. Each convo-
lutional layer has a 3 x 3 kernel and used a stride of 1 pixel. Each layer’s number of
filters increased as it gone deeper into the network, starting with 64 filters in the first
layer and doubling after each max pooling layer.
3. Max Pooling Layers: There are 5 max-pooling layers in VGG 16. Each max-pooling
layer has a 2 x 2 kernel and a stride of 2 pixels.
4. Fully Connected Layers: VGG 16 has 3 layers that have been completely linked.
The amount of neurons in the first two totally connected layers (4,096 neurons each)
and the third fully connected layer (1,000 neurons) correspond to the number of classes
in the ImageNet dataset.
5. Softmax Layer: For obtaining the final probability distribution across the 1,000
classes, a softmax function have applied to the output of the last fully connected
layer.
Fig. 2: VGG 16 Architecture.
4 METHODOLOGY
This section describes the methodology used to conduct the work. In our work, we have
followed a procedure that contains all of the methods or tools that we have used to analyze
image segmentation and classification in our work. Figure 3 shows the procedure of our
work. First of all, we have collected a dataset. We have worked on three datasets. All of
them have been collected from Kaggle [1].
International Journal of Computer Science & Information Technology (IJCSIT) Vol 16, No 1, February 2024
18
7. Fig. 3: Workflow of the work
4.1 Data Collection
In our work, we have chosen three datasets shown in Table 1 from Kaggle based on two
classes of training and testing [1]. The first and second datasets are about brain tumor
images [21] and [22]. Dataset 1 and Dataset 2 contain around 3274 images and 7023 images
in total. They have 4 classes including glioma tumors, no tumors, meningioma tumors,
and pituitary tumors. The third dataset is about natural images. This dataset contains a
collection of 7 categories of natural images with a total of 8,789 images [1]. In our work,
we select four categories such as dog, cat, fruit and flower.
Splitting Category
Dataset 1 Dataset 2 Dataset 3
Train Test Train Test Train Test
2880 394 5712 1311 2642 788
Table 1: Splitting Category of all Datasets
4.2 Dataset Preprocessing
Before analyzing the dataset, it is necessary to preprocess the dataset for better results.
The goal of preprocessing is to prepare the data for analysis and to increase the model’s
accuracy by reducing noise and removing inconsistencies. We have used Canny Edge De-
tection and Sample Gradient as preprocessing steps.
Canny Edge Detection: This is a widely used algorithm for detecting edges in im-
ages. It worked by identifying areas in the image with a significant change in intensity or
color and marking them as edges. The algorithm involves several steps, including smooth-
ing the image to remove noise and calculating the gradient of the image to determine the
edges, and non-maximum suppression is used to soften the edges [31]. The edges serve as
the final threshold to create a binary picture, with the background represented by black
International Journal of Computer Science & Information Technology (IJCSIT) Vol 16, No 1, February 2024
19
8. pixels and the edges by white pixels. Figure 4 shows us how canny edge detection works
on four types of MRI images and natural images.
(a) MRI Images (b) Natural Images
Fig. 4: Canny Edge Detection
Sample Gradient: In image processing, the gradient is a measure of the rate at which
the pixel intensity changes in an image. In our work, we have used the Sobel gradient
for image preprocessing. The Sobel gradient is a simple and widely used method for edge
detection in image processing. S. B. Kulkarni and S. G. Bhirud described the Sobel gradient
as a simple and effective method for edge detection. They explain that the Sobel gradient
works by calculating the gradient magnitude of an image by convolving it with two filters,
one for the horizontal edges and another for the vertical edges [23]. Mustafa et al. explained
that the Sobel gradient is a popular edge detection method due to its simplicity and
effectiveness [24]. Wang proposed work to develop Laplacian operator-based edge detectors.
The detectors utilized the second-order derivative of Gaussian filters to extract the edges
from an image [30].
4.3 Image Segmentation
The technique of segmenting an image involves dividing it up into different sections or areas
that are uniform in terms of things like color, texture, or intensity. Minaee et al. explain
the importance of image segmentation in different uses of computer vision including object
identification, recognition, and classification [25]. FCNs are a kind of deep neural network
that can segment images pixel by pixel. To learn how to divide an image into several areas
according to their semantic significance, Long et al. suggested an approach that involves
training an FCN on a sizable dataset of annotated images [26]. One way to perform image
segmentation is using the K-means. Data are divided into K clusters based on similarity
using the unsupervised learning method K-means. Burney et al. presented a clear and
short overview of the application of K-means clustering for segmentation. It highlights
the advantages of this approach and demonstrates its effectiveness through experimental
results [27].
4.4 Data Augmentation
Augmentation is used to boost the quantity and variety of training data without actually
gathering any new data. By applying various transformations to existing data, such as
rotating, cropping, or adding noise, it has created a new sample that is still representative
of the original data for MRI images and natural images as shown in figure 5. The goal
of data augmentation is to increase the diversity and volume of training data, which can
improve deep learning models’ generalization performance [28].
International Journal of Computer Science & Information Technology (IJCSIT) Vol 16, No 1, February 2024
20
9. (a) MRI Images (b) Natural Images
Fig. 5: Data Augmentation
4.5 Our CNN Model Structure
In CNN models, various layers work together in a structured manner. Convolutional layers
extract image features, pooling layers decrease spatial resolution for efficiency, and fully
connected layers handle the final classification or regression task, depending on the problem
at hand. This is a sequential neural network model with the following layers:
1. Conv2D layer with 32 filters, kernel size of (3,3), ReLU activation function, and input
shape of (IMAGE SIZE, IMAGE SIZE, 3). This layer applies 32 different filters to the
input image, each of size 3x3, and applies the ReLU activation function to the output.
2. MaxPooling2D layer with pool size of (2,2). This layer reduces the size of the input
image by taking the maximum value within each 2x2 window.
3. Conv2D layer with 64 filters, kernel size of (3,3), and ReLU activation function. This
layer applies 64 different filters to the output of the previous layer, each of size 3x3,
and applies the ReLU activation function to the output.
4. MaxPooling2D layer with pool size of (2,2). This layer reduces the size of the input
image by taking the maximum value within each 2x2 window.
5. Conv2D layer with 32 filters, kernel size of (3,3), and ReLU activation function. This
layer applies 32 different filters to the output of the previous layer, each of size 3x3,
and applies the ReLU activation function to the output.
6. Flatten layer. This layer flattens the output of the previous layer into a 1D array.
7. Dense layer with 16 neurons and ReLU activation function. This layer applies a fully
connected layer to the output of the previous layer, with 16 neurons and ReLU acti-
vation function.
8. Dense layer with 4 neurons and softmax activation function. This layer applies a fully
connected layer to the output of the previous layer, with 4 neurons and softmax acti-
vation function.
The model have compiled using the ’adam’ optimizer, ’sparse categorical crossentropy’
as the loss function, and ’accuracy’ as the evaluation metric.
4.6 Our Transfer Learning Model
Transfer learning approach entails leveraging previously learned models as a foundation for
new models. The VGG 16 architecture is a popular choice for transfer learning due to its
excellent performance in image recognition tasks. We have built a transfer learning model
based on the VGG 16 architecture using the Keras API. The pre-trained VGG 16 model
is loaded with the imagenet weights, and all the layers are set to non-trainable, except for
the last three layers in the VGG block, which are set to trainable. The pre-trained VGG
International Journal of Computer Science & Information Technology (IJCSIT) Vol 16, No 1, February 2024
21
10. 16 model comes first followed by a flattened layer, two dropout layers, a dense layer with
128 neurons with ReLU activation, and a final dense layer with neurons. The neurons are
equal to the number of unique labels in the dataset and a softmax activation function.
This new sequential model is then defined. The model has been made for tasks requiring
several classes of categorization.
5 RESULT ANALYSIS
This section presents the result of this work, including performance analysis and shows
how Convolutional Neural Networks (CNN) and VGG 16 architecture classify our different
datasets into different categories, focusing on evaluating their performance.
This analysis used CNN and VGG 16 for classifying our datasets into different cate-
gories. The Adam optimizer have been used to train the neural network. We have used
10 epochs, a batch volume of 20, and sparse categorical cross-entropy as the loss function.
The objective was to evaluate the performance of these models in accurately classifying
MRI images and comparing their results. We have reported the results for 3 datasets which
shows that the accuracy of VGG 16 is better than CNN for all datasets. For dataset 1,
the CNN model training accuracy is 99% and test accuracy is 72%, while the VGG 16
achieved 98% in the training and 75% in the testing phase. In dataset 2, the CNN model
achieved and VGG 16 both models achieved 99% in the training phase, but in the testing
phase VGG 16 model result is better than the CNN model. In dataset 3, the CNN model
achieved a training accuracy of 99% and a test accuracy of 82%. But VGG 16 model
achieved 99% in the training phase and 95% in the testing. Table 2 shows the accuracy
performance of the training and test sample for all models.
Model Dataset 1 Dataset 2 Dataset 3
Train Test Train Test Train Test
CNN 99 72 99 95 99 82
VGG16 98 75 99 97 99 95
Table 2: Accuracy performance between CNN and VGG 16
5.1 Evaluation Matrix
In machine learning, performance metrics are used to evaluate the effectiveness of a model.
One such metric is the classification report, which evaluates the quality of a classification
model. The report includes various performance metrics such as accuracy, precision, recall,
F1-score, and support [2].
– True Positive(TP): The model successfully predicts the positive class and the actual
outcome is positive.
– True Negative(TN): The model successfully predicts the negative class when the
actual outcome is truly negative.
– False Positive(FP): The model predicts the positive class inaccurately when the real
class is negative.
– False Negative(FN): The model predicts the negative class inaccurately when the
real class is positive.
International Journal of Computer Science & Information Technology (IJCSIT) Vol 16, No 1, February 2024
22
11. – Accuracy:Accuracy measures the percentage of correct predictions made by the model
Accuracy = TP + TN/P + N (3)
– Precision: Precision measures the per centage of positive images that were correctly
classified out of all the images that the model predicted as positive.
Precision = TP/TP + FP (4)
– Recall:Recall measures the percentage of positive images that were correctly classified
out of all the actual positive images.
Recall = TP/TP + FN (5)
– F1-Score:The F1-score is the harmonic mean of precision and recall and is used to
evaluate the overall performance of the model.
F1score = 2 ∗ (Precision ∗ Recall)/(Precision + Recall) (6)
– Support: Support is the total number of images in a certain class. It represents the
number of instances in the dataset that belong to a particular class.
By evaluating these metrics, researchers can determine the quality of their model and
how well it has being performing in accurately identifying the correct labels for a given set
of images. Finally, we have calculated the Precision, Recall, and F1-Score of the proposed
CNN and VGG 16 model, which are commonly used performance evaluation metrics to
measure the effectiveness of an image classification model in accurately identifying the
correct labels for a given set of images.
In this work, experiments are conducted using three distinct datasets to train Convolu-
tional Neural Network (CNN) and VGG16 models, varying in the sizes of training and test
samples. Notably, Dataset 1, with 3000 data points, exhibited lower F1 scores compared
to the other two datasets. The decision to exclude results from Dataset 1 is based on mul-
tiple considerations. Firstly, the relatively small size of the dataset may limit the models’
ability to generalize effectively, potentially leading to decrease performance. Additionally,
concerns about an imbalanced distribution of samples across classes in Dataset 1 could
bias the models, particularly affecting performance metrics like the F1 score, especially
for minority classes. In conclusion, the exclusion of results from Dataset 1 is a strategic
choice aimed at ensuring the robustness and reliability of the conclusions derived from the
experiments.
Convolutional Neural Network
Precision Recall F1 score Support
no tumor 1.00 0.99 0.99 405
meningioma tumor 0.94 0.87 0.90 306
glioma tumor 0.90 0.96 0.93 300
pituitary tumor 0.97 0.98 0.98 300
(a) Dataset 2
Convolutional Neural Network
Precision Recall F1 score Support
cat 0.81 0.67 0.73 259
dog 0.58 0.67 0.62 159
flower 0.86 0.93 0.90 183
fruit 0.97 0.99 0.98 187
(b) Dataset 3
Table 3: Evaluation Matrix using CNN
International Journal of Computer Science & Information Technology (IJCSIT) Vol 16, No 1, February 2024
23
12. The accuracy, recall, F1 score, and support for the CNN using Dataset 2 and Dataset
3 are shown in Table 3. The table demonstrates that Dataset 3 have a lower precision,
recall, and F1 scores than Dataset 2. Dataset 2 shown in Table 3a has the best precision,
recall, and F1 score.
VGG 16
Precision Recall F1 score Support
glioma tumor 0.94 0.98 0.96 300
meningioma tumor 0.97 0.94 0.96 306
no tumor 1.00 1.00 1.00 405
pituitary tumor 0.99 0.98 0.99 300
(a) Dataset 2
VGG 16
Precision Recall F1 score Support
cat 0.92 0.98 0.95 259
dog 0.96 0.84 0.90 159
flower 0.98 0.99 0.99 183
fruit 1.00 1.00 1.00 187
(b) Dataset 3
Table 4: Evaluation Matrix using VGG 16
Table 4 displays the Precision, Recall, F1 score, and Support results for the VGG 16
using Dataset 2 and Dataset 3. The table demonstrates that Dtaset 2 has the highest
F1 score, recall, or accuracy than Dataset 3. For Dataset 2 shown in Table 4a the model
performs flawlessly for the assigned task, as seen by its 100% accuracy, 100% recall, and
100% F1 score for no tumor. This would suggest that there were no false positives or
false negative predictions made by the model, meaning that it properly categorised every
incident in the dataset.
(a) CNN (b) VGG 16
Fig. 6: Model Training History of Dataset 2
Figure 6 is a training graph of accuracy and loss for CNN and VGG 16 that shows
the performance of the model throughout training for Dataset 2. The accuracy graph of
CNN in figure 6a shows how the accuracy of the model changes over time during training.
Initially, the accuracy may be low, but as the model is trained on more data, it gradually
improves to 99.99%. The loss graph shows how the loss of the model changes over time
during training. The loss should generally decrease over time, as the model learns to make
better predictions.
The accuracy graph for VGG 16 in figure 6b would show the percentage of images
that were correctly classified by the model. Initially, the accuracy would be low, but as
the model is trained on more data, the accuracy would increase by 99.99%. The loss graph
International Journal of Computer Science & Information Technology (IJCSIT) Vol 16, No 1, February 2024
24
13. for VGG 16 shows the amount of error between the predicted and actual labels for the
training data. The loss should generally decrease over time as the model learns to make
better predictions.
(a) CNN (b) VGG 16
Fig. 7: Model Training History of Dataset 3
Figure 7 is a training graph of accuracy and loss for CNN and VGG 16 that shows
the performance of the model throughout training for Dataset 3. The accuracy graph of
CNN in figure 7a shows how the accuracy of the model changes over time during training.
Initially, the accuracy may be low, but as the model is trained on more data, it gradually
improves to 99%. The loss graph shows how the loss of the model changes over time during
training. The loss should generally decrease over time, as the model learns to make better
predictions.
The accuracy graph for VGG 16 in figure 7b would show the percentage of images that
were correctly classified by the model. Initially, the accuracy would be low, but as the
model is trained on more data, the accuracy would increase by 99.99%. The loss graph
for VGG 16 shows the amount of error between the predicted and actual labels for the
training data. The loss should generally decrease over time as the model learns to make
better predictions.
6 CONCLUSION
This study presents a comprehensive approach for image segmentation and classification
using pre-processing techniques, K-means algorithm for segmentation, and deep learning
models (CNN and VGG16) for classification. Higher accuracy both for MRI and natural
images is achieved by proposed approach. The VGG 16 model outperforms than the other
models in terms of accuracy. It is shown that pre-processing techniques, segmentation, and
deep learning models can be combined to achieve high accuracy in image classification
tasks. The proposed method has the potential to be used in a variety of medical condi-
tions, as MRI imaging is commonly used in medical diagnosis and treatment planning. In
future research we can explore larger datasets, alternative segmentation techniques, trans-
fer learning, ensemble methods, and alternative metrics. These techniques can be used to
improve the robustness, accuracy, and applicability of the system. Further research can
explore the deployment of these systems in clinical settings and real-time segmentation
on low-power devices. As the field continues to evolve, there will be exciting opportunities
for more advanced techniques for analyzing and understanding visual data.
References
1. Kaggle, https://www.kaggle.com/, [Online; accessed 22-March-2023]
International Journal of Computer Science & Information Technology (IJCSIT) Vol 16, No 1, February 2024
25
14. 2. “Javatpoint.” https://www.javatpoint.com/performance-metrics-in-machine-learning [Online; accessed
25 December-2023].
3. Roy, Prasun,Natural Images, year = 2021, https://www.kaggle.com/prasunroy/natural-images, [Ac-
cessed: March 23, 2023]
4. N. Abiwinanda, M. Hanif, S. T. Hesaputra, A. Handayani, and T. R. Mengko, “Brain tumor classification
using convolutional neural network,” in World Congress on Medical Physics and Biomedical Engineering
2018: June 3-8, 2018, Prague, Czech Republic (Vol. 1), pp. 183–189, Springer, 2019
5. E. Miranda, M. Aryuni, and E. Irwansyah, “A survey of medical image classification tech- niques,”
in 2016 International Conference on Information Management and Technology (ICIMTech), pp. 56-61,
2016
6. B. Padmini, C. Johnson, B. A. Kumar, and G. R. Yadav, “Brain tumor detection by using cnn and
vgg-16,”.
7. S. S. Nath, G. Mishra, J. Kar, S. Chakraborty, and N. Dey, “A survey of image classification methods
and techniques,” in 2014 International Conference on Control, Instrumentation, Communication and
Computational Technologies (ICCICCT), pp. 554-557, 2014.
8. F. Sultana, A. Sufian, and P. Dutta, “Advancements in image classification using convolu- tional neural
network,” CoRR, vol. abs/1905.03288, 2019.
9. H. A. Khan, W. Jue, M. Mushtaq, and M. U. Mushtaq, “Brain tumor classification in mri image using
convolutional neural network,” Mathematical Biosciences and Engineering, vol. 17, no. 5, pp. 6203-6216,
2020.
10. N. Chattyopadhyay, H. Maity, B. Debnath, D. Ghosh, R. Chakraborty, S. Mitra, R. Islam, and N.
Saha, “Classification of mnist image dataset using improved convolutional neural network”.
11. P.Kumar and U.Dugal, “Tensorflow based image classification using advanced convo- lutional neural
network,” International Journal of Recent Technology and Engineering (IJRTE), vol. 8, pp. 994-998,
2020.
12. J. Zuluaga-Gomez, Z. Al Masry, K. Benaggoune, S. Meraghni, and N. Zerhouni, “A cnn-based method-
ology for breast cancer diagnosis using thermal images,” Computer Methods in Biomechanics and
Biomedical Engineering: Imaging & Visualization, vol. 9, no. 2, pp. 131-145, 2021.
13. M. Tripathi, “Analysis of convolutional neural network based image classification tech- niques,” Journal
of Innovative Image Processing (JIIP), vol. 3, no. 02, pp. 100-117, 2021.
14. M. A. B. Siddique, S. Sakib, M. M. R. Khan, A. K. Tanzeem, M. Chowdhury, and N. Yas- min,
“Deep convolutional neural networks model-based brain tumor detection in brain mri images,” in 2020
Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC),
pp. 909-914, IEEE, 2020.
15. R. Agarwal, “State classification of cooking images using vgg19 network”.
16. D. Febrianto, I. Soesanti, and H. Nugroho, “Convolutional neural network for brain tu- mor detection,”
in IOP Conference Series: Materials Science and Engineering, vol. 771, p. 012-031, IOP Publishing, 2020.
17. S. U. Hoque, “Performance comparison between vgg16 & vgg19 deep learning method with cnn for
brain tumor detection”.
18. M. K. Abd-Ellah, A. I. Awad, A. A. Khalaf, and H. F. Hamed, “Two-phase multi-model automatic
brain tumour diagnosis system from magnetic resonance images using convo- lutional neural networks,”
EURASIP Journal on Image and Video Processing, vol. 2018, no. 1, pp. 1-10, 2018.
19. C. R. Pravallika and R. Baskar, “Image processing based brain tumor classification using vgg16 com-
pared with svm to improve accuracy,” pp. 1398-1401, 2022.
20. M. E. Agus, S. Y. Bagas, M. Yuda, N. A. Hanung, and Z. Ibrahim, “Convolutional neural network
featuring vgg-16 model for glioma classification,” JOIV: International Journal on Informatics Visual-
ization, vol. 6, no. 3, pp. 660-666, 2022.
21. S. Bhuvaji, A. Kadam, P. Bhumkar, S. Dedge, and S. Kanchan, “Brain tumor classification (mri),”
2020.
22. M. Nickparvar, “Brain tumor mri dataset,” 2021.
23. S. B. Kulkarni and S. G. Bhirud, “Edge detection techniques for image segmentation: a survey of soft
computing approaches,” International Journal of Emerging Technology and Advanced Engineering, vol.
4, no. 8, pp. 610-614, 2014.
24. Z. Mustafa, M. A. Khan, and M. Riaz, “An overview of image edge detection techniques,” Journal of
Information Processing Systems, vol. 13, no. 1, pp. 1-20, 2017.
25. S. Minaee, Y. Boykov, F. Porikli, A. Plaza, N. Kehtarnavaz, and D. Terzopoulos, “Image segmentation
using deep learning: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.
44, no. 7, pp. 3523-3542, 2022.
26. . Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic seg- mentation,” in
Proceedings of the IEEE conference on computer vision and pattern recog- nition, pp. 3431-3440, 2015.
27. S. A. Burney and H. Tariq, “K-means cluster analysis for image segmentation,” Interna- tional Journal
of Computer Applications, vol. 96, no. 4, 2014.
International Journal of Computer Science & Information Technology (IJCSIT) Vol 16, No 1, February 2024
26
15. 28. L . Perez and J. Wang, “The effectiveness of data augmentation in image classification using deep
learning,” CoRR, vol. abs/1712.04621, 2017.
29. N. Dhanachandra, K. Manglem, and Y. J. Chanu, “Image segmentation using k-means clustering
algorithm and subtractive clustering algorithm,” Procedia Computer Science, vol. 54, pp. 764–771,
2015.
30. X. Wang, “Laplacian operator-based edge detectors,” IEEE transactions on pattern analysis and ma-
chine intelligence, vol. 29, pp. 886–90, 06 2007.
31. Sekehravani, E. A., Babulak, E., and Masoodi, M. “Implementing canny edge detection algorithm for
noisy image. Bulletin of Electrical Engineering and Informatics”, 9(4), 1404-1410,2020.
32. R. Kaushik and S. Kumar, “Image segmentation using convolutional neural network,” Int. J. Sci.
Technol. Res, vol. 8, no. 11, pp. 667–675, 2019.
33. M. E. Agus, S. Y. Bagas, M. Yuda, N. A. Hanung, and Z. Ibrahim, “Convolutional neural network
featuring vgg-16 model for glioma classification,” JOIV: International Journal on Informatics Visual-
ization, vol. 6, no. 3, pp. 660–666, 2022.
34. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,”
arXiv preprint arXiv:1409.1556, 2014.
Authors
Fatema Tuj Zohra is a recent B.Sc. graduate in Computer Science and Engineering. With
a passion for advancing the field of machine learning, Fatema combines a strong founda-
tion in computer science with a profound curiosity for exploring innovative solutions. She
seeks to revolutionize real-world applications through their insightful contributions and
unwavering dedication to technological progress. .
Rifa Tasfia Ratri received B.Sc. Engineering degree in Computer Science and Engi-
neering (CSE) from Notre Dame University Bangladesh. Her current research focuses on
machine learning and image processing.
Dr. Shaheena Sultana is a Professor and Chairman in the department of Computer
Science and Engineering at Notre Dame University Bangladesh. She received M.Sc. Engi-
neering degree in Computer Science and Engineering (CSE) from Bangladesh University
of Engineering and Technology (BUET). She did B.Sc. Engineering degree in Electrical
and Electronic Engineering (EEE) from Khulna University of Engineering and Technology
(KUET) She obtained Ph.D. in CSE from BUET. Her research interests include Graph
Drawing, Graph Theory, VLSI Design, Embedded System, Data Mining.
Humayara Binte Rashid is currently teaching as a Lecturer in the Department of
Computer Science and Engineering of Notre Dame University Bangladesh (NDUB). She
passed B.Sc. in Computer Science and Engineering (CSE) from Military Institute of Sci-
ence and Technology (MIST). Her current research focuses on machine learning and Data
Mining.
International Journal of Computer Science & Information Technology (IJCSIT) Vol 16, No 1, February 2024
27