SlideShare ist ein Scribd-Unternehmen logo
1 von 8
Downloaden Sie, um offline zu lesen
Image Segmentation and Classification for Fin Whale1
Identification2
Michael Ford
Kathleen Moriarty
Elijah Willie
Luke McQuaid
3
December 10, 20164
1 Problem Description5
In many fields of zoology, the ability to identify individual animals is foundational to many research6
endeavors, and can allow scientists to discover aspects of animal biology such as social dynamics and7
population distribution. However, depending on the animal being studied, the process of performing8
an identification may be difficult or involve significant manual labour. A primary example of this is9
performing identification of fin whales, which is a species of large cetacean that lives off the coast10
of British Columbia and has unique identifying markings for each individual. There are over 70011
individual fin whales, and when an animal is encountered their photo must be compared by hand to a12
reference catalogue. In this project we applied various machine learning approaches with the goal of13
automating this process of fin whale identification.
Figure 1: Example image of a fin whale indicating unique identification features.
14
1.1 Data set15
In collaboration with the Cetacean Research Program, Fisheries and Oceans Canada, we received16
a catalogue of 79 individuals with 884 images. This represents a small sample of the entire stored17
catalogue of identification photos, however the labels for the catalogue are recorded on hard copy,18
and due to the significant labour involved in manually labeling photos we were restricted to this data19
set.20
1.2 Identifying Challenges21
1.2.1 Data set size22
Previous work in the area of automated whale identification has shown that this is a feasible problem.23
In particular, the Kaggle Right Whale Recognition Contest of August of 2015 [kag()] had private24
submissions achieving log-loss of 0.59600. However the Kaggle data set contained 4237 photos25
for 427 individuals, a significant difference in scale from our data set. We recognized initially that26
the size of the data set would restrict our ability to train complex models due to the high chance of27
over-fitting.28
1
1.2.2 Signal-to-Noise29
As can be seen in the figure below, many input photos have significant noise in the background of30
the photo. Due to the poor signal to noise ratio, and being unable to perform complex learning for31
feature extraction due to our small dataset, we decided to devote significant effort to pre-processing32
steps prior to doing any classification.33
2 Pre-processing34
2.1 Segmentation using Markov Random Fields35
Markov Random Fields have over the past few decades become an innovative way of denoising36
and segmenting a wide range of types of images. Therefore is without much thought that the pre-37
processing of the whale images contains a pipeline that appeals to MRFs. We used a previously38
implemented MRF [Sharma()] model for processing the images. For our purpose, this model was39
used for image de-noising, and segmentation by edge detection. This model has parameters that can40
be set by the user. This is useful for optimization, as a user needs may differ based on the context41
of use. For this model, we needed to specify the maximum number of iterations, the number of42
neighbours for k-means clustering for classifying a pixel in the image based on its neighbors, and a43
value for the potential function used in the model.44
This pipeline was very memory demanding, and as such, great care was given when picking parameters45
since running all the samples through the pipeline was quite lengthy in time. See figure below for46
the result of MRF applied to an image. This image was processed with the following parameters:47
(maxIter = 5, k = 3, and potential = 0.5). The model with these parameters took approximately 4548
seconds for completion. Multiplying this by the total number of images in our data set (838 images),49
we see that this model applied to the whole of the images would take approximately 10.5 hours.50
These parameters proved to be best compromise between computation time and quality of results as51
other parameters had longer computation times while yielding similar results.
(a) Original image before MRF transformation
(b) Image after MRF transformation
Figure 2: MRF applied to test image
52
2.2 Segmentation using Hidden Random Markov Field-Expectation Maximizaton53
(HRMF-EM)54
In addition to using regular MRF for image denoisng and segmentation by edge detection, we also55
used different model for image denoisng and segmentation. This time we used a model based on56
Hidden Markov Random Field using the Expectation Maximization algorithm learn the most probable57
parameters for this algorithm. This algorithm is fully described in [Wang(2012)]. This source also58
provided an implemented model which also enabled us to tweak the parameters until we were able59
to remove an adequate amount of noise from the original image. This model was just as memory60
intensive and time consuming as the model based on regular MRF. This model however contained61
one fewer parameter. We had to specify the number of iterations, and the number of neighbours for62
computing the k-nearest neighbour when classifying a pixel in the image. Great care was also taken63
2
when picking the parameters that would result in as little noise as possible remaining in the image64
when compared to original image. Both of these models (MRF, and HRMF-EM) were used for image65
denoising, segmentation and generating extra features for downstream training of the final model.66
See figure below for the result of MRF applied to an image. This image was processed with the67
following parameters: (maxIter = 5, and k = 3). The model with these parameters took approximately68
80 seconds for completion. Again, multiplying this by the total number of images in the data set69
(838 images), we see that this model applied to the whole of the images would take approximately70
17 hours for completion. Given this lengthy computation time, it is clear how necessary parameter71
optization was for this model. These parameters, however, proved to be best compromise between72
computation time and quality of results as other parameters had longer computation times while73
yielding similar results. In addition to having a pre-implemented model, these models were chosen74
because they represent the simplest statistical models for image denoising and segmentation that by no75
means assume that the variables within a system are independent. The variables here being the pixels76
within an input image. These models also allows us to model important inter-pixel dependencies, and77
conditional dependencies that can be taken advantage of for our purpose of pre-processing.
(a) Original image before MRF-EM transformation
(b) Image after HMRF-EM transformation
Figure 3: MRF-EM applied to test image
78
3
2.3 Cropping79
2.3.1 Manual Cropping80
While developing a model to crop images automatically and pipeline them into our CNN identifier81
we had to work on the CNN itself. A small program was created to expedite the process of manually82
cropping all the photos. Images were greatly reduced in size using an augmenting program, then83
cropping was done using a code developed by [Rosebrock()] for cropping boxes out of an image. The84
cropped regions were then mapped back but to the full size images to get a high resolution version of85
the cropped photo.86
3 Whale Identification87
3.1 Binary Classification using Pre-trained CNN88
The initial model approach was to create a binary classifier using two identical, merged CNN89
structures. Siamese CNN structures have previously been applied with some success facial recognition90
[Khalil-Hani(2014)] and in one-shot image recognition [Koch(2015)]. This suggested that this model91
would be suitable for our whale identification using our small dataset. However, due to the small92
dataset size and limited available computation resources, we decided to implement the model using93
a VGG16 structure pre-trained on the imagenet database [Chollet(a)]. We removed the VGG last94
classification block, merged two identical models with learning turned off, and then tried several95
different classification structures, including adding one or two fully connected layers, and dropout96
layer, before the single classification node which used a sigmoid activation function. Training was97
preformed using full frame images reszied to 224x224. However we were unable to get any results98
that modeled more that the proportion of true to positive training examples. This made us conclude99
that either the signal to noise ratio was not adequate, or that the features that the pre-trained VGG100
network was not extracting features that were representative of the inter-individual variation.101
3.2 Minimal Compute Method to Establish Baseline Results102
3.2.1 Method103
A model of our minimal compute method, outlined in stages 1 and 2 below can seen in Fig. 4.
Figure 4: Model of Minimal Compute Method Pipeline
104
Stage 1: Feature Extraction105
Training a state of the art convolutional neural network (CNN) model for multiclass classification106
would require more compute power than we had at our disposal. Instead, a "CPU friendly"107
alternative method was used: whale images were feed through an Inception V3 CNN model which108
was pre-trained on ImageNet data set images, (similar to Assignment 3 [Mori(2016)]. Output109
from the last average pooling layer were collected and saved as feature vectors for each whale110
image. Our hypothesis was that our Inception V3 model, would have learned enough discrimina-111
tive information about each whale image to make classification on top of these feature vectors possible.112
113
Several variations of the original images were passed through the pre-trained CNN, to get114
different sets of feature vectors, (as depicted in Fig. 4). These sets were: original full images,115
4
cropped images, and cropped high resolution images, (they were not re-sized to 299x299 V3 input116
dimensions).117
*Stage 2: Support Vector Machine For Classification118
The feature vectors of each image, as discussed in Stage 1, were then passed into an support vector119
machine, to classify each image as one of 78 whales.120
Feature vectors were either left unnormalized or normalized, by scaling each vector to it’s unit121
norm,(i.e. L2-Normalization). This was proven to be an effective pre-processing step in previous122
works using SVMs for classification problems, ([Simonyan and Zisserman(2015)])123
With L2-Normalization, For each element E in feature vector x:124
Exnorm =
Ex
xnorm 2
(1)
However, After visualizing these normalized feature vectors, ( see Fig. 5), it should be noted that125
there did not appear to be enough variation between images which could be determined from their126
V3 learned features.
Figure 5: Normalized Feature Vectors Learned from Inception V3
127
3.2.2 Results128
Several hyper-parameters were tested during cross-validation, the results are shown in Fig. 6. The best
Figure 6: Cross-Validation Accuracy Reported Across Several Hyper-parameters
129
choice of hyper-parameters, resulting in the highest validation accuracy, was as follows: Surprisingly,130
despite L2- normalization outperforming in every other test, unnormalized features resulted in higher131
accuracy on the validation set. The highest validation accuracy was also achieved in conjunction132
with using a new feature vector, created from the feature vectors of full images, cropped images and133
5
high-resolution cropped images:134
Featuresbest =
Ftype
numFtypes
(2)
These hyper-parameters were then used to create our final ’CPU-friendly’ model, achieving a135
classification accuracy on our test set of 7.61 percent. (see Fig. 7)136
Figure 7: Test Accuracy, Using Chosen "Best" Hyper-parameters
The final model was also tested on whales which had more than 25 images included in the data137
set, resulting in only 12 classes. Our final model achieved a test accuracy of 19 percent with this138
simplified problem.139
3.3 Dual Input Merged Model140
In an attempt to create a model that took as input as much signal as possible, we constructed a141
dual-input merged classification model where the input was both the cropped original image, and the142
output of the MRF pre-processing using the cropped original image. Both inputs were re-sized to143
224x224. The structure was based on two independent VGG16 networks pre-trained on Imagenet.144
Training was performed on the final convolution block as well as an added fully connected layer on145
top of the VGG networks.146
6
Figure 8: Merged model structure. Blue indicates layers where learning was turned off, while yellow
layers had learning turned on.
The model was run using SGD with a slow learning rate of 0.00001 and momentum of 0.9 as147
suggested by ’Building powerful image classification models using very little data’ [Chollet(b)].148
While training accuracy of 99.85% was achieved, this was accompanied with zero testing accuracy149
using 20% of the data set as test data. This indicates that the model was too complex for the size of150
the data set provided.151
3.4 Simple Convolution Neural Network152
Due to the lack in ability of the pre-trained VGG network to extract relevant features, and the153
massive over-fitting that resulted when learning was applied to only a limited portion of the VGG, we154
attempted to learn a simple CNN from scratch. The network consisted of a single 2d convolution155
layer, a max pooling layer, fully connected 64 node hidden layer, dropout of 0.5 and classification156
layer. ReLU was used as the activation function for hidden layers while softmax was used for the157
classification layer. See figure below for results.158
Input Learning Rate Momentum Train Accuracy Test accuracy
HMRF 0.1 0.0 0.142 0.0074
HMRF 0.00001 0.5 0.128 0.147
HMRF 0.00001 0.9 0.399 0.0662
Cropped 0.00001 0.9 0.0456 0.441
Figure 9: Results from training using different input data sets and parameters. HMRF input refers
to the the HMRF pre-processing applied to all the data set, while Cropped refers to the manually
cropped data set. All inputs were re-sized to 224x224.
4 Discussion159
4.1 Contributions160
Michael initiated the project and collaborated with the team from Fisheries in order to obtain the data161
set and understand the problem from the scientists perspective. Luke and Elijah teamed up to work162
on the pre-processing half of the project. While everyone met and dealt with issues all together we163
7
all had our own specific jobs. Luke developed a process for cropping images manually and focused164
on that for the first half of the project in order for the whale classification to test and refine their165
programs on better cropped images. Once we had a full set of high resolution cropped images Luke166
continued by assisting where he could with whale classification and Elijah’s automated cropping167
tool. Kathleen was responsible for the Minimal Compute Method for Baseline Results and Support168
Vector Machine for Classification, while Michael was responsible for the Binary Classification using169
Pre-Trained CNN, Dual-Input Merged Model and Simple Convolution Neural Network.170
References171
[kag()] https://www.kaggle.com/c/noaa-right-whale-recognition.172
[Sharma()] Kamal Kishor Sharma. github.com/kamalkishor/Wound_Image_Segmentation_by_Markov_Random_Field.173
[Wang(2012)] Quan Wang. Hmrf-em-image: Implementation of the hidden markov random field174
model and its expectation-maximization algorithm, 2012.175
[Rosebrock()] Adrian Rosebrock. tionReferences [1] http://www.pyimagesearch.com/2015/03/09/capturing-176
mouse-clic.177
[Khalil-Hani(2014)] LS Khalil-Hani, M; Sung. A Convolutional Neural Network Approach for Face178
Verification. NUCLEIC ACIDS RESEARCH, 2014.179
[Koch(2015)] Gregory Koch. Siamese neural networks for one-shot image recognition. PhD thesis,180
University of Toronto, 2015.181
[Chollet(a)] Francois Chollet. deep-learning-models, a. https://github.com/fchollet/deep-learning-182
models.183
[Mori(2016)] Dr. Greg Mori. Assignment 3: Deep learning. 2016.184
[Simonyan and Zisserman(2015)] Karen Simonyan and Andrew Zisserman. Very deep convolutional185
networks for large-scale image recognition. ICLR, 2015.186
[Chollet(b)] Francois Chollet. Building powerful image classification models using very little187
data, b. https://blog.keras.io/building-powerful-image-classification-models-using-very-little-188
data.html.189
8

Weitere ähnliche Inhalte

Was ist angesagt?

A PERFORMANCE EVALUATION OF A PARALLEL BIOLOGICAL NETWORK MICROCIRCUIT IN NEURON
A PERFORMANCE EVALUATION OF A PARALLEL BIOLOGICAL NETWORK MICROCIRCUIT IN NEURONA PERFORMANCE EVALUATION OF A PARALLEL BIOLOGICAL NETWORK MICROCIRCUIT IN NEURON
A PERFORMANCE EVALUATION OF A PARALLEL BIOLOGICAL NETWORK MICROCIRCUIT IN NEURONijdpsjournal
 
IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...
IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...
IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...IRJET Journal
 
Super Resolution with OCR Optimization
Super Resolution with OCR OptimizationSuper Resolution with OCR Optimization
Super Resolution with OCR OptimizationniveditJain
 
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODINGNEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODINGcscpconf
 
Improved anti-noise attack ability of image encryption algorithm using de-noi...
Improved anti-noise attack ability of image encryption algorithm using de-noi...Improved anti-noise attack ability of image encryption algorithm using de-noi...
Improved anti-noise attack ability of image encryption algorithm using de-noi...TELKOMNIKA JOURNAL
 
Reversible Data Hiding in the Spatial and Frequency Domains
Reversible Data Hiding in the Spatial and Frequency DomainsReversible Data Hiding in the Spatial and Frequency Domains
Reversible Data Hiding in the Spatial and Frequency DomainsCSCJournals
 
A novel approach for satellite imagery storage by classifying the non duplica...
A novel approach for satellite imagery storage by classifying the non duplica...A novel approach for satellite imagery storage by classifying the non duplica...
A novel approach for satellite imagery storage by classifying the non duplica...IAEME Publication
 
A novel approach for satellite imagery storage by classify
A novel approach for satellite imagery storage by classifyA novel approach for satellite imagery storage by classify
A novel approach for satellite imagery storage by classifyiaemedu
 
A proposed assessment metrics for image steganography
A proposed assessment metrics for image steganographyA proposed assessment metrics for image steganography
A proposed assessment metrics for image steganographyijcisjournal
 
Median based parallel steering kernel regression for image reconstruction
Median based parallel steering kernel regression for image reconstructionMedian based parallel steering kernel regression for image reconstruction
Median based parallel steering kernel regression for image reconstructioncsandit
 
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONMEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONcsandit
 
IMAGE QUALITY OPTIMIZATION USING RSATV
IMAGE QUALITY OPTIMIZATION USING RSATVIMAGE QUALITY OPTIMIZATION USING RSATV
IMAGE QUALITY OPTIMIZATION USING RSATVpaperpublications3
 
Extended Performance Appraise of Image Retrieval Using the Feature Vector as ...
Extended Performance Appraise of Image Retrieval Using the Feature Vector as ...Extended Performance Appraise of Image Retrieval Using the Feature Vector as ...
Extended Performance Appraise of Image Retrieval Using the Feature Vector as ...Waqas Tariq
 
A NEW ALGORITHM FOR DATA HIDING USING OPAP AND MULTIPLE KEYS
A NEW ALGORITHM FOR DATA HIDING USING OPAP AND MULTIPLE KEYSA NEW ALGORITHM FOR DATA HIDING USING OPAP AND MULTIPLE KEYS
A NEW ALGORITHM FOR DATA HIDING USING OPAP AND MULTIPLE KEYSEditor IJMTER
 
IRJET- Interactive Image Segmentation with Seed Propagation
IRJET-  	  Interactive Image Segmentation with Seed PropagationIRJET-  	  Interactive Image Segmentation with Seed Propagation
IRJET- Interactive Image Segmentation with Seed PropagationIRJET Journal
 
DCT based Steganographic Evaluation parameter analysis in Frequency domain by...
DCT based Steganographic Evaluation parameter analysis in Frequency domain by...DCT based Steganographic Evaluation parameter analysis in Frequency domain by...
DCT based Steganographic Evaluation parameter analysis in Frequency domain by...IOSR Journals
 

Was ist angesagt? (19)

17Vol71No1
17Vol71No117Vol71No1
17Vol71No1
 
A PERFORMANCE EVALUATION OF A PARALLEL BIOLOGICAL NETWORK MICROCIRCUIT IN NEURON
A PERFORMANCE EVALUATION OF A PARALLEL BIOLOGICAL NETWORK MICROCIRCUIT IN NEURONA PERFORMANCE EVALUATION OF A PARALLEL BIOLOGICAL NETWORK MICROCIRCUIT IN NEURON
A PERFORMANCE EVALUATION OF A PARALLEL BIOLOGICAL NETWORK MICROCIRCUIT IN NEURON
 
IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...
IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...
IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...
 
Super Resolution with OCR Optimization
Super Resolution with OCR OptimizationSuper Resolution with OCR Optimization
Super Resolution with OCR Optimization
 
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODINGNEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
 
Improved anti-noise attack ability of image encryption algorithm using de-noi...
Improved anti-noise attack ability of image encryption algorithm using de-noi...Improved anti-noise attack ability of image encryption algorithm using de-noi...
Improved anti-noise attack ability of image encryption algorithm using de-noi...
 
Reversible Data Hiding in the Spatial and Frequency Domains
Reversible Data Hiding in the Spatial and Frequency DomainsReversible Data Hiding in the Spatial and Frequency Domains
Reversible Data Hiding in the Spatial and Frequency Domains
 
A novel approach for satellite imagery storage by classifying the non duplica...
A novel approach for satellite imagery storage by classifying the non duplica...A novel approach for satellite imagery storage by classifying the non duplica...
A novel approach for satellite imagery storage by classifying the non duplica...
 
A novel approach for satellite imagery storage by classify
A novel approach for satellite imagery storage by classifyA novel approach for satellite imagery storage by classify
A novel approach for satellite imagery storage by classify
 
A proposed assessment metrics for image steganography
A proposed assessment metrics for image steganographyA proposed assessment metrics for image steganography
A proposed assessment metrics for image steganography
 
Bj31416421
Bj31416421Bj31416421
Bj31416421
 
Median based parallel steering kernel regression for image reconstruction
Median based parallel steering kernel regression for image reconstructionMedian based parallel steering kernel regression for image reconstruction
Median based parallel steering kernel regression for image reconstruction
 
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONMEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
 
IMAGE QUALITY OPTIMIZATION USING RSATV
IMAGE QUALITY OPTIMIZATION USING RSATVIMAGE QUALITY OPTIMIZATION USING RSATV
IMAGE QUALITY OPTIMIZATION USING RSATV
 
Extended Performance Appraise of Image Retrieval Using the Feature Vector as ...
Extended Performance Appraise of Image Retrieval Using the Feature Vector as ...Extended Performance Appraise of Image Retrieval Using the Feature Vector as ...
Extended Performance Appraise of Image Retrieval Using the Feature Vector as ...
 
A NEW ALGORITHM FOR DATA HIDING USING OPAP AND MULTIPLE KEYS
A NEW ALGORITHM FOR DATA HIDING USING OPAP AND MULTIPLE KEYSA NEW ALGORITHM FOR DATA HIDING USING OPAP AND MULTIPLE KEYS
A NEW ALGORITHM FOR DATA HIDING USING OPAP AND MULTIPLE KEYS
 
IRJET- Interactive Image Segmentation with Seed Propagation
IRJET-  	  Interactive Image Segmentation with Seed PropagationIRJET-  	  Interactive Image Segmentation with Seed Propagation
IRJET- Interactive Image Segmentation with Seed Propagation
 
DCT based Steganographic Evaluation parameter analysis in Frequency domain by...
DCT based Steganographic Evaluation parameter analysis in Frequency domain by...DCT based Steganographic Evaluation parameter analysis in Frequency domain by...
DCT based Steganographic Evaluation parameter analysis in Frequency domain by...
 
Oc2423022305
Oc2423022305Oc2423022305
Oc2423022305
 

Andere mochten auch

Globeways Corporate PPT
Globeways Corporate PPTGlobeways Corporate PPT
Globeways Corporate PPTglobeways
 
Distributed Product Development In Wargaming
Distributed Product Development In WargamingDistributed Product Development In Wargaming
Distributed Product Development In WargamingAlexander Derkach
 
Top 10 Ingredients to Beat the Odds | Boaz Levin
Top 10 Ingredients to Beat the Odds | Boaz LevinTop 10 Ingredients to Beat the Odds | Boaz Levin
Top 10 Ingredients to Beat the Odds | Boaz LevinJessica Tams
 
Ice and Water Rescue Incidents
Ice and Water Rescue IncidentsIce and Water Rescue Incidents
Ice and Water Rescue IncidentsSuncoastMeetings
 
Cresent pure(harvard business school case) pushkar saraf
Cresent pure(harvard business school case) pushkar sarafCresent pure(harvard business school case) pushkar saraf
Cresent pure(harvard business school case) pushkar sarafPushkar Saraf
 
Foliar Diseases in Pulses
Foliar Diseases in PulsesFoliar Diseases in Pulses
Foliar Diseases in PulsesICARDA
 
The legend of Gara and Jonay by Team6
The legend of  Gara and Jonay by Team6The legend of  Gara and Jonay by Team6
The legend of Gara and Jonay by Team6e-twinning
 
Use of Statistics in civil engineering and in real life
Use of Statistics in civil engineering and in real lifeUse of Statistics in civil engineering and in real life
Use of Statistics in civil engineering and in real lifeEngr Habib ur Rehman
 
Effects of regulated deficit irrigation (rdi) on fruit yield, quality, and ph...
Effects of regulated deficit irrigation (rdi) on fruit yield, quality, and ph...Effects of regulated deficit irrigation (rdi) on fruit yield, quality, and ph...
Effects of regulated deficit irrigation (rdi) on fruit yield, quality, and ph...Emily Wieber
 

Andere mochten auch (14)

Globeways Corporate PPT
Globeways Corporate PPTGlobeways Corporate PPT
Globeways Corporate PPT
 
Scientific Method Notes
Scientific Method NotesScientific Method Notes
Scientific Method Notes
 
Shot list
Shot listShot list
Shot list
 
Distributed Product Development In Wargaming
Distributed Product Development In WargamingDistributed Product Development In Wargaming
Distributed Product Development In Wargaming
 
Top 10 Ingredients to Beat the Odds | Boaz Levin
Top 10 Ingredients to Beat the Odds | Boaz LevinTop 10 Ingredients to Beat the Odds | Boaz Levin
Top 10 Ingredients to Beat the Odds | Boaz Levin
 
Ice and Water Rescue Incidents
Ice and Water Rescue IncidentsIce and Water Rescue Incidents
Ice and Water Rescue Incidents
 
SO Template
SO TemplateSO Template
SO Template
 
Cresent pure(harvard business school case) pushkar saraf
Cresent pure(harvard business school case) pushkar sarafCresent pure(harvard business school case) pushkar saraf
Cresent pure(harvard business school case) pushkar saraf
 
Foliar Diseases in Pulses
Foliar Diseases in PulsesFoliar Diseases in Pulses
Foliar Diseases in Pulses
 
The legend of Gara and Jonay by Team6
The legend of  Gara and Jonay by Team6The legend of  Gara and Jonay by Team6
The legend of Gara and Jonay by Team6
 
Laporan praktikum
Laporan praktikumLaporan praktikum
Laporan praktikum
 
Use of Statistics in civil engineering and in real life
Use of Statistics in civil engineering and in real lifeUse of Statistics in civil engineering and in real life
Use of Statistics in civil engineering and in real life
 
Male sterility
Male sterilityMale sterility
Male sterility
 
Effects of regulated deficit irrigation (rdi) on fruit yield, quality, and ph...
Effects of regulated deficit irrigation (rdi) on fruit yield, quality, and ph...Effects of regulated deficit irrigation (rdi) on fruit yield, quality, and ph...
Effects of regulated deficit irrigation (rdi) on fruit yield, quality, and ph...
 

Ähnlich wie Fin_whales

ImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural NetworksImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural NetworksWilly Marroquin (WillyDevNET)
 
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...IRJET Journal
 
Ultrasound Nerve Segmentation
Ultrasound Nerve Segmentation Ultrasound Nerve Segmentation
Ultrasound Nerve Segmentation Sneha Ravikumar
 
Predicting rainfall using ensemble of ensembles
Predicting rainfall using ensemble of ensemblesPredicting rainfall using ensemble of ensembles
Predicting rainfall using ensemble of ensemblesVarad Meru
 
IMAGE SEGMENTATION AND ITS TECHNIQUES
IMAGE SEGMENTATION AND ITS TECHNIQUESIMAGE SEGMENTATION AND ITS TECHNIQUES
IMAGE SEGMENTATION AND ITS TECHNIQUESIRJET Journal
 
IRJET - Hand Gesture Recognition to Perform System Operations
IRJET -  	  Hand Gesture Recognition to Perform System OperationsIRJET -  	  Hand Gesture Recognition to Perform System Operations
IRJET - Hand Gesture Recognition to Perform System OperationsIRJET Journal
 
Hyperspectral unmixing using novel conversion model.ppt
Hyperspectral unmixing using novel conversion model.pptHyperspectral unmixing using novel conversion model.ppt
Hyperspectral unmixing using novel conversion model.pptgrssieee
 
Sign Detection from Hearing Impaired
Sign Detection from Hearing ImpairedSign Detection from Hearing Impaired
Sign Detection from Hearing ImpairedIRJET Journal
 
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNN
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNNTRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNN
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNNgerogepatton
 
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNN
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNNTRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNN
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNNijaia
 
Segmentation of Images by using Fuzzy k-means clustering with ACO
Segmentation of Images by using Fuzzy k-means clustering with ACOSegmentation of Images by using Fuzzy k-means clustering with ACO
Segmentation of Images by using Fuzzy k-means clustering with ACOIJTET Journal
 
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION cscpconf
 
WEB IMAGE RETRIEVAL USING CLUSTERING APPROACHES
WEB IMAGE RETRIEVAL USING CLUSTERING APPROACHESWEB IMAGE RETRIEVAL USING CLUSTERING APPROACHES
WEB IMAGE RETRIEVAL USING CLUSTERING APPROACHEScscpconf
 
A modified symmetric local binary pattern for image features extraction
A modified symmetric local binary pattern for image features extractionA modified symmetric local binary pattern for image features extraction
A modified symmetric local binary pattern for image features extractionTELKOMNIKA JOURNAL
 
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...IRJET Journal
 
Large Scale Kernel Learning using Block Coordinate Descent
Large Scale Kernel Learning using Block Coordinate DescentLarge Scale Kernel Learning using Block Coordinate Descent
Large Scale Kernel Learning using Block Coordinate DescentShaleen Kumar Gupta
 
Pso based optimized security scheme for image authentication and tamper proofing
Pso based optimized security scheme for image authentication and tamper proofingPso based optimized security scheme for image authentication and tamper proofing
Pso based optimized security scheme for image authentication and tamper proofingcsandit
 
PSO BASED OPTIMIZED SECURITY SCHEME FOR IMAGE AUTHENTICATION AND TAMPER PROOFING
PSO BASED OPTIMIZED SECURITY SCHEME FOR IMAGE AUTHENTICATION AND TAMPER PROOFINGPSO BASED OPTIMIZED SECURITY SCHEME FOR IMAGE AUTHENTICATION AND TAMPER PROOFING
PSO BASED OPTIMIZED SECURITY SCHEME FOR IMAGE AUTHENTICATION AND TAMPER PROOFINGcscpconf
 
NBDT : Neural-backed Decision Tree 2021 ICLR
 NBDT : Neural-backed Decision Tree 2021 ICLR NBDT : Neural-backed Decision Tree 2021 ICLR
NBDT : Neural-backed Decision Tree 2021 ICLRtaeseon ryu
 
Image De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural NetworkImage De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural Networkaciijournal
 

Ähnlich wie Fin_whales (20)

ImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural NetworksImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural Networks
 
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
 
Ultrasound Nerve Segmentation
Ultrasound Nerve Segmentation Ultrasound Nerve Segmentation
Ultrasound Nerve Segmentation
 
Predicting rainfall using ensemble of ensembles
Predicting rainfall using ensemble of ensemblesPredicting rainfall using ensemble of ensembles
Predicting rainfall using ensemble of ensembles
 
IMAGE SEGMENTATION AND ITS TECHNIQUES
IMAGE SEGMENTATION AND ITS TECHNIQUESIMAGE SEGMENTATION AND ITS TECHNIQUES
IMAGE SEGMENTATION AND ITS TECHNIQUES
 
IRJET - Hand Gesture Recognition to Perform System Operations
IRJET -  	  Hand Gesture Recognition to Perform System OperationsIRJET -  	  Hand Gesture Recognition to Perform System Operations
IRJET - Hand Gesture Recognition to Perform System Operations
 
Hyperspectral unmixing using novel conversion model.ppt
Hyperspectral unmixing using novel conversion model.pptHyperspectral unmixing using novel conversion model.ppt
Hyperspectral unmixing using novel conversion model.ppt
 
Sign Detection from Hearing Impaired
Sign Detection from Hearing ImpairedSign Detection from Hearing Impaired
Sign Detection from Hearing Impaired
 
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNN
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNNTRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNN
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNN
 
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNN
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNNTRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNN
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNN
 
Segmentation of Images by using Fuzzy k-means clustering with ACO
Segmentation of Images by using Fuzzy k-means clustering with ACOSegmentation of Images by using Fuzzy k-means clustering with ACO
Segmentation of Images by using Fuzzy k-means clustering with ACO
 
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
 
WEB IMAGE RETRIEVAL USING CLUSTERING APPROACHES
WEB IMAGE RETRIEVAL USING CLUSTERING APPROACHESWEB IMAGE RETRIEVAL USING CLUSTERING APPROACHES
WEB IMAGE RETRIEVAL USING CLUSTERING APPROACHES
 
A modified symmetric local binary pattern for image features extraction
A modified symmetric local binary pattern for image features extractionA modified symmetric local binary pattern for image features extraction
A modified symmetric local binary pattern for image features extraction
 
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
 
Large Scale Kernel Learning using Block Coordinate Descent
Large Scale Kernel Learning using Block Coordinate DescentLarge Scale Kernel Learning using Block Coordinate Descent
Large Scale Kernel Learning using Block Coordinate Descent
 
Pso based optimized security scheme for image authentication and tamper proofing
Pso based optimized security scheme for image authentication and tamper proofingPso based optimized security scheme for image authentication and tamper proofing
Pso based optimized security scheme for image authentication and tamper proofing
 
PSO BASED OPTIMIZED SECURITY SCHEME FOR IMAGE AUTHENTICATION AND TAMPER PROOFING
PSO BASED OPTIMIZED SECURITY SCHEME FOR IMAGE AUTHENTICATION AND TAMPER PROOFINGPSO BASED OPTIMIZED SECURITY SCHEME FOR IMAGE AUTHENTICATION AND TAMPER PROOFING
PSO BASED OPTIMIZED SECURITY SCHEME FOR IMAGE AUTHENTICATION AND TAMPER PROOFING
 
NBDT : Neural-backed Decision Tree 2021 ICLR
 NBDT : Neural-backed Decision Tree 2021 ICLR NBDT : Neural-backed Decision Tree 2021 ICLR
NBDT : Neural-backed Decision Tree 2021 ICLR
 
Image De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural NetworkImage De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural Network
 

Mehr von Elijah Willie

BC-Cancer ChimeraScan Presentation
BC-Cancer ChimeraScan PresentationBC-Cancer ChimeraScan Presentation
BC-Cancer ChimeraScan PresentationElijah Willie
 
Molecular_bilogy_lab_report_2
Molecular_bilogy_lab_report_2Molecular_bilogy_lab_report_2
Molecular_bilogy_lab_report_2Elijah Willie
 
Molecular_bilogy_lab_report_1
Molecular_bilogy_lab_report_1Molecular_bilogy_lab_report_1
Molecular_bilogy_lab_report_1Elijah Willie
 
Computational_biology_project_report
Computational_biology_project_reportComputational_biology_project_report
Computational_biology_project_reportElijah Willie
 
Target_heart_rate_monitor
Target_heart_rate_monitorTarget_heart_rate_monitor
Target_heart_rate_monitorElijah Willie
 

Mehr von Elijah Willie (7)

Co-OP Presentation
Co-OP PresentationCo-OP Presentation
Co-OP Presentation
 
BC-Cancer ChimeraScan Presentation
BC-Cancer ChimeraScan PresentationBC-Cancer ChimeraScan Presentation
BC-Cancer ChimeraScan Presentation
 
Molecular_bilogy_lab_report_2
Molecular_bilogy_lab_report_2Molecular_bilogy_lab_report_2
Molecular_bilogy_lab_report_2
 
Molecular_bilogy_lab_report_1
Molecular_bilogy_lab_report_1Molecular_bilogy_lab_report_1
Molecular_bilogy_lab_report_1
 
Computational_biology_project_report
Computational_biology_project_reportComputational_biology_project_report
Computational_biology_project_report
 
Target_heart_rate_monitor
Target_heart_rate_monitorTarget_heart_rate_monitor
Target_heart_rate_monitor
 
Image_processing
Image_processingImage_processing
Image_processing
 

Fin_whales

  • 1. Image Segmentation and Classification for Fin Whale1 Identification2 Michael Ford Kathleen Moriarty Elijah Willie Luke McQuaid 3 December 10, 20164 1 Problem Description5 In many fields of zoology, the ability to identify individual animals is foundational to many research6 endeavors, and can allow scientists to discover aspects of animal biology such as social dynamics and7 population distribution. However, depending on the animal being studied, the process of performing8 an identification may be difficult or involve significant manual labour. A primary example of this is9 performing identification of fin whales, which is a species of large cetacean that lives off the coast10 of British Columbia and has unique identifying markings for each individual. There are over 70011 individual fin whales, and when an animal is encountered their photo must be compared by hand to a12 reference catalogue. In this project we applied various machine learning approaches with the goal of13 automating this process of fin whale identification. Figure 1: Example image of a fin whale indicating unique identification features. 14 1.1 Data set15 In collaboration with the Cetacean Research Program, Fisheries and Oceans Canada, we received16 a catalogue of 79 individuals with 884 images. This represents a small sample of the entire stored17 catalogue of identification photos, however the labels for the catalogue are recorded on hard copy,18 and due to the significant labour involved in manually labeling photos we were restricted to this data19 set.20 1.2 Identifying Challenges21 1.2.1 Data set size22 Previous work in the area of automated whale identification has shown that this is a feasible problem.23 In particular, the Kaggle Right Whale Recognition Contest of August of 2015 [kag()] had private24 submissions achieving log-loss of 0.59600. However the Kaggle data set contained 4237 photos25 for 427 individuals, a significant difference in scale from our data set. We recognized initially that26 the size of the data set would restrict our ability to train complex models due to the high chance of27 over-fitting.28 1
  • 2. 1.2.2 Signal-to-Noise29 As can be seen in the figure below, many input photos have significant noise in the background of30 the photo. Due to the poor signal to noise ratio, and being unable to perform complex learning for31 feature extraction due to our small dataset, we decided to devote significant effort to pre-processing32 steps prior to doing any classification.33 2 Pre-processing34 2.1 Segmentation using Markov Random Fields35 Markov Random Fields have over the past few decades become an innovative way of denoising36 and segmenting a wide range of types of images. Therefore is without much thought that the pre-37 processing of the whale images contains a pipeline that appeals to MRFs. We used a previously38 implemented MRF [Sharma()] model for processing the images. For our purpose, this model was39 used for image de-noising, and segmentation by edge detection. This model has parameters that can40 be set by the user. This is useful for optimization, as a user needs may differ based on the context41 of use. For this model, we needed to specify the maximum number of iterations, the number of42 neighbours for k-means clustering for classifying a pixel in the image based on its neighbors, and a43 value for the potential function used in the model.44 This pipeline was very memory demanding, and as such, great care was given when picking parameters45 since running all the samples through the pipeline was quite lengthy in time. See figure below for46 the result of MRF applied to an image. This image was processed with the following parameters:47 (maxIter = 5, k = 3, and potential = 0.5). The model with these parameters took approximately 4548 seconds for completion. Multiplying this by the total number of images in our data set (838 images),49 we see that this model applied to the whole of the images would take approximately 10.5 hours.50 These parameters proved to be best compromise between computation time and quality of results as51 other parameters had longer computation times while yielding similar results. (a) Original image before MRF transformation (b) Image after MRF transformation Figure 2: MRF applied to test image 52 2.2 Segmentation using Hidden Random Markov Field-Expectation Maximizaton53 (HRMF-EM)54 In addition to using regular MRF for image denoisng and segmentation by edge detection, we also55 used different model for image denoisng and segmentation. This time we used a model based on56 Hidden Markov Random Field using the Expectation Maximization algorithm learn the most probable57 parameters for this algorithm. This algorithm is fully described in [Wang(2012)]. This source also58 provided an implemented model which also enabled us to tweak the parameters until we were able59 to remove an adequate amount of noise from the original image. This model was just as memory60 intensive and time consuming as the model based on regular MRF. This model however contained61 one fewer parameter. We had to specify the number of iterations, and the number of neighbours for62 computing the k-nearest neighbour when classifying a pixel in the image. Great care was also taken63 2
  • 3. when picking the parameters that would result in as little noise as possible remaining in the image64 when compared to original image. Both of these models (MRF, and HRMF-EM) were used for image65 denoising, segmentation and generating extra features for downstream training of the final model.66 See figure below for the result of MRF applied to an image. This image was processed with the67 following parameters: (maxIter = 5, and k = 3). The model with these parameters took approximately68 80 seconds for completion. Again, multiplying this by the total number of images in the data set69 (838 images), we see that this model applied to the whole of the images would take approximately70 17 hours for completion. Given this lengthy computation time, it is clear how necessary parameter71 optization was for this model. These parameters, however, proved to be best compromise between72 computation time and quality of results as other parameters had longer computation times while73 yielding similar results. In addition to having a pre-implemented model, these models were chosen74 because they represent the simplest statistical models for image denoising and segmentation that by no75 means assume that the variables within a system are independent. The variables here being the pixels76 within an input image. These models also allows us to model important inter-pixel dependencies, and77 conditional dependencies that can be taken advantage of for our purpose of pre-processing. (a) Original image before MRF-EM transformation (b) Image after HMRF-EM transformation Figure 3: MRF-EM applied to test image 78 3
  • 4. 2.3 Cropping79 2.3.1 Manual Cropping80 While developing a model to crop images automatically and pipeline them into our CNN identifier81 we had to work on the CNN itself. A small program was created to expedite the process of manually82 cropping all the photos. Images were greatly reduced in size using an augmenting program, then83 cropping was done using a code developed by [Rosebrock()] for cropping boxes out of an image. The84 cropped regions were then mapped back but to the full size images to get a high resolution version of85 the cropped photo.86 3 Whale Identification87 3.1 Binary Classification using Pre-trained CNN88 The initial model approach was to create a binary classifier using two identical, merged CNN89 structures. Siamese CNN structures have previously been applied with some success facial recognition90 [Khalil-Hani(2014)] and in one-shot image recognition [Koch(2015)]. This suggested that this model91 would be suitable for our whale identification using our small dataset. However, due to the small92 dataset size and limited available computation resources, we decided to implement the model using93 a VGG16 structure pre-trained on the imagenet database [Chollet(a)]. We removed the VGG last94 classification block, merged two identical models with learning turned off, and then tried several95 different classification structures, including adding one or two fully connected layers, and dropout96 layer, before the single classification node which used a sigmoid activation function. Training was97 preformed using full frame images reszied to 224x224. However we were unable to get any results98 that modeled more that the proportion of true to positive training examples. This made us conclude99 that either the signal to noise ratio was not adequate, or that the features that the pre-trained VGG100 network was not extracting features that were representative of the inter-individual variation.101 3.2 Minimal Compute Method to Establish Baseline Results102 3.2.1 Method103 A model of our minimal compute method, outlined in stages 1 and 2 below can seen in Fig. 4. Figure 4: Model of Minimal Compute Method Pipeline 104 Stage 1: Feature Extraction105 Training a state of the art convolutional neural network (CNN) model for multiclass classification106 would require more compute power than we had at our disposal. Instead, a "CPU friendly"107 alternative method was used: whale images were feed through an Inception V3 CNN model which108 was pre-trained on ImageNet data set images, (similar to Assignment 3 [Mori(2016)]. Output109 from the last average pooling layer were collected and saved as feature vectors for each whale110 image. Our hypothesis was that our Inception V3 model, would have learned enough discrimina-111 tive information about each whale image to make classification on top of these feature vectors possible.112 113 Several variations of the original images were passed through the pre-trained CNN, to get114 different sets of feature vectors, (as depicted in Fig. 4). These sets were: original full images,115 4
  • 5. cropped images, and cropped high resolution images, (they were not re-sized to 299x299 V3 input116 dimensions).117 *Stage 2: Support Vector Machine For Classification118 The feature vectors of each image, as discussed in Stage 1, were then passed into an support vector119 machine, to classify each image as one of 78 whales.120 Feature vectors were either left unnormalized or normalized, by scaling each vector to it’s unit121 norm,(i.e. L2-Normalization). This was proven to be an effective pre-processing step in previous122 works using SVMs for classification problems, ([Simonyan and Zisserman(2015)])123 With L2-Normalization, For each element E in feature vector x:124 Exnorm = Ex xnorm 2 (1) However, After visualizing these normalized feature vectors, ( see Fig. 5), it should be noted that125 there did not appear to be enough variation between images which could be determined from their126 V3 learned features. Figure 5: Normalized Feature Vectors Learned from Inception V3 127 3.2.2 Results128 Several hyper-parameters were tested during cross-validation, the results are shown in Fig. 6. The best Figure 6: Cross-Validation Accuracy Reported Across Several Hyper-parameters 129 choice of hyper-parameters, resulting in the highest validation accuracy, was as follows: Surprisingly,130 despite L2- normalization outperforming in every other test, unnormalized features resulted in higher131 accuracy on the validation set. The highest validation accuracy was also achieved in conjunction132 with using a new feature vector, created from the feature vectors of full images, cropped images and133 5
  • 6. high-resolution cropped images:134 Featuresbest = Ftype numFtypes (2) These hyper-parameters were then used to create our final ’CPU-friendly’ model, achieving a135 classification accuracy on our test set of 7.61 percent. (see Fig. 7)136 Figure 7: Test Accuracy, Using Chosen "Best" Hyper-parameters The final model was also tested on whales which had more than 25 images included in the data137 set, resulting in only 12 classes. Our final model achieved a test accuracy of 19 percent with this138 simplified problem.139 3.3 Dual Input Merged Model140 In an attempt to create a model that took as input as much signal as possible, we constructed a141 dual-input merged classification model where the input was both the cropped original image, and the142 output of the MRF pre-processing using the cropped original image. Both inputs were re-sized to143 224x224. The structure was based on two independent VGG16 networks pre-trained on Imagenet.144 Training was performed on the final convolution block as well as an added fully connected layer on145 top of the VGG networks.146 6
  • 7. Figure 8: Merged model structure. Blue indicates layers where learning was turned off, while yellow layers had learning turned on. The model was run using SGD with a slow learning rate of 0.00001 and momentum of 0.9 as147 suggested by ’Building powerful image classification models using very little data’ [Chollet(b)].148 While training accuracy of 99.85% was achieved, this was accompanied with zero testing accuracy149 using 20% of the data set as test data. This indicates that the model was too complex for the size of150 the data set provided.151 3.4 Simple Convolution Neural Network152 Due to the lack in ability of the pre-trained VGG network to extract relevant features, and the153 massive over-fitting that resulted when learning was applied to only a limited portion of the VGG, we154 attempted to learn a simple CNN from scratch. The network consisted of a single 2d convolution155 layer, a max pooling layer, fully connected 64 node hidden layer, dropout of 0.5 and classification156 layer. ReLU was used as the activation function for hidden layers while softmax was used for the157 classification layer. See figure below for results.158 Input Learning Rate Momentum Train Accuracy Test accuracy HMRF 0.1 0.0 0.142 0.0074 HMRF 0.00001 0.5 0.128 0.147 HMRF 0.00001 0.9 0.399 0.0662 Cropped 0.00001 0.9 0.0456 0.441 Figure 9: Results from training using different input data sets and parameters. HMRF input refers to the the HMRF pre-processing applied to all the data set, while Cropped refers to the manually cropped data set. All inputs were re-sized to 224x224. 4 Discussion159 4.1 Contributions160 Michael initiated the project and collaborated with the team from Fisheries in order to obtain the data161 set and understand the problem from the scientists perspective. Luke and Elijah teamed up to work162 on the pre-processing half of the project. While everyone met and dealt with issues all together we163 7
  • 8. all had our own specific jobs. Luke developed a process for cropping images manually and focused164 on that for the first half of the project in order for the whale classification to test and refine their165 programs on better cropped images. Once we had a full set of high resolution cropped images Luke166 continued by assisting where he could with whale classification and Elijah’s automated cropping167 tool. Kathleen was responsible for the Minimal Compute Method for Baseline Results and Support168 Vector Machine for Classification, while Michael was responsible for the Binary Classification using169 Pre-Trained CNN, Dual-Input Merged Model and Simple Convolution Neural Network.170 References171 [kag()] https://www.kaggle.com/c/noaa-right-whale-recognition.172 [Sharma()] Kamal Kishor Sharma. github.com/kamalkishor/Wound_Image_Segmentation_by_Markov_Random_Field.173 [Wang(2012)] Quan Wang. Hmrf-em-image: Implementation of the hidden markov random field174 model and its expectation-maximization algorithm, 2012.175 [Rosebrock()] Adrian Rosebrock. tionReferences [1] http://www.pyimagesearch.com/2015/03/09/capturing-176 mouse-clic.177 [Khalil-Hani(2014)] LS Khalil-Hani, M; Sung. A Convolutional Neural Network Approach for Face178 Verification. NUCLEIC ACIDS RESEARCH, 2014.179 [Koch(2015)] Gregory Koch. Siamese neural networks for one-shot image recognition. PhD thesis,180 University of Toronto, 2015.181 [Chollet(a)] Francois Chollet. deep-learning-models, a. https://github.com/fchollet/deep-learning-182 models.183 [Mori(2016)] Dr. Greg Mori. Assignment 3: Deep learning. 2016.184 [Simonyan and Zisserman(2015)] Karen Simonyan and Andrew Zisserman. Very deep convolutional185 networks for large-scale image recognition. ICLR, 2015.186 [Chollet(b)] Francois Chollet. Building powerful image classification models using very little187 data, b. https://blog.keras.io/building-powerful-image-classification-models-using-very-little-188 data.html.189 8