In recent years, neural networks and deep learning techniques have shown to perform well on many
problems in image recognition, speech recognition, natural language processing and many other tasks.
As a result, a large number of libraries, toolkits and frameworks came out in different languages and
with different purposes. In this report, firstly we take a look at these projects and secondly we choose the
framework that best suits our needs: Theano. Eventually, we implement a simple convolutional neural net
using this framework to test both its ease-of-use and efficiency.
Deep Learning libraries and first experiments with Theano
1. Deep Learning libraries and first experiments with Theano • March 30, 2015
Deep Learning libraries and first
experiments with Theano
Vincenzo Lomonaco
Alma Mater Studiorum - University of Bologna
vincenzo.lomonaco@studio.unibo.it
Abstract
In recent years, neural networks and deep learning techniques have shown to perform well on many
problems in image recognition, speech recognition, natural language processing and many other tasks.
As a result, a large number of libraries, toolkits and frameworks came out in different languages and
with different purposes. In this report, firstly we take a look at these projects and secondly we choose the
framework that best suits our needs: Theano. Eventually, we implement a simple convolutional neural net
using this framework to test both its ease-of-use and efficiency.
1. Introduction
D
eep learning (deep structured learning
or hierarchical learning) is a set of al-
gorithms in machine learning that at-
tempt to model high-level abstractions in data
by using model architectures composed of mul-
tiple non-linear transformations. In the last
decade, the real impact of deep learning on
the industry and a renewed interest in the re-
search in this area led it to become part of
many state-of-the-art systems in different dis-
ciplines, particularly that of computer vision
and automatic speech recognition. Advances in
hardware have also been an important enabling
factor for the resurgence of neural networks
and deep learning architectures. In particular,
powerful graphics processing units (GPUs) are
highly suited for the kind of number crunch-
ing, matrix/vector math involved in machine
learning. GPUs have been shown to speed up
training algorithms by orders of magnitude,
bringing running times of weeks back to days.
However, It is well known that machine learn-
ing comes with huge maintenance costs and it
is extremely hard to debug, in particular when
the algorithms are greatly optimized for spe-
cific hardwares like GPUs. For this reason, a
large number of libraries and software utilities
came out to help developers to focus on the
model side letting the framework or the library
do the rest. In the following section we take a
look at different open source projects that fol-
low this path. Then we focus on the four most
popular general-purpose framework, explain-
ing them in greater detail and choosing that
one the best suits our needs and our resources.
In the second part of the report, we test the
framework developing a simple convolutional
neural net and performing same experiments
with a known dataset. Eventually, in the last
section we draw the main conclusions.
2. Available Deep Learning
libraries and frameworks
In this section we provide an overview about
all the main projects provided by the deep
learning community that can have different
abstraction levels and different purposes. In
fact, each of them could be particular help-
ful for a specific task. In general, they came
from research groups both from the academy
and the industry and are open-source, licensed
with BSD or similar.
Theano [24] is a Python library that al-
lows you to define, optimize, and evalu-
ate mathematical expressions involving multi-
dimensional arrays efficiently.
1
2. Deep Learning libraries and first experiments with Theano • March 30, 2015
Pylearn2 [26] is a machine learning library.
Most of its functionality is built on top of
Theano. This means you can write Pylearn2
plugins (new models, algorithms, etc) using
mathematical expressions, and Theano will op-
timize and stabilize those expressions for you,
and compile them to a back-end of your choice
(CPU or GPU).
Torch [25] is a scientific computing frame-
work with wide support for machine learn-
ing algorithms. It is easy to use and efficient,
thanks to an easy and fast scripting language,
LuaJIT, and an underlying C/CUDA imple-
mentation.
Deeplearning4j [5] is the first commercial-
grade, open-source, distributed deep-learning
library written for Java and Scala. Integrated
with Hadoop and Spark, DL4J is designed to
be used in business environments, rather than
as a research tool. It aims to be cutting-edge
plug and play, more convention than configu-
ration, which allows for fast prototyping for
non-researchers.
Caffe [29] is a deep learning framework
made with expression, speed, and modularity
in mind. It is developed by the Berkeley Vi-
sion and Learning Center (BVLC) and by com-
munity contributors. Yangqing Jia created the
project during his PhD at UC Berkeley. Caffe
is released under the BSD 2-Clause license.
NVIDIA cuDNN [19] is a GPU-accelerated
library of primitives for deep neural networks.
It emphasizes performance, ease-of-use, and
low memory overhead. NVIDIA cuDNN is
designed to be integrated into higher-level ma-
chine learning frameworks, such as UC Berke-
ley’s popular Caffe software. The simple, drop-
in design allows developers to focus on de-
signing and implementing neural net models
rather than tuning for performance, while still
achieving the high performance modern paral-
lel computing hardware affords.
DeepLearnToolbox [6] is a Matlab/Octave
toolbox for deep learning. Includes deep be-
lief bets, stacked autoencoders, convolutional
neural nets, convolutional autoencoders and
vanilla neural net. It is in its early stages of
development.
Cuda-Convnet2 [3] is a fast C++/CUDA im-
plementation of convolutional (or more gen-
erally, feed-forward) neural networks. It can
model arbitrary layer connectivity and net-
work depth. Any directed acyclic graph of
layers will do. Training is done using the back-
propagation algorithm and now it supports
multi-GPUs training parallelism.
RNNLM [30] is a toolkit led by Tomas
Mikolov that can be used to train, evaluate and
use neural network based language models.
RNNLIB [27] is a recurrent neural network
library for sequence learning problems. Ap-
plicable to most types of spatio-temporal data,
it has proven particularly effective for speech
and handwriting recognition.
LUSH [12] is an object-oriented program-
ming language designed for researchers, ex-
perimenters, and engineers interested in large-
scale numerical and graphic applications. Lush
is designed to be used in situations where
one would want to combine the flexibility of a
high-level, weakly-typed interpreted language,
with the efficiency of a strongly-typed, natively-
compiled language, and with the easy integra-
tion of code written in C, C++, or other lan-
guages.
Eblearn.lsh [9] is a LUSH-based machine
learning library for doing energy-based learn-
ing led by Koray Kavukcuoglu. It includes
code for “predictive sparse decomposition”
and other sparse auto-encoder methods for un-
supervised learning.
Eblearn [34] is a C++ machine learning
library with a BSD license for energy-
2
3. Deep Learning libraries and first experiments with Theano • March 30, 2015
based learning, convolutional networks, vi-
sion/recognition applications, etc. EBLearn
is primarily maintained by Pierre Sermanet at
NYU.
MShadow [15] is a lightweight CPU/GPU
Matrix/Tensor Template Library in
C++/CUDA. The goal of mshadow is to sup-
port efficient, device invariant and simple
tensor library for machine learning project that
aims for both simplicity and performance. Sup-
ports CPU/GPU/Multi-GPU and distributed
system.
Nengo [23] is a graphical and scripting based
software package for simulating large-scale
neural systems. To use Nengo, you define
groups of neurons in terms of what they rep-
resent, and then form connections between
neural groups in terms of what computa-
tion should be performed on those representa-
tions. Nengo then uses the Neural Engineering
Framework (NEF) to solve for the appropri-
ate synaptic connection weights to achieve this
desired computation. Nengo also supports var-
ious kinds of learning. Nengo helps make de-
tailed spiking neuron models that implement
complex high-level cognitive algorithms.
cudamat [31] is a Python module for per-
forming basic dense linear algebra computa-
tions on the GPU using CUDA. The current fea-
ture set of cudamat is biased towards features
needed for implementing some common ma-
chine learning algorithms. Some feedforward
neural networks and restricted Boltzmann ma-
chines implementations are provided as exam-
ples that come with cudamat.
Gnumpy [35] is a Python module that inter-
faces in a way almost identical to numpy, but
does its computations on the GPU. It runs on
top of cudamat.
CUV Library [4] is a C++ framework with
python bindings for easy use of Nvidia CUDA
functions on matrices. It contains an RBM im-
plementation, as well as annealed importance
sampling code and code to calculate the parti-
tion function exactly. It is provided by the AIS
lab at University of Bonn.
ConvNet [1] is a Matlab toolbox for convo-
lutional neural networks, including invariang
backpropagation algorithm (IBP). It has ver-
sions for GPU and CPU, written on CUDA,
C++ and Matlab. All versions work identi-
cally. The GPU version uses kernels from Alex
Krizhevsky’s library cuda-convnet2.
neuralnetworks [17] is a Java implementa-
tion of some of the algorithms for training deep
neural networks. GPU support is provided
via the OpenCL and Aparapi. The architec-
ture is designed with modularity, extensibil-
ity and pluggability in mind. At the moment
it supports multilayer perceptron, restricted
boltzmann machine, autoencoder, deep belief
network, stacked autoencodeer, convolutional
networks with max pooling, average pooling
and stochastic pooling.
convnetjs [2] is a Javascript library for train-
ing Deep Learning models (mainly Neural Net-
works) entirely in the browser. The software
has no other dependencies and doesn’t require
any installation. It currently supports com-
mon neural network modules, classification
(SVM/Softmax) and regression (L2) cost func-
tions, a MagicNet class for fully automatic neu-
ral network learning (automatic hyperparam-
eter search and cross-validatations), ability to
specify and train convolutional networks that
process images, an experimental reinforcement
learning module, based on Deep Q Learning.
NuPIC [18] is a set of learning algorithms
written in Python and C++ that implements an
Hierarchal Temporal Memory, or HTM, first
described in a white paper published by Nu-
menta in 2009 [28]. The learning algorithms
tries faithfully to capture how layers of neurons
in the neocortex learn.
3
4. Deep Learning libraries and first experiments with Theano • March 30, 2015
PyBrain [33] is a modular Machine Learning
Library for Python. Its goal is to offer flexi-
ble, easy-to-use yet still powerful algorithms
for Machine Learning Tasks and a variety of
predefined environments to test and compare
algorithms. It currently supports various algo-
rithms for neural networks, for reinforcement
learning (and the combination of the two), for
unsupervised learning, and evolution.
deepnet [7] is a GPU-based python imple-
mentation of feed-forward neural nets, re-
stricted boltzmann machines, deep belief nets,
autoencoders, deep boltzmann machines, con-
volutional nets and it is built on top of the cud-
amat library by Vlad Mnih and cuda-convnet2
library by Alex Krizhevsky.
DeepPy [8] tries to combine state-of-the-art
deep learning models with a Pythonic interface
in an extensible framework. It can run both on
CPU or Nvidia GPUs when available (thanks
to CUDArray).
hebel [11] is a library for deep learning with
neural networks in Python using GPU acceler-
ation with CUDA through PyCUDA. It imple-
ments the most important types of neural net-
work models and offers a variety of different
activation functions and training methods such
as momentum, Nesterov momentum, dropout,
and early stopping.
Mocha [14] is a Deep Learning framework
for Julia, inspired by the C++ framework Caffe.
Efficient implementations of general stochastic
gradient solvers and common layers in Mocha
could be used to train deep/shallow (con-
volutional) neural networks, with (optional)
unsupervised pre-training via (stacked) auto-
encoders.
OpenDL [20] is a deep learning training li-
brary based on Spark framework. The algo-
rithms in OpenDL should be gradient update
support like logistic regression (Softmax), back-
propagation, autoEncoder, RBM, Convolution
and so on, all of them can be incorporated into
OpenDL framework.
MGL [13] is a Common Lisp machine learn-
ing library by Gàbor Melis with some parts
originally contributed by Ravenpack Interna-
tional. It mainly concentrates on various forms
of neural networks (boltzmann machines, feed-
forward and recurrent backprop nets). Most
of MGL is built on top of MGL-MAT so it has
BLAS and CUDA support. In general, the fo-
cus is on power and performance not on ease
of use.
Gensim [10] is an open-source vector space
modeling and topic modeling toolkit, imple-
mented in the Python programming language,
using NumPy, SciPy and optionally Cython for
performance. It is specifically intended for han-
dling large text collections, using efficient on-
line algorithms. Gensim includes implementa-
tions of tf–idf, random projections, deep learn-
ing with Google’s word2vec algorithm (reim-
plemented and optimized in Cython), hierar-
chical Dirichlet processes (HDP), latent seman-
tic analysis (LSA) and latent Dirichlet alloca-
tion (LDA), including distributed parallel ver-
sions.
ND4J [16] is a scientific computing library for
the JVM. It is meant to be used in production
environments rather than as a research tool,
which means routines are designed to run fast
with minimum RAM requirements. its main
features are: versatile n-dimensional array
object, multiplatform functionality including
GPUs, linear algebra and signal processing
functions.
As we saw, there are a great number of projects
dealing with dense numeric computations both
on CPUs and GPUs. Most of them are opti-
mized to suit DL algorithms and offer a num-
ber of primitives or classes to help developers
to write easy and readable code. However they
work at different abstraction levels and it is not
uncommon to find projects that are based on
other lower level projects.
4
5. Deep Learning libraries and first experiments with Theano • March 30, 2015
3. Most popular frameworks
In this section we explore in greater details
the most popular and general-purposes frame-
works used today to solve various DL tasks.
We decided to focus on 4 main projects:
• Theano
• Torch
• Deeplearning4j
• Caffe
3.1. Theano
As we saide before, Theano is a Python library
that lets you to define, optimize, and evaluate
mathematical expressions, especially ones with
multi-dimensional arrays (numpy.ndarray). Us-
ing Theano it is possible to attain speeds rival-
ing hand-crafted C implementations for prob-
lems involving large amounts of data. It can
also surpass C on a CPU by many orders of
magnitude by taking advantage of recent GPUs.
Theano combines aspects of a computer alge-
bra system (CAS) with aspects of an optimizing
compiler. It can also generate customized C
code for many mathematical operations. This
combination of CAS with optimizing compi-
lation is particularly useful for tasks in which
complicated mathematical expressions are eval-
uated repeatedly and evaluation speed is crit-
ical. For situations where many different ex-
pressions are each evaluated once Theano can
minimize the amount of compilation/analysis
overhead, but still provide symbolic features
such as automatic differentiation.
In Theano there are two ways currently to use
a GPUs, one of which only supports NVIDIA
cards (CUDA backend) and the other, in de-
velopment, that should support any OpenCL
device as well as NVIDIA cards (GpuArray
Backend).
One thing to keep in mind is that the “build-
ing blocks” you get in Theano are not ready-
made neural network layer classes, but rather
symbolic function expressions that are possible
to compose into other expressions. The work
is made at a slightly lower level of abstrac-
tion, but this means there is lot more flexibility.
(That said, if one needs ’plug and play’ neural
networks, he can can use pylearn2 which is
built on top of Theano).
Theano was written at the LISA lab to sup-
port rapid development of efficient machine
learning algorithm and released under a BSD
license.
3.2. Torch
The goal of Torch is to have maximum flexi-
bility and speed in building your scientific al-
gorithms while making the process extremely
simple. Torch comes with a large ecosystem of
community-driven packages in machine learn-
ing, computer vision, signal processing, par-
allel processing, image, video, audio and net-
working among others, and builds on top of
the Lua community.
At the heart of Torch are the popular neural
network and optimization libraries which are
simple to use, while having maximum flexibil-
ity in implementing complex neural network
topologies. You can build arbitrary graphs
of neural networks, and parallelize them over
CPUs and GPUs in an efficient manner.
Torch core features are:
• A powerful N-dimensional array.
• Lots of routines for indexing, slicing,
transposing, etc.
• Amazing interface to C, via LuaJIT.
• Linear algebra routines.
• Neural network, and energy-based mod-
els.
• Numeric optimization routines.
• Fast and efficient GPU support.
• Embeddable, with ports to iOS, Android
and FPGA backends.
In the words of Soumith Chintala, one of the
Author of Torch:
It’s like building some kind of electronic
contraption or, like, a Lego set. You
just can plug in and plug out all these
blocks that have different dynamics and
that have complex algorithms within
them.
At the same time Torch is actually not
extremely difficult to learn unlike, say,
5
6. Deep Learning libraries and first experiments with Theano • March 30, 2015
the Theano library.
We’ve made it incredibly easy to use.
We introduce someone to Torch, and
they start churning out research really
fast.
So it has a slightly higher level of abstrac-
tion then Theano, but It has nothing to do
with the familiar and rich Python ecosystem.
With respect to the GPU support, It works
well with different backends depending on
the task. CUDA is supported by installing
the package cutorch. An alternative is to use
NVidia CuDNN, which is very reliable, or
cuda-convnet2 bindings, or nnbhwd package.
3.3. Deeplearning4j
In a nutshell, Deeplearning4j lets you compose
deep nets from various shallow nets, each of
which form a layer. This flexibility lets you
combine restricted Boltzmann machines, au-
toencoders, convolutional nets and recurrent
nets as needed in a distributed, production-
grade framework on Spark, Hadoop and else-
where.
To sum up, the DL4J’s main features are:
• A versatile n-dimensional array class.
• GPU integration.
• Scalable on Hadoop, Spark and Akka +
AWS and other platforms.
Torch, while powerful, was not designed
to be widely accessible to the Python-based
academic community, nor to corporate soft-
ware engineers, whose lingua franca is Java.
Deeplearning4j was written in Java to reflect
the focus on industry and ease of use. In fact
usability is often the limiting parameter that
inhibits more widespread of deep-learning im-
plementations. A great thing in DL4J is that
you can choose CUDA compatible GPUs or
native CPUs for your backend processing by
changing just one line in a configuration file.
Moreover, while both Torch7 and DL4J employ
parallelism, DL4J’s parallelism is automatic.
That is, the setting up of worker nodes and
connections is automated, allowing users to
bypass libs while creating a massively parallel
network on Spark, Hadoop, or with Akka and
AWS.
3.4. Caffe
Caffe is developed by the Berkeley Vision and
Learning Center (BVLC) and by community
contributors released under the BSD 2-Clause
license. Yangqing Jia created the project during
his PhD at UC Berkeley. In one sip, Caffe is
brewed for:
• Expression: models and optimizations
are defined as plaintext schemas instead
of code.
• Speed: for research and industry alike
speed is crucial for state-of-the-art mod-
els and massive data.
• Modularity: new tasks and settings re-
quire flexibility and extension.
• Openness: scientific and applied
progress call for common code, refer-
ence models, and reproducibility.
• Community: academic research, startup
prototypes, and industrial applications
all share strength by joint discussion and
development in a BSD-2 project.
Deep networks are compositional models that
are naturally represented as a collection of
inter-connected layers that work on chunks of
data. Caffe defines a net layer-by-layer in its
own model schema. The network defines the
entire model bottom-to-top from input data to
loss. As data and derivatives flow through the
network in the forward and backward passes
Caffe stores, communicates, and manipulates
the information as blobs: the blob is the stan-
dard array and unified memory interface for
the framework.
In terms of GPUs support it works both with
CUDA and NVIDIA cuDNN. One lack at the
moment is the multi-GPUs support. Currently,
in fact, Caffe does work with multiple GPUs
only in a standalone fashion.
6
7. Deep Learning libraries and first experiments with Theano • March 30, 2015
4. Our resources and the
framework chosen
In this section we provide a brief explanation
regard the framework we chose. First of all,
we limited our interest to the four framework
detailed before. Then we considered that one,
Theano, that best suits our needs. In fact we
looked for a framework that has a good sup-
port for Windows OS, has a good set of tuto-
rials and examples, has great portability, is
easy to use and is widely used in the aca-
demic community. Torch7, the newest ver-
sion of Torch, doesn’t seem to provide a com-
plete Windows support. In fact, even if like
Theano is optimized for a linux x64 machine, it
doesn’t provide any documentation about the
installation on Windows. Moreover, Caffe and
Deeplearning4j are discarded from the start
because caffe has only an unofficial Windows
port and Deeplearning4j carries with him a
lot of dependencies in the Java ecosystem that
is not something we want to bear. Another
reason to choose Theano instead of Torch7 is
that is widely used in the academic community.
Torch7 is used by Google DeepMind, the Face-
book AI Research Group, the Computational
Intelligence, Learning, Vision, and Robotics
Lab at NYU and the Idiap Research Institute,
but it is mainly because its main authors are
part of these institutions. The main article of
Theano, instead, has more than 300 citations
from a large range of academics, including Y.
Bengio, T. Mikolov, J. Goodfellow and a lot of
research groups from many universities like
the Bernstein Center for Computational Neuro-
science at the Freie Universität Berlin, the Har-
vard Intelligent Probabilistic Systems group,
the Reservoir Lab at Ghent University in Ghent
in Belgium, the Centre for Theoretical Neuro-
science, University of Waterloo, the Center for
Language and Speech Processing at the Johns
Hopkins University and many others.
Other great researchers in the area, instead,
seem not to use either of them. Juergen Schmid-
huber and his team at the Dalle Molle Institute
of Artificial Intelligence is pushing for his own
framework PyBrain; The Toronto crew led by
Geoffrey Hinton, and the Stanford crew led
by Andrew Ng, finally, seem to prefer custom
implementations of ad-hoc algorithms rather
than using general-purpose frameworks.
5. Experiments and results
In this section we provide a brief use-case with
Theano implementing a convolutional neural
network and testing that on a subsampled
NORB dataset.
We used and modified the code provided in the
Theano tutorial library[22] (which already im-
plements a LeNet-5 for recognizing handwrit-
ten digit images of the classic MNIST dataset)
and then we performed two simple experi-
ments on that, reaching a good level of accu-
racy despite the low amount of training data.
The dataset The original NORB dataset is in-
tended for experiments in 3D object recogni-
tion from shape. It contains images of 50 toys
belonging to 5 generic categories: four-legged
animals, human figures, airplanes, trucks, and
cars. The objects are imaged by two cameras
under 6 lighting conditions, 9 elevations (30 to
70 degrees every 5 degrees), and 18 azimuths
(0 to 340 every 20 degrees). In this work we
used a subsampled dataset in which each im-
age is 32x32 pixels and we count 200 images
for each category in the training set and 500
for the validation and the test set.
The model The model is a slightly modifi-
cation of that described in the original work
by LeCun [21] about the experiments on the
NORB dataset. In this case we consider one
input channel of 32x32 pixels and we apply our
first convolution layer using a 5x5 filter. We
end up with 8 images 30x30 that are then sub-
sampled with max-pooling to 8 images 15x15.
After that, another convolution operation is ap-
plied with a 6x6 filter and subsampled again to
obtain 25 images 5x5. Before the full connected
layer the last convolution is applied using a
5x5 filter.
7
8. Deep Learning libraries and first experiments with Theano • March 30, 2015
Figure 1: The LeNet-5 model used for the experiments.
Taken from [32]
the experiments After having modified the
code to implement our model we decided to
perform two simple experiments on our dataset
to see how it works. Since stochastic gradient
descent is used for the training, It was cho-
sen to distinguish the experiments only for the
maximum number of epoch: The first exper-
iment up to 100, the second one up to 500.
In the original code there was also an early-
stopping parameter depending on the perfor-
mance increment in each epoch but we chose
to ignore it to see how the model works up
to the overfitting. The other parameters were
the batch size, set at 100, and the learning rate,
set at 0.06 (note that these parameters stay the
same during the whole training). Theano lets
you decide on which hardware the code has to
be executed without changing a single line of
code. By default it searches for a GPU and if
it is not found it does the job on the CPU. In
our case the experiments were performed on
a laptop with an Intel Core 2 Duo and a CPU
usage-limit of 35%.
The results In the table 1 it is possible to see
the time used to train the model for each ex-
periment and the error rate of the best model
according to the validation set but computed
on the joint validation and test set.
Experiment time error rate
Exp1-100ep 16.73m 22.36%
Exp2-500ep 72.00m 22.16%
Table 1: Eperiments results indicating training time and
error rate of the best model found, computed on
the whole test+validation set.
In the first experiment the model could be
trained in about five minutes using the full
CPU, and the second one in about 25 minutes.
The reached accuracy is good considering that
only 200 images for each category are used in
the training set.
In figure 2 it is possible to see the mean of
the negative log likelihood computed for each
epoch during the training stage. In figure 3,
instead, the error rate for both the validation
set and the test set are plotted.
Figure 2: The negative log likelihood computed in the
training for each epoch of the first experiment.
Figure 3: The error rate percentage for both validation
and test set for each epoch in the first experi-
ment.
Note that if the performance on the validation
set is not the best until that moment, the error
rate on the test set is not computed. After some
8
9. Deep Learning libraries and first experiments with Theano • March 30, 2015
oscillations the error keeps decreasing until the
the epoch 100 is reached.
Figure 4 and figure 5 teach us what happens
after the 100th epoch. The negative log likeli-
hood keeps decreasing while the error rate on
the validation set doesn’t decrease anymore,
on the contrary it slightly increases.
Figure 4: The negative log likelihood computed in the
training for each epoch of the second experi-
ment.
Figure 5: The error rate percentage for both validation
and test set for each epoch in the second exper-
iment.
6. Conclusion
In this report we took a look at different
projects that came out in the last decade as
a result for a renewed interest in deep learning
and neural networks. They differ for objec-
tives and features they offer, the programming
language or the abstraction level used. We iso-
lated 4 main projects that are found to be the
most popular in the research community or
in the industry context. All of them have a
great flexibility and are easy-to-use (to some
extent) even if they perform heavy computa-
tional optimizations for both CPUs and GPUs.
In the second part of the report we focused
on Theano, that is the best choice with respect
to our resources and needs. With it, we tried
to implement a convolutional neural network
using the code provided in the tutorial library.
Then we applied this model on a subsampled
NORB dataset and got the expected results. In
conclusion, Theano seems to be very flexible
and portable, bringing with him the beauty of
the Python ecosystem, a good set of tutorials
and a well documented code.
References
[1] Convnet. https://github.com/
sdemyanov/ConvNet. Accessed: 2015-03-
03.
[2] convnetjs. http://cs.stanford.edu/
people/karpathy/convnetjs/. Accessed:
2015-03-03.
[3] Cuda-convnet2. https://code.google.
com/p/cuda-convnet2/. Accessed: 2015-
03-03.
[4] Cuv library. http://www.ais.uni-bonn.
de/deep_learning/downloads.html. Ac-
cessed: 2015-03-03.
[5] Deeplearning4j. http://deeplearning4j.
org/. Accessed: 2015-03-03.
[6] Deeplearntoolbox. https://github.com/
rasmusbergpalm/DeepLearnToolbox. Ac-
cessed: 2015-03-03.
[7] deepnet. https://github.com/
nitishsrivastava/deepnet. Accessed:
2015-03-03.
9
10. Deep Learning libraries and first experiments with Theano • March 30, 2015
[8] deeppy. https://github.com/
andersbll/deeppy. Accessed: 2015-
03-03.
[9] Eblearn.lsh. http://koray.kavukcuoglu.
org/code.html. Accessed: 2015-03-03.
[10] Gensim. https://radimrehurek.com/
gensim/. Accessed: 2015-03-03.
[11] hebel. https://github.com/
hannes-brt/hebel. Accessed: 2015-
03-03.
[12] Lush. http://lush.sourceforge.net/.
Accessed: 2015-03-03.
[13] Mgl. http://melisgl.github.io/
mgl-pax-world/mgl-manual.html. Ac-
cessed: 2015-03-03.
[14] Mocha. https://github.com/pluskid/
Mocha.jl. Accessed: 2015-03-03.
[15] Mshadow. https://github.com/tqchen/
mshadow. Accessed: 2015-03-03.
[16] Nd4j. http://nd4j.org/. Accessed: 2015-
03-03.
[17] neuralnetworks. https://github.com/
ivan-vasilev/neuralnetworks. Ac-
cessed: 2015-03-03.
[18] Nupic. http://numenta.org/. Accessed:
2015-03-03.
[19] Nvidia cudnn. https://developer.
nvidia.com/cuDNN. Accessed: 2015-03-03.
[20] Opendl. https://github.com/
guoding83128/OpenDL. Accessed:
2015-03-03.
[21] Norb: Generic object recognition in im-
ages. http://www.cs.nyu.edu/~yann/
research/norb/, 2011. Accessed: 2015-
03-03.
[22] Theano documentation: Deep learning
tutorials. http://deeplearning.net/
tutorial/lenet.html, 2015. Accessed:
2015-03-03.
[23] Trevor Bekolay, James Bergstra, Eric
Hunsberger, Travis DeWolf, Terrence C
Stewart, Daniel Rasmussen, Xuan Choo,
Aaron Russell Voelker, and Chris Elia-
smith. Nengo: a python tool for building
large-scale functional brain models. Fron-
tiers in neuroinformatics, 7, 2013.
[24] James Bergstra, Olivier Breuleux, Frédéric
Bastien, Pascal Lamblin, Razvan Pascanu,
Guillaume Desjardins, Joseph Turian,
David Warde-Farley, and Yoshua Bengio.
Theano: a cpu and gpu math expression
compiler. In Proceedings of the Python for
scientific computing conference (SciPy), vol-
ume 4, page 3. Austin, TX, 2010.
[25] Ronan Collobert, Koray Kavukcuoglu,
and Clément Farabet. Torch7: A matlab-
like environment for machine learning. In
BigLearn, NIPS Workshop, number EPFL-
CONF-192376, 2011.
[26] Ian J Goodfellow, David Warde-Farley,
Pascal Lamblin, Vincent Dumoulin,
Mehdi Mirza, Razvan Pascanu, James
Bergstra, Frédéric Bastien, and Yoshua
Bengio. Pylearn2: a machine learn-
ing research library. arXiv preprint
arXiv:1308.4214, 2013.
[27] Alex Graves. Rnnlib: A recurrent neu-
ral network library for sequence learning
problems, 2008.
[28] Jeff Hawkins and Dileep George. Hierar-
chical temporal memory: Concepts, the-
ory and terminology. Technical report,
Technical report, Numenta, 2006.
[29] Yangqing Jia, Evan Shelhamer, Jeff Don-
ahue, Sergey Karayev, Jonathan Long,
Ross Girshick, Sergio Guadarrama, and
Trevor Darrell. Caffe: Convolutional archi-
tecture for fast feature embedding. arXiv
preprint arXiv:1408.5093, 2014.
[30] Tomas Mikolov, Stefan Kombrink, Anoop
Deoras, Lukar Burget, and Jan Cernocky.
Rnnlm-recurrent neural network language
modeling toolkit. In Proc. of the 2011 ASRU
Workshop, pages 196–201, 2011.
10
11. Deep Learning libraries and first experiments with Theano • March 30, 2015
[31] Volodymyr Mnih. Cudamat: a cuda-based
matrix class for python. Department of Com-
puter Science, University of Toronto, Tech.
Rep. UTML TR, 4, 2009.
[32] Luca Nicolini. Convolutional neural net-
work, 2012.
[33] Tom Schaul, Justin Bayer, Daan Wier-
stra, Yi Sun, Martin Felder, Frank Sehnke,
Thomas Rückstieß, and Jürgen Schmid-
huber. Pybrain. The Journal of Machine
Learning Research, 11:743–746, 2010.
[34] Pierre Sermanet, Koray Kavukcuoglu, and
Yann LeCun. Eblearn: Open-source
energy-based learning in c++. In Tools with
Artificial Intelligence, 2009. ICTAI’09. 21st
International Conference on, pages 693–697.
IEEE, 2009.
[35] Tijmen Tieleman. Gnumpy: an easy way
to use gpu boards in python. Department
of Computer Science, University of Toronto,
2010.
11