Real-time and high-speed vibrissae monitoring with dynamic vision sensors and embedded systems

Real-time and high-speed
vibrissae monitoring with
dynamic vision sensors
and embedded systemsNeuroscience Instrumentation
Aryan Esfandiari
Master’s Thesis Autumn 2016

UNIVERSITY OF OSLO
DEPARTMENT OF INFORMATICS
MSC. NANOELECTRONICS AND ROBOTICS
Real-time and high-speed vibrissae
monitoring with dynamic vision
sensors and embedded systems
ARYAN ESFANDIARI
supervised by
Professor Philipp Dominik HÄFLIGER
Department of Informatics
Professor Koen Gerard Alois VERVAEKE
Department of Basic Medical Sciences
Associate Professor Ketil RØED
Department of Physics
November 1, 2016

This thesis is proudly dedicated to
The King of my life, my father
The Queen of my life, my mother and
The Princess of my life, my sister.

i
Abstract
One of the largest challenges facing neuroscientists is instrumentation. In-
strumentation is highly related to and restricted by current ongoing tech-
nologies and has a significant impact on research. Currently, neuroscien-
tists are often obligated to move sequentially through multiple time con-
suming processes to achieve the desirable analysis after an experiment. In
addition to being inconvenient, this can be an unreliable method for ob-
servation and analysis. It is also in conflict with real-time application re-
quirements. The most recent technologies in this field are typically devel-
oped to generate larger amounts of available information, such as three-
dimensional high-speed videography units. These emphasize the current
challenges.
This thesis shows that spiked-based neuromorphic sensors, such as dy-
namic vision sensors, open new possibilities for instrumentation. Under
specific circumstances, this can be an outstanding competitor to conven-
tional technologies, making up for their disadvantages and rectifying cur-
rent challenges during physical experiments. Such technology is more con-
venient while maintaining a desired level of accuracy and reliability.
This thesis discusses and considers algorithms through which multiple
approaches to noise reduction are represented. Moreover, an Artificial
Neural Network, as a classificatory, and the Kalman filter, as an estimator,
are compared to determine the best options for a whisker movement
monitoring and tracking algorithm. In this case, a modular weighted
prediction-correction algorithm is generated with a noticeable level of
accuracy and the ability to retain its reliability in noisy and unsuitable
experimental conditions.
This thesis discusses several conventional embedded system platforms for
further implementations. All Programmable System-On-Chip is concluded
as the most optimal solution for further research in this field. The platform
shows noticeable capabilities for more complex dynamic vision sensors
expected in the future, along with comprehensive algorithms. This thesis
contributes necessary implementations for embedded system platforms
with state-of-the-art technologies, such as System-On-Chip components
including field-programmable gate array and microcontrollers. These can
be used for further research and developments on asynchronous spiked-
based neuromorphic sensors.

CONTENTS iii
Contents
Contents iii
List of Tables v
List of Equations vi
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Goal of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Outline of the thesis . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Background 5
2.1 Vibrissae and active sensing . . . . . . . . . . . . . . . . . . . 5
2.1.1 Vibrissae . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.2 Active sensing . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.3 Whisker movement . . . . . . . . . . . . . . . . . . . . 8
2.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.1 Electrophysiology . . . . . . . . . . . . . . . . . . . . . 10
2.2.2 High-speed digital videography . . . . . . . . . . . . 12
2.3 Dynamic Vision Sensor . . . . . . . . . . . . . . . . . . . . . . 16
2.3.1 Influence of the human eye . . . . . . . . . . . . . . . 16
2.3.2 Event-based neuromorphic dynamic vision sensor . . 16
2.3.3 Address-Event Representation . . . . . . . . . . . . . 19
2.3.4 Asynchronous handshaking . . . . . . . . . . . . . . . 22
2.3.5 The receiver . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3.6 Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3 Algorithm 33
3.1 Artificial whisker generator . . . . . . . . . . . . . . . . . . . 33
3.2 Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2.1 Static noise . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.2 Dynamic noise . . . . . . . . . . . . . . . . . . . . . . 36
3.2.3 Performance evaluation: . . . . . . . . . . . . . . . . 37
3.2.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3 Artificial Neural Network . . . . . . . . . . . . . . . . . . . . 40
3.3.1 Single layer perceptron . . . . . . . . . . . . . . . . . . 41
3.3.2 Multilayer perceptron . . . . . . . . . . . . . . . . . . 43

iv CONTENTS
3.3.3 MLP classification for whisker tracking in practice . . 47
3.3.4 Performance evaluation . . . . . . . . . . . . . . . . . 50
3.3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.4 Kalman filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.1 Linear whisker tracking algorithm . . . . . . . . . . . 57
3.4.2 Linear whisker tracking algorithm in practice . . . . 61
3.4.3 Non-linear whisker tracking algorithm . . . . . . . . 66
3.4.4 Performance evaluation . . . . . . . . . . . . . . . . . 68
3.4.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4 Implementation 75
4.1 Platforms for real-time processing . . . . . . . . . . . . . . . 75
4.2 Embedded Dynamic Vision Sensor . . . . . . . . . . . . . . . 76
4.2.1 ARM Cortex-M4/M0 MCU . . . . . . . . . . . . . . . 77
4.3 Field-programmable gate array . . . . . . . . . . . . . . . . . 83
4.3.1 Xilinx ZYNQ-7000 SoC . . . . . . . . . . . . . . . . . . 83
4.3.2 High-level synthesis . . . . . . . . . . . . . . . . . . . 92
4.3.3 Verification . . . . . . . . . . . . . . . . . . . . . . . . 93
4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5 Physical experiments 97
5.1 Experiments with artificial whisker . . . . . . . . . . . . . . . 97
5.1.1 Computer-aided design . . . . . . . . . . . . . . . . . 100
5.2 Experiments in laboratory . . . . . . . . . . . . . . . . . . . . 101
6 Results and discussions 105
6.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
7 Conclusion 111
7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
7.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Appendices 123
A Equations 125
A.1 Homogeneous transformation matrix . . . . . . . . . . . . . 125
B Matlab 127
B.1 eDVSEventClass . . . . . . . . . . . . . . . . . . . . . . . . . . 127
B.2 EventClass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
B.3 Artificial Whisker Generator . . . . . . . . . . . . . . . . . . . 129
C C/C++ 135
C.1 eDVSEventFIFO . . . . . . . . . . . . . . . . . . . . . . . . . . 135
C.2 Main . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

LIST OF TABLES v
C.3 Noise reduction . . . . . . . . . . . . . . . . . . . . . . . . . . 137
C.4 Whisker tracking algorithm . . . . . . . . . . . . . . . . . . . 138
C.5 Whisker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
C.6 eDVSEvent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
C.7 ArtificialWhisker . . . . . . . . . . . . . . . . . . . . . . . . . 144
C.8 Arduino . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
D FPGA / VHDL 147
D.1 Vivado Block Design . . . . . . . . . . . . . . . . . . . . . . . 147
D.2 Schematic of eDVSWhiskerMonitoring . . . . . . . . . . . . . 149
D.3 Asynchronous handshaking . . . . . . . . . . . . . . . . . . . 154
D.4 Timestamps generator . . . . . . . . . . . . . . . . . . . . . . 156
D.5 FIFO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
D.6 AER Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
D.7 UVM Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
E CAD 163
F Miscellaneous 165
List of Tables
2.1 Basler acA2000-340kmNIR specifications . . . . . . . . . . . 13
2.2 Comparison of devices with TmpDiff128 silicon chip . . . . 19
2.3 Structure of a single AER-event made by the TmpDiff128
dynamic vision sensor silicon chip . . . . . . . . . . . . . . . 20
2.4 AER-events transaction speed and latency over UART protocol 21
2.5 Timestamp resolution on DVS devices. . . . . . . . . . . . . . 26
2.6 Comparison of conventional and ongoing whisker tracking
techniques. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.7 Comparison of high-speed videography and DVS. . . . . . . 31
3.1 Statistics of the noise reduction algorithms. . . . . . . . . . . 39
3.2 Statistics of the ANN algorithm . . . . . . . . . . . . . . . . . 54
3.3 Statistics of the whisker tracking algorithm . . . . . . . . . . 69
3.4 Performance of the non-linear whisker tracking algorithm . 70
3.5 Overall statistics of whisker tracking algorithms influenced
by ANN and Kalman filter. . . . . . . . . . . . . . . . . . . . 72
6.1 Statistics of noise availability through with illumination
sources and qualities . . . . . . . . . . . . . . . . . . . . . . . 106
6.2 Statistics of the conceived whisker tracking algorithm in a
real experimental environment . . . . . . . . . . . . . . . . . 107

vi LIST OF EQUATIONS
List of Equations
3.1 Angle from AER-event’s Cartesian address coordinates . . . . 35
3.2 Static noise recognition . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Dynamic noise recognition . . . . . . . . . . . . . . . . . . . . . 37
3.4 Mathematical function of a single neuron . . . . . . . . . . . . 41
3.5 Activation or threshold function . . . . . . . . . . . . . . . . . 41
3.6 Updating of new weights . . . . . . . . . . . . . . . . . . . . . 43
3.7 Input vector from x and y addresses . . . . . . . . . . . . . . . 48
3.8 Estimated angle in form of real number . . . . . . . . . . . . . 49
3.9 Magnitude of the error . . . . . . . . . . . . . . . . . . . . . . . 51
3.10 Mean of the error . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.11 Mean of the error over amount of measurements . . . . . . . . 51
3.12 Mean of the error over amount of iterations . . . . . . . . . . . 52
3.13 Prediction-model . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.14 Scaled prediction-model . . . . . . . . . . . . . . . . . . . . . . 58
3.15 Angle measurement . . . . . . . . . . . . . . . . . . . . . . . . 58
3.16 Gain factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.17 Scaled gain factor . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.18 Estimate of the state . . . . . . . . . . . . . . . . . . . . . . . . 59
3.19 DVS to the algorithm transformation matrix . . . . . . . . . . 62
3.20 Algorithm to the DVS transformation matrix . . . . . . . . . . 62
3.21 Scalar measurement of non-linear whisker . . . . . . . . . . . 66
3.22 Vector measurement of non-linear whisker . . . . . . . . . . . 67
3.23 Averaging measurement of non-linear whisker . . . . . . . . . 67

LIST OF EQUATIONS vii
Acknowledgement
Firstly, I would like to express my sincere gratitude to my main supervisor
Prof. Philipp Dominik Häfliger of the Department of Informatics at
University of Oslo. The door to Professor Häfliger’s office was always open
whenever I ran into a spot of trouble or had a question about my research.
He has been very supportive since day one and this accomplishment would
not have been possible without him.
I would also like to express my sincere gratitude to my secondary
supervisors Prof. Koen Gerard Alois Vervaeke of the Department of Basic
Medical Sciences and leader of the Laboratory for Neural Computation at
University of Oslo and Associate Prof. Ketil Røed of the Department of
Physics at University of Oslo for their endless support.
Alongside my supervisors, I would also like to thank Nanoelectronics,
Robotics and Intelligent Systems research groups led by Prof. Jim Tørresen
and Centre for Integrative Neuroplasticity led by Prof. Marianne Hafting
Fyhn. I also appreciate the participation of Associate Prof. Kyrre Glette
at University of Oslo and Prof. Tobi Delbrück at Inilabs for sharing their
pearls of wisdom with me during the course of this thesis.
Finally, but by no means least, I would like to thank my fellow peers at
University of Oslo and University of California, Berkeley who were always
provided me with their assistance throughout my degree.
Aryan Esfandiari, November 2016

CHAPTER 1. INTRODUCTION 1
Chapter 1
Introduction
1.1 Motivation
Researchers have shown that behaviors in animals often contain useful
information concerning their states and properties, although they have not
yet reached a conclusive understanding of these actions and behaviors.
Actions contain signals for expression and communication with other
animals, such as fear, aggression, attention, and sexual attraction, to name
a few [60]. These signals can be executed by different sense organs and, in
particular vibrissae if some types of Mammalia are included [12].
Although human beings do not use facial hair as animals do, this form
of communication can be understood though human facial properties
in displays of thoughtfulness, confusion, and happiness, among others;
features which can be extracted and distinguished one of human’s most
powerful sensory organs [53]. Communication through vibrissae eases our
understanding of animals’ intentions [60], though this continues to be an
abstract and ambiguous topic for researchers. This topic has been of such
interest for researchers that they have begun to investigate artificial robotic
whiskers [54].
To obtain and achieve this type of knowledge, neuroscientists have
observed vibrissae, examining possible indications of the relationship
between vibrissae movements and rodents’ behaviors [57].
One of the largest challenges facing researchers is instrumentation equip-
ment, the use of which is related to and restricted by current technology.
For whisker movement monitoring purposes, neuroscientists can either
use traditional electrophysiological methods with unfavorable results or
more conventional high-speed videography to achieve the desired out-
come. However, the latter method is not without disadvantages either,
including the processing of large amounts of data, the requirement of addi-
tional hardware, and last but not least, the significant latency of the desired
results, in conflict with the terms and definitions of real-time processing.
According to current solutions, neuroscientists are obligated to move se-
quentially through multiple time consuming processes to achieve the de-
sired analysis after experimentation. This can be an inconvenient and in

2 1.2. GOAL OF THE THESIS
some cases unreliable method for observing and analyzing the results, sig-
nificantly impacting research.
1.2 Goal of the thesis
The primary goal of the current study is to open a brand-new aspect of
instrumentation for neuroscience that provides different capabilities in the
light of real-time system requirements than conventional equipment. New
techniques are discussed and suggested, including numerous algorithms
and implementations which contribute to the observation, monitoring and
analysis of vibrissae movements. This is an attempt to provide researchers
with more convenient and less challenging experimental techniques so
that the study and investigation of the relationship between vibrissae
movements and rodent behavior is more feasible and reliable.
Several approaches and achievements are discussed in this thesis, the most
important of which are listed as follows:
• The presentation of a novel technique for real-time and high-speed
vibrissae tracking and monitoring with dynamic vision sensors. This
thesis provides the necessary background for the use of this sensor in
desired projects.
• A discussion of algorithms for vibrissae movement monitoring
through several outstanding and conventional approaches, including
Artificial Neural Network, as a classificatory, and the Kalman filter,
as an estimator.
• The provision of necessary technical knowledge and implementa-
tion for the use of a dynamic vision sensor, especially for embed-
ded systems, including all programmable System-On-Chip, field-
programmable gate arrays and microcontrollers for further research
concerning the application and development of state-of-the-art tech-
nologies.
• Emphasis on the advantages of several conventional techniques, in-
cluding digital signal processors, high-level synthesis, hardware de-
scription languages, and universal verification methodology, among
others.

CHAPTER 1. INTRODUCTION 3
1.3 Outline of the thesis
Chapter 2: This chapter provides some insight into knowledge for the
current thesis. Neuroscientist ambitions investigating rodents’ vibrissae
movements are described. This is followed by challenges with ongoing
instruments including electrophysiology and high-speed camera before, a
new technique referred to as event-based neuromorphic dynamic vision
sensor is presented.
Chapter 3: This chapter presents a whisker tracking algorithm for the
current thesis. The artificial whisker generator is described before, and
noise issues which must be controlled and eliminated are observed.
A selection of tracking algorithms is evaluated before an algorithm
specifically chosen for the current study is presented.
Chapter 4: This chapter presents the optimal embedded system platforms
for the current thesis. A selection of techniques is discussed to take
advantage of the state-of-the-art technologies. The general requirements
for real-time application and its relation to neuroscientific instrumentation
is described before, two implementations specifically chosen to be the
most optimal under the current study’s conditions and future research and
development are presented.
Chapter 5: This chapter illustrates physical experiments during the current
thesis. The artificial whisker is described, for evaluation before laboratory
experiments are illustrated.
Chapter 6: This chapter provides results of the conceived system for
the current thesis. First, statistics over the system are represented for
evaluation before, achievements of the system are illustrated. Finally,
results are evaluated under the current study’s requirement and the
neuroscientist’s conditions.
Chapter 7: This chapter represents the conclusion of the conceived system
for the current thesis. Research questions are discussed and answered
under the study’s conditions and constraints before further work is
described research.

4 1.3. OUTLINE OF THE THESIS
Figure 1.1: Structure and the outline of this thesis.

CHAPTER 2. BACKGROUND 5
Chapter 2
Background
Chapter abstract: This chapter provides a brief insight into the literature review
and prior knowledge for the current thesis. First, the neuroscientist’s ambitions
with investigations on rodents’ vibrissae movement are described. This is then
followed by challenges with ongoing instrumentations. Finally, a brand-new
technique that is specifically chosen to be the most optimal under the current
neuroscientific and real-time processing conditions and requirements is presented.
2.1 Vibrissae and active sensing
2.1.1 Vibrissae
Many small mammals, including laboratory rats and mice, possess, in
addition to a visual system, a complementary and well-characterized
sensory system driven by the tactile stimulation of prominent arrays of
sensitive vibrissae (hereafter referred to as whiskers), particularly those
located around the snout [75] as illustrated in figure 2.1.
Rodents’ somatosensory systems are able to solve complex, perceptual
tasks, such as determining the position, orientation, size, and shape of
an object, if something is moving or not, its speed, direction, texture and
whether the object is living or nonliving [40].
Whiskers can be categorized into large whiskers (hereafter referred to as
macrovibrissae) and small whiskers (hereafter referred to as microvibris-
sae). The length of a macrovibrissae can be up to approximately 50 mm
with a diameter of less than 1 mm at the base and narrower at the tip [76].
The musculature of the mystical pad or so called whisker-pad enables the
control of whisker movement with incredible speed, up to 3000 deg/s at
the whisker’s tip [57].
Whiskers are very sensitive and can be understood in terms of smaller
regions on the human body, such as fingertips and lips, with the exception
that many other mammals are more dependent on their somatosensory
systems than humans, and are therefore able to protect their somatosensory
systems, such as whiskers, from surfaces and objects that may incur
damage [86].

6 2.1. VIBRISSAE AND ACTIVE SENSING
(a) Side view of mystacial whisker-
pad ﬁelds.
(b) Schematic frontal view of the
mystacial microvibrissae.
(c) Schematic frontal view of mysta-
cial macrovibrissae.
Figure 2.1: Functional architecture of the mystacial vibrissae [12].
2.1.2 Active sensing
Some types of Mammalia, including rodents like rats and mice, are able
to acquire information about their surrounding environment through a
collaboration between their head and highly sensitive whiskers.
Active sensing describes when a rat continuously explores and under-
stands an unfamiliar environment or object, making whiskers a non-
passive sensor. Active sensing is achieved by whisking, denoting the
sweeping back and forth of whiskers against an object to generate tactile
sensory information through contact with an environmental structure. In
most cases, the process occurs, in frequencies between 5–12 Hz, as shown
in ﬁgure 2.2 [39].
Figure 2.2: A whisker’s vibration at 25 Hz, recorded by a high-speed
camera at 30 fps. Note that the tip of the whisker (black arrow) has a higher
frequency than the base of the whisker (white arrow) [40].

Active sensing and exploration make a rat capable of collecting three key
parameters from its whiskers: the firing of neurons (spatial), the timing of
the firing (temporal), and the intensity of the firing (temporal) [1]. These
parameters are used at later stages to encode data collected concerning
important information about the contacted object, such as its localization
and identification. The potential information encoding is illustrated in
figure 2.3 which shows how several parameters can be translated into
useful information.
Figure 2.3: Potential information encoding with regards to collected
parameters from whiskers [1].
With the unique and powerful advantage of a somatosensory system, a
rat is able to collect information in three-dimensions from a wider variety
of environments, including those that are dark or noisy, a process which
may vary from other sensory organs, such as vision or hearing, suitable for
specific circumstances.
Several studies have shown that rats use both macrovibrissae and mi-
crovibrissae for exploration. However, active movement is only directly
completed by macrovibrissae and it is with aid of these that a rat will en-
counter an object of interest within one or two whisk cycles, orienting its
head such that the microvibrissae field can contact the object or area [86]. In
other words, this describes active sensing, exploration as a series of sequen-
tial and consecutive movements rather than simultaneous [39]. Whisker
movements are largely constrained to the horizontal axis and movements
along the vertical axis are small. These vertical movements are caused by
the activity of extrinsic muscles that act to move the entire whisker-pad
together, as illustrated in figure [1].
Figure 2.4: Comparison of the whisker’s angle during seconds of active
sensing within a period of 5 seconds. Antero-posterio (AP) corresponds to
the right whisker and dorso-ventral (DV) corresponds to the left whisker
[10].

8 2.1. VIBRISSAE AND ACTIVE SENSING
2.1.3 Whisker movement
Whisker movement is an important topic for investigation as the whiskers
are highly sensitive and even small changes can lead to large changes
in signals in the sensory system [76]. This is significantly related to the
rodent’s state and understanding of objects and environmental properties,
such as object localization during active sensing [58].
Figure 2.5: Whisker movement observation and angle comparison
recorded by a high-speed camera during active sensing. Each collision be-
tween the whisker and the object is described by a black dot (0 ms, 120 ms,
230 ms) [75].
One of the most important factors of whisker movement, and one that is
highly related to the shape of the whisker, is the angle of the whisker,
as shown in figure 2.6. This information corresponds to the force that is
applied to the whisker, as shown in figure 2.5. Fortunately this is not
only limited to the rat’s internal sensory system, but also visible through
external vision sensors such as human eyes and high-speed cameras. A
whisker’s angle contains essential information which can help researchers
and scientists explore the relationship between a rat’s behavior and the
movement of its whiskers [95].

(a) Detection and reconstruction of mul-
tiple whiskers
(b) Whisker movement monitoring
with corresponding whisker angles
Figure 2.6: A typical monitoring of whisker movements for research and
investigations with conventional instruments [1].
Several studies have investigated the physical constraints of whisker
movement, including the study shown in figure 2.12, which clarifies two
main points. The first is that the base of the whisker has less area of
movement than the tip of the whisker. The second is that a macrovibrissae
will move within an area of 100° if force is applied naturally from the
whisker-pad and not from external factors, such as objects [56]. In other
words, the tip of the average macrovibrissae, which has the largest area
of movement compared to the rest of the whisker, will, in most active
sense scenarios and without any external effect, move from -50° to +50°,
as illustrated in figure 2.7.
Figure 2.7: Area of the whisker movement with natural force from whisker-
pad where θ corresponds to 50° [10].

10 2.2. RELATED WORK
2.2 Related work
The study ﬁrst examines two widely used electrophysiology methods
in neuroscience experiments on rodents before discussing high-speed
videography. The latter technique is state-of-the-art for monitoring whisker
movements.
2.2.1 Electrophysiology
One of the most conventional experimental methods on animals in neu-
roscience is electrophysiology, typically electrophysiological monitoring or
so called electrography. This includes measurements of the electrical activ-
ity of neurons, particularly action potential activity, by measuring voltage
changes or electric current in neurons and organs. Electrophysiology mea-
surements are performed using electrodes and the technique is either cat-
egorized as invasive, involving the attachment of electrodes into the body
with small surgeries, or non-invasive, corresponding to placing electrodes
on the body, such as over the skin without any surgeries.
Electroencephalography
Electroencephalography (EEG) is widely used in most experiments on
animals due to its reliable monitoring of the electrophysiological activities
of a small group of neurons or even a single neuron with high temporal
and spatial resolution [66]. Electroencephalography is generally a non-
invasive method with electrodes attached along the scalp, however, in
some circumstances, invasive electrodes with surgery, such as a Utah multi-
electrode array, are required. Even though EEG is commonly used for the
investigation of rodent behaviors, not many researchers have explored the
relationship between EEG and whisker movement due to its complexity.
Electromyography
Electromyography (EMG) detects the electrical potential generated by
muscle cells when these cells are electrically or neurologically activated,
as shown in ﬁgure 2.8. This method has been used to analyze the muscle
movements of whisker-pads with either surface or intramuscular needle
electrodes [75].

(a) Surface electrodes which can
classify whether a hand’s grip
tightens [8].
(b) Intramuscular needle electrodes for
monitoring of tiny muscles [85].
(c) Comparison between signal on surface and intra-
muscular needle electrodes [64].
Figure 2.8: Monitoring of muscle activity and classification with elec-
tromyography (EMG) electrodes.
Researchers have not determined EMG to be sufficient to reconstruct
the shape of whiskers, as several whisker parameters are dynamic, such
as weight, meaning whisker movement can vary even when the force
applied to a whisker is consistent [18]. The method has two additional
significant disadvantages. First, experiments often include multiple
whiskers wherein muscles in small areas affect each other and develop
a complex combination of muscle movements. Second, it is not possible
to determine a whisker’s position in space nor its elastic behavior with
information available from an EMG [57].
Figure 2.9: Comparison of whisker movement recorded by EMG (gray trace
θ∗) and high-speed digital camera (black trace θ) [76].

2.2.2 High-speed digital videography
Classical videography techniques generate major disadvantages for
whisker movement monitoring, typically arising from whiskers’ proper-
ties such as its thinness and high speed movement in complex patterns,
generating several crucial challenges for videography and excluding it as
an instrumentation option. Fortunately, recent improvements in the spatial
resolution of high-speed digital cameras have made videography sufficient
for the reconstructing the shapes of a whisker for further quantitative anal-
ysis of behaviors. High-speed digital videography is currently widely used
as a non-invasive technique in most whisker movement experiments ow-
ing to its sufficient accuracy and precision, figure 2.10.
Figure 2.10: Whisker movement observations and angle comparisons
recorded by a high-speed camera during an active sensing activity. Each
collision between the whisker and object are presented by the black dots (0
ms, 120 ms, 230 ms) [18].
While a comparison of the EMG results and high-speed cameras, as shown
in figure 2.9, may present similarities, it is important to note that EMG
requires specific adjustments concerning the properties and features of
selected whiskers, such as a whisker’s weight, to maintain desirable results;
while high-speed cameras require only abstract information and can be
applied to most types of whiskers, such as microvibrissae.

High-speed cameras are characterized by a high frame rate with sufficient
resolution and a high-speed interface for transmitting data to a receiver.
The figure 2.11 and specifications in table 2.1 describe a high-speed camera
from Basler which has been used in recent investigations at Vervaeke
Laboratory for Neural Computation with 340 fps, a high-resolution (2048 x
1088), and up to one thousand frame rates if a lower resolution is selected;
a common frame rate for modern high-speed cameras [7].
Sensor CMOSIS CMV2000 NIR-enhanced
Resolution 2048 X 1088 (2MP)
Frame Rate 340 fps
Mono/Color Mono
Interface Camera Link
Mono/Color Mono
Synchronization External trigger, Software and Free-run
Lens Mount C-mount
Table 2.1: Basler acA2000-340kmNIR specifications
(a) Front view with C-mount connec-
tion.
(b) Back view with Camera link inter-
face .
Figure 2.11: High-speed camera from Basler used in recent projects at
Laboratory for Neural Computation
Videography methods have recently escalated to the next-level with three-
dimensional (3D) technology that provides data in a 3D perspective
for further observation and analysis. Figure 2.12 shows the whisker’s
movement defined by three colored circles throughout an episode of
continues whisking. Each circle corresponds to a specific position on

the whisker during active sensing, where the blue circle approximately
corresponds at the base of the whisker and the red circle approximately
corresponds to the tip of the whisker.
Figure 2.12: Whisker movement represented by three colored circles in a
space measured with 3D technology during active sensing [56].
A number of experiments with high-speed cameras have been examined,
almost all of which follow a common algorithm and structure after
recording whisker movement, shown in figure 2.13 and listed as follows:
1. For each iteration, process a single frame of an image. Note there may
be hundreds and even thousands of frames of high-resolution images
in one second.
2. Extract the whisker from the background. In this case, it is beneficial
to ease this process at an earlier stage. For example, creating a
significant contrast between the environment and the whisker with
the proper background and lighting.
3. Distinguish between the whisker and the background, often achieved
with a combination of image processing approaches, such as edge-
detection and Hough transformation. It is notable that most of these
mathematical methods are available in most image processing li-
braries, however they require significant high performance resources.
4. Determine the whisker’s angle with mathematical functions. This can
range from the angle of a linear line to more complex phenomena,
such as the angle of non-linear line, requiring different mathematical
approaches such as piecewise polynomial functions or so called
splines.

(a) Recording of whisker
movements
(b) Distinguishing and recon-
struction of different whiskers
(c) Determine the
whisker angle
Figure 2.13: Sequences of multiple time consuming processes to achieve the
desirable analysis after an experiment [75].
However, traditional videography has significant disadvantages and may
at times be inefficient. Some major disadvantages of videography in terms
of whisker movement tracking instrumentation are as follows:
• Large and significant amounts of data in the form of multidimen-
sional arrays of information. To maintain desirable accuracy and
precision, it is necessary to increase frame rate and resolution, pro-
portional to the amount of data. Typically, hundreds of megabytes
describe a few seconds of raw, uncompressed bitmaps.
• Use of image processing algorithms to extract useful information
can be complex and time consuming with high latency. In most
cases, the algorithm is executed after the recording process. As such,
observations cannot take place in real-time.
• A large amount of transmitted data contains mostly useless and
duplicate information about the environment such as the background
or whiskers’ surroundings. This is processed, however this process is
unnecessary, and requires resources and causes greater latency.
• Most image processing algorithms require complex hardware with a
number of available high performance resources.

16 2.3. DYNAMIC VISION SENSOR
2.3 Dynamic Vision Sensor
To understand the dynamic vision sensor’s foundation, it is necessary to
examine how it is influenced by the human eye before examining the
sensor’s definition and technical specifications.
2.3.1 Influence of the human eye
Visual parts of the brain do not directly respond to light waves, but rather
respond to a neural signal. Therefore, a transformation mechanism that
converts light waves into understandable and recognizable signals for the
brain is necessary. For this reason, the retina is of great importance [44].
The Retina plays an essential role in the transformation of external physical
signals such as light waves into understandable neural signals in the brain.
Light waves pass through three different layers of cells in the retina, as
illustrated in figure 2.14 (a): ganglion cells, bipolar cells, and receptor cells.
The receptor layer contains two types of photoreceptors, rods and cones,
that help us to discriminate and classify light waves, as illustrated in figure
2.14 (b).
Cones are responsible for daylight and detailed vision as well as providing
the eye’s sensitivity to color. Rods are responsible for dark-adapted vision
and work in dim and dark light. Rods are more sensitive than cones to
light, and as such they respond and fire from less light, a type of sensitive
motion sensor.
(a) The layers and structures of the eye. (b) Light flow through
ganglion, bipolar, and
receptor cells.
Figure 2.14: Simple anatomy of the Retina [43]
2.3.2 Event-based neuromorphic dynamic vision sensor
The asynchronous dynamic vision sensor (DVS) is a spike-based and event-
based neuromorphic sensor inspired by neuro-biological architecture and
the human retina. The foundation of the DVS is photoreceptors in the
human retina, particularly rods, as described in section 2.3.1.

(a) Stand-alone DVS with C-
mount connection.
(b) TmpDiff128 dynamic vision
sensor chip.
Figure 2.15: DVS provided by Inilabs [46].
A DVS examines events which generate relative intensity change caused
by illumination, or more precisely, the quantized change of log intensity of
a particular pixel between a current and previous event, as illustrated in
figure 2.16 (b).
In other words, a DVS samples the intensity of the current state at a
specific pixel and compares it with a previous sample of the same pixel.
If the difference between these samples is more than a threshold, the DVS
assumes a new event and reports the location, or so-called the address of
the sampled pixel. This calculation occurs for all pixels, also referred to as
elements or cells.
Simultaneously, the DVS informs whether the luminosity change is
ascending or descending, referred to as polarity.
(a) Schematic of a single DVS element. (b) Related positive change of luminos-
ity.
Figure 2.16: Schematic of event-based neuromorphic sensors and relative
intensity change caused by illumination [46].
The DVS sensor in this project is a 0.35 µm TmpDiff128 silicon chip
provided by Inilabs. It is a pixel array sensor with 128 x 128 pixels,
corresponding to 16,384 elements, as shown in figure 2.15 (b). The structure
and schematic of a single element are illustrated in figure 2.16 (a) and can

be described as follows [62]:
1. A photoreceptor that responds logarithmically to the intensity.
2. An amplifier that removes DC mismatch and amplifies only the
change in intensity.
3. A comparator that fires with regards to the related luminosity change
with a threshold of 2.1%.
The TmpDiff128 silicon chip from Inilabs comes in different forms which
can be selected in regards to the application’s specifications and demand.
The comparison between DVS devices provided by Inilabs is as follows:
(a) DVS128 BASIC (b) DVS128_PAER
(c) eDVS (d) eDVS MINI
Figure 2.17: Variety of different dynamic vision sensors provided by Inilabs
[46].

DVS128 BA-
SIC
DVS128_PAER eDVS eDVS MINI
Description Complete
Stand-alone
package and
ready to use
Only DVS128
and its pe-
ripherals
DVS128 with
microcon-
troller
DVS128 with
compressed
and tiny mi-
crocontroller
Interface USB 2.0 16-bit paral-
lel AER for
only external
transactions
UART, USB
2.0 FTDI and
SPI
UART TTL
and USB 2.0
FTDI
Transactions
speed
Up to 1M
Events Per
Second (EPS)
Determined
by receiver
Up to 600K
EPS without
any times-
tamp or
200K EPS
with 32-bit
timestamp
Up to 1320K
EPS without
any times-
tamp or
450K EPS
with 32-bit
timestamp
Timestamp
bit-width
32-bit — none, 8,
16, 24 or
32-bits (user-
defined)
none, 8,
16, 24 or
32-bits (user-
defined)
Type of the
receiver
CPLD — MCU MCU
MCU — — NXP LPC4337 STM32F74xx
Lens Mount C-mount C-mount S-mount —
Direct AER
interface
— Rome and
CAVIAR
— —
General-
purpose
IO
— — —
Table 2.2: Comparison of devices with TmpDiff128 silicon chip
2.3.3 Address-Event Representation
Address-event representation (AER) is a communication protocol for
transferring spikes between bio-inspired chips. The main purpose of the
AER is to have a standardized protocol for the transaction of the states
of an array of cells such as DVS where each cell has a continuous state
variation in regards to time [11]. Since the DVS is a two-dimensional array,
in practice it requires two dedicated designs with respect to the x and y
directions where each design sample and encode address of occurred pixels
or so called elements correspond into the event as illustrated in figure 2.18
(a) [79].
The data width of the AER varies according to the selected technology, but
overall, the AER protocol will fully optimize the data to be represented

to minimize the margin of latency and avoid transferring unnecessary
information, such as header or duplicated data.
(a) DVS with a two-
dimensional array and
dedicated sample and
encoding entity [79].
(b) Parallel to serial communication and vice versa with
multiplexers [17].
Figure 2.18: TmpDiff128 silicon chip and its surrounding entities.
The structure of a 16-bits single AER-event which is made by TmpDiff128
silicon chip is as follow:
Range Description
AERData[15] —
AERData[14:8] Cartesian position of the AER-event with respect to Y
AERData[7:1] Cartesian position of the AER-event with respect to X
AERData[0] Polarity of the AER-event
Polarity =
1, if luminosity change is ascending
0, if luminosity change is descending
Table 2.3: Structure of a single AER-event made by the TmpDiff128
dynamic vision sensor silicon chip
It is also notable that with serial communications, such as Universal
asynchronous receiver/transmitter (UART) which is widely used on DVS
devices from Inilabs, a single 16-bit AER-event needs to be encapsulated in
byte level which will cause the transferring of 2-bytes. Moreover, the UART
is asynchronous, including a start and stop-bit for its handshaking which
needs to be established for the transaction of every byte. In other words, as
the simplest form of UART (without any parity and a single stop bit), the
AER-event will end up with 20-bits without any timestamps, as illustrated
in ﬁgure 2.19, or 60-bits with 32-bits timestamps.

eDVS eDVS MINI
Event Per Sec-
ond (EPS)
Up to 600K EPS without
any timestamp or 200K
EPS with 32-bit timestamp
Up to 1320K EPS without
any timestamp or 450K
EPS with 32-bit timestamp
Time ≈1500 ns without any
timestamp and 5000 ns
with 32-bit timestamp
750 ns without any times-
tamp and ≈2000 ns with
32-bit timestamp
Table 2.4: AER-events transaction speed and latency over UART protocol
Figure 2.19: UART sequence of 20-bits which represent an AER-event
without any timestamp on eDVS by default AER-protocol settings.
Address-Event raw file
Address Event DATA (AEDAT) files contain information for efficient
presentations, including the AER-event and its timestamp, if any. There
is also some additional information encapsulated into a header, including
sensor parameters, address, and timestamp bit-width to provide important
parameters about the recorded activity.
The structure of AEDAT 2.0 is as follows:
1. The header begins with # (0x23) and includes preferences as the
version of AEDAT, bit-width of the address and timestamp, the
timestamp resolution, and the sensor’s parameters. The header is
completed with a carriage return (0x0D) followed by newline (0x0A)
and null-null (0x00 0x00).
2. Start of a new AER-event is indicated with null-null (0x00 0x00).
3. A specific number of bytes represent an AER-event. This data is
divided into address and timestamp with regards to the bit-width
described in the header.
4. Repeat from step two if there is an event left.
To optimize the AEDAT for greater efficiency, the AEDAT is created
as a binary file and not ASCII as all elements except the header are
numbers and can be easily represented with binary. A complication

of this format is that it cannot be directly read by humans, how-
ever it reduces the file size for faster extractions and avoids unnec-
essary conversions from ASCII to numbers for further calculations.
1 # ! AER−DAT2 .0
2 # This is a raw AE data file − do not edit
3 # Data format is int32 address , int32 timestamp
4 # (8 bytes total ) , repeated for each event
5 # Timestamps tick is 1 us
6 # created Tue Nov 01 00:00:00 CET 2016
Code 2.1: A part of AEDAT header
2.3.4 Asynchronous handshaking
As discussed, the type of communication of the TmpDiff128 silicon chip
in any form of DVS is parallel asynchronous. At first glance, this may
appear unfavorable due its handshaking requirement which can generate
more logic and delay than synchronous communication, however; this is a
major advantage of the DVS.
Asynchronous DVS ensures that only the new AER-event is sent and the
receiver does not need to acquire and continuously process unless there
is a new AER-event. This makes the DVS an event-based sensor and
significantly minimizes latency, processing, and occupation of resources.
There are several issues that can occur in asynchronous communication,
for example, if the sender and the receiver lack a common clock domain
for synchronization, lack the availability of stochastic data, or if a sender
with a higher frequency sends data continuously though a receiver which
may not be ready for a new transaction due to its lower frequency and
computational latency.
Most asynchronous communication schemes are based on some sort of
protocol involving a request used to initiate an action and a corresponding
acknowledgment, indicating a response to the request.
Four-phase handshaking
Four-phase handshaking is a widely used handshaking protocol as it has
the advantage of a feedback signal. This is particularly useful when a
sender and receiver are in different clock domains and the sender must
wait for the availability of the receiver before the next transaction due to
latency in processing.
Four-phase handshaking is used in the DVS sensor to establish AER trans-
actions. This handshaking protocol adds two signals in addition to the
15-bit AER-event, referred to as the request and acknowledge signals, illus-
trated in figure 2.20. In other words, 18-bits is the minimum requirement
of the dedicated signals to communicate with an asynchronous event-based
DVS, 2 for the handshake and 16 for the AER-event.

Figure 2.20: Asynchronous handshaking between the the talker and the
listener which in this thesis are the DVS and the receiver [16].
The structure of the four-phase handshake protocol is as follows [32]:
1. The talker activates the request signal, ﬁgure 2.21 (1).
2. When the listener detects a request activity, it activates its acknowl-
edge signal, (2).
3. When the talker detects the acknowledge activity, it deactivates its
request signal, (3).
4. When the listener detects the request deactivation, it deactivates its
acknowledge signal, (4).
5. When the talker detects the acknowledge deactivation, it returns to
the initial state and is ready for the next transaction, (5).
Figure 2.21: Four-phase handshaking protocol with period of the validity
of data in push operation [16].
As illustrated in ﬁgure 2.21, the data from the sender must be valid and
without metastability for a certain period. The availability of data can
be obtained with tristate-buffers and controlled by either the sender’s so-
called push operation or the receiver’s so-called pull operation. The DVS
uses a push operation, as such the TmpDiff128 silicon chip guarantees that
the valid data is available along with its high request signal and the receiver
can begin data acquisition if the request signal is maintained as high.
2.3.5 The receiver
As described, the TmpDiff128 silicon chip generates 16-bit AER-events and
requires signals for handshake establishment. These signals are transferred

through asynchronous parallel communication for further acquisition.
However, there are several processes which need to be completed by
the receiver before data is available for further processing for its final
application.
In most cases, the receiver executes the handshake for asynchronous
communication and only forwards the acquired AER-event to the final
processing unit with other forms of communications, such as USB, though
in some circumstances AER-data can be processed natively. The latter
method is not suitable for stand-alone applications as it requires native
programming of the microcontroller and a lower abstraction level; on
the other hand, it minimizes latency significantly as all processing occurs
internally without unnecessary communication and routing with external
units.
The receivers are typically composed of application-specific integrated
circuit (ASIC), complex programmable logic device (CPLD), field-
programmable gate array (FPGA) as shown in figure 2.22, or microcon-
trollers (MCU). According to the table 2.2, DVS128 BASIC contains on-
board CPLD while the eDVS and eDVS MINI have on-board MCU. The
DVS128_PEAR has no receiver and it must be selected by the user with
regard to demand through direct AER interfaces, such as CAVIAR [38].
Figure 2.22: A general structure of an AER receiver, optimized for
neuromorphic systems which can be implemented and reconfigured based
on VLSI and ASIC-FPGA [17].
As noted, the receiver can also bypass the acquired AER-event and send
to the next unit for further processing on the final application. This
communication can be selected concerning available peripherals, as well

as requirements and demands. The acquired AER-event can be bypassed
either as serial or parallel. Each communication protocol is as described as
follows:
• Parallel communication dedicates 16-bits of data in addition to
the handshaking signals and is the most suitable and beneficial
communication method due its parallel execution, causing less
latency. Parallel communication, in this case, requires a number of
dedicated signals which may not be possible for some circumstances
or platforms. This type of communication is mostly suitable for
embedded systems and is available on DVS128 PAER and eDVS
(before the on-board microcontroller).
• Serial communication is the most common type of communication in
the form of USB or UART, available in most receivers and processing
units. With this type of communication, the receiver acquires the
AER-event as parallel but bypasses the data sequentially via a serial
data bus with the help of arbiters for encoding and decoding, as
illustrated in figure 2.18 (b).
Generally, the primary tasks and responsibilities of the receiver are as
follows:
1. Perform handshaking for establishment of asynchronous communi-
cation from the TmpDiff128 silicon chip (described in greater detail
in section 2.3.4).
2. Sample new AER-event, typically 16-bit.
3. Validate the acquired AER-event. This can be completed with a parity
check of specific bits.
4. Store the acquired AER-event into the AER-First In, First Out (FIFO).
5. Generate or acquire the timestamp from its source.
6. Store the timestamp into the TIME-FIFO.
7. • Process the AER-event natively and transfer only the desired
result.
• Bypass and forward the complete AER-event with or without
timestamp to the desired and selected communication protocol
for further processing.
Timestamp
Timestamps are used to encapsulate each event with a unique identifica-
tion. This identification can be used to maintain all acquired continues
events in the correct order and for further processing, such as noise reduc-
tion algorithms, described in section 3.2.

Timestamps cause additional latency as they generate more data to be
transferred, though the process can be highly valuable.
Timestamps are obtained with a hardware with, real-time clock (RTC),
which is often an on-board chip with a 1KHz resolution that works well
with receivers with a transaction speed below 1000 EPS.
Device Resolution
DVS128 BASIC 32-bit
DVS128 PAER —
eDVS none, 8, 16, 24 or 32-bit (user-defined)
eDVS MINI none, 8, 16, 24 or 32-bit (user-defined)
Table 2.5: Timestamp resolution on DVS devices.
Another way to generate timestamps is through counters, which increase
their value along with clock cycles. However, in this case, it is not possible
to extract the actual time of the occurred event at later stages when the need
arises.
First In, First Out
Considerations of asynchronous communications are discussed in section
2.3.4 alongside handshaking. The discussed approaches make it possible
for two systems in different clock domains to communicate with each other.
Though handshaking is a sufficient method for synchronization, it does not
guarantee that data is not lost in some circumstances. This can occur if the
sender transfers data faster than the receiver is able to accept it, perhaps
because the receiver is slower to bypass the AER-event or processing an
AER-event takes a longer amount of time and occupies the receiver. This
issue is called an overwrite error, indicating that the receiver cannot acquire
the new AER-event with sufficient time and the data will be overwritten by
a new AER-event before it is acquired.
This issue can be solved by queues such as FIFO, which is widely used
in most data acquisition applications. First In, First Out stores data on
memory in sequence; and this data is maintained until it is acquired by
the beneficiary, as shown in figure 2.23.

Figure 2.23: First In, First Out (FIFO). The data received first is the first to
be sent out by a request from the beneficiary [16].
Another consideration is the size of the FIFO, in other words, how many
AER-events it can store. This is proportional to the speed of transactions,
the size of each FIFO element, and the available memory. The choice of a
reasonable size for the FIFO will avoid overflow errors and ensure that the
FIFO always has available memory for new AER-events.
2.3.6 Libraries
The DVS provided by Inilabs comes with several libraries. These libraries
are supported within conventional environments, such as Matlab and
C/C++, but primarily by Java.
Numerous available libraries and the source-code of previous projects were
examined during the current thesis, leading to a favorable method to be
familiarized with the DVS in general and develop a greater understanding
of its properties, protocols, and use. The most influential among these
projects was the high-speed particle tracker with the DVS, StaticBioVis AER
silicon retina sensor [79] and foveated [4], supervised by Professor Philipp
Dominik Häfliger, also the primary supervisor of the current thesis.
• jAER: This is the official application of the DVS with the most continuous
updates and support from the vendor. It provides many user-defined
settings and preferences.
The jAER is based on the Java environment which makes it suitable as a
cross-platform stand-alone application due to its user-friendly graphical
user interface and easy access to essential features, such as connection to
the DVS and the adjustment of parameters for convenient exploration.
Another significant benefit of jAER is that it is an open-source and can be
designed and developed with regards to experimental requirements and
demands when the need arises. It is possible to modify the jAER to be as
simple as possible with a few functions and an automatic connection to the
selected DVS, or to be more complex, with the execution of algorithms and
computations.
Some jAER applications have been illustrated where figure 2.24 (a) shows
a simple representation of AER-events which have occurred with white

pixels and stored into a FIFO. Figure 2.24 (b) is an extended version which
also represents the polarity of the events where green pixels corresponds to
high AER-events and red pixels corresponds to low AER- events, described
in greater detail in table 2.3 in section 2.3.3. Figure 2.24 (c) shows the AER-
events chronologically within a deﬁned threshold.
(a) Representation of AER-
events.
(b) High and low polari-
ties.
(c) Darker greys represent
newer events and lighter
greys represent older
events.
Figure 2.24: Monitoring of AER-events with jAER [48].
Figure 2.25: Experiments with DVS BASIC and jAER during this thesis.
• cAER: Inilabs provides a pure C/C++ library called cAER [63]. This
library provides all necessary functions to communicate with the DVS and
begins to acquire AER-events. This application programming interface
(API) is not particularly suitable as a stand-alone application, however
it can be imported along with other libraries, such as algorithms and
graphical user interfaces, to provide outstanding low-level applications

with less implicated layers between the runtime application and the
required hardware.
Under some circumstances, the cAER can be favorable for some types
of embedded systems. This API requires several additional libraries
which need to be installed and configured in advance. This means that
the cAER must be executed in an operative system environment; since
these required libraries are mostly available through Debian package
management, Ubuntu has been considered the best choice. Ubuntu is also a
widely supported operating system for most embedded system platforms,
such as Raspberry PI and FPGAs.
• Matlab: Inilabs provides MathWorks Matlab library for data processing
and data analysis. This library contains simple functions which are mostly
for post-processing, including AER-event extractions and presentation.
This library is useful for algorithm development as it allows for necessary
operations on AER-events and simultaneously the developer can take
advantage of Matlab’s environment and a combination of advanced
libraries.
Matlab is also highly suitable for calculations and arithmetic operations
because all data, including AER-events, can be represented as matrices.
This library has been widely used development of algorithms, modifica-
tions, and comparisons during this thesis. The advantages of this library
have primarily lay in the analysis of offline post-processing on raw AEDAT
files which recorded and stored whisker movement in advance.
2.4 Discussion
The previous sections discussed two typical whisker movement monitor-
ing approaches. Both EEG and EMG were selected from electrophysiology,
in addition to high-speed videography, describing the instrumentation of
almost all recent research and investigations on rodent behavior, in par-
ticular whisker movements. It is clear that the selection of this technique
has unique trade-offs which must be considered with regards to an exper-
iment’s requirements. For this reason, it is important to understand the
advantages and disadvantages of each approach.
For easy comparison, table 2.6 describes the most important properties of
the ongoing techniques discussed in regard to whisker movement monitor-
ing. This table shows the high-speed digital videography to be the best se-
lection and the most beneficial for the whisker movement monitoring. The
high-speed camera is the only non-invasive instrument that can measure
the actual shape of the whisker and its position with a sufficient resolution.
Although high-speed videography has clear advantages, some of its draw-
backs can be significant for the desired instrumentation and achievements.
The most important issue is the latency caused by the amount of available
information required to be processed, in conflict with the definition of real-

30 2.4. DISCUSSION
time systems. It is challenging to develop an instrument that can process
this amount of data and maintain the desired real-time preferences.
EEG EMG High-speed
videography
Description Monitoring of
electrical activity
of the brain with
electrodes placed
along the scalp
Monitoring of
electric potential
generated by
muscle cells with
surface or needle
electrodes
Capturing con-
tinues frames
by image sensor
which converts
light waves to
digital signals
Resolution Very high with
single or multiple
neurons level
Low with muscu-
lature areas level
Sufﬁcient with
micrometer level
Non-
invasive
Partially Fully Fully
Format of
the output
Activity spikes
with frequency
and amplitude
Activity spikes
with frequency
and amplitude
Complex digital
image of the en-
tire environment
Speed
of data
transaction
Very high due
simplicity of the
output signal
Very high due
simplicity of the
output signal
Very low due the
complexity and
largeness of the
image
Angle
of the
whisker
— Mostly for single
whisker but too
complex for mul-
tiple whiskers
Shape
of the
whisker
— —
Position
of the
whisker
— —
Table 2.6: Comparison of conventional and ongoing whisker tracking
techniques.
The latter part of this chapter described an additional approach, namely
a spiked-based neuromorphic sensor which covers almost all the same
considerations as high-speed videography and also provides the desired
eligibility and suitability for whisker movement monitoring purposes.
However, the DVS also has limitations and constraints. Though, if these
considerations are managed, the DVS can be an outstanding replacement
for conventional high-speed videography, especially for neuroscience
instrumentation and neuroplasticity studies.
For greater clariﬁcation, and to summarize the most important features of
both high-speed videography and the DVS, a comparison is generated in

table 2.7.
High-speed videog-
raphy
Dynamic Vision
Sensor
Description Capturing continues
frames by image
sensor which con-
verts light waves to
digital signals
Event-based neu-
romorphic sensor
which responds to
relative intensity
change
Resolution Very high,
2048 X 1088
(Basler acA2000-
340kmNIR)
Very low 128 X
128 (DVS128 or
TmpDiff128)
Communication Synchronous Asynchronous
Format of the
output
Complex digital im-
age of the entire en-
vironment
Coordinate of the
new event in form
of AER event
Available param-
eters
Multidimensional
array of pixels with
binary represen-
tation of intensity
for each pixel
(grayscale)
AER event of oc-
curred change in
form of cartesian
coordinate and its
polarity
Communication
interface
Camera Link
(Basler acA2000-
340kmNIR)
UART, USB/FTDI
(eDVS4337)
Amount of avail-
able data from
experimental
recording on
whisker move-
ments in 60
seconds
Approximately one
gigabyte or even
more for the raw ﬁle
and 100 megabytes
for compressed
MPEG-4 which
requires additional
ofﬂine processing
after the actual
recording
Proportional to the
whisker activity, but
approximately 12
megabytes for a nor-
mal situating and
up to 72 megabytes
for worst-case sce-
nario. This is in
form of raw AER-
events without any
noise reduction
algorithms
Table 2.7: Comparison of high-speed videography and DVS.
As discussed, the DVS is a fascinating and impressive solution for the
purpose of this thesis, under the conditions that satisfy the suitable
algorithm and its implementation.
In terms of the algorithm, the whisker tracking algorithm must be able
to provide a reliable outcome with a high-level of accuracy with regard

32 2.5. CONCLUSION
to the available information, which, in this case is limited compared to
other approaches, such as high-speed videography. The algorithm must
be designed so it can best take advantage of the DVS and its asynchronous
event-based behavior. It should also be designed to provide the desired
goal with high performance and minimized latency. Notice that with the
use of the DVS, noise must be managed. This can be challenging because
at times noise and important information can have identical properties. It
is also important to note that it can be challenging to develop an algorithm
that can handle multiple whiskers in a region of interest where not only the
estimated angle must be achieved, but the algorithm must also distinguish
between different whiskers.
When it comes to implementation, the official libraries provided by the
vendor are not particularly efficient with regards to the needs of the current
thesis which requires intense high-speed tracking.
There are some important issues concerning the DVS libraries, such as the
inability of jAER to be executed as a real-time application according to real-
time requirements, described in more detail in section 4.1. Most of the
libraries must be executed on general-purpose operating systems which
are not considered suitable for real-time operative systems (RTOSs) due to
non-deterministic behaviors and preemptions.
This issue is more considerable for the jAER as it adds an additional layer
between the application and operating system, the Java Virtual Machine
(JVM). The JVM provides many behind-the-scene features to improve the
convenience and performance of the application; on the other hand, it
takes away full control of the development and makes the system more
stochastic, again in conflict with real-time system conditions.
Though the cAER is more suitable for embedded systems than the jAER, it
is not a perfect and optimal option due to its dependency on an operating
system and because the required libraries must be ready in advance, a
process not fully supported by most RTOSs.
2.5 Conclusion
Under unique circumstances, the DVS can be an outstanding competitor
to conventional high-speed videography devices, especially for the goals
of this thesis, namely high-speed whisker movement monitoring and
tracking. The DVS can satisfy almost all considerations and issues
with high-speed videography concerning the desired achievements and
providing noticeable eligibility, reliability, and efficiency.
The selection of the DVS presupposes that some topics are considered and
contented, including algorithms, and its implementation on embedded sys-
tem platforms where the algorithm must be able to take the best advantages
of the little available information while eliminating noise and providing a
high-level of reliability and accuracy. The implementation must attempt to
sufficiently meet real-time system definitions and conditions.

CHAPTER 3. ALGORITHM 33
Chapter 3
Algorithm
Chapter abstract: This chapter presents a whisker tracking algorithm for the
current thesis. First, the artificial whisker generator is described, for further
performance evaluation of the whisker tracking algorithms. Then, essential noise
issues which must be controlled and eliminated are observed. Additionally, a
selection of tracking algorithms is evaluated in light of signal representation
employed by the DVS and the application requirements. Finally, an algorithm
specifically chosen to be the most optimal for the current study’s conditions and
constraints is presented.
3.1 Artificial whisker generator
For the current thesis, it became clear that the algorithm must be evaluated
at an earlier stage before further decisions were made. In this case,
evaluation included observing the performance of the algorithm during
algorithm development and the adjustment of parameters.
Initially, the evaluation was completed through observation by eye,
beneficial for significant errors and imprecise for evaluating the reliability
of the desired algorithm, especially for errors with smaller margins and
long term execution.
This motivated the use of reverse engineering to use supervised examples,
essentially a pair consisting of an input object and a desired output value.
In this case, rather than having stochastic and random inputs without
any correlation to the output, the algorithm fed a deterministic input and
expected output.
To this end, an artificial whisker generator was conceived which simulates
a constructed whisker with continuous movement and generates corre-
sponding AER-events and the actual angle of the whisker. With this, all
evaluations can be fully autonomous and self-checking, as inputs, in this
case AER-events, can be compared to an expected output, here the whisker
angle, with high accuracy. Moreover, the artificial whisker generator can be
used to generate examples for supervised learning, described in the current
chapter.

34 3.1. ARTIFICIAL WHISKER GENERATOR
The artificial whisker generator can accept whisker and movement proper-
ties from the user, such as whisker length, intensity of AER- events, speed,
and number of movements, among others. For these parameters, it gener-
ates two primary files, described below:
• AEDAT file: This file format is the raw outcome of the DVS and
contains a collection of AER-events that have occurred, described
in detail in section 2.3.3. In this case, an AEDAT file will be
generated with regards to whiskers’ properties and movements. This
information is transformed and represented as artificial AER-events.
This file follows all necessary structures provided by Inilabs, such as
a header and binary file format, and is fully compatible with Inilabs
libraries and applications, such as jAER, described in section 2.3.6.
• Example data: A text file with tab-separated information for further
evaluation and analysis. This file is a collection of lines, each of
which contains an artificially generated AER-event and the angle of
the whisker at the particular moment. This information is used later
for the evaluation of algorithms. Here, the AER-events are the inputs
which will simulate the actual DVS, and the actual angle is the desired
output. With this file, it is possible to find the error and observe how
parameter adjustments affect the performance of the algorithm.
When it comes to the intensity of the whisker, the user can select up to
full intensity, constructing a clear whisker, and represented by many AER-
events. This can be useful to evaluate the algorithm with lower intensities,
as in real-world conditions, lighting and other external factors affect the
number of AER-events that are generated by the actual DVS. In other
words, with the artificial whisker generator, advantages of the developed
algorithm can be evaluated in different environmental conditions.
Another advantage of the artificial whisker generator is the capability of
noise generation. A user can select between a static or dynamic noise,
or even a combination, to generate artificial noise. This means that a
representation of the constructed whisker can be combined with a number
of meaningless random AER-events to evaluate the algorithm and its
performance in a noisy environment. Notice that even with noise, the
example data contains the expected and the desired whisker angle.
The artificial whisker generator is written as fully object-oriented in Matlab.
The user can adjust all parameters as desired and execute a number of
movements. Code B.3 is an example of the use of artificial whisker
generator, making a whisker object with desired properties, such as 80%
of the full length, 60% of the full intensity, and an initial angle of 90°. In
addition, dynamic noises have been added, described in section 3.2.2. In
this example, the dynamic noise is 10%, meaning that for each 10 AER-
events which represent the constructed whisker, a useless AER- event, i.e.
noise, will be made, making the outcome more stochastic and genuine.

At the final stage, the artificial whisker generator generates two files, the
official AEDAT file format and another text file as an example which
contains AER-events as its inputs and the actual and desired whisker angle
as its output.
(a) A single whisker which
is generated of numerous
artificial AER-events
(b) Sequence of artificial
whisker movements and
generation of AEDAT file
(c) Monitoring of
generated AEDAT
file by official jAER
from Inilabs
Figure 3.1: Workflow and achievements of the artificial whisker generator
3.2 Noise
Noise is an important factor that affects almost all algorithms, but in
different degrees. This is one of the main issues of the current thesis, in
particular with the use of DVS, as for the only available information it is
important to distinguish between relevant data and noise.
In an ideal development process with a noiseless environment, all incom-
ing AER-events from the DVS would be guaranteed to be useful and im-
portant. As such, the developer could focus solely on the algorithm and
assume that all AER-events, (the only parameter of the algorithm), have an
equal weight. In this case, the algorithm could be a simple mathematical
function where the whisker angle is concluded from each AER-events and
achieved sequentially.
Equation 3.15 shows a simple function that determines the whisker angle
from an AER-event.
Z = atan2(y, x) (3.1)
where x and y are extracted from the AER-event’s Cartesian coordinates.
This is of course far from reality. In most cases, noise is an important and
considerable issue which can be alleviated but not fully ignored. It requires
significant control to distinguish between useful information and noise.
The control of noise is an essential operation as otherwise the performance
of algorithms will become stochastic with wide variation, making it
difficult to develop and evaluate the desired algorithm. Therefore, the
algorithm must have the ability to manage and reduce noise to maintain
its reliability and functionality.

36 3.2. NOISE
In this thesis, noise has been divided into two main categories, static and
dynamic noise, which will be discussed in the following sections.
3.2.1 Static noise
Static noise typically occurs with common properties such as areas. This
type of noise is visible in the same area almost all the time and is easy to
detect as it is expected during observations. This type of noise typically
arises from the environment and DVS properties.
An algorithm developed to perform static noise reduction indicates which
areas contain static noise and rejects events from these areas, assumed as
unimportant information. This can be accomplished with geometric shapes
like a mask, in this case a circle, which is used to select the region of interest
and reject the rest of the area which may not contain important information.
isNoise(x, y) =
1, (DVSResolution
2 − x)2 + (DVSResolution
2 − y)2 ≥ R
0, otherwise
(3.2)
where x and y are extracted from the AER-event’s Cartesian coordinates,
DVSResolution are the DVS resolutions, and R is the radius of the desired
circle.
3.2.2 Dynamic noise
Dynamic noise occurs randomly, making this type of noise more difﬁcult
to determine. There are several batch and post-processing algorithms,
though most are based on a non-recursive structure and require a stack
of memory, unfavorable for real-processing, in addition to the embedded
system constraints.
As discussed, an AER-event always contains Cartesian coordinates and
polarity, but in most cases a timestamp is also available, described in
section 2.3.5. Although the timestamp is optional, it can be used for
important purposes, including dynamic noise reduction.
In the current thesis, an algorithm for distinguishing dynamic noise
through the use of timestamps was inspired by the cAER library from
Inilabs [63]. This algorithm is based on the intensity of the AER-event,
where a higher intensity of the incoming AER-events indicates the level
of importance of the data. To be more precise, the difference between
the current and previous timestamp is computed and compared with a
user-deﬁned threshold. If this difference is smaller than the threshold,
it is assumed that the AER-event may be useful information, and if the
difference is greater, the AER-event is assumed to be noise:

isNoise(ti, ti−1) =
1, |ti − ti−1| ≥
0, |ti − ti−1| <
(3.3)
where ti is the current timestamp, ti−1 is the previous timestamp extracted
from received and collected AER-events, and is a deﬁned timing
threshold.
This emphasizes the advantage of asynchronous event-based sensors. With
these, the developer is guaranteed that only events that occur will be sent in
order of the actual changes, not synchronized and sequentially at an agreed
upon frequency.
3.2.3 Performance evaluation:
Random sample consensus (RANSAC) was used to evaluate the developed
noise reduction algorithms, of note for its estimation and noise recognition
advantages.
Random sample consensus allows for the division of AER-events into two
main categories, useful information, or inliers, and useless information
and noise, outliers. This algorithm is an iterative algorithm; however,
it requires all data points to be available before processing can take
place. As such, RANSAC is a post-processing algorithm. Within the
Matlab environment, RANSAC has been widely used in the current
thesis to evaluate noise reduction performance. Here, performance of
the noise reduction algorithm is considered as the capability of the
algorithm with regard to the amount of accepted and rejected data as
compared to the RANSAC’s determination of the amount of inliers and
outliers. Moreover, RANSAC was used for parameter adjustments on noise
reduction approaches before further implementation.
Since AER-events are sequentially acquired and RANSAC is a post-
processing algorithm which requires all available data in advance, a
vector of AER-events was made which contains a predetermined number
of acquired AER-events provided to RANSAC to achieve the desired
evaluation.
The structure and steps of RANSAC are as follows [31]:
1. Select random sample of the minimum required size to ﬁt model.
2. Compute a putative model from a sample set.
3. Compute the set of inliers for this model from the whole data set.
4. Repeat 1-3 until a model with the most inliers over all samples is
found.

38 3.2. NOISE
(a) Recorded whisker movements pro-
vided by artificial whisker.
(b) Recognition of inliers and outliers
with RANSAC.
Figure 3.2: Performance evaluation of noise reduction approaches.
3.2.4 Results
Results of the static noise reduction algorithm are plotted below:
(a) Physical artificial
whisker movements.
(b) Generated geo-
metrical mask where
R = eDVSResolution
2 = 64.
(c) The outcome of static
noise reduction algorithm by
filtering of static noise.
Figure 3.3: Performance of the static noise reduction algorithm

Results of the dynamic noise reduction algorithm are plotted below:
(a) Physical artificial
whisker movements.
(b) A sequence of timstamps
with threshold function where
= 50.
(c) The outcome of dy-
namic noise reduction al-
gorithm by filtering of dy-
namic noise.
Figure 3.4: Performance of the dynamic noise reduction algorithms.
Results of combined of both static and dynamic noise reduction algorithms
in its their most optimal state are plotted as following:
(a) Physical artificial whisker move-
ments.
(b) The outcome of noise reduction
algorithms by filtering both type of
noise.
Figure 3.5: The results of combined of both static and dynamic noise
reduction algorithms.
Overall statistics of combination of both static and dynamic noise reduction
algorithms are listed below:
RANSAC Algorithm
Inliers Outliers Not noise Noise
78.5% 21.5% 93.0% 7.0% Static
78.5% 21.5% 91.5% 8.5% Dynamic
78.5% 21.5% 87.0% 13.0% Combination
Table 3.1: Statistics of the noise reduction algorithms.

40 3.3. ARTIFICIAL NEURAL NETWORK
3.3 Artificial Neural Network
Machine learning is state-of-the-art technology in terms autonomous
learning and self-adapted algorithms for classification, including a wide
range of approaches, such as artificial neural networks. Machine learning
allows for patterns to be found, pattern recognition, and useful information
to be extracted from massive datasets and data mining. Such tasks can be
very complex for human beings to analyze [52]. The purpose of machine
learning in the current thesis is to shift control over developing the desired
whisker movement monitoring and tracking algorithm to a computer.
Before continuing, it is worthwhile to investigate the foundation of
machine learning for discussion of more complicated aspects.
• Unsupervised learning: Attempts to find patterns without any
specifically desired results and exploration of current inputs. The computer
is primarily in charge of finding patterns and performing clustering and
classification with the help of the similarity of inputs [41]. Unsupervised
learning is suitable when targets are not available and classification and
the desirable results are more abstract or unknown.
• Supervised learning: Attempts to develop an algorithm with examples
that correspond to a number of inputs with related, expected outputs.
The developed algorithm will behave as a look-up table when inputs are
close to the provided examples, but at the same time, is able to provide a
reasonable output when inputs are less familiar [108]. Supervised learning
is suitable when examples can be provided with s a set of inputs (training
data) and desirable outputs (target).
The artificial whisker generator, described in section 3.1, contributed
to the whisker movement examples for supervised learning, containing
both training data and target with examples from simulations and the
construction of an artificial whisker, as provided by an actual DVS. With
this contribution, supervised learning can be considered as its conditions
are met and examples are available.
There are several approaches within supervised learning. One of the most
phenomenal and conventional approaches is the Artificial Neural Network
(ANN). The ANN is able to perform complex problems with regard to its
optimization and modification. The ANN has several models which can be
implemented for the conditions and required complexity. These models
include Single-layer Perceptron (SLP) and Multilayer Perceptron (MLP),
the extended version of SLP.
The foundation of the ANN with SLP and MLP is described in the
following sections as a brief introduction to essential topics and necessary
prerequisite knowledge. However, machine learning and ANN are
complex and extensive topics and so, it is not possible to discuss them in

full within this thesis.
3.3.1 Single layer perceptron
Hebb’s rule states that changes in the strength of synaptic connections are
proportional to the correlation of the firing of the two connecting neurons
[65]. The mathematical model of a neuron, provided by McCulloch and
Pitts, is illustrated in figure 3.6. In actual neurons, the dendrite receives
electrical signals from the axons of other neurons. In the perceptron, these
electrical signals are represented as numerical values [88]. The perceptron
is nothing more than a collection of McCulloch and Pitts neurons, together
with a set of inputs and some weights to fasten the inputs to the neurons.
A perceptron contains a set of inputs xi that are multiplied by weight wi.
The neurons sum their values, formulated in equation 3.4. Regarding the
summed value, an activation or threshold function decides whether the
neuron fires (’spikes’), formulated in equation 3.5 [65].
Figure 3.6: McCulloch and Pitts mathematical model of a neuron, referred
to as a single perceptron.
The mathematical function of a single neuron, h, is as follows:
h =
n
∑
i=1
wixi (3.4)
where wi is the weighted synaptic, xi is the input, and n the number of
inputs.
Activation or threshold function, y, is as follows:
y = g(h) =
1, if h > ϕ
0, if h ≤ ϕ
(3.5)
where g(h) is the function of the neuron and ϕ is a threshold.

With the function in equation 3.5, the perceptron will either ﬁre or not. A
real neuron generates a spike train, a series of continuous spikes over a
period that can be encoded and translated into information.
Notice that the activation function does not necessary need to be a binary
step function, either on or off, it can be a non-linear function such as a
sigmoid function where it provides a real number [6].
The bias
For equations 3.4 and 3.5, the ﬁnal output is proportional to the inputs,
but in some cases, it may be favorable to have an actual output from the
perceptron, even if all inputs are zero. To solve this issue, a bias is used.
A bias is typically a neuron with a constant input of −1, but its weight
is dynamic and may change during the learning process. To simplify
descriptions and equations, the biases are treated as an ordinary input with
a constant input but dynamic weight.
Learning rate
Another consideration is how much a weight is updated with each learning
iteration in equation 3.6. This is decided by the constant value of learning
rate η. If this value is too large, the learning process will change the weights
drastically which may lead the perceptron to never be able to obtain the
desired state. On the other hand, if this value is too small, the desired
state will be achieved but only after much iteration. The suggested value is
typically 0.1 < η < 0.4 [99].
The algorithm
As discussed, examples are available for supervised learning, indicating
the expected output according to the corresponding input. This means that
the perceptron can only provide an expected output if all its neurons make
the correct decision.
The perceptron algorithm takes charge of the learning so the perceptron
can increasingly provide the correct answers and decisions.
A neuron is made up of the inputs, weights, and the threshold. The inputs
cannot change because they are external, as such, only the weights and
threshold can be changed. Since most learning in the neural network
happens with the weights, the algorithm can start to adapt itself and is
able to learn by updating the perceptron’s weights [65].

Updating the weights is performed with each learning iteration, described
as wij and shown as follows:
wij ← wij − η ( yj − tj ) ∗ xi (3.6)
where wij is the current weight that connects input node i to neuron j, η is
the constant learning rate, yj is the outcome of the activation function from
equation 3.5, tj is the target and the expected value, and xi is the input.
The learning algorithm for SLP is based on a trial and error method,
meaning that the weight updating algorithm in equation 3.6 must be
executed several times for the desired results. In other words, the learning
process must iterate with a number of iterations or until the SLP has
achieved the desired state. This is a polite and sufficient approximation of
the desired algorithm when significant improvement is no longer observed
and the performance converges, also referred to as early stopping [36].
The structure of the perceptron algorithm is as follows [65]:
1. Initialization: Set all of the weights wij to small (positive and
negative) random numbers.
2. Training: For a number of iterations, or until the outputs are correct,
perform the following for each input vector:
(a) Compute the activation of each neuron j using activation
function h from equations 3.4 along with 3.5.
(b) Update each of the weights individually by using equation 3.6.
3. Recall: Compute activation of each neuron j by using equation 3.6.
3.3.2 Multilayer perceptron
A SLP is not able to solve complex equations and can only identify linear
problems, which in most cases does not meet all application requirements.
Usually, real-world problems are non-linear and an outstanding and
comprehensive algorithm must be capable of sufficiently solving these
types of equations. To solve non-linear problems, the SLP, mainly a single
layer of perceptrons, can be extended to a MLP where it can solve complex
problems with regards to its optimization and modification [35].
A MLP contains several layers of perceptrons which are proportional to
the complexity of its decision boundaries, as illustrated in figure 3.7. All
perceptrons in the MLP are fully connected, meaning that all nodes in one
layer are connected to all nodes in the next layer. This condition appears
similar to neural network in general.

The topology of an MLP is as follows [87]:
1. Input layer: Distribute a vector of input values to the next layer. This
layer does not perform any computations.
2. Hidden layers: This can be one or more layers. These layers accept
the output from the previous layer, weight them, and pass through a
normal and non-linear activation function.
3. Output layer: Takes the output from the final hidden layer, weights
them, and possibly passes it through an output to produce the target
values. This layer typically provides a non-linear output but this is
not required or necessary.
Figure 3.7: The general three-layer artificial neural network.
Note that an MLP always has an input and output layer and can have
numerous hidden layers. The output layer, together with all hidden layers,
have a dedicated bias, described more specifically for SLPs in section 3.3.1.
The number of output units does not need to be equal to the number of
input units and the number of hidden layers can be more or less than the
number of input or output units. An MLP is often referred to as a two-layer
network. This corresponds to a network with an input and output layer in
addition to a single hidden layer, while a three-layer network corresponds
to an additional hidden layer.

Figure 3.8: Schematic of the effective learning shape of each stage of a
Multilayer Perceptron (MLP) [65].
As noted, the number of layers determines the complexity of the MLP
and its problem solving capabilities. Therefore, layers must be selected
considering requirements and conditions so the MLP can solve the problem
appropriately.
The correlation between the number of layers and the complexity is
illustrated in ﬁgure 3.8 and described as follows [34]:
• Single layer: This is able to position a hyper-plane in the input space
(the SLP).
• Two layers (one hidden layer): This is able to describe a decision
boundary which surrounds a single convex region of the input space.
• Three layers (two hidden layers): This is able to generate arbitrary
decision boundaries.
Back-propagation of error
When it comes to the learning, the complexity of the SLP and MLP becomes
even more apparent. One of the most conventional learning methods for
MLP is back-propogation, based on trial and error and is made of two main
algorithms performed back and forth to obtain the desired satisfaction.
The forward algorithm feeds through all layers where it calculates the
activations of all hidden and output nodes, similar to equations 3.4 and
3.5. However the inputs are not necessarily external and can also be nodes
from the previous layer.
The backward algorithm is more complicated as all layers are proportional
to each other and changes in one layer will impact the other layers.
In other words, the error must be corrected with respect to numerous
perceptrons. The back-propagation algorithm uses a gradient descent

technique to minimize the sum-of-squares error, the difference between the
actual output and the target [73].
Gradient descent is a first-order iterative optimization algorithm which
finds the local minimum of a function by taking steps proportional to the
negative of the gradient at the current point, as illustrated in figure 3.9 [2].
Notice that in this case, the function is a sum-of-squares error which must
be derivate with respect to the selected weight. Weights are adjusted after
every iteration until they converge and the error is reduced to an acceptable
value.
Figure 3.9: The error surface in the nonlinear case for one weight. The
gradient of error (E) in weight space is computed and the weights are
moved along the negative gradient.
The algorithm
The multilayer perceptron algorithm is structured as follows [65]:
1. Initialization: Set all of the layer weights to small (positive and
negative) random numbers
2. Training: For a number of iterations, or until the outputs are correct,
perform the following for each input vector:
(a) Forwards phase: The input layer along with hidden layers are
used to decide whether the nodes fire or not. This process feeds
through all layers until it reaches the output layer with the final
decision. Equations 3.4 and 3.5 from SLP can be considered if
they also take nodes from a previous layer as their input.
(b) Compute the sum-of-squares error: The error is computed as
the sum-of-squares difference between the actual output and the
target.
(c) Backwards phase: The error feeds backwards through the
network to update the output and hidden layer weights.
3. Recall: Perform the forward phase as described above to compute
the final decision and outcome of the network.

3.3.3 MLP classification for whisker tracking in practice
For whisker tracking, as has been discussed, to be able to build an
MLP, examples must be available where the artificial whisker generator
described in section 3.1 can take charge of these requirements and deliver
artificial whisker movements, just as they would be acquired from the
actual DVS.
The structure of the MLP will be designed by the developer with the
purpose and aim of whisker movement monitoring and tracking. The
MLP’s parameters must also be adjusted to increase the performance
through trial and error by which several approaches and adjustments can
be conducted and compared before a final decision is made.
Figure 3.10: Whisker tracking algorithm with two-layers artificial neural
network conceived of for this thesis.
Different structures of MLP have been considered which attempt to take
advantage of the artificial whisker movement generator and are expected
to provide a reasonable estimation of the whisker angle. These structures
are also designed such that they can be compatible with different properties
of any arbitrary whisker, including those of various lengths and in

environments with different conditions which may generate noise.
The following sections describe the MLP which is specifically chosen to be
the most optimal under the current study’s conditions and constraints, as
illustrated in figure 3.10.
Input layer
The input layer should take the most advantage of AER-events provided by
the artificial whisker generator. As discussed in section 2.3.3, an AER-event
contains x and y coordinates, a polarity, and in most cases, an additional n-
bit timestamp.
To feed the MLP with essential information, the extracted x and y
coordinates are used as input nodes where a number of previous AER-
events are stored into a FIFO. In this case, half of the input vector will
contain x coordinates and half will contain y coordinates.
The input vector of the MLP which is fed by the extracted x and y
coordinates is as follows:
˜x = [x1 . . . xn
2
y1 . . . yn
2
] (3.7)
where x and y coordinates are extracted from previous AER-events and n
is the length of the input vector.
Notice that the AER-events discussed lack timestamps, is irrelevant in this
case and as such they can be completely avoided. Note that an additional
bias node will also be added.
Hidden layers
When it comes to the number of hidden layers, the desired algorithm and
its complexity must be considered. The MLP’s layer topology, as well as
the purpose of hidden layers, and the differences between the two-layer
and three-layer network, are described in section 3.3.2.
For this thesis, a two-layer network is considered wherein the hidden layer
contains 10 nodes. This has been selected based on trial and error and
suggestions from several studies [87]. Note that an additional bias node
will also be added.
A Log-sigmoid transfer function, along with a constant gain factor, are
employed as the activation functions of the hidden layer nodes and able
to provide a non-linear logistic sigmoid function [26].
Output layer
Clearly the algorithm must provide a sufficient approximation of the
output, here the estimated whisker angle. This estimation is expected

to be as close as possible to the target provided by the artificial whisker
generator.
For this purpose, a real number is provided which indicates the estimated
angle. This number can either be an integer or float, depending on the
design and provided target. However, it is not recommended to have
more than one mantissa (number after the decimal). According to the
specifications, this is not required for a high-resolution estimated angle and
this will only further complicate the entire MLP.
The output of the MLP as a single real number angle is:
ˆθ = y (3.8)
where y is the single output of the MLP.
As in the hidden-layer, the log-sigmoid function, along with the constant
gain factor, are also considered as the activation function for the output
layer.
Simulations
To simulate the artificial neural network, several outstanding environments
provide libraries which include all necessary approaches. A user does not
need to build all approaches from scratch and available algorithms can
be used for fast and convenient simulations, adjustments, and decision
making without comprehensive or in-depth knowledge of ANN.
Two of the most conventional deep learning environments which were also
used in this thesis are described in the following paragraphs.
• RapidMiner: A graphical open-source application for machine learning,
data mining, and business analytics, among other tools [3]. The most
significant advantage of this application is that it is often free and can be
used across platforms. However, it must be executed on JVM. Its graphical
user interface, along with comprehensive documentations, makes this a
suitable choice for ANN simulations.
• Matlab Neural Network Toolbox: An extension for Matlab which pro-
vides the necessary algorithms and functions to train, visualize, and sim-
ulate neural networks [69]. Matlab’s environment, as described in section
2.3.6, is advantageous as it has the ability to combine numerous approaches
provided by Matlab together with user-defined implementations, such as
classes, functions, and scripts [9].
To train and perform simulations on MLPs during this thesis, Matlab was
used because of its ability to integrate previously developed algorithms,

Real-time and high-speed vibrissae monitoring with dynamic vision sensors and embedded systems

Real-time and high-speed vibrissae monitoring with dynamic vision sensors and embedded systems

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (19)

Ähnlich wie Real-time and high-speed vibrissae monitoring with dynamic vision sensors and embedded systems

Ähnlich wie Real-time and high-speed vibrissae monitoring with dynamic vision sensors and embedded systems (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Real-time and high-speed vibrissae monitoring with dynamic vision sensors and embedded systems