SlideShare ist ein Scribd-Unternehmen logo
1 von 297
Robust Abandoned Object Detection based on Life-cycle State Measurement
1
    Wei-Hsin Hsu(徐維忻), 2Hung-I Pai(白宏益) , 3Shen-Zheng Wang(王舜正), 4San-Lung Zhao(趙
                             善隆), 5Kung-Ming Lan(藍坤銘)

      Identification and Security Technology Center, Systems Development and Solutions Division,
                       Industrial Technology Research Institute, Hsin-Chu, Taiwan
         E-mail: hsuweihsin@itri.org.tw 2HIPai@itri.org.tw 3st@itri.org.tw 4slzhao@itri.org.tw
                  1
                                          5
                                            blueriver@itri.org.tw



                     ABSTRACT                                     can be used in some applications such as dangerous
                                                                  abandoned object detection, abandoned luggage
In public areas, objects could be abandoned due to                detection for passengers and so on. Moreover, this
careless forgetting or terrorist attack purposes. If we           system is also lowering personnel.
can detect those abandoned objects automatically based
on a video surveillance system, the forgotten objects             [1] provides a two backgrounds framework (long-term
can be returned to the owner and the terrorist attacks            and short-term). The two backgrounds framework uses
can be stopped. In this paper, we propose an automatic            pair of backgrounds that have different characteristics
abandoned object detection algorithm to satisfy the               to segment related foregrounds and extract abandoned
requirement. The algorithm includes a double-                     objects. Our background module building technology is
background framework, which generates two                         just follow [1] and tries to fix it. The advantage of
background models according to different sampling                 method from [1] is that it doesn’t track all of objects so
rates. The framework can extract un-moving object                 that it can save the tracking computing performance.
candidates rapidly and the abandoned objects will be              However, this method is not perfect yet; temporal rate
extracted from those candidates according to their                of background construction is quite important, in [1],
appearance features. Finally, this paper propose a finite         long-term and short-term background will be updated
state machine to model The Life-cycle State                       for a period of time. However, if long-term background
Measurement called LCSM module, to prevent the                    is updated before the abandoned object is detected, it
missing fault caused by occlusion or illumination                 cause failure in detection and effect on the result. In
changing. When abandoned events happened, the                     this study, we try to find the optimized key timing to
LCSM module can launch or stop an alarm in the                    reduce the risk of failure in detection cause of long-
starting time or the ending time. To evaluation the               term background updating. The detail is described in
performance, we test the algorithm on 10 videos. The              Section 2.
experiment result shows that the algorithm is feasible,
since the false alarm rate and missing rate are very low.         [2] provides a framework of two level backgrounds,
                                                                  which uses linear combination, belonging to single
Keywords Pure background; Instant background; un-                 background building method and manages to track each
moving object aggregated map; LCSM                                moving object. The framework uses optical flow
                                                                  technology to detect moving objects because optical
                 1. INTRODUCTION                                  flow for each object will be changed when the object is
                                                                  moving. Therefore, it can easy to recognize moving
The problem of video surveillance becomes quite                   object and static object. Static object and stayed human
important issue today. Following the progressed                   are separated by Human Patten recognition method.
technology, a lot of places including the public area,            However, the method still has some limits for filtering.
insider building, even in the public roadway are set up           For example, we are hard to confirm each shape of still
camera for surveillance. However, we cannot monitor               human. Therefore, we must have enough Human Patten
each camera by human immediately cause of                         templates. When the number of template is up, the
insufficient human resource. In this study, we propose a          recognition rate is increased; even so, the computing
new abandoned object detection technology, which                  performance is raised. Therefore, it is unsuitable for
receive the video stream from a camera and than detect            performance priority. In Section 2, we propose a
the abandoned object in few minutes. The technology               method to filter an object by the object feature filtering




                                                            691
to avoid the performance problem cause of using the                    of objects in S are aggregated over the threshold, we
Human Patten method.                                                   draw objects out from S and save those objects.
                                                                       Moreover, those objects should be filtered according to
[3] provides a framework of two backgrounds method                     feature filtering including Shape 、 Compactness and
(current and buffered background) to track all objects                 Area. The final processing is the most important issue
and to record the object’s information to determine                    in this paper: The Life-cycle State Measurement called
whether it is occluded or not. The advantage is that it is             LCSM. The conception of LCSM is from software
still locks on target even so the target is occluded.                  engineering, the original meaning talks about the life-
Section 2, we fix the idea from [3] and provide the                    cycle of software, this paper puts the idea into
theory about The Life-cycle State Measurement (LCSM)                   abandoned object detection, it makes abandoned object
to make each abandoned object in different                             have different state in each situation. In this paper , the
environments more convinced and have the detection                     state include the growing state、the stable state、the
result become more reasonable.                                         aging state and the dead state. We assign proper state
                                                                       for each abandoned object in different situations like
                                                                       occlusion or removal, in this way; we can make
                                                                       abandoned object processing more reasonable. This
                                                                       issue will be discussed in Section 2, and Fig. 1 is shown
                                                                       the definition of all the symbols and described the
                                                                       relationships one another.

                                                                       2. ABANDONED OBJECT EDTECTION SYSTEM




           Fig. 1.Symbol definitions and flow

An abandoned object detection method using object
tracking method could be feasible. However, the
problem of using tracking method has lower efficiency
when too many objects are tracked, since those methods
track not only abandoned object but also all other
objects. Otherwise, the computing is heavy. Therefore,                                 Fig. 2. System Overview
we use new method based on different sample rate
instead of tracking-based technology to avoid the                      In this paper, the system consists of three modules as
problem we discussed above. In this paper , we divide                  shown in Fig. 2. The first module, um-moving object
the system framework into three technological                          detection, is composed of foreground detection, un-
processeses; first, we manage to receive the pure                      moving object decision and real un-moving object
background called BP(t) and the instant background                     extraction by aggregated processing. The second
called BI(t) according to different sample rate at time t,             module called an abandoned object decision; the
BP(t) means the original background without any                        function of this module is clustering the image pixels of
moving objects here, and BI(t) means the background                    un-moving object from the un-moving object
received form video sequence for a short period.                       aggregated map received from module 1. Moreover,
Computing by frame difference between current frame                    those un-moving objects are filtered by the object
BC(t) and BP(t), BI(t) to get the pure foreground called FP(t)         feature filtering method and the abandoned objects are
and the instant foreground called FI(t). Following the                 decided. The final module, life cycle of abandoned
rules in Section 2, we can extract the un-moving objects               object, decides the persistent time of an abandoned
in current frame according to FP(t) and FI(t), and getting             object event according to a finite state machine.
the un-moving object map called St(t) at present. Second,
the processing will aggregate value for each pixel from                According to the definition of the modules, three main
each St(t) to receive the un-moving object aggregated                  corresponding technologies are presented in the
map called S, this processing can get rid of some                      following.
objects which remain within short period. When some




                                                                 692
2.1 The un-moving object decision                                    background using BC(t) because BC(t) could have moving
                                                                     or un-moving objects here, and it is easy to update the
According to the discussion from the Introduction, the               objects into Bp(t). For that reason, To keeping pure, we
abandoned objects detection by objects tracking method               select foreground map FP(t) as the mask, and then we
could cause performance problem, because most                        use the linear combination method to combine the
computing are cost to track moving objects. In [1] [3],              masked pixels of BP(t) and BC(t). Therefore it can avoid
double backgrounds are proposed to remove moving                     objects updated into BP(t) and suitable for illumination
objects and retain un-moving objects. By this method,                changing. Actually, we cannot guarantee the accuracy
the performance will be raised than tracking method.                 of FP(t) and can’t avoid noises are updated into BP(t).
                                                                     Those noises be increased quickly when the updated
                                                                     frequency is raised. However, when the updated
2.1.1. Background updating                                           frequency of BP(t) is lower, the ability of illumination
                                                                     changing adaption will be declined. Therefore, in this
In the sequence of video, we extract the frames to be the            paper, according to different timing, we propose three
updated source of the two background model BP(t) and                 updating rules to adapt background model BP(t). The
BI(t) by different duration shown in Fig.3:                          background is not updated for each frame. The update
                                                                     rate of BP(t) is defined as previous paragraph and the
                                                                     updating rules are defined as below:

                                                                     a) The foreground map FP(t) is used as a mask to select
                                                                     the pixels that need to be updated. Those selected pixels
                                                                     in BP(t) is linearly combined with BC(t) for updating.

                                                                     b) If the number of pixels from un-moving object
                                                                     aggregated map S is over an assigned threshold, it
                                                                     means that the noises are up or light is changing. In
                                                                     that condition, BP(t) is replaced by BC(t) .

                                                                     c) If there are not any moving objects in BC(t) for the
   Fig. 3 Pure and Instant sampling rate illustration                long duration and there are not any abandoned objects
                                                                     detected, BP(t) is replaced by BC(t) .
BP(t) indicates keeping pure image without any moving
objects during the long period, and BI(t) indicates an                In the second rule, the un-moving object aggregated
image caught by an sample rate with short period.                     map S is a map used to accumulate the possibility of
Objects are expected in BI(t) when the event of                       each pixel belonging to abandoned object. The detail
abandoned objects happened. Current frame at time t is                about map S will be described in next sub-section (Sec.
denoted by BC(t). After the background model BP(t) is                 2.1.2).
estimated, we can easy to obtain a foreground map FP(t)
including moving and un-moving objects by computing                  2.1.2. Un-moving object decision
the frame difference between BP(t) and BC(t). To extract
un-moving objects from the foreground map FP(t), we                  Following the last section, first of all, we compute the
try to extract another foreground map FI(t) only includes            frame difference between BC(t) and BP(t), and between
moving object shown in Fig.4. The foreground map FI(t)               BC(t) and BI(t). The frame difference method is image
is gotten by computing the frame difference between                  subtraction between two images. The difference results
BI(t) and BC(t). If an un-moving object stays in a position,         FP(t) and FI(t) are shown in Fig.4:
the value in the object position of the map FP(t) should
be 0 and the value in the object position of the map FI(t)
should be 1. Therefore, this processing can easily to
extract an un-moving object presently. According to our
experiment, the time period of sample rate of BP(t) is
about 25 frames, and the sample rate of BI(t) is about 15
frames .

Because of illumination variances, it is difficult to
update Bp(t). In general, Bp(t) could be updated and                     Fig. 4 Pure and Instant foreground illustration
sourced from BC(t). However, it is hard to keep pure




                                                               693
If a object stay in the environment, the value on the                   constant value (Iv). Otherwise, decrease a constant
same positions (x, y) of FP(t)(x, y) is 0 and FI(t)(x, y) is 1.         value (Dv). Therefore, if objects stay for a long period,
The algorism is shown in Algo.1:                                        the values of the object positions in map S will increase
                                                                        continuously until that enough pixels value are exceed
In the algorithm, FP(t) and FI(t) are monochrome and the                to the thresholds we assign. In that time, those pixels
range of pixel value is 0 to 1. When the pixel value of                 could be parts of un-moving objects. The advantage of
FP(t) (x, y) is equal to 0 and the pixel value of FI(t)(x, y)           this method is that it can prevent the temporary objects,
is equal to 1, it means that this pixel begin to stop and               which stayed only for few seconds, to be regarded as
stay in the current frame. Therefore; we can extract the                abandoned objects. Moreover, when number of pixel
un-moving object map St shown in Fig.5:                                 which value is equal to 1 is too much, this situation
                                                                        may infer too many noises in Bp(t) after updating, or the
                                                                        light is changed too large. Those information could be
                                                                        feedback to be a updated timing conditions of BP(t).
Input: FP(t) ,FI(t)     Output: St(t)
For Each Position (x, y) inside the maps FP(t) and
FI(t)
      IF (FP(t)(x, y) = 0 & FI(t)(x, y) = 1)
          Set St(t) (x, y) = 1
       Else
          Set St(t) (x, y) = 0
    Algo. 1. The algorism of getting the un-moving object
                                                                         Fig. 6. Un-moving object aggregated map illustration

                                                                        2.2 Abandoned object decision

                                                                        The method of [2] is using Human Patten to recognize
                                                                        human or abandoned objects. But it is not enough that
                                                                        only Human Patten is using. In normal situation, an
                                                                        abandoned object is usually a luggage or a briefcase,
Fig. 5.Un-moving object map illustration: only present                  and bomb usually package in a box with regular shape.
       the un-moving object. The right map is St                        Therefore, we can only focus on the object that has
                                                                        regular shape, getting rid of it using object feature
2.1.3. Real Un-moving Object extraction                                 filtering.

St includes current un-moving objects. However, it                      2.2.1. Abandoned object clustering
could not indicate that they are real un-moving objects
because those objects could just be a person staying for                According to the un-moving object aggregated map S
seconds or objects placing temporary. Therefore, St                     which is getting in the section 2.1, we extract each
must be added up for each frame continuously to form                    object from S using clustering method. Those un-
an un-moving object aggregated map S for extracting                     moving objects may include noises, staying human or
real un-moving objects. The map S is defined as Algo.2.                 luggage and so on, generally, we take care about the
                                                                        object which is baggage with regular shape, briefcase
                                                                        and box, therefore, we focus on those kinds of objects
                                                                        based on three special features.
Input: St(t),S , Iv, Dv Output: S
For Each Position (x, y) inside St(t) & S                               2.2.2. Object feature filtering
     IF(St(t)(x, y) = 0 )
                                                                        According to the need from Subsection 2.2.1, we filter
     S(x, y) = S(x, y) + Iv
                                                                        each object based on object features. In this paper, we
     Else                                                               use three object features. First object feature is Area.
     S(x, y) = S(x, y) - Dv                                             The goal of Area constrain is filtering the objects which
                                                                        are too large or too small. The Area feature is shown
                 Algo. 2. The algorism of S
                                                                        (1) :
For each pixel, if the pixel value is equal to 0, then the
same position (x, y) of S will begin to increase a                      Area  size(Object)                              (1)



                                                                  694
where size (Object) means sun of pixels of object.                   abandoned object set that we already have(objectj(t-n)
                                                                     1<n<t-1).If the center position of objecti (t) is located on
The second feature is Shape, the state of object                     one of abandoned objects‘s bounding box from the Set
appearance. The function is shown (2) :                              (objectj(t-n) 1<n<t-1), it means that they have relationship
                                                                     between objectj(t-n) and objecti(t), and called the objectj(t-n)
              size(Object)  4                                      has relationship satisfaction. Otherwise; it could be new
  Shape                                             (2)             object 、 removal object or occlusion and called
                ( Perimeter)2                                        relationship dissatisfaction, if it is a new object, than
where Perimeter indicate sun of object edge length and               adding objecti(t) into the Set(objectj(t-n) 1<n<t-1). Therefore;
the shape of object. When the shape of object is not                 we can decide relation and state for each object from
regular, the Shape is smaller. Otherwise, the Shape is               Set(objectj(t-n) 1<n<t-1) through this processing. When an
larger by non-regular object. Generally speaking, this               abandoned object is considered, we don't lunch a alarm
feature is good for human shape filtering.                           to user immediately instead of making a decision based
                                                                     on the state of current abandoned object. Following the
The last feature is Compactness. If the object is more               definition of the life-cycle of software from software
dispersed, the value of Compactness will be smaller.                 engineering technology, we manage to use the ideas
The function is shown (3):                                           into our study. The Life-cycle State Measurement
                                                                     (LCSM) including four states: the growing state、the
                               n
                             iObject
                                        i
                                                                     stable state、 the aging state and the dead state. Fig.7
                                                                     is The Finite State Machine of LCSM:
  Compactness                                       (3)
                        size(Object )  8

where ni means that for each pixel, searching eight
neighborhood pixels, if one of eight pixels is also
included object, than ni will be add value 1.

The objects extracting from S should be contented with
those three features, and than those objects be
                                                                                 Fig. 7.The Finite State Machine of LCSM
considered an abandoned object finally. Therefore; we
can be easily to filter most of objects thought those
                                                                     The beginning state of abandoned object which is
features.
                                                                     recorded in Set(objectj(t-n) 1<n<t-1) is assigned the growing
                                                                     state, when the timing of growing state is finished, the
2.3 The Life-cycle State Measurement
                                                                     state will be changed into the stable state or the aging
                                                                     state. When the timing of aging state is raised, it could
The advantage of our method is that we don't track all
                                                                     be into the dead state.
objects and the efficiency is better. However, the object
features are easily to be effected due to illumination
                                                                     Following illustrate is the algorism of each state.
changing and so on. Moreover, the occlusion problem
                                                                     Symbols are defined in Table.1 to convince the
is a harsh issue, when the objects are occluded. The
                                                                     following illustration.
abandon object could be discarded due to that the Area
feature is reduced. When objects are occluded in small
period; system should be kept instead of being                           Table.1. The definitions of symbols in LCSM
discarded. Therefore, the set saving an abandoned                    symbol        illustration
object should have temporal register. In other word,                 ObjState      The life-cycle state of current object
when we get objects form the S for each frame, those
objects should compute the relationship algorism with                Newobj              A type is Boolean, indicates whether
abandoned object set which detected before and                                           object is new abandoned object or
discarded abandon object. The definition of relationship                                 not
means whether the abandoned object set at present                    GrowingT            A type is Boolean, indicates whether
(objecti (t)) connect to the object set which detected                                   the growing time (GrowingTime) is
before (objectj(t-n) 1<n<t-1) or not. The processing ensures                             finished or not. GrowingTime is
that each abandoned object has stayed for a period of                                    means a duration form Growing
time.                                                                                    State to Stable State
                                                                     AgingT              A type is Boolean, indicates whether
The computing method of relationship is that for each                                    the aging time (AgingTime) is
objecti(t), their center position is compared with




                                                               695
finished or not. . AgingTime is                    be stopped and the state will be changed into the special
                 means a duration form Aging State                  situation called the unstable state. When the unstable
                 to Dead State                                      state continues for a long period of time, this object will
ObjFeature       A type is Boolean, indicates whether               be killed by itself finally. Therefore, through the LCSM
                 features of abandoned object are                   processing, we can avoid some of unstable situations
                 satisfied assigned by user or not                  and make the detection more reasonable.
ObjRelation      A type is Boolean, indicates whether
                 the relationship satisfaction is                                3. EXPERIMENT RESULTS
                 satisfied through the relationship
                 computing or not                                   We test ten videos which the dimension is 352 X 288 or
                                                                    320 X 240 and those videos be divided into four types,
Growing State: When the new abandoned object is                     each type has the same background, therefore; the
created, we give it a time buffer (GrowingTime) to                  parameters for each type should be the same to proof
grow and to avoid a false alarm due to the error                    that the same parameters in the same background could
detection by occlusion or non-complete area. Therefore;             be use. When abandoned objects are detected, system
this state is using at the beginning until the                      will draw bounding box on them even in occlusion
GrowingTime is finished.                                            situation, when objects are removed, the alarm is still
                                                                    retained for a period of time.
ObjState = Grounging State
    if Newobj = True & GrowingT = False &
      ObjFeature = True & ObjRelation = True           (4)

Stable State: Stable state means the GrowingTime is
finished, and object’s features are satisfied, and
relationship satisfaction is existed. Relationship
satisfaction means the object is not removed or
occluded. If the state of abandoned object is changed to
the stable state, system will lunch alarm to user and the
object will be boxed and feedback to user.

ObjState = Stable State
  if GrowingT = True & ObjFeature = True &
   ObjRelation = True                                  (5)

Aging State: Aging state happened when the
relationship is not contented or feature conditions are
not satisfied, it is usually stand for occlusion or removal
object. In that time, the state will be changed to the
aging state. We also give the object a time buffer                  Fig. 8.Abandoned object detection results: when the
(AgingTime) to age, once the conditions of the stable               abandoned object is occlude, the bounding box is still
state are fitted among the AgingTime. The state is                  keeping it; when the object is removed, the bounding
returned to the stable state again, otherwise, the state is         box is also keeping for duration.
changed to the dead state finally.
                                                                    The top of picture in the Fig.8 show the fact that     the
  ObjState = Aging State                                            system can still select the object for a while when    the
   if GrowingT = True & AgingT = False &                            object is occluded, others are show the fact that      the
     ObjFeature = False | ObjRelation = False         (6)           system can still select the object for a while when    the
                                                                    object is picked off.
Dead State: When the state is changed into the dead
state, It means the object should be ignored and deleted
from Set(objectj(t-n) 1<n<t-1) in a few minute.                     In testing, we use Sensitivity and Specificity to verify
                                                                    our result。All the action in those test videos have at
     ObjState = Dead State                                          least one abandoned object, objects which stay over 3
      if AgingT = True                                (7)           seconds will be considered the non-abandoned object
                                                                    event. Those test videos include 10 abandoned objects.
If the state is the Growing State, but each conditions of           Table 2 is about the definition of True positive (TP)、
the Growing State are not satisfied, the GrowingT will




                                                              696
False positive (FP)、False negative (FN), True negative              [2] Wu-Yu Chen, Meng-Fen Ho, Chung-Lin Huang, S. T.
(TN) and Table 3is the table of results of 10 test video.               Lee,Clarence Cheng,” DETECTING ABANDONED
                                                                        OBJECTS IN VIDEO-SURVEILLANCE SYSTEM”,”
                                                                        The 21th IPPR Conference on Computer Vision, Graphics,
       Table.2. Definition of TP、FP、FN、TN                               and Image Processing” ,CVGIP2008.
      Illustration
TP    Abandoned objects are detected correctly                      [3]A.Singh ,S.Sawan ,M.Hanmandlu ,V.K.Madasu ,B.C.Love
                                                                       ll,”An abandoned object detection system based on dual
FP    Non-abandoned objects are detected as
                                                                       background segmentation”,” Proceedings of the 2009
      abandon objects                                                  Sixth IEEE International Conference on Advanced Video
TN    Non-abandoned object are detected as non-                        and Signal Based Surveillance”, Pages: 352-357 .
      abandon objects
FN    Abandoned objects are detected as non-                        [4] J.Wang and W. Ooi. “Detecting static objects in busy
      abandon objects                                                    scenes”. Technical Report TR99-1730, Department of
                                                                         Computer Science, Cornell University, February1999.
The 10 test video is from popular databases and our
                                                                    [5] M. Bhargava, C-C. Chen, M.S. Ryoo, and J.K. Aggarwal,
results show that the methods to solve the problem                       “Detection of Abandoned Objects in Crowded
abandon object detection are efficient in the two points                 Environments”, in Proceedings of IEEE Conference on
with high accuracy and low computing cost. The                           Advanced Video and Signal Based Surveillance, 2007,
sensitivity is 90% and specificity is 92.6%. This shows                  pp. 271 – 276
high accuracy by applying our methods. The average                  [6] R. Mathew, Z. Yu and J. Zhang, “Detecting New Stable
FPS is around 30 fps, and real time test using IP                        Objects in Surveillance Video” in Proceedings of the
camera is about 25 fps. This shows cheap computing                       IEEE 7th Workshop on Multimedia Signal Processing,
cost and the methods can work real time.                                 2005, pp. 1 – 4.

                                                                    [7] F. Porikli, Y. Ivanov, and T. Haga, “Robust Abandoned
      Table.3. Result of Sensitivity and Specificity                    Object Detection Using Dual Foregrounds”, Eurasip
                  Positive            Negative                          Journal on Advances in Signal Processing, vol. 2008,
Positive          TP = 9              FP = 4                            2008.
Negative          FN = 1              TN = 50

Sensitivity = TP / (TP + FN) = 90.0%
Specificity = TN / (FP + TN) = 92.6%

                   4. CONCLUSION

In this paper, the results are reasonable by applying
some techniques of foreground analysis 、 feature
filtering and LCSM mechanism. However, the
techniques are not flawless. For example, the updating
BP(t) still has noise in a long period of time, even though
a mechanism is proposed to replace BP(t). Missing
abandon object detection can not be avoided. The next
problem is about feature filtering. In normal situation,
feature filtering can separate human and object, but it
could make a false decision due to non-completed
foreground detection, or people whose foregrounds are
looked like rectangular and static object. In the future,
we will make this technology of an abandoned object
detection more reliable and useful in video surveillance.

                    REFERENCES

[1] Fatih Porikli ,”Detection of Temporarily State Regions
    by Processing Video at Different Frame Rates ” , ”
    Advanced Video and Signal Based Surveillance2007”,
    AVSS 2007. IEEE Conference on 5-7 Sept.




                                                              697
HIERARCHICAL METHOD FOR FOREGROUND DETECTION USING
                     CODEBOOK MODEL
                        Jing-Ming Guo (郭景明), Member, IEEE and Chih-Sheng Hsu (徐誌笙)

                                             Department of Electrical Engineering
                                    National Taiwan University of Science and Technology
                                                      Taipei, Taiwan
                                     E-mail: jmguo@seed.net.tw, seraph1220@gmail.com

                        ABSTRACT                                    [6], the gradient information is employed to detect
This paper presents a hierarchical scheme with                      shadows, and which achieves good results. Yet, multiple
block-based and pixel-based codebooks for foreground                steps are required for removing shadows, and thus it
detection. The codebook is mainly used to compress                  increases the complexity. Zhang et al. [24] proposed ratio
information to achieve high efficient processing speed. In          edge method to detect shadow, and the geometric
the block-based stage, 12 intensity values are employed to          heuristics was used to improve the performance. However,
represent a block. The algorithm extends the concept of             the main problem of this scheme is its high complexity.
the Block Truncation Coding (BTC), and thus it can                     Most foreground detection methods are pixel-based,
further improve the processing efficiency by enjoying its           and one of the popular methods is the MOG. Stauffer and
low complexity advantage. In detail, the block-based                Grimson [7], [8] proposed the MOG by using multiple
stage can remove most of the noises without reducing the            Gaussian distributions to represent each pixel in
True Positive (TP) rate, yet it has low precision. To               background modeling. The advantage is to overcome
overcome this problem, the pixel-based stage is adopted             non-stationary background which provides better
to enhance the precision, which also can reduce the False           adaptation for background modeling. Yet it has some
Positive (FP) rate. Moreover, the short term information is         drawbacks: One of which is the standard deviation (SD);
employed to improve background updating for adaptive                if SD is too small, a pixel may easily be judged as
environments. As documented in the experimental results,            foreground, and vice versa. Another drawback is that it
the proposed algorithm can provide superior performance             cannot remove shadows, since the matching criterion
to that of the former related approaches.                           simply indicates that a pixel is classified as background
                                                                    when it is within 2.5 times of SD. Chen et al. [9] proposed
Keywords- Background subtraction; foreground                        a hierarchical method with MOG, the method also
detection; shadow detection; visual surveillance; BTC               employs block and pixel-based strategy, yet shadows
                                                                    cannot be removed with their method. Martel-Brisson and
1. INTRODUCTION                                                     Zaccarin [10] presented a novel pixel-based statistical
In visual surveillance, background subtraction is an                approach to model moving cast shadows of non-uniform
important issue to extract foreground object for further            and intensity-varying objects. This approach employs
analysis, such as human motion analysis. A challenge                MOG’s learning strategy to build statistical models for
problem for background subtraction is that the                      describing moving cast shadows, yet this model requires
backgrounds are usually non-stationary in practice, such            more time for learning. Benedek and Sziranyi [23] choose
as waving tree, ripple water, light changing, etc. Another          the CIE L*u*v space to detect foregrounds or shadows by
difficult problem is that the foreground generally suffers          MOG, and the texture features are employed to enhance
from shadow interference which leads to wrong analysis              the segmentation results. The main problem of this
of foreground objects. Hence, background model is highly            scheme is its low processing speed.
demanded to be adaptively manipulated via background                   Kim et al. [11] presented a real-time algorithm for
maintenance. In [1], some of the well-known issues in               foreground detection which samples background pixel
background maintenances are introduced.                             values and then quantizes them into codebooks. This
   To overcome shadows, some well-known methods can                 approach can improve the processing speed by
be adopted for use, such as RGB model, HSV model,                   compressing background information. Moreover, two
gradient information and ratio edge. In particular,                 features, layered modeling/detection and adaptive
Horprasert et al. [2] proposed to employ statistical RGB            codebook updating, are presented for further improving
color model to remove shadow. However, it suffers from              the algorithm. In [12] and [13], the concept of Kohonen
some drawbacks, including 1) more processing time is                networks and Self-Organizing Maps (SOMs) [14] were
required to compute thresholds, 2) non-stationary                   proposed to build background model. The background
background problem cannot be solved, and 3) a fixed                 model can automatically adapt to a self-organizing
threshold near the origin is used which offers less                 manner and without a prior knowledge. Patwardhan et al.
flexibility. Another RGB color model proposed by                    [15] proposed robust foreground detection by propagating
Carmona et al. [18] can solve the third problem of [2], yet         layers using the maximum-likelihood assignment, and
it needs too many parameters for their color model. In [3]          then clustered into “layers”, in which pixels that share
and [4], the HSV color model is employed to detect                  similar statistics are modeled as union of such
shadows. The shadows are defined by a diminution of the             nonparametric layer-models. The pixel-layer manner for
luminance and saturation values when the hue variation is           foreground detection requires more time for processing, at
smaller than a predefined threshold parameter. In [5] and           around 10 frames per second on a standard laptop



                                                              698
computer. In our observation, classifying each pixel to            model building. In our observation, the CB employs more
represent various types of features after background               information to build the background, yet the proposed
training period is good manner for building adaptive               method employed the concept of MOG [7] by simply
background model. Also, it can overcome the                        using weights to classify foreground and background and
non-stationary problem for background classification.              thus can provide even higher efficient advantage and the
   Another foreground detection method can be classified           precision is also higher than that of CB. Another
as texture-based, in which Heikkila and Pietikainen [16]           difference between the proposed method and CB is that
presented efficient texture-based method by using                  the two stages, namely block-based and pixel-based
adaptive local binary pattern (LBP) histograms to model            stages, are involved in background model construction,
the background of each pixel. LBP method employs                   while simply one stage is used in CB. In block-based
circular neighboring pixels to label the threshold                 stage, multiple neighboring pixels are classified as a unit,
difference between neighboring pixels and the center               while a pixel is the basic unit in pixel-based. Figure 1
pixel. The results are considered as a binary number               shows the structure of the background model which
which can fully represent the texture of a pattern.                composes of block-based and pixel-based stages. The
   In this study, a hierarchical method is proposed for            details are introduced in the following sub-sections.
background subtraction by using both block and
pixel-based stages to model the background. This                                           Background Model
block-based strategy is from the traditional compression
scheme, BTC [17], which divides an image into
non-overlapped blocks, and each pixel in a block is                                             Block-based                           Pixel-based

substituted by a high mean or low mean. BTC algorithm
simply employs two distinct intensity values to represent
a block. Yet, in this paper, four intensity values are                Fig. 1. Structure of background construction model.
employed to represent a block, and each pixel in a block
is substituted by the high-top mean, high-bottom mean,             2.1 Block feature in block-based stage
low-top mean or low-bottom mean. The block-based                   The block feature used in this study is extended from
background modeling can efficiently detect foreground              BTC algorithm which maintains the first and the second
without reducing TP, yet the precision is rather low. To           moment in a block. Although BTC is a highly efficient
overcome this problem, the pixel-based codebook strategy           coding scheme, we further reduce its complexity by
is involved to compress background information to                  modifying the corresponding high mean and low mean.
simultaneously maintain its high speed advantage and               Moreover, we extended the BTC algorithm by using four
enhance the accuracy. Moreover, a modified color model             intensity values to represent a block to increase the
from the former approach [18] is used to distinguish               recognition confidence, each pixel in a block is
shadow, highlight, background, and foreground. The                 substituted by the High-top mean (Ht), High-bottom (Hb),
modified structure can simplify the used parameters and            Low-top (Lt) or Low-bottom (Lb) means. Suppose an
thus improve the efficiency. As documented in the                  image is divided into non-overlapped blocks, and each
experimental results, the proposed method can effectively          block is of size M x N. Let x1, x2, ..., xm be the pixel
solve the non-stationary background problem. One                   values in a block, where m=MxN. The average value of a
specific problem for background subtraction is that a              block is
                                                                        1 m
                                                                    x   xi
moving object becomes stationary foreground when it                                                                     (1)
stands still for a while during the period of background                m i 1
construction. Consequently, this object shall become a             The high mean Hm and low mean Lm is defined as
part of the background model. For this, the short term                        m                                   m

information is employed to solve this problem in                           (x         i   | xi  x )            (x        i    | xi  x )
background model construction.                                     Hm      i 1
                                                                                                        , Lm     i 1     (2)
   The paper is organized as below. Section 2 presents                         q                  mq
initial background model in background training period             where q denotes the number of pixels equal or greater
that includes the block-based and pixel-based codebooks.           than x . Notably, if q is equal to m or 0 then all the
Section 3 reports background subtraction by the proposed           values in a block are forced to be identical to x . In this
hierarchical scheme. Section 4 introduces the short term           case, the Ht, Hb, Lt and Lb are assigned with x .
information with background model. Section 5 documents             Otherwise, three thresholds ( x , Hm and Lm ) are
experimental results, in terms of accuracy and efficiency,         employed to distinguish the four intensity values, Ht, Hb,
and compares with former MOG [7], Rita’s method [4],               Lt and Lb as defined below,
CB [11], Chen’s method [9] and Chiu’s method [22]
                                                                           m                                        m

schemes. Section 6 draws conclusions.                                     (x      i   | xi  Hm)                  (x           i   | x  xi  Hm)
                                                                   Ht    i 1
                                                                                                         , Hb     i 1                               (3)
                                                                                           p                                          q p
2. INITIAL BACKGROUND MODEL                                               m                                               m
In this study, two types of codebooks are constructed for                  ( x | Lm  x  x )
                                                                                   i               i                       ( x | x  Lm)
                                                                                                                                      i   i
block-based and pixel-based background modeling. The               Lt    i 1
                                                                                                           , Lb         (4)
                                                                                                                          i 1

proposed background modeling is similar to CB [11]. The                     mqk                     k
advantage of CB is its high efficiency in background               where p denotes the number of the pixels equal or greater



                                                             699
than Hm. If p is equal to q or 0, then both Ht and Hb are                                                 vblock _ L  xblock_ t
assigned with a value equal to Hm. The variable k denotes
                                                                                                                  1
the number of the pixels which are smaller than Lm. If k                                                  wL 
is equal to (m-q) or 0, then both Lt and Lb are assigned                                                          N
with a value equal to Lm. In RGB color spaces, a divided                                         IV. Otherwise, update the matched codeword cm,
block of a specific color space is transformed to yield a                                            consisting of Vblock_m and wm, by setting:
                                                                                                        block _ m  (1   )vblock _ m   x block_t
set of Ht, Hb, Lt, and Lb. Thus, a block is represented by                                           v                                              (5)
Vblock=(RHt, GHt, BHt, RHb, GHb, BHb, RLt, GLt, BLt, RLb, GLb,                                                             1
BLb).
                                                                                                           wm  wm 
                                                                                                                           N
   The reason that the proposed block feature can provide                                               end for
superior performance than the former schemes is that                                      Step 3: select background codeword in codebook:
unlike the traditional BTC, The codeword size for a block                                       I.    Sort the codewords in descending order
is increased from six to twelve to better characterize the                                            according to their weights
texture of the block for the block-based background                                                                        b
reconstruction. Moreover, the BTC-based strategy can                                             II.       B  arg min  wk  T                              (6)
                                                                                                                      b
                                                                                                                          k 1
significantly reduce the complexity to adapt to a real-time
application. Compared with the former Chen’s
hierarchical method [9], in which the texture information                                 where α denotes the learning rate and which is empirically
is employed to form a 48-dimension feature, the proposed                                  set at 0.05 in this study. Step 3 is to demarcate the
method can effectively classify foreground and                                            background with the way as that in MOG [7]. A codeword
background by simply using 12 dimensions. Moreover,                                       with a bigger weight has higher likelihood of being a
the processing speed is superior to Chen’s method.                                        background codeword in the background codebook. The
                                                                                          codewords are sorted in descending order according to
2.2 Initial background model for block-based                                              their weights, and then select the codewords meet Eq. 6 as
codebook                                                                                  the background codebook, where T denotes an empirical
In block-based stage, an image is divided into                                            threshold with value 0.8.
non-overlapped blocks, and each block can construct its
own codebook. Using N training sequence to build the                                      2.3 Initial background model for pixel-based codebook
block-based codebook, thus each codebook of a block has                                   Algorithm for codebook construction in pixel-based stage
N block vectors for training the background model. Let X                                  is similar to block-based stage when a basic unit is
be a training sequence for a block consisting of N block                                  changed from a block to a pixel. Let X be a training
vectors: X={xblock_1,xblock_2,…,xblock_N}. Let C=(c1, c2,…,                               sequence for a pixel consisting of N RGB vector:
cL) represent the codebook for a block consisting of L                                    X=(xpixe_1, xpixel_2, …, xpixel_N). Let F=(f1, f2,…, fL ) be the
codewords. Each block has a different codebook size                                       codebook for a pixel consisting of L codewords. Each
based on codewords’ weights. Each codeword ci, i=1, …,                                    pixel has a different codebook size based on codewords’
L, consisting of an block vector vblock_i=( RHt _ i , GHt _ i ,                           weight. Each codeword fi, i=1…L, consisting of a pixel
                                                                                          vector vpixel_i=(Ri, Gi, Bi) and a weight wi.
BHt _ i , RHb _ i , GHb _ i , BHb _ i , RLt _ i ,   GLt _ i , BLt _ i , RLb _ i ,            In the step 2(II), find the codeword fm matching to
GLb _ i , BLb _ i ) and a weight wi.                                                      xpixel_t based on the match_function(xpixel_t, vpixel_m) which
                                                                                          will be introduced in Section 2.4. In the step 2(III), if F=0
   In the training phase, an input block vector xblock
                                                                                          or there is no match, then create a new codeword f L by
compares with each codeword in the codebook. If no
                                                                                          assigning xpixel_t to vpixel_L. Otherwise, update fm by
match is found or there is no codeword in the codebook,
                                                                                          assigning (1   )v pixel _ m   x pixel_t to vpixel_m. In the step 3,
the input codeword is created in the codebook. Otherwise,
update the matched codeword, and increase the weight                                      the parameters α and T are identical to that of the
value. To determine which codeword is the best matched                                    block-based stage.
candidate, the match function as introduced in sub-section                                   The proposed block-based and pixel-based procedures
2.4 is employed for measuring. The detailed algorithm is                                  are used to establish the background mode, which is
given below.                                                                              similar to CB [11]. The main difference is that the CB
                                                                                          employs more information to build the background, yet
    Algorithm for block-based codebook construction                                       the proposed method employed the concept of MOG [7]
Step 1: L  0 , C  0 (empty set)                                                         by simply using weights to classify foreground and
Step 2: for t=1 to N do                                                                   background and thus can provide higher efficient
      I. xblock_t=(RHt_t , GHt_t , BHt_t , RHb_t , GHb_t , BHb_t ,                        advantage and the precision is also higher than that of CB
           RLt_t , GLt_t , BLt_t , RLb_t , GLb_t , BLb_t)                                 [11].
      II. find the codeword cm in C={ ci | 1  i  L }
                                                                                          2.4 Match function
           matching to xblock_t based on:                                                 The match function for n dimensions employed in this
           Matching_function(xblock_t, vblock_m)=true
                                                                                          study in terms of squared distance is given as below
      III. If C=0 or there is no match, then L  L  1.
                                                                                           dTd
           Create a new codeword cL by setting:                                                 2                                            (7)
                                                                                            N




                                                                                    700
where d  (I ) 1 ( x  v) , and the empirical value of the                      Input Sequence

standard deviation σ is in between 2.5 and 5, with 2.5 as a
tight bound and 5 as a loosen bound; The identity matrix
is of size NxN, where N=12 and 3 in block-based and
                                                                         Foreground
pixel-based stages, respectively. The match function can                  Detection
be applied for n dimensions; the proposed block vector
vblock in the block-based stage is of 12 dimensions, and                                                                                          Foreground
                                                                                                            Foreground
pixel vector vpixel is of 3 dimensions. A match is found as                       Block-based stage                           Pixel-based stage
sample falling within λ=2.5 standard deviation of the
mean of one of the codeword. The output of the match                           Background
function is as below:
                                    d Td                                                                                       Background
                            true,         2 ;         (8)
                                                                                 Update Pixel-based                              Shadows
 match _ function ( x, v)           N
                                                                                 background Model                                Highlight

                             false, otherwise.
                            
In the pixel-based phase, the color model is exploited to
classify a pixel simply when no match is found. The
                                                                                 Short term            Construct short term        Construct short term
strategy can significantly improve the efficiency.                              information                information                  information
                                                                                                      for Block-based stage        for Pixel-based stage

3. FOREGROUND DETECTION
The proposed foreground detection stage can also be                                            Weight > T_add                 Weight > T_add
divided into block-based and pixel-based stages. In the
block-based stage, the match function introduced in
                                                                                                        Insert short term             Insert short term
section 2.4 is employed to distinguish background or                                  Insert               information                   information
foreground. If a block is classified as background, then                                                 to Block-based
                                                                                                       background model
                                                                                                                                       to Pixel-based
                                                                                                                                     background model
which is fed to pixel-based background model updating
for adapting to the current environment conditions. Yet,                     Fig. 2. Flow chart for foreground detection.
this raises a disadvantage by increasing the processing
time for foreground detection. For this, the threshold               3.2 Pixel-based background updating model
T_update is used to enable the updating phase, which                 To adapt to the current environment conditions, when a
means the updating is conducted every T_update frames.               block is classified as background in block-based stage, the
Empirically, the T_update is set at 2~5 to guarantee the             corresponding pixel-based background model needs to be
adaptation of the background model. Using the color                  updated. Yet this raises a disadvantage by increasing the
model function which will be introduced in Section 3.5               processing time for foreground detection. For this, the
and the match function can distinguish the current frame             threshold T_update is used to enable the updating phase,
into four states, background, foreground, high light and             which means the updating is conducted every T_update
shadows. Figure 2 shows the proposed foreground                      frames. Empirically, the T_update is set at 2~5 to
detection flow chart, and which is detailed in the                   guarantee the adaptation of the background model.
following sub-sections.                                              Meanwhile, the match function is used to find the
                                                                     matched codeword for updating. The details of the
3.1 Foreground detection with block-based stage                      algorithm are organized as below.
Block-based stage is employed to separate background
and foreground. Although the block-based stage has low                 Algorithm for pixel-based background model updating
precision, it can ensure the detected foreground without             Step 1: xpixel=(R,G,B)
reducing TP rate when σ is set with a small value as a               Step 2: if the accumulated time is equal to T_update, then
tight bound. However, a small σ increases FP rate as well.           do
Therefore, there is a trade-off in choosing the value of σ.                1)      for all codewords in B in Eq. (6), find the
Herein, the empirical value is set at 2.5 in this work.                            codeword fm matching to xpixel based on :
                                                                                        Match_function(xpixle, vpixel_m)=true
 Algorithm for background subtraction using block-based                            Update         the       matched           codeword as
                            codebook                                               v pixel _ m  (1   )v pixel _ m   x pixel
Step 1: xblock=(RHt, GHt, BHt, RHb, GHb, BHb, RLt, GLt, BLt,
       RLb, GLb, BLb)
Step2: for all codewords in B in Eq. (6), find the                   3.3 Foreground detection with pixel-based stage
       codeword cm matching to xblock based on :                     If a pixel is classified as foreground in block-based, then
           Match_function(xblock, vblock_m)=true
                                                                     input pixel xpixel=(R,G,B) proceeds to pixel-based stage to
       Update the matched codeword as in Eq. (5)                     determine the state of a pixel. Algorithm for pixel-based
                                                                     background subtraction is similar to block-based. The
                         Foreground if there is no match;
               block )  
Step 3: BS(x                                                         only difference is on the match function. Herein, the color
                         Background otherwise.                      model and match function are used to determine a pixel
                                                                     vector belongs to shadow, highlight, background, or



                                                               701
foreground. The detailed algorithm is organized below.                         between 2 and 3.5.
                                                                               I _ max   v , I _ min   v
 Algorithm for background subtraction using pixel-based                                                                              (11)
                                                                               Where β>1 and γ<1. In our experiments, β is set in
                             codebook
Step 1: xpixel=(R,G,B)                                                         between 1.1 and 1.25, and γ is in between 0.7 and 0.85.
Step 2: for all codewords in B in Eq. (6), find the                            The range [I_max, I_min] is used for measuring comp_I;
        codeword fm matching to xpixel based on :                              if comp_I is not in this range, the pixel is classified as
                                                                               foreground. The overall color model is organized as
          s  color _ mod el _ function ( x pixel , v pixel )
                                                                               below:
          If s is classified as background, then do
            v pixelm  (1   )v pixelm   x pixel                          Color_model_function(x, v) =
                                                                               Background     if match_func tion (x, v)  true;
3.4 Color model                                                                                      proj_I
                                                                               Highlight      else            tan θ & v  comp _ I  I_max;
In [18], the proposed color model can classify a RGB                                               comp_I
color pixel into shadow, highlight, background, and                            
                                                                               Shadow                proj_I
foreground. However, many parameters are employed in                                           else             tan θ & I_min  comp _ I  v ;
                                                                                                    comp _ I
this model, which leads to a disadvantage by increasing                        
the computational complexity. In this work, the number of                      Foreground     otherwise.
parameters is reduced to three, namely θ, β, and γ, to                                                                                  (12)
reduce the complexity. Figure 3 shows the modified color                       4. BACKGROUND MODEL UPDATING WITH
model.                                                                         SHORT TERM INFORMATION
                     G                                                         As indicated in Fig. 2, the background model updating
                                                                               with the short term information is divided to two stages:
                              I_max
                                                                               First, construct short term information model with
                                                                               foreground region; second, if a codeword in short term
                                                                               information model accumulates enough weights, this
                          _I
                         mp




                                      Vi
                                                                               codeword will be inserted to background model for
                         Co




                                             Proj_I
                     I_min                                                     foreground detection. This strategy yields an advantage: A
                                           Xi (input pixel)
                                                                               user can control the lasting period of a stationary
                               θ                                               foreground which can be inserted to the background
                                                                               model. However, a non-stationary foreground region will
                                                              R                lead to too much unnecessary codewords in short term
             B
                                                                               information model. For this, time information is added to
                 Fig. 3. Modified color model                                  a codeword and a threshold is used to decide whether a
                                                                               codeword is reserved or deleted. In addition, identical
Given an input pixel vector x, the match function is                           strategy is applied to background model as well.
employed for measuring if it is in background state. If the                       The procedures of the short term information
vector x is classified as foreground by the match function,                    construction for block-based and pixel-based phases are
then we compare the angle, tanθ. If proj_I/comp_I is                           identical. The main concept is to add an additional model
greater than tanθ, the vector x must be foreground.                            S called the short term information model. The S records
Otherwise, the input vector x may fall within this color                       foreground regions after foreground detection. The model
model bound. Subsequently, the variables I_max and                             construction is similar to that of Section 2. Yet, herein the
I_min are calculated; if comp_I falls in between v and                         time information (S_time) is added for a codeword. In
I_max, the pixel is classified as highlight; if the pixel                      addition, three additional thresholds, S_delete, T_add, and
value is not in between I_max and I_min, the pixel is                          B_delete, are employed: S_delete is used to determine
classified as foreground.                                                      whether a codeword is reserved or deleted. If the current
       Given an input pixel vector x=(R,G,B) and                               time subtracted by the last time information of a
background vector v  ( R , G , B ) ,                                          codeword is smaller than S_delete, then the codeword is
                                                                               unnecessary in the codebook, and thus it is deleted from
 x  R2  G2  B2 , v  R 2  G 2  B 2
                                                                               the codebook. T_add is used to decide whether a
 x, v  ( RR  GG  BB )                                                       codeword is inserted to background model. If a codeword
                              x, v                                             accumulates enough weights, then this codeword can be a
comp _ I  x cos                                                 (9)         part of background model. B_delete is used to determine
                               v
                                                                               whether a codeword is reserved or deleted in background
proj _ I    x  comp _ I 2
                 2
                                                                  (10)         model when short term information is inserted to
                                                                               background model, and sets the parameter B_delete
where comp_I is used to determine a pixel vector belongs
                                                                               equals to T_add times S_delete (which is the worst case)
to shadow or highlight; where proj_I is used to measure
                                                                               to ensure reserve the last updated time for codeword in
the nearest distance with background vector v. If
                                                                               background model. The overall procedure of the
proj_I/comp_I is greater than tanθ, then the pixel is
                                                                               algorithm is organized as below.
classified as foreground. Herein, θ is empirically set in



                                                                         702
Algorithm for short term information model construction              green and blue, are employed to represent shadows,
Step 1: Given a background model B with the initial                   highlight and foreground, respectively. Figure 4(b) shows
         background model, create a new model S for                   the detected results using the block-based stage with
         recording foreground regions.                                block of size 10x10, in which most of the noises can be
Step 2: Add time information parameter (B_time) for                   removed. Figure 4(c) shows the results obtained by the
         every codeword in B for recording current time               hierarchical block-based and pixel-based stages.
         (C_time). S is assigned with an empty set.                   Apparently, the pixel-based stage can significantly
Step 3:                                                               enhance the detected precision. Yet, we would like to
      I. Find a match codeword in B for an input image.               point out a weakness of the proposed method. As it can be
          The “match” is determined by when a codeword                seen in the third row of Fig. 4 (Highway_I), when the
          is found during the updating codeword in Eqs.               color of the shadow is dark, it will be classified as
          (5) and B_time is equal to C_time.                          foreground. Since a lower threshold is set for the color
      II. If no match codeword is found in B, then search             model of the proposed method, when the value exceeds
          the matched codeword in S for foreground                    the threshold it will be classified as shadows. The problem
          region, and do the following steps:                         can be eased by increasing the threshold. Yet, as it can be
           i. find the codeword sm in S={ si | 1  i  L }            seen in Fig. 4, some of the foregrounds are classified as
              whether matching to x (input vector) based              shadows by doing this. In summary, the proposed method
              on the matching function.                               performs well for small intensity of shadows, yet it cannot
          ii. If S=0 or no match, then L  L  1 . Create a           provide perfect performance for greater intensity of
              new codeword sL by setting:                             shadows.
                    vL  x

                    wL=1
                    S_timeL=C_time
         iii. Otherwise, update the matched codeword s m,
              consisting of vm, wm and S_timem, by
              setting:
                    v  (1   )v   x
                       m            m
                    w  w 1
                        m     m
                    S_time  C_time
                            m

Step 4: S  {sm | (C _ time  S _ timem )  S _ delete}
Step 5: Check the weight of every codeword in S. If the
         weight of the codeword is greater than T_add,
         then do the following steps:
     I. B  {cm | (C _ time  B _ timem )  B _ delete}
                                                                               (a)                 (b)                 (c)
     II. Add codeword as short term information at the
                                                                      Fig. 4. Classified results of sequence [19] for IR (row 1),
          head of B.
                                                                      Campus (row 2) and Highway_I (row 3) with shadow
Step 6: Repeat the algorithm from Step 3 to Step 5.
                                                                      (red), highlight (green), and foreground (blue). (a)
                                                                      Original image, (b) block-based stage only with block of
5. EXPERIMENTAL RESULTS
                                                                      size 10x10, and (c) proposed method.
For measuring the accuracy of the results, the criterions
FP rate, TP rate, Precision, and Similarity [12] are
                                                                         Figure 5 shows the test sequence WT [21] of
employed as defined below:
                                                                      non-stationary background with waving tree containing
             fp                  tp
FP rate            TP rate                                          287 frames of size 160x120. Compared with the five
          fp  tn ,           tp  fn ,                               former methods, MOG [7], Rita’s method [4], CB [11],
               tp
                      Similarity 
                                         tp                           Chen’s method [9] and Chiu’s method [22] the proposed
Precision 
            tp  fp ,               tp  fp  fn ,                    method can provide better performance in handling
                                                                      non-stationary background. Moreover, show the detected
where tp, tn, fp, and fn denote the numbers of true
                                                                      results with different block sizes using simply
positives, true negative, true positives, and false negative,
                                                                      block-based codebook. Apparently, most noises are
respectively; (tp + fn) indicates the total number of pixels
                                                                      removed without reducing TP rate. Most importantly, the
presented in the foreground, and (fp + tn) indicates the
                                                                      processing speed is highly efficient with the block-based
total number of pixels presented in the background. In our
                                                                      strategy. Yet, a low precision is its drawback. To
experimental results is without any post processing and
                                                                      overcome this problem, the pixel-based stage is involved
short term information for measuring the accuracy of the
                                                                      to enhance the precision, and which can also reduce the
results.
                                                                      FP rate. And, show the detected results using the proposed
    Figure 4 shows the test sequences [19] of size
                                                                      hierarchical scheme (block-based stage and pixel-based
320x240 with IR (row 1), Campus (row 2) and
                                                                      stage) with various block sizes. Figures 6(a)-(d) shows the
Highway_I (row 3). To provide a better understanding
                                                                      accuracy values, FP rate, TP rate, Precision, and Similarity,
about the detected results, three colors, including red,



                                                                703
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2
CVGIP 2010 Part 2

Weitere ähnliche Inhalte

Was ist angesagt?

A Novel Approach for Tracking with Implicit Video Shot Detection
A Novel Approach for Tracking with Implicit Video Shot DetectionA Novel Approach for Tracking with Implicit Video Shot Detection
A Novel Approach for Tracking with Implicit Video Shot DetectionIOSR Journals
 
IRJET- Moving Object Detection using Foreground Detection for Video Surveil...
IRJET- 	 Moving Object Detection using Foreground Detection for Video Surveil...IRJET- 	 Moving Object Detection using Foreground Detection for Video Surveil...
IRJET- Moving Object Detection using Foreground Detection for Video Surveil...IRJET Journal
 
HUMAN MOTION DETECTION AND TRACKING FOR VIDEO SURVEILLANCE
HUMAN MOTION  DETECTION AND TRACKING FOR VIDEO SURVEILLANCEHUMAN MOTION  DETECTION AND TRACKING FOR VIDEO SURVEILLANCE
HUMAN MOTION DETECTION AND TRACKING FOR VIDEO SURVEILLANCENEHA THADEUS
 
Video surveillance Moving object detection& tracking Chapter 1
Video surveillance Moving object detection& tracking Chapter 1 Video surveillance Moving object detection& tracking Chapter 1
Video surveillance Moving object detection& tracking Chapter 1 ahmed mokhtar
 
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...csandit
 
TRACKING OF PARTIALLY OCCLUDED OBJECTS IN VIDEO SEQUENCES
TRACKING OF PARTIALLY OCCLUDED OBJECTS IN VIDEO SEQUENCESTRACKING OF PARTIALLY OCCLUDED OBJECTS IN VIDEO SEQUENCES
TRACKING OF PARTIALLY OCCLUDED OBJECTS IN VIDEO SEQUENCESPraveen Pallav
 
Survey on video object detection &amp; tracking
Survey on video object detection &amp; trackingSurvey on video object detection &amp; tracking
Survey on video object detection &amp; trackingijctet
 
Parallel wisard object tracker a rambased tracking system
Parallel wisard object tracker a rambased tracking systemParallel wisard object tracker a rambased tracking system
Parallel wisard object tracker a rambased tracking systemcseij
 
Wireless Vision based Real time Object Tracking System Using Template Matching
Wireless Vision based Real time Object Tracking System Using Template MatchingWireless Vision based Real time Object Tracking System Using Template Matching
Wireless Vision based Real time Object Tracking System Using Template MatchingIDES Editor
 
Encrypted sensing of fingerprint image
Encrypted sensing of fingerprint imageEncrypted sensing of fingerprint image
Encrypted sensing of fingerprint imageDhirendraKumar170
 
Secure Image Transfer in The Domain Transform DFT
Secure Image Transfer in The Domain Transform DFTSecure Image Transfer in The Domain Transform DFT
Secure Image Transfer in The Domain Transform DFTijcisjournal
 
Satellite image classification and content base image retrieval using type-2 ...
Satellite image classification and content base image retrieval using type-2 ...Satellite image classification and content base image retrieval using type-2 ...
Satellite image classification and content base image retrieval using type-2 ...Shirish Agale
 
IRJET-Real-Time Object Detection: A Survey
IRJET-Real-Time Object Detection: A SurveyIRJET-Real-Time Object Detection: A Survey
IRJET-Real-Time Object Detection: A SurveyIRJET Journal
 
A NOVEL BACKGROUND SUBTRACTION ALGORITHM FOR PERSON TRACKING BASED ON K-NN
A NOVEL BACKGROUND SUBTRACTION ALGORITHM FOR PERSON TRACKING BASED ON K-NN A NOVEL BACKGROUND SUBTRACTION ALGORITHM FOR PERSON TRACKING BASED ON K-NN
A NOVEL BACKGROUND SUBTRACTION ALGORITHM FOR PERSON TRACKING BASED ON K-NN csandit
 
IRJET - Object Detection and Translation for Blind People using Deep Learning
IRJET - Object Detection and Translation for Blind People using Deep LearningIRJET - Object Detection and Translation for Blind People using Deep Learning
IRJET - Object Detection and Translation for Blind People using Deep LearningIRJET Journal
 

Was ist angesagt? (18)

A Novel Approach for Tracking with Implicit Video Shot Detection
A Novel Approach for Tracking with Implicit Video Shot DetectionA Novel Approach for Tracking with Implicit Video Shot Detection
A Novel Approach for Tracking with Implicit Video Shot Detection
 
IRJET- Moving Object Detection using Foreground Detection for Video Surveil...
IRJET- 	 Moving Object Detection using Foreground Detection for Video Surveil...IRJET- 	 Moving Object Detection using Foreground Detection for Video Surveil...
IRJET- Moving Object Detection using Foreground Detection for Video Surveil...
 
HUMAN MOTION DETECTION AND TRACKING FOR VIDEO SURVEILLANCE
HUMAN MOTION  DETECTION AND TRACKING FOR VIDEO SURVEILLANCEHUMAN MOTION  DETECTION AND TRACKING FOR VIDEO SURVEILLANCE
HUMAN MOTION DETECTION AND TRACKING FOR VIDEO SURVEILLANCE
 
Video surveillance Moving object detection& tracking Chapter 1
Video surveillance Moving object detection& tracking Chapter 1 Video surveillance Moving object detection& tracking Chapter 1
Video surveillance Moving object detection& tracking Chapter 1
 
L0816166
L0816166L0816166
L0816166
 
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...
 
TRACKING OF PARTIALLY OCCLUDED OBJECTS IN VIDEO SEQUENCES
TRACKING OF PARTIALLY OCCLUDED OBJECTS IN VIDEO SEQUENCESTRACKING OF PARTIALLY OCCLUDED OBJECTS IN VIDEO SEQUENCES
TRACKING OF PARTIALLY OCCLUDED OBJECTS IN VIDEO SEQUENCES
 
Survey on video object detection &amp; tracking
Survey on video object detection &amp; trackingSurvey on video object detection &amp; tracking
Survey on video object detection &amp; tracking
 
Parallel wisard object tracker a rambased tracking system
Parallel wisard object tracker a rambased tracking systemParallel wisard object tracker a rambased tracking system
Parallel wisard object tracker a rambased tracking system
 
Wireless Vision based Real time Object Tracking System Using Template Matching
Wireless Vision based Real time Object Tracking System Using Template MatchingWireless Vision based Real time Object Tracking System Using Template Matching
Wireless Vision based Real time Object Tracking System Using Template Matching
 
Encrypted sensing of fingerprint image
Encrypted sensing of fingerprint imageEncrypted sensing of fingerprint image
Encrypted sensing of fingerprint image
 
Secure Image Transfer in The Domain Transform DFT
Secure Image Transfer in The Domain Transform DFTSecure Image Transfer in The Domain Transform DFT
Secure Image Transfer in The Domain Transform DFT
 
Satellite image classification and content base image retrieval using type-2 ...
Satellite image classification and content base image retrieval using type-2 ...Satellite image classification and content base image retrieval using type-2 ...
Satellite image classification and content base image retrieval using type-2 ...
 
IRJET-Real-Time Object Detection: A Survey
IRJET-Real-Time Object Detection: A SurveyIRJET-Real-Time Object Detection: A Survey
IRJET-Real-Time Object Detection: A Survey
 
A NOVEL BACKGROUND SUBTRACTION ALGORITHM FOR PERSON TRACKING BASED ON K-NN
A NOVEL BACKGROUND SUBTRACTION ALGORITHM FOR PERSON TRACKING BASED ON K-NN A NOVEL BACKGROUND SUBTRACTION ALGORITHM FOR PERSON TRACKING BASED ON K-NN
A NOVEL BACKGROUND SUBTRACTION ALGORITHM FOR PERSON TRACKING BASED ON K-NN
 
Ag044216224
Ag044216224Ag044216224
Ag044216224
 
Moving object detection
Moving object detectionMoving object detection
Moving object detection
 
IRJET - Object Detection and Translation for Blind People using Deep Learning
IRJET - Object Detection and Translation for Blind People using Deep LearningIRJET - Object Detection and Translation for Blind People using Deep Learning
IRJET - Object Detection and Translation for Blind People using Deep Learning
 

Andere mochten auch

Fingerprint detection
Fingerprint detectionFingerprint detection
Fingerprint detectionMudit Mishra
 
Dr. rustom kanga latest trends in video analytics for railways
Dr. rustom kanga latest trends in video analytics for railwaysDr. rustom kanga latest trends in video analytics for railways
Dr. rustom kanga latest trends in video analytics for railwaysimadhammoud
 
Fingerprint Recognition Technique(PDF)
Fingerprint Recognition Technique(PDF)Fingerprint Recognition Technique(PDF)
Fingerprint Recognition Technique(PDF)Sandeep Kumar Panda
 

Andere mochten auch (6)

SECURITYHIGHEND CSPL
SECURITYHIGHEND CSPLSECURITYHIGHEND CSPL
SECURITYHIGHEND CSPL
 
Final PPT
Final PPTFinal PPT
Final PPT
 
Fingerprint detection
Fingerprint detectionFingerprint detection
Fingerprint detection
 
Dr. rustom kanga latest trends in video analytics for railways
Dr. rustom kanga latest trends in video analytics for railwaysDr. rustom kanga latest trends in video analytics for railways
Dr. rustom kanga latest trends in video analytics for railways
 
Fingerprint Recognition Technique(PDF)
Fingerprint Recognition Technique(PDF)Fingerprint Recognition Technique(PDF)
Fingerprint Recognition Technique(PDF)
 
Fingerprint recognition
Fingerprint recognitionFingerprint recognition
Fingerprint recognition
 

Ähnlich wie CVGIP 2010 Part 2

Analysis of Human Behavior Based On Centroid and Treading Track
Analysis of Human Behavior Based On Centroid and Treading  TrackAnalysis of Human Behavior Based On Centroid and Treading  Track
Analysis of Human Behavior Based On Centroid and Treading TrackIJMER
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
A Novel Background Subtraction Algorithm for Person Tracking Based on K-NN
A Novel Background Subtraction Algorithm for Person Tracking Based on K-NN A Novel Background Subtraction Algorithm for Person Tracking Based on K-NN
A Novel Background Subtraction Algorithm for Person Tracking Based on K-NN cscpconf
 
A NOVEL METHOD FOR PERSON TRACKING BASED K-NN : COMPARISON WITH SIFT AND MEAN...
A NOVEL METHOD FOR PERSON TRACKING BASED K-NN : COMPARISON WITH SIFT AND MEAN...A NOVEL METHOD FOR PERSON TRACKING BASED K-NN : COMPARISON WITH SIFT AND MEAN...
A NOVEL METHOD FOR PERSON TRACKING BASED K-NN : COMPARISON WITH SIFT AND MEAN...sipij
 
Object Detection and Tracking AI Robot
Object Detection and Tracking AI RobotObject Detection and Tracking AI Robot
Object Detection and Tracking AI RobotIRJET Journal
 
IRJET- Comparative Analysis of Video Processing Object Detection
IRJET- Comparative Analysis of Video Processing Object DetectionIRJET- Comparative Analysis of Video Processing Object Detection
IRJET- Comparative Analysis of Video Processing Object DetectionIRJET Journal
 
Moving object detection using background subtraction algorithm using simulink
Moving object detection using background subtraction algorithm using simulinkMoving object detection using background subtraction algorithm using simulink
Moving object detection using background subtraction algorithm using simulinkeSAT Publishing House
 
A Robust Method for Moving Object Detection Using Modified Statistical Mean M...
A Robust Method for Moving Object Detection Using Modified Statistical Mean M...A Robust Method for Moving Object Detection Using Modified Statistical Mean M...
A Robust Method for Moving Object Detection Using Modified Statistical Mean M...ijait
 
A Critical Survey on Detection of Object and Tracking of Object With differen...
A Critical Survey on Detection of Object and Tracking of Object With differen...A Critical Survey on Detection of Object and Tracking of Object With differen...
A Critical Survey on Detection of Object and Tracking of Object With differen...Editor IJMTER
 
IRJET - Direct Me-Nevigation for Blind People
IRJET -  	  Direct Me-Nevigation for Blind PeopleIRJET -  	  Direct Me-Nevigation for Blind People
IRJET - Direct Me-Nevigation for Blind PeopleIRJET Journal
 
Detection and Tracking of Objects: A Detailed Study
Detection and Tracking of Objects: A Detailed StudyDetection and Tracking of Objects: A Detailed Study
Detection and Tracking of Objects: A Detailed StudyIJEACS
 
DEEP LEARNING APPROACH FOR EVENT MONITORING SYSTEM
DEEP LEARNING APPROACH FOR EVENT MONITORING SYSTEMDEEP LEARNING APPROACH FOR EVENT MONITORING SYSTEM
DEEP LEARNING APPROACH FOR EVENT MONITORING SYSTEMIJMIT JOURNAL
 
Algorithmic Analysis to Video Object Tracking and Background Segmentation and...
Algorithmic Analysis to Video Object Tracking and Background Segmentation and...Algorithmic Analysis to Video Object Tracking and Background Segmentation and...
Algorithmic Analysis to Video Object Tracking and Background Segmentation and...Editor IJCATR
 
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...cscpconf
 
Proposed Multi-object Tracking Algorithm Using Sobel Edge Detection operator
Proposed Multi-object Tracking Algorithm Using Sobel Edge Detection operatorProposed Multi-object Tracking Algorithm Using Sobel Edge Detection operator
Proposed Multi-object Tracking Algorithm Using Sobel Edge Detection operatorQUESTJOURNAL
 
Real time object tracking and learning using template matching
Real time object tracking and learning using template matchingReal time object tracking and learning using template matching
Real time object tracking and learning using template matchingeSAT Publishing House
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 

Ähnlich wie CVGIP 2010 Part 2 (20)

Analysis of Human Behavior Based On Centroid and Treading Track
Analysis of Human Behavior Based On Centroid and Treading  TrackAnalysis of Human Behavior Based On Centroid and Treading  Track
Analysis of Human Behavior Based On Centroid and Treading Track
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
A Novel Background Subtraction Algorithm for Person Tracking Based on K-NN
A Novel Background Subtraction Algorithm for Person Tracking Based on K-NN A Novel Background Subtraction Algorithm for Person Tracking Based on K-NN
A Novel Background Subtraction Algorithm for Person Tracking Based on K-NN
 
A NOVEL METHOD FOR PERSON TRACKING BASED K-NN : COMPARISON WITH SIFT AND MEAN...
A NOVEL METHOD FOR PERSON TRACKING BASED K-NN : COMPARISON WITH SIFT AND MEAN...A NOVEL METHOD FOR PERSON TRACKING BASED K-NN : COMPARISON WITH SIFT AND MEAN...
A NOVEL METHOD FOR PERSON TRACKING BASED K-NN : COMPARISON WITH SIFT AND MEAN...
 
Object Detection and Tracking AI Robot
Object Detection and Tracking AI RobotObject Detection and Tracking AI Robot
Object Detection and Tracking AI Robot
 
[IJET-V1I3P20] Authors:Prof. D.S.Patil, Miss. R.B.Khanderay, Prof.Teena Padvi.
[IJET-V1I3P20] Authors:Prof. D.S.Patil, Miss. R.B.Khanderay, Prof.Teena Padvi.[IJET-V1I3P20] Authors:Prof. D.S.Patil, Miss. R.B.Khanderay, Prof.Teena Padvi.
[IJET-V1I3P20] Authors:Prof. D.S.Patil, Miss. R.B.Khanderay, Prof.Teena Padvi.
 
IRJET- Comparative Analysis of Video Processing Object Detection
IRJET- Comparative Analysis of Video Processing Object DetectionIRJET- Comparative Analysis of Video Processing Object Detection
IRJET- Comparative Analysis of Video Processing Object Detection
 
Moving object detection using background subtraction algorithm using simulink
Moving object detection using background subtraction algorithm using simulinkMoving object detection using background subtraction algorithm using simulink
Moving object detection using background subtraction algorithm using simulink
 
A Robust Method for Moving Object Detection Using Modified Statistical Mean M...
A Robust Method for Moving Object Detection Using Modified Statistical Mean M...A Robust Method for Moving Object Detection Using Modified Statistical Mean M...
A Robust Method for Moving Object Detection Using Modified Statistical Mean M...
 
A Critical Survey on Detection of Object and Tracking of Object With differen...
A Critical Survey on Detection of Object and Tracking of Object With differen...A Critical Survey on Detection of Object and Tracking of Object With differen...
A Critical Survey on Detection of Object and Tracking of Object With differen...
 
IRJET - Direct Me-Nevigation for Blind People
IRJET -  	  Direct Me-Nevigation for Blind PeopleIRJET -  	  Direct Me-Nevigation for Blind People
IRJET - Direct Me-Nevigation for Blind People
 
Detection and Tracking of Objects: A Detailed Study
Detection and Tracking of Objects: A Detailed StudyDetection and Tracking of Objects: A Detailed Study
Detection and Tracking of Objects: A Detailed Study
 
DEEP LEARNING APPROACH FOR EVENT MONITORING SYSTEM
DEEP LEARNING APPROACH FOR EVENT MONITORING SYSTEMDEEP LEARNING APPROACH FOR EVENT MONITORING SYSTEM
DEEP LEARNING APPROACH FOR EVENT MONITORING SYSTEM
 
D018112429
D018112429D018112429
D018112429
 
Algorithmic Analysis to Video Object Tracking and Background Segmentation and...
Algorithmic Analysis to Video Object Tracking and Background Segmentation and...Algorithmic Analysis to Video Object Tracking and Background Segmentation and...
Algorithmic Analysis to Video Object Tracking and Background Segmentation and...
 
Csit3916
Csit3916Csit3916
Csit3916
 
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...
 
Proposed Multi-object Tracking Algorithm Using Sobel Edge Detection operator
Proposed Multi-object Tracking Algorithm Using Sobel Edge Detection operatorProposed Multi-object Tracking Algorithm Using Sobel Edge Detection operator
Proposed Multi-object Tracking Algorithm Using Sobel Edge Detection operator
 
Real time object tracking and learning using template matching
Real time object tracking and learning using template matchingReal time object tracking and learning using template matching
Real time object tracking and learning using template matching
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 

Kürzlich hochgeladen

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 

Kürzlich hochgeladen (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

CVGIP 2010 Part 2

  • 1. Robust Abandoned Object Detection based on Life-cycle State Measurement 1 Wei-Hsin Hsu(徐維忻), 2Hung-I Pai(白宏益) , 3Shen-Zheng Wang(王舜正), 4San-Lung Zhao(趙 善隆), 5Kung-Ming Lan(藍坤銘) Identification and Security Technology Center, Systems Development and Solutions Division, Industrial Technology Research Institute, Hsin-Chu, Taiwan E-mail: hsuweihsin@itri.org.tw 2HIPai@itri.org.tw 3st@itri.org.tw 4slzhao@itri.org.tw 1 5 blueriver@itri.org.tw ABSTRACT can be used in some applications such as dangerous abandoned object detection, abandoned luggage In public areas, objects could be abandoned due to detection for passengers and so on. Moreover, this careless forgetting or terrorist attack purposes. If we system is also lowering personnel. can detect those abandoned objects automatically based on a video surveillance system, the forgotten objects [1] provides a two backgrounds framework (long-term can be returned to the owner and the terrorist attacks and short-term). The two backgrounds framework uses can be stopped. In this paper, we propose an automatic pair of backgrounds that have different characteristics abandoned object detection algorithm to satisfy the to segment related foregrounds and extract abandoned requirement. The algorithm includes a double- objects. Our background module building technology is background framework, which generates two just follow [1] and tries to fix it. The advantage of background models according to different sampling method from [1] is that it doesn’t track all of objects so rates. The framework can extract un-moving object that it can save the tracking computing performance. candidates rapidly and the abandoned objects will be However, this method is not perfect yet; temporal rate extracted from those candidates according to their of background construction is quite important, in [1], appearance features. Finally, this paper propose a finite long-term and short-term background will be updated state machine to model The Life-cycle State for a period of time. However, if long-term background Measurement called LCSM module, to prevent the is updated before the abandoned object is detected, it missing fault caused by occlusion or illumination cause failure in detection and effect on the result. In changing. When abandoned events happened, the this study, we try to find the optimized key timing to LCSM module can launch or stop an alarm in the reduce the risk of failure in detection cause of long- starting time or the ending time. To evaluation the term background updating. The detail is described in performance, we test the algorithm on 10 videos. The Section 2. experiment result shows that the algorithm is feasible, since the false alarm rate and missing rate are very low. [2] provides a framework of two level backgrounds, which uses linear combination, belonging to single Keywords Pure background; Instant background; un- background building method and manages to track each moving object aggregated map; LCSM moving object. The framework uses optical flow technology to detect moving objects because optical 1. INTRODUCTION flow for each object will be changed when the object is moving. Therefore, it can easy to recognize moving The problem of video surveillance becomes quite object and static object. Static object and stayed human important issue today. Following the progressed are separated by Human Patten recognition method. technology, a lot of places including the public area, However, the method still has some limits for filtering. insider building, even in the public roadway are set up For example, we are hard to confirm each shape of still camera for surveillance. However, we cannot monitor human. Therefore, we must have enough Human Patten each camera by human immediately cause of templates. When the number of template is up, the insufficient human resource. In this study, we propose a recognition rate is increased; even so, the computing new abandoned object detection technology, which performance is raised. Therefore, it is unsuitable for receive the video stream from a camera and than detect performance priority. In Section 2, we propose a the abandoned object in few minutes. The technology method to filter an object by the object feature filtering 691
  • 2. to avoid the performance problem cause of using the of objects in S are aggregated over the threshold, we Human Patten method. draw objects out from S and save those objects. Moreover, those objects should be filtered according to [3] provides a framework of two backgrounds method feature filtering including Shape 、 Compactness and (current and buffered background) to track all objects Area. The final processing is the most important issue and to record the object’s information to determine in this paper: The Life-cycle State Measurement called whether it is occluded or not. The advantage is that it is LCSM. The conception of LCSM is from software still locks on target even so the target is occluded. engineering, the original meaning talks about the life- Section 2, we fix the idea from [3] and provide the cycle of software, this paper puts the idea into theory about The Life-cycle State Measurement (LCSM) abandoned object detection, it makes abandoned object to make each abandoned object in different have different state in each situation. In this paper , the environments more convinced and have the detection state include the growing state、the stable state、the result become more reasonable. aging state and the dead state. We assign proper state for each abandoned object in different situations like occlusion or removal, in this way; we can make abandoned object processing more reasonable. This issue will be discussed in Section 2, and Fig. 1 is shown the definition of all the symbols and described the relationships one another. 2. ABANDONED OBJECT EDTECTION SYSTEM Fig. 1.Symbol definitions and flow An abandoned object detection method using object tracking method could be feasible. However, the problem of using tracking method has lower efficiency when too many objects are tracked, since those methods track not only abandoned object but also all other objects. Otherwise, the computing is heavy. Therefore, Fig. 2. System Overview we use new method based on different sample rate instead of tracking-based technology to avoid the In this paper, the system consists of three modules as problem we discussed above. In this paper , we divide shown in Fig. 2. The first module, um-moving object the system framework into three technological detection, is composed of foreground detection, un- processeses; first, we manage to receive the pure moving object decision and real un-moving object background called BP(t) and the instant background extraction by aggregated processing. The second called BI(t) according to different sample rate at time t, module called an abandoned object decision; the BP(t) means the original background without any function of this module is clustering the image pixels of moving objects here, and BI(t) means the background un-moving object from the un-moving object received form video sequence for a short period. aggregated map received from module 1. Moreover, Computing by frame difference between current frame those un-moving objects are filtered by the object BC(t) and BP(t), BI(t) to get the pure foreground called FP(t) feature filtering method and the abandoned objects are and the instant foreground called FI(t). Following the decided. The final module, life cycle of abandoned rules in Section 2, we can extract the un-moving objects object, decides the persistent time of an abandoned in current frame according to FP(t) and FI(t), and getting object event according to a finite state machine. the un-moving object map called St(t) at present. Second, the processing will aggregate value for each pixel from According to the definition of the modules, three main each St(t) to receive the un-moving object aggregated corresponding technologies are presented in the map called S, this processing can get rid of some following. objects which remain within short period. When some 692
  • 3. 2.1 The un-moving object decision background using BC(t) because BC(t) could have moving or un-moving objects here, and it is easy to update the According to the discussion from the Introduction, the objects into Bp(t). For that reason, To keeping pure, we abandoned objects detection by objects tracking method select foreground map FP(t) as the mask, and then we could cause performance problem, because most use the linear combination method to combine the computing are cost to track moving objects. In [1] [3], masked pixels of BP(t) and BC(t). Therefore it can avoid double backgrounds are proposed to remove moving objects updated into BP(t) and suitable for illumination objects and retain un-moving objects. By this method, changing. Actually, we cannot guarantee the accuracy the performance will be raised than tracking method. of FP(t) and can’t avoid noises are updated into BP(t). Those noises be increased quickly when the updated frequency is raised. However, when the updated 2.1.1. Background updating frequency of BP(t) is lower, the ability of illumination changing adaption will be declined. Therefore, in this In the sequence of video, we extract the frames to be the paper, according to different timing, we propose three updated source of the two background model BP(t) and updating rules to adapt background model BP(t). The BI(t) by different duration shown in Fig.3: background is not updated for each frame. The update rate of BP(t) is defined as previous paragraph and the updating rules are defined as below: a) The foreground map FP(t) is used as a mask to select the pixels that need to be updated. Those selected pixels in BP(t) is linearly combined with BC(t) for updating. b) If the number of pixels from un-moving object aggregated map S is over an assigned threshold, it means that the noises are up or light is changing. In that condition, BP(t) is replaced by BC(t) . c) If there are not any moving objects in BC(t) for the Fig. 3 Pure and Instant sampling rate illustration long duration and there are not any abandoned objects detected, BP(t) is replaced by BC(t) . BP(t) indicates keeping pure image without any moving objects during the long period, and BI(t) indicates an In the second rule, the un-moving object aggregated image caught by an sample rate with short period. map S is a map used to accumulate the possibility of Objects are expected in BI(t) when the event of each pixel belonging to abandoned object. The detail abandoned objects happened. Current frame at time t is about map S will be described in next sub-section (Sec. denoted by BC(t). After the background model BP(t) is 2.1.2). estimated, we can easy to obtain a foreground map FP(t) including moving and un-moving objects by computing 2.1.2. Un-moving object decision the frame difference between BP(t) and BC(t). To extract un-moving objects from the foreground map FP(t), we Following the last section, first of all, we compute the try to extract another foreground map FI(t) only includes frame difference between BC(t) and BP(t), and between moving object shown in Fig.4. The foreground map FI(t) BC(t) and BI(t). The frame difference method is image is gotten by computing the frame difference between subtraction between two images. The difference results BI(t) and BC(t). If an un-moving object stays in a position, FP(t) and FI(t) are shown in Fig.4: the value in the object position of the map FP(t) should be 0 and the value in the object position of the map FI(t) should be 1. Therefore, this processing can easily to extract an un-moving object presently. According to our experiment, the time period of sample rate of BP(t) is about 25 frames, and the sample rate of BI(t) is about 15 frames . Because of illumination variances, it is difficult to update Bp(t). In general, Bp(t) could be updated and Fig. 4 Pure and Instant foreground illustration sourced from BC(t). However, it is hard to keep pure 693
  • 4. If a object stay in the environment, the value on the constant value (Iv). Otherwise, decrease a constant same positions (x, y) of FP(t)(x, y) is 0 and FI(t)(x, y) is 1. value (Dv). Therefore, if objects stay for a long period, The algorism is shown in Algo.1: the values of the object positions in map S will increase continuously until that enough pixels value are exceed In the algorithm, FP(t) and FI(t) are monochrome and the to the thresholds we assign. In that time, those pixels range of pixel value is 0 to 1. When the pixel value of could be parts of un-moving objects. The advantage of FP(t) (x, y) is equal to 0 and the pixel value of FI(t)(x, y) this method is that it can prevent the temporary objects, is equal to 1, it means that this pixel begin to stop and which stayed only for few seconds, to be regarded as stay in the current frame. Therefore; we can extract the abandoned objects. Moreover, when number of pixel un-moving object map St shown in Fig.5: which value is equal to 1 is too much, this situation may infer too many noises in Bp(t) after updating, or the light is changed too large. Those information could be feedback to be a updated timing conditions of BP(t). Input: FP(t) ,FI(t) Output: St(t) For Each Position (x, y) inside the maps FP(t) and FI(t) IF (FP(t)(x, y) = 0 & FI(t)(x, y) = 1) Set St(t) (x, y) = 1 Else Set St(t) (x, y) = 0 Algo. 1. The algorism of getting the un-moving object Fig. 6. Un-moving object aggregated map illustration 2.2 Abandoned object decision The method of [2] is using Human Patten to recognize human or abandoned objects. But it is not enough that only Human Patten is using. In normal situation, an abandoned object is usually a luggage or a briefcase, Fig. 5.Un-moving object map illustration: only present and bomb usually package in a box with regular shape. the un-moving object. The right map is St Therefore, we can only focus on the object that has regular shape, getting rid of it using object feature 2.1.3. Real Un-moving Object extraction filtering. St includes current un-moving objects. However, it 2.2.1. Abandoned object clustering could not indicate that they are real un-moving objects because those objects could just be a person staying for According to the un-moving object aggregated map S seconds or objects placing temporary. Therefore, St which is getting in the section 2.1, we extract each must be added up for each frame continuously to form object from S using clustering method. Those un- an un-moving object aggregated map S for extracting moving objects may include noises, staying human or real un-moving objects. The map S is defined as Algo.2. luggage and so on, generally, we take care about the object which is baggage with regular shape, briefcase and box, therefore, we focus on those kinds of objects based on three special features. Input: St(t),S , Iv, Dv Output: S For Each Position (x, y) inside St(t) & S 2.2.2. Object feature filtering IF(St(t)(x, y) = 0 ) According to the need from Subsection 2.2.1, we filter S(x, y) = S(x, y) + Iv each object based on object features. In this paper, we Else use three object features. First object feature is Area. S(x, y) = S(x, y) - Dv The goal of Area constrain is filtering the objects which are too large or too small. The Area feature is shown Algo. 2. The algorism of S (1) : For each pixel, if the pixel value is equal to 0, then the same position (x, y) of S will begin to increase a Area  size(Object) (1) 694
  • 5. where size (Object) means sun of pixels of object. abandoned object set that we already have(objectj(t-n) 1<n<t-1).If the center position of objecti (t) is located on The second feature is Shape, the state of object one of abandoned objects‘s bounding box from the Set appearance. The function is shown (2) : (objectj(t-n) 1<n<t-1), it means that they have relationship between objectj(t-n) and objecti(t), and called the objectj(t-n) size(Object)  4 has relationship satisfaction. Otherwise; it could be new Shape  (2) object 、 removal object or occlusion and called ( Perimeter)2 relationship dissatisfaction, if it is a new object, than where Perimeter indicate sun of object edge length and adding objecti(t) into the Set(objectj(t-n) 1<n<t-1). Therefore; the shape of object. When the shape of object is not we can decide relation and state for each object from regular, the Shape is smaller. Otherwise, the Shape is Set(objectj(t-n) 1<n<t-1) through this processing. When an larger by non-regular object. Generally speaking, this abandoned object is considered, we don't lunch a alarm feature is good for human shape filtering. to user immediately instead of making a decision based on the state of current abandoned object. Following the The last feature is Compactness. If the object is more definition of the life-cycle of software from software dispersed, the value of Compactness will be smaller. engineering technology, we manage to use the ideas The function is shown (3): into our study. The Life-cycle State Measurement (LCSM) including four states: the growing state、the n iObject i stable state、 the aging state and the dead state. Fig.7 is The Finite State Machine of LCSM: Compactness  (3) size(Object )  8 where ni means that for each pixel, searching eight neighborhood pixels, if one of eight pixels is also included object, than ni will be add value 1. The objects extracting from S should be contented with those three features, and than those objects be Fig. 7.The Finite State Machine of LCSM considered an abandoned object finally. Therefore; we can be easily to filter most of objects thought those The beginning state of abandoned object which is features. recorded in Set(objectj(t-n) 1<n<t-1) is assigned the growing state, when the timing of growing state is finished, the 2.3 The Life-cycle State Measurement state will be changed into the stable state or the aging state. When the timing of aging state is raised, it could The advantage of our method is that we don't track all be into the dead state. objects and the efficiency is better. However, the object features are easily to be effected due to illumination Following illustrate is the algorism of each state. changing and so on. Moreover, the occlusion problem Symbols are defined in Table.1 to convince the is a harsh issue, when the objects are occluded. The following illustration. abandon object could be discarded due to that the Area feature is reduced. When objects are occluded in small period; system should be kept instead of being Table.1. The definitions of symbols in LCSM discarded. Therefore, the set saving an abandoned symbol illustration object should have temporal register. In other word, ObjState The life-cycle state of current object when we get objects form the S for each frame, those objects should compute the relationship algorism with Newobj A type is Boolean, indicates whether abandoned object set which detected before and object is new abandoned object or discarded abandon object. The definition of relationship not means whether the abandoned object set at present GrowingT A type is Boolean, indicates whether (objecti (t)) connect to the object set which detected the growing time (GrowingTime) is before (objectj(t-n) 1<n<t-1) or not. The processing ensures finished or not. GrowingTime is that each abandoned object has stayed for a period of means a duration form Growing time. State to Stable State AgingT A type is Boolean, indicates whether The computing method of relationship is that for each the aging time (AgingTime) is objecti(t), their center position is compared with 695
  • 6. finished or not. . AgingTime is be stopped and the state will be changed into the special means a duration form Aging State situation called the unstable state. When the unstable to Dead State state continues for a long period of time, this object will ObjFeature A type is Boolean, indicates whether be killed by itself finally. Therefore, through the LCSM features of abandoned object are processing, we can avoid some of unstable situations satisfied assigned by user or not and make the detection more reasonable. ObjRelation A type is Boolean, indicates whether the relationship satisfaction is 3. EXPERIMENT RESULTS satisfied through the relationship computing or not We test ten videos which the dimension is 352 X 288 or 320 X 240 and those videos be divided into four types, Growing State: When the new abandoned object is each type has the same background, therefore; the created, we give it a time buffer (GrowingTime) to parameters for each type should be the same to proof grow and to avoid a false alarm due to the error that the same parameters in the same background could detection by occlusion or non-complete area. Therefore; be use. When abandoned objects are detected, system this state is using at the beginning until the will draw bounding box on them even in occlusion GrowingTime is finished. situation, when objects are removed, the alarm is still retained for a period of time. ObjState = Grounging State if Newobj = True & GrowingT = False & ObjFeature = True & ObjRelation = True (4) Stable State: Stable state means the GrowingTime is finished, and object’s features are satisfied, and relationship satisfaction is existed. Relationship satisfaction means the object is not removed or occluded. If the state of abandoned object is changed to the stable state, system will lunch alarm to user and the object will be boxed and feedback to user. ObjState = Stable State if GrowingT = True & ObjFeature = True & ObjRelation = True (5) Aging State: Aging state happened when the relationship is not contented or feature conditions are not satisfied, it is usually stand for occlusion or removal object. In that time, the state will be changed to the aging state. We also give the object a time buffer Fig. 8.Abandoned object detection results: when the (AgingTime) to age, once the conditions of the stable abandoned object is occlude, the bounding box is still state are fitted among the AgingTime. The state is keeping it; when the object is removed, the bounding returned to the stable state again, otherwise, the state is box is also keeping for duration. changed to the dead state finally. The top of picture in the Fig.8 show the fact that the ObjState = Aging State system can still select the object for a while when the if GrowingT = True & AgingT = False & object is occluded, others are show the fact that the ObjFeature = False | ObjRelation = False (6) system can still select the object for a while when the object is picked off. Dead State: When the state is changed into the dead state, It means the object should be ignored and deleted from Set(objectj(t-n) 1<n<t-1) in a few minute. In testing, we use Sensitivity and Specificity to verify our result。All the action in those test videos have at ObjState = Dead State least one abandoned object, objects which stay over 3 if AgingT = True (7) seconds will be considered the non-abandoned object event. Those test videos include 10 abandoned objects. If the state is the Growing State, but each conditions of Table 2 is about the definition of True positive (TP)、 the Growing State are not satisfied, the GrowingT will 696
  • 7. False positive (FP)、False negative (FN), True negative [2] Wu-Yu Chen, Meng-Fen Ho, Chung-Lin Huang, S. T. (TN) and Table 3is the table of results of 10 test video. Lee,Clarence Cheng,” DETECTING ABANDONED OBJECTS IN VIDEO-SURVEILLANCE SYSTEM”,” The 21th IPPR Conference on Computer Vision, Graphics, Table.2. Definition of TP、FP、FN、TN and Image Processing” ,CVGIP2008. Illustration TP Abandoned objects are detected correctly [3]A.Singh ,S.Sawan ,M.Hanmandlu ,V.K.Madasu ,B.C.Love ll,”An abandoned object detection system based on dual FP Non-abandoned objects are detected as background segmentation”,” Proceedings of the 2009 abandon objects Sixth IEEE International Conference on Advanced Video TN Non-abandoned object are detected as non- and Signal Based Surveillance”, Pages: 352-357 . abandon objects FN Abandoned objects are detected as non- [4] J.Wang and W. Ooi. “Detecting static objects in busy abandon objects scenes”. Technical Report TR99-1730, Department of Computer Science, Cornell University, February1999. The 10 test video is from popular databases and our [5] M. Bhargava, C-C. Chen, M.S. Ryoo, and J.K. Aggarwal, results show that the methods to solve the problem “Detection of Abandoned Objects in Crowded abandon object detection are efficient in the two points Environments”, in Proceedings of IEEE Conference on with high accuracy and low computing cost. The Advanced Video and Signal Based Surveillance, 2007, sensitivity is 90% and specificity is 92.6%. This shows pp. 271 – 276 high accuracy by applying our methods. The average [6] R. Mathew, Z. Yu and J. Zhang, “Detecting New Stable FPS is around 30 fps, and real time test using IP Objects in Surveillance Video” in Proceedings of the camera is about 25 fps. This shows cheap computing IEEE 7th Workshop on Multimedia Signal Processing, cost and the methods can work real time. 2005, pp. 1 – 4. [7] F. Porikli, Y. Ivanov, and T. Haga, “Robust Abandoned Table.3. Result of Sensitivity and Specificity Object Detection Using Dual Foregrounds”, Eurasip Positive Negative Journal on Advances in Signal Processing, vol. 2008, Positive TP = 9 FP = 4 2008. Negative FN = 1 TN = 50 Sensitivity = TP / (TP + FN) = 90.0% Specificity = TN / (FP + TN) = 92.6% 4. CONCLUSION In this paper, the results are reasonable by applying some techniques of foreground analysis 、 feature filtering and LCSM mechanism. However, the techniques are not flawless. For example, the updating BP(t) still has noise in a long period of time, even though a mechanism is proposed to replace BP(t). Missing abandon object detection can not be avoided. The next problem is about feature filtering. In normal situation, feature filtering can separate human and object, but it could make a false decision due to non-completed foreground detection, or people whose foregrounds are looked like rectangular and static object. In the future, we will make this technology of an abandoned object detection more reliable and useful in video surveillance. REFERENCES [1] Fatih Porikli ,”Detection of Temporarily State Regions by Processing Video at Different Frame Rates ” , ” Advanced Video and Signal Based Surveillance2007”, AVSS 2007. IEEE Conference on 5-7 Sept. 697
  • 8. HIERARCHICAL METHOD FOR FOREGROUND DETECTION USING CODEBOOK MODEL Jing-Ming Guo (郭景明), Member, IEEE and Chih-Sheng Hsu (徐誌笙) Department of Electrical Engineering National Taiwan University of Science and Technology Taipei, Taiwan E-mail: jmguo@seed.net.tw, seraph1220@gmail.com ABSTRACT [6], the gradient information is employed to detect This paper presents a hierarchical scheme with shadows, and which achieves good results. Yet, multiple block-based and pixel-based codebooks for foreground steps are required for removing shadows, and thus it detection. The codebook is mainly used to compress increases the complexity. Zhang et al. [24] proposed ratio information to achieve high efficient processing speed. In edge method to detect shadow, and the geometric the block-based stage, 12 intensity values are employed to heuristics was used to improve the performance. However, represent a block. The algorithm extends the concept of the main problem of this scheme is its high complexity. the Block Truncation Coding (BTC), and thus it can Most foreground detection methods are pixel-based, further improve the processing efficiency by enjoying its and one of the popular methods is the MOG. Stauffer and low complexity advantage. In detail, the block-based Grimson [7], [8] proposed the MOG by using multiple stage can remove most of the noises without reducing the Gaussian distributions to represent each pixel in True Positive (TP) rate, yet it has low precision. To background modeling. The advantage is to overcome overcome this problem, the pixel-based stage is adopted non-stationary background which provides better to enhance the precision, which also can reduce the False adaptation for background modeling. Yet it has some Positive (FP) rate. Moreover, the short term information is drawbacks: One of which is the standard deviation (SD); employed to improve background updating for adaptive if SD is too small, a pixel may easily be judged as environments. As documented in the experimental results, foreground, and vice versa. Another drawback is that it the proposed algorithm can provide superior performance cannot remove shadows, since the matching criterion to that of the former related approaches. simply indicates that a pixel is classified as background when it is within 2.5 times of SD. Chen et al. [9] proposed Keywords- Background subtraction; foreground a hierarchical method with MOG, the method also detection; shadow detection; visual surveillance; BTC employs block and pixel-based strategy, yet shadows cannot be removed with their method. Martel-Brisson and 1. INTRODUCTION Zaccarin [10] presented a novel pixel-based statistical In visual surveillance, background subtraction is an approach to model moving cast shadows of non-uniform important issue to extract foreground object for further and intensity-varying objects. This approach employs analysis, such as human motion analysis. A challenge MOG’s learning strategy to build statistical models for problem for background subtraction is that the describing moving cast shadows, yet this model requires backgrounds are usually non-stationary in practice, such more time for learning. Benedek and Sziranyi [23] choose as waving tree, ripple water, light changing, etc. Another the CIE L*u*v space to detect foregrounds or shadows by difficult problem is that the foreground generally suffers MOG, and the texture features are employed to enhance from shadow interference which leads to wrong analysis the segmentation results. The main problem of this of foreground objects. Hence, background model is highly scheme is its low processing speed. demanded to be adaptively manipulated via background Kim et al. [11] presented a real-time algorithm for maintenance. In [1], some of the well-known issues in foreground detection which samples background pixel background maintenances are introduced. values and then quantizes them into codebooks. This To overcome shadows, some well-known methods can approach can improve the processing speed by be adopted for use, such as RGB model, HSV model, compressing background information. Moreover, two gradient information and ratio edge. In particular, features, layered modeling/detection and adaptive Horprasert et al. [2] proposed to employ statistical RGB codebook updating, are presented for further improving color model to remove shadow. However, it suffers from the algorithm. In [12] and [13], the concept of Kohonen some drawbacks, including 1) more processing time is networks and Self-Organizing Maps (SOMs) [14] were required to compute thresholds, 2) non-stationary proposed to build background model. The background background problem cannot be solved, and 3) a fixed model can automatically adapt to a self-organizing threshold near the origin is used which offers less manner and without a prior knowledge. Patwardhan et al. flexibility. Another RGB color model proposed by [15] proposed robust foreground detection by propagating Carmona et al. [18] can solve the third problem of [2], yet layers using the maximum-likelihood assignment, and it needs too many parameters for their color model. In [3] then clustered into “layers”, in which pixels that share and [4], the HSV color model is employed to detect similar statistics are modeled as union of such shadows. The shadows are defined by a diminution of the nonparametric layer-models. The pixel-layer manner for luminance and saturation values when the hue variation is foreground detection requires more time for processing, at smaller than a predefined threshold parameter. In [5] and around 10 frames per second on a standard laptop 698
  • 9. computer. In our observation, classifying each pixel to model building. In our observation, the CB employs more represent various types of features after background information to build the background, yet the proposed training period is good manner for building adaptive method employed the concept of MOG [7] by simply background model. Also, it can overcome the using weights to classify foreground and background and non-stationary problem for background classification. thus can provide even higher efficient advantage and the Another foreground detection method can be classified precision is also higher than that of CB. Another as texture-based, in which Heikkila and Pietikainen [16] difference between the proposed method and CB is that presented efficient texture-based method by using the two stages, namely block-based and pixel-based adaptive local binary pattern (LBP) histograms to model stages, are involved in background model construction, the background of each pixel. LBP method employs while simply one stage is used in CB. In block-based circular neighboring pixels to label the threshold stage, multiple neighboring pixels are classified as a unit, difference between neighboring pixels and the center while a pixel is the basic unit in pixel-based. Figure 1 pixel. The results are considered as a binary number shows the structure of the background model which which can fully represent the texture of a pattern. composes of block-based and pixel-based stages. The In this study, a hierarchical method is proposed for details are introduced in the following sub-sections. background subtraction by using both block and pixel-based stages to model the background. This Background Model block-based strategy is from the traditional compression scheme, BTC [17], which divides an image into non-overlapped blocks, and each pixel in a block is Block-based Pixel-based substituted by a high mean or low mean. BTC algorithm simply employs two distinct intensity values to represent a block. Yet, in this paper, four intensity values are Fig. 1. Structure of background construction model. employed to represent a block, and each pixel in a block is substituted by the high-top mean, high-bottom mean, 2.1 Block feature in block-based stage low-top mean or low-bottom mean. The block-based The block feature used in this study is extended from background modeling can efficiently detect foreground BTC algorithm which maintains the first and the second without reducing TP, yet the precision is rather low. To moment in a block. Although BTC is a highly efficient overcome this problem, the pixel-based codebook strategy coding scheme, we further reduce its complexity by is involved to compress background information to modifying the corresponding high mean and low mean. simultaneously maintain its high speed advantage and Moreover, we extended the BTC algorithm by using four enhance the accuracy. Moreover, a modified color model intensity values to represent a block to increase the from the former approach [18] is used to distinguish recognition confidence, each pixel in a block is shadow, highlight, background, and foreground. The substituted by the High-top mean (Ht), High-bottom (Hb), modified structure can simplify the used parameters and Low-top (Lt) or Low-bottom (Lb) means. Suppose an thus improve the efficiency. As documented in the image is divided into non-overlapped blocks, and each experimental results, the proposed method can effectively block is of size M x N. Let x1, x2, ..., xm be the pixel solve the non-stationary background problem. One values in a block, where m=MxN. The average value of a specific problem for background subtraction is that a block is 1 m x   xi moving object becomes stationary foreground when it (1) stands still for a while during the period of background m i 1 construction. Consequently, this object shall become a The high mean Hm and low mean Lm is defined as part of the background model. For this, the short term m m information is employed to solve this problem in (x i | xi  x ) (x i | xi  x ) background model construction. Hm  i 1 , Lm  i 1 (2) The paper is organized as below. Section 2 presents q mq initial background model in background training period where q denotes the number of pixels equal or greater that includes the block-based and pixel-based codebooks. than x . Notably, if q is equal to m or 0 then all the Section 3 reports background subtraction by the proposed values in a block are forced to be identical to x . In this hierarchical scheme. Section 4 introduces the short term case, the Ht, Hb, Lt and Lb are assigned with x . information with background model. Section 5 documents Otherwise, three thresholds ( x , Hm and Lm ) are experimental results, in terms of accuracy and efficiency, employed to distinguish the four intensity values, Ht, Hb, and compares with former MOG [7], Rita’s method [4], Lt and Lb as defined below, CB [11], Chen’s method [9] and Chiu’s method [22] m m schemes. Section 6 draws conclusions. (x i | xi  Hm) (x i | x  xi  Hm) Ht  i 1 , Hb  i 1 (3) p q p 2. INITIAL BACKGROUND MODEL m m In this study, two types of codebooks are constructed for  ( x | Lm  x  x ) i i  ( x | x  Lm) i i block-based and pixel-based background modeling. The Lt  i 1 , Lb  (4) i 1 proposed background modeling is similar to CB [11]. The mqk k advantage of CB is its high efficiency in background where p denotes the number of the pixels equal or greater 699
  • 10. than Hm. If p is equal to q or 0, then both Ht and Hb are  vblock _ L  xblock_ t assigned with a value equal to Hm. The variable k denotes 1 the number of the pixels which are smaller than Lm. If k  wL  is equal to (m-q) or 0, then both Lt and Lb are assigned N with a value equal to Lm. In RGB color spaces, a divided IV. Otherwise, update the matched codeword cm, block of a specific color space is transformed to yield a consisting of Vblock_m and wm, by setting: block _ m  (1   )vblock _ m   x block_t set of Ht, Hb, Lt, and Lb. Thus, a block is represented by  v (5) Vblock=(RHt, GHt, BHt, RHb, GHb, BHb, RLt, GLt, BLt, RLb, GLb, 1 BLb).  wm  wm  N The reason that the proposed block feature can provide end for superior performance than the former schemes is that Step 3: select background codeword in codebook: unlike the traditional BTC, The codeword size for a block I. Sort the codewords in descending order is increased from six to twelve to better characterize the according to their weights texture of the block for the block-based background b reconstruction. Moreover, the BTC-based strategy can II. B  arg min  wk  T (6) b k 1 significantly reduce the complexity to adapt to a real-time application. Compared with the former Chen’s hierarchical method [9], in which the texture information where α denotes the learning rate and which is empirically is employed to form a 48-dimension feature, the proposed set at 0.05 in this study. Step 3 is to demarcate the method can effectively classify foreground and background with the way as that in MOG [7]. A codeword background by simply using 12 dimensions. Moreover, with a bigger weight has higher likelihood of being a the processing speed is superior to Chen’s method. background codeword in the background codebook. The codewords are sorted in descending order according to 2.2 Initial background model for block-based their weights, and then select the codewords meet Eq. 6 as codebook the background codebook, where T denotes an empirical In block-based stage, an image is divided into threshold with value 0.8. non-overlapped blocks, and each block can construct its own codebook. Using N training sequence to build the 2.3 Initial background model for pixel-based codebook block-based codebook, thus each codebook of a block has Algorithm for codebook construction in pixel-based stage N block vectors for training the background model. Let X is similar to block-based stage when a basic unit is be a training sequence for a block consisting of N block changed from a block to a pixel. Let X be a training vectors: X={xblock_1,xblock_2,…,xblock_N}. Let C=(c1, c2,…, sequence for a pixel consisting of N RGB vector: cL) represent the codebook for a block consisting of L X=(xpixe_1, xpixel_2, …, xpixel_N). Let F=(f1, f2,…, fL ) be the codewords. Each block has a different codebook size codebook for a pixel consisting of L codewords. Each based on codewords’ weights. Each codeword ci, i=1, …, pixel has a different codebook size based on codewords’ L, consisting of an block vector vblock_i=( RHt _ i , GHt _ i , weight. Each codeword fi, i=1…L, consisting of a pixel vector vpixel_i=(Ri, Gi, Bi) and a weight wi. BHt _ i , RHb _ i , GHb _ i , BHb _ i , RLt _ i , GLt _ i , BLt _ i , RLb _ i , In the step 2(II), find the codeword fm matching to GLb _ i , BLb _ i ) and a weight wi. xpixel_t based on the match_function(xpixel_t, vpixel_m) which will be introduced in Section 2.4. In the step 2(III), if F=0 In the training phase, an input block vector xblock or there is no match, then create a new codeword f L by compares with each codeword in the codebook. If no assigning xpixel_t to vpixel_L. Otherwise, update fm by match is found or there is no codeword in the codebook, assigning (1   )v pixel _ m   x pixel_t to vpixel_m. In the step 3, the input codeword is created in the codebook. Otherwise, update the matched codeword, and increase the weight the parameters α and T are identical to that of the value. To determine which codeword is the best matched block-based stage. candidate, the match function as introduced in sub-section The proposed block-based and pixel-based procedures 2.4 is employed for measuring. The detailed algorithm is are used to establish the background mode, which is given below. similar to CB [11]. The main difference is that the CB employs more information to build the background, yet Algorithm for block-based codebook construction the proposed method employed the concept of MOG [7] Step 1: L  0 , C  0 (empty set) by simply using weights to classify foreground and Step 2: for t=1 to N do background and thus can provide higher efficient I. xblock_t=(RHt_t , GHt_t , BHt_t , RHb_t , GHb_t , BHb_t , advantage and the precision is also higher than that of CB RLt_t , GLt_t , BLt_t , RLb_t , GLb_t , BLb_t) [11]. II. find the codeword cm in C={ ci | 1  i  L } 2.4 Match function matching to xblock_t based on: The match function for n dimensions employed in this  Matching_function(xblock_t, vblock_m)=true study in terms of squared distance is given as below III. If C=0 or there is no match, then L  L  1. dTd Create a new codeword cL by setting:  2 (7) N 700
  • 11. where d  (I ) 1 ( x  v) , and the empirical value of the Input Sequence standard deviation σ is in between 2.5 and 5, with 2.5 as a tight bound and 5 as a loosen bound; The identity matrix is of size NxN, where N=12 and 3 in block-based and Foreground pixel-based stages, respectively. The match function can Detection be applied for n dimensions; the proposed block vector vblock in the block-based stage is of 12 dimensions, and Foreground Foreground pixel vector vpixel is of 3 dimensions. A match is found as Block-based stage Pixel-based stage sample falling within λ=2.5 standard deviation of the mean of one of the codeword. The output of the match Background function is as below:  d Td Background true,  2 ; (8) Update Pixel-based Shadows match _ function ( x, v)   N background Model Highlight  false, otherwise.  In the pixel-based phase, the color model is exploited to classify a pixel simply when no match is found. The Short term Construct short term Construct short term strategy can significantly improve the efficiency. information information information for Block-based stage for Pixel-based stage 3. FOREGROUND DETECTION The proposed foreground detection stage can also be Weight > T_add Weight > T_add divided into block-based and pixel-based stages. In the block-based stage, the match function introduced in Insert short term Insert short term section 2.4 is employed to distinguish background or Insert information information foreground. If a block is classified as background, then to Block-based background model to Pixel-based background model which is fed to pixel-based background model updating for adapting to the current environment conditions. Yet, Fig. 2. Flow chart for foreground detection. this raises a disadvantage by increasing the processing time for foreground detection. For this, the threshold 3.2 Pixel-based background updating model T_update is used to enable the updating phase, which To adapt to the current environment conditions, when a means the updating is conducted every T_update frames. block is classified as background in block-based stage, the Empirically, the T_update is set at 2~5 to guarantee the corresponding pixel-based background model needs to be adaptation of the background model. Using the color updated. Yet this raises a disadvantage by increasing the model function which will be introduced in Section 3.5 processing time for foreground detection. For this, the and the match function can distinguish the current frame threshold T_update is used to enable the updating phase, into four states, background, foreground, high light and which means the updating is conducted every T_update shadows. Figure 2 shows the proposed foreground frames. Empirically, the T_update is set at 2~5 to detection flow chart, and which is detailed in the guarantee the adaptation of the background model. following sub-sections. Meanwhile, the match function is used to find the matched codeword for updating. The details of the 3.1 Foreground detection with block-based stage algorithm are organized as below. Block-based stage is employed to separate background and foreground. Although the block-based stage has low Algorithm for pixel-based background model updating precision, it can ensure the detected foreground without Step 1: xpixel=(R,G,B) reducing TP rate when σ is set with a small value as a Step 2: if the accumulated time is equal to T_update, then tight bound. However, a small σ increases FP rate as well. do Therefore, there is a trade-off in choosing the value of σ. 1) for all codewords in B in Eq. (6), find the Herein, the empirical value is set at 2.5 in this work. codeword fm matching to xpixel based on :  Match_function(xpixle, vpixel_m)=true Algorithm for background subtraction using block-based Update the matched codeword as codebook v pixel _ m  (1   )v pixel _ m   x pixel Step 1: xblock=(RHt, GHt, BHt, RHb, GHb, BHb, RLt, GLt, BLt, RLb, GLb, BLb) Step2: for all codewords in B in Eq. (6), find the 3.3 Foreground detection with pixel-based stage codeword cm matching to xblock based on : If a pixel is classified as foreground in block-based, then  Match_function(xblock, vblock_m)=true input pixel xpixel=(R,G,B) proceeds to pixel-based stage to Update the matched codeword as in Eq. (5) determine the state of a pixel. Algorithm for pixel-based background subtraction is similar to block-based. The Foreground if there is no match; block )   Step 3: BS(x only difference is on the match function. Herein, the color Background otherwise. model and match function are used to determine a pixel vector belongs to shadow, highlight, background, or 701
  • 12. foreground. The detailed algorithm is organized below. between 2 and 3.5. I _ max   v , I _ min   v Algorithm for background subtraction using pixel-based (11) Where β>1 and γ<1. In our experiments, β is set in codebook Step 1: xpixel=(R,G,B) between 1.1 and 1.25, and γ is in between 0.7 and 0.85. Step 2: for all codewords in B in Eq. (6), find the The range [I_max, I_min] is used for measuring comp_I; codeword fm matching to xpixel based on : if comp_I is not in this range, the pixel is classified as foreground. The overall color model is organized as  s  color _ mod el _ function ( x pixel , v pixel ) below:  If s is classified as background, then do v pixelm  (1   )v pixelm   x pixel Color_model_function(x, v) = Background if match_func tion (x, v)  true; 3.4 Color model  proj_I Highlight else  tan θ & v  comp _ I  I_max; In [18], the proposed color model can classify a RGB  comp_I color pixel into shadow, highlight, background, and  Shadow proj_I foreground. However, many parameters are employed in else  tan θ & I_min  comp _ I  v ;  comp _ I this model, which leads to a disadvantage by increasing  the computational complexity. In this work, the number of Foreground otherwise. parameters is reduced to three, namely θ, β, and γ, to (12) reduce the complexity. Figure 3 shows the modified color 4. BACKGROUND MODEL UPDATING WITH model. SHORT TERM INFORMATION G As indicated in Fig. 2, the background model updating with the short term information is divided to two stages: I_max First, construct short term information model with foreground region; second, if a codeword in short term information model accumulates enough weights, this _I mp Vi codeword will be inserted to background model for Co Proj_I I_min foreground detection. This strategy yields an advantage: A Xi (input pixel) user can control the lasting period of a stationary θ foreground which can be inserted to the background model. However, a non-stationary foreground region will R lead to too much unnecessary codewords in short term B information model. For this, time information is added to Fig. 3. Modified color model a codeword and a threshold is used to decide whether a codeword is reserved or deleted. In addition, identical Given an input pixel vector x, the match function is strategy is applied to background model as well. employed for measuring if it is in background state. If the The procedures of the short term information vector x is classified as foreground by the match function, construction for block-based and pixel-based phases are then we compare the angle, tanθ. If proj_I/comp_I is identical. The main concept is to add an additional model greater than tanθ, the vector x must be foreground. S called the short term information model. The S records Otherwise, the input vector x may fall within this color foreground regions after foreground detection. The model model bound. Subsequently, the variables I_max and construction is similar to that of Section 2. Yet, herein the I_min are calculated; if comp_I falls in between v and time information (S_time) is added for a codeword. In I_max, the pixel is classified as highlight; if the pixel addition, three additional thresholds, S_delete, T_add, and value is not in between I_max and I_min, the pixel is B_delete, are employed: S_delete is used to determine classified as foreground. whether a codeword is reserved or deleted. If the current Given an input pixel vector x=(R,G,B) and time subtracted by the last time information of a background vector v  ( R , G , B ) , codeword is smaller than S_delete, then the codeword is unnecessary in the codebook, and thus it is deleted from x  R2  G2  B2 , v  R 2  G 2  B 2 the codebook. T_add is used to decide whether a x, v  ( RR  GG  BB ) codeword is inserted to background model. If a codeword x, v accumulates enough weights, then this codeword can be a comp _ I  x cos   (9) part of background model. B_delete is used to determine v whether a codeword is reserved or deleted in background proj _ I  x  comp _ I 2 2 (10) model when short term information is inserted to background model, and sets the parameter B_delete where comp_I is used to determine a pixel vector belongs equals to T_add times S_delete (which is the worst case) to shadow or highlight; where proj_I is used to measure to ensure reserve the last updated time for codeword in the nearest distance with background vector v. If background model. The overall procedure of the proj_I/comp_I is greater than tanθ, then the pixel is algorithm is organized as below. classified as foreground. Herein, θ is empirically set in 702
  • 13. Algorithm for short term information model construction green and blue, are employed to represent shadows, Step 1: Given a background model B with the initial highlight and foreground, respectively. Figure 4(b) shows background model, create a new model S for the detected results using the block-based stage with recording foreground regions. block of size 10x10, in which most of the noises can be Step 2: Add time information parameter (B_time) for removed. Figure 4(c) shows the results obtained by the every codeword in B for recording current time hierarchical block-based and pixel-based stages. (C_time). S is assigned with an empty set. Apparently, the pixel-based stage can significantly Step 3: enhance the detected precision. Yet, we would like to I. Find a match codeword in B for an input image. point out a weakness of the proposed method. As it can be The “match” is determined by when a codeword seen in the third row of Fig. 4 (Highway_I), when the is found during the updating codeword in Eqs. color of the shadow is dark, it will be classified as (5) and B_time is equal to C_time. foreground. Since a lower threshold is set for the color II. If no match codeword is found in B, then search model of the proposed method, when the value exceeds the matched codeword in S for foreground the threshold it will be classified as shadows. The problem region, and do the following steps: can be eased by increasing the threshold. Yet, as it can be i. find the codeword sm in S={ si | 1  i  L } seen in Fig. 4, some of the foregrounds are classified as whether matching to x (input vector) based shadows by doing this. In summary, the proposed method on the matching function. performs well for small intensity of shadows, yet it cannot ii. If S=0 or no match, then L  L  1 . Create a provide perfect performance for greater intensity of new codeword sL by setting: shadows.  vL  x  wL=1  S_timeL=C_time iii. Otherwise, update the matched codeword s m, consisting of vm, wm and S_timem, by setting:  v  (1   )v   x m m  w  w 1 m m  S_time  C_time m Step 4: S  {sm | (C _ time  S _ timem )  S _ delete} Step 5: Check the weight of every codeword in S. If the weight of the codeword is greater than T_add, then do the following steps: I. B  {cm | (C _ time  B _ timem )  B _ delete} (a) (b) (c) II. Add codeword as short term information at the Fig. 4. Classified results of sequence [19] for IR (row 1), head of B. Campus (row 2) and Highway_I (row 3) with shadow Step 6: Repeat the algorithm from Step 3 to Step 5. (red), highlight (green), and foreground (blue). (a) Original image, (b) block-based stage only with block of 5. EXPERIMENTAL RESULTS size 10x10, and (c) proposed method. For measuring the accuracy of the results, the criterions FP rate, TP rate, Precision, and Similarity [12] are Figure 5 shows the test sequence WT [21] of employed as defined below: non-stationary background with waving tree containing fp tp FP rate  TP rate  287 frames of size 160x120. Compared with the five fp  tn , tp  fn , former methods, MOG [7], Rita’s method [4], CB [11], tp Similarity  tp Chen’s method [9] and Chiu’s method [22] the proposed Precision  tp  fp , tp  fp  fn , method can provide better performance in handling non-stationary background. Moreover, show the detected where tp, tn, fp, and fn denote the numbers of true results with different block sizes using simply positives, true negative, true positives, and false negative, block-based codebook. Apparently, most noises are respectively; (tp + fn) indicates the total number of pixels removed without reducing TP rate. Most importantly, the presented in the foreground, and (fp + tn) indicates the processing speed is highly efficient with the block-based total number of pixels presented in the background. In our strategy. Yet, a low precision is its drawback. To experimental results is without any post processing and overcome this problem, the pixel-based stage is involved short term information for measuring the accuracy of the to enhance the precision, and which can also reduce the results. FP rate. And, show the detected results using the proposed Figure 4 shows the test sequences [19] of size hierarchical scheme (block-based stage and pixel-based 320x240 with IR (row 1), Campus (row 2) and stage) with various block sizes. Figures 6(a)-(d) shows the Highway_I (row 3). To provide a better understanding accuracy values, FP rate, TP rate, Precision, and Similarity, about the detected results, three colors, including red, 703