Weitere ähnliche Inhalte
Ähnlich wie Literature review of facial modeling and animation techniques
Ähnlich wie Literature review of facial modeling and animation techniques (20)
Literature review of facial modeling and animation techniques
- 1. International Journal of Computer Engineering (IJCET), ISSN 0976 – 6367(Print),
International Journal of Computer Engineering and Technology
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME
and Technology (IJCET), ISSN 0976 – 6367(Print)
ISSN 0976 – 6375(Online) Volume 1
IJCET
Number 1, May - June (2010), pp. 136-146 ©IAEME
© IAEME, http://www.iaeme.com/ijcet.html
LITERATURE REVIEW OF FACIAL MODELING AND
ANIMATION TECHNIQUES
Mr. K. Gnanamuthu Prakash
Research Scholar
Anna University of Technology, Coimbatore
Coimbatore – 641 047
Dr. S. Balasubramanian
Research Scholar
Anna University of Technology, Coimbatore
Coimbatore – 641 047
ABSTRACT
A major unsolved problem in computer graphics is the construction and animation
of realistic human facial models. Traditionally, facial models have been built
painstakingly by manual digitization and animated by ad hoc parametrically controlled
facial mesh deformations or kinematics approximation of muscle actions. Fortunately,
animators are now able to digitize facial geometries through the use of scanning range
sensors and animate them through the dynamic simulation of facial tissues and muscles.
However, these techniques require considerable user input to construct facial models of
individuals suitable for animation. Realistic facial animation is achieved through
geometric and image manipulations. Geometric deformations usually account for the
shape and deformations unique to the physiology and expressions of a person. Image
manipulations model the reflectance properties of the facial skin and hair to achieve small
scale detail that is difficult to model by geometric manipulation alone.
INTRODUCTION
Computer facial animation is primarily an area of computer graphics that
encapsulates models and techniques for generating and animating images of the human
head and face. Two-dimensional facial animation is commonly based upon the
transformation of images, including both images from still photography and sequences of
136
- 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME
video. Image morphing is a technique which allows in-between transitional images to be
generated between a pair of target still images or between frames from sequences of
video. These morphing techniques usually consist of a combination of a geometric
deformation technique, which aligns the target images, and a cross-fade which creates the
smooth transition in the image texture. Another form of animation from images consists
of concatenating together sequences captured from video. Another one more is technique
called video-rewrite where existing footage of an actor is cut into segments
corresponding to phonetic units which are blended together to create new animations of a
speaker. Video-rewrite uses computer vision techniques to automatically track lip
movements in video and these features are used in the alignment and blending of the
extracted phonetic units. This animation technique only generates animations of the lower
part of the face, these are then composited with video of the original actor to produce the
final animation. Three-dimensional head models provide the most powerful means of
generating computer facial animation. The model was a mesh of 3D points controlled by
a set of conformation and expression parameters. The former group controls the relative
location of facial feature points such as eye and lip corners. Changing these parameters
can re-shape a base model to create new heads. Different methods for initializing such
“generic” model based on individual (3D or 2D) data have been proposed and
successfully implemented. The parameterized models are effective ways due to use of
limited parameters, associated to main facial feature points. The MPEG-4 standard
defines a minimum set of parameters for facial animation. Animation is done by changing
parameters over time. Facial animation is approached in different ways, traditional
techniques include
1. shapes/morph targets,
2. skeleton-muscle systems,
3. bones/cages,
4. motion capture on points on the face and
5. knowledge based solver deformations.
Facial animation is now attracting more attention than ever before in its 25 years
as an identifiable area of computer graphics. Imaginative applications of animated
graphical faces are found in sophisticated human-computer interfaces, interactive games,
137
- 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME
multimedia titles, VR telepresence experiences, and, as always, in a broad variety of
production animations. Graphics technologies underlying facial animation now run the
gamut from key framing to image morphing, video tracking, geometric and physical
modeling, and behavioral animation. Supporting technologies include speech synthesis
and artificial intelligence. Whether the goal is to synthesize realistic faces or fantastic
ones, representing the dynamic facial likeness of humans and other creatures is giving
impetus to a diverse and rapidly growing body of cross-disciplinary research.
LITERATURE REVIEW
Facial modeling and animation research falls into two major categories, those
based on geometric manipulations and those based on image manipulations. Each realm
comprises several subcategories. Geometric manipulations include key-framing and
geometric interpolations [A. Enmett 1985, F. I. Parke 1991], parameterizations [M.
Cohen et.al 1993] finite element methods [B. Guenter 1992], muscle based modeling [K.
Waters 1987], visual simulation using pseudo muscles [P. Kalra et.al 1992], spline
models [C. L. Y. Wang 1994] and free-form deformations [S. Coquillart 1990]. Image
manipulations include image morphing between photographic images [T. Beier et.al
1992], texture manipulations [M. Oka et.al 1987], image blending [F. Pighin et.al 1998],
and vascular expressions [P. Kalra et.al 1994].
As stated by Ekman (1975), humans are highly sensitive to visual messages sent
voluntarily or involuntary by the face. Consequently, facial animation requires specific
algorithms able to render with a high degree of realism the natural characteristics of the
motion. Research on basic facial animation and modeling has been extensively studied
and several models have been proposed. For example, in the Parke models (1975, 1982)
the set of facial parameters is based on both observation and the underlying structures
that cause facial expression. The animator can create any facial image by specifying the
appropriate set of parameter values. Motions are described as a pair of numeric tuples
which identify the initial frame, final frame, and interpolation. Pearce et al. (1986)
introduced a small set of keywords to extend the Parke model.
Platt and Badler (1981) have designed a model that is based on underlying facial
structure. The skin is the outside level, represented by a set of 3D points that define a
138
- 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME
surface which can be modified. The bones represent an initial level that cannot be moved.
Between both levels, muscles are groups of points with elastic arcs.
Waters (1987) represents the action of muscles using primary motivators on a
non-specific deformable topology of the face. The muscle actions themselves are tested
against FACS (Facial Action Coding System) which employs action units directly to one
muscle or a small group of muscles. Two types of muscles are created: linear/parallel
muscles that pull and sphincter muscles that squeeze. Magnenat-Thalmann et al. (1988)
defined a model where the action of a muscle is simulated by a procedure, called an
Abstract Muscle Action procedure (AMA), which acts on the vertices composing the
human face figure. It is possible to animate a human face by manipulating the facial
parameters using AMA procedures. By combining the facial parameters obtained by the
AMA procedures in different ways, we can construct more complex entities
corresponding to the well-known concept of facial expression. Nahas et al. (1987)
propose a method based on the B-spline. They use a digitizing system to obtain position
data on the face from which they extract a certain number of points, and organize them in
a matrix. This matrix is used as a set of control points for a 5-dimensional bicubic B-
spline surface. The model is animated by moving these control points.
CLASSIFICATION OF FACIAL MODELING AND ANIMATION
METHODS
This taxonomy in Figure 1 illustrates the diversity of approaches to facial
animation. Exact classifications are complicated by the lack of exact boundaries between
methods and the fact that recent approaches often integrate several methods to produce
better results. The literature review as follows introduce the interpolation techniques and
parameterizations followed by the animation methods using 2D and 3D morphing
techniques. The Facial Action Coding System, a frequently used facial description tool.
Physics based modeling and simulated muscle modeling are discussed. Techniques for
increased realism, including wrinkle generation, vascular expression and texture
manipulation, are discussed. Individual modeling and model fitting are described in
literature review.
139
- 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME
Figure 1: Classification of facial modeling and animation methods
FACIAL MODELING TECHNIQUES
POLYGONAL
Polygonal modeling specifies exactly each 3D point, which connected to each
other as polygons. This is an exacting way to get topology (points) where you need it on a
face and not where you don’t.
PATCHES (NURBs)
Patches (or a set of splines) indirectly defines a smooth curve surface from a set
of control points. A small amount of control points (called CVs in Maya) can define a
complex surface. One type of spline is called NURBs which stands for Non-Uniform
140
- 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME
Rational B-Splines. This type of batch allows each control point to have its own weight
that can affect the “pinch” of the curve at the point. So they are considered the most
versatile of batches. They work very well for organic smooth objects so hence they are
well suited for facial modeling however several issues arise.
SUB-DIVISION SURFACES
Sub-Division surfaces is a fairly new modeling technique that gives you the
control and flexibility of polygons with the ease of use and smoothness of patches. Tony
DeRose (who wrote the paper on Sub-D surfaces and created a working version for Pixar,
first used in Geri’s Game) has slides on the advantages on sub-d surface. Sub-D surfaces
gives you the detail only where you need it. Paul Aichele discussed this on our Pixar trip
with Geri’s head.
DIGITIZING
Facial models can be created by digitizing live humans or physical models. There
are several techniques. One that does need an expensive digitizer is using fudical points
to reconstruct 2D photographs into a 3D model. Now however, automatic digitizing
equipment like that from CyberWare is regularly used to create high resolution 3D
models of live human models complete with color data. While digitizing models are very
useful for many application such as Stanford’s Digital Michelangelo Project typically the
data from these systems are too high resolution and not semantically setup for facial
animation.
PHOTOGRAMETRIC ACQUISITION
Web based avatar companies and others are using techniques that take a
photograph of a human face (sometimes from the front and side) and map it onto a pre-
made 3D model that animates by going through a registration process, where key points
on the photograph (corners of: the eyes, eyebrows, mouth, ..) are picked via the mouse to
register the image with the model. Several researchers are working on automatic
registering techniques but lighting conditions on a live face or photograph and other
standardization issues plagues the process.
141
- 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME
FACE GENERATION
Face generation systems genetically evolve the type of face you want or have you
search or surf through a theoretical space of faces as with the . Still other face generation
systems allow you to pick from a database of facial parts to create a head as you might
with a police artist sketch kit such as the Faces system we used as an assignment.
FACIAL ANIMATION TECHNIQUES
a) KEYFRAMING
The most widely used animation technique is key-framing, where the animator
creates key poses of an articulated model and the animation system interpolates the “in
between” frames from that set of key-frame data. Typically the data being keyed and
interpolated are transformations (move, rotation, scale) of rigid objects such as the
hierarchical parts of human body. The problem with facial animation is there really are
no rigid parts that move in relation to each other to key-frame. Hence there is a myriad of
facial animation techniques based on what you are actually key-framing/interpolating to
get smooth animation of a flexible surface of the face.
b) MORPH TARGETS
One widely used basic technique is to create a model of a face in a rest position.
Then using essentially modeling techniques, edit the points of a copy of that face to make
other faces (typically with the same topology hence the copying of the rest face) in
different phoneme and expression states. Then animate a facial animation sequence by
morphing (point interpolation) between this set of like-minded faces.
The disadvantage with this technique is that the animator is only picking from a
set of pre-made face morphs and thereby limited to the expressions possible from that set
in the final animated face sequence. There are several variants of this morphing technique
– most notable compound or hierarchical morph targets which allow the animator to
blend several faces together with differing weights and/or only morph specific areas of
the face. Again, all versions of this technique limit the creative process by only allowing
you to pick from a pre-made set of expressions (or force you to stop the animation
process to create additional morph targets).
142
- 8. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME
c) CHARACTER ANIMATION TOOLS
Tools that were made and are useful for character and organic animation have
been used to animate the face. Typically these techniques are not straightforward and are
cumbersome because they are not very well suited for a flexible face. They include free
form deformations, “bones”, point cluster techniques and others.
d) PARAMETERIZED SYSTEMS
Fred Parke's early work in facial animation at Univ. of Utah and NY Inst of
Technology lead to the first facial animation system. It used a control parameterized
method where the animation becomes a process of specifying and controlling parameter
set values as a function of time. Parameter systems are able to animate the facial
expressions as well as the facial types. The Face Lift program used for The Sims uses a
simplified version of a production parameterization system. Most parameter systems use
Paul Ekman's FACS (Facial Action Coding System) which describes facial muscle
movement as a basis or starting point for specifying and defining the range of parameters.
Ken Perlin's web Java-based system also uses a simple but very effective parameterized
technique Parameter systems create a very compact specification for facial animation and
are therefore ideally suited for the web and games.
e) MUSCLE SIMULATION SYSTEMS
Keith Waters work uses a simple simulation of muscle deformation to animate the
face. It uses two types of muscles: linear muscles that pull and sphincter muscles that
squeeze. He uses a mass and spring technique to animate or deform the skin. The muscle
control system has a one to one correspondence to know face muscles and to Ekman's
FACS. An extended, fast and open source version of Waters technique with applications
for games and our real-time systems is called Expressions
f) MOTION CAPTURE
Facial animation has also been achieved via performance systems where a live
performance is digitalized and applied to the facial model rather than created by an
animator. Motion capture is the most widely used performance technique. Systems
typically track via one or several cameras, small point-like reflective stickers attached in
strategic positions on the performers face. See two examples of such systems.
143
- 9. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME
g) SPEECH GENERATED SYSTEMS
Another form of performance animation system uses the voice only to create not
only the synched lip movement, but the movement of the other parts of the face as well
including the eyebrows, blinks and eye movement, and head movement (neck rotation).
These systems analyze the voice to get lip syncing phoneme positions and also use pitch,
volume, sentence semantics (dividing speech into sentence sections based on pauses) and
other cues to approximate the animation of a faces. These systems in standalone form or
combined with photogrametric techniques are being used both in linear animation
systems and in real-time web-based applications.
VIDEO BASED ANIMATION OF PEOPLE
In order to create animations, that have natural motion AND have photo-realistic
appearance, we need to combine motion-capture and image based (or video based)
techniques. The goal is to build video based representations of annotated example
motions. Unlike standard motion capture techniques that are based on markers or other
devices, we need to annotate body and facial configurations directly in unconstrained
video. In static scenes the user could supply annotations by hand, but for video
sequences, automatic techniques are crucial (10 min of video has 18,000 images, no-one
has the budged, patience, and consistency to do this by hand). To build libraries of
example motions, we also need techniques that annotate coarse motion categories
automatically. Again, this has to be done automatically. For example a 10 minute video
of someone talking could be transformed into a video-based library of more then 2,000
phonetic lip motions (phonemes or visemes).
VIDEO REWRITE
Video Rewrite uses existing footage to create automatically new video of a person
mouthing words that she did not speak in the original footage. This technique is useful in
movie dubbing, for example, where the movie sequence can be modified to sync the
actors' lip motions to the new soundtrack.
VIDEO MOTION CAPTURE
This paper demonstrates a new vision based motion capture technique that is able
to recover high.
144
- 10. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME
Degree-of-freedom articulated human body configurations in complex video
sequences. It does not require any markers, body suits, or other devices attached to the
subject.
CONCLUSIONS
Computer facial animation is now being used in a multitude of important fields. It
brings a more human, social and dramatic reality to computer games, films and
interactive multimedia, and is growing in both use and importance. Authoring computer
facial animation with complex and subtle expressions is still difficult and fraught with
problems. It is currently mostly authored using generalized computer animation
techniques, which often limit the quality and quantity of facial animation production.
Given additional computer power, facial understanding and software sophistication, new
face-centric methods are emerging but typically are adhoc in nature. This research
attempts to define and organizationally categorize current and emerging methods,
including surveying facial animation experts to define the current state of field perceived
bottlenecks and emerging techniques.
REFERENCES
1. Ekman P. and Friesen WV. (1975), Unmasking the Face: A Guide to Recognizing
Emotions from Facial Clues, Prentice-Hall.
2. Parke FI (1975) A Model for Human Faces that allows Speech Synchronized Animation,
Computers and Graphics, pergamon Press, Vol.1, No.1, pp.1-4.
3. Parke FI (1982) Parameterized Models for Facial Animation, IEEE Computer Graphics
and Applications, Vol.2, No.9, pp.61-68.
4. Pearce A, Wyvill B, Wyvill G and Hill D (1986) Speech and expression: a Computer
Solution to Face Animation, Proc. Graphics Interface '86, pp.136-140.
5. Platt S, Badler N (1981) Animating Facial Expressions, Proc. SIGGRAPH '81, pp.245-
252.
6. Waters K (1987) A Muscle Model for Animating Three-Dimensional Facial Expression,
Proc. SIGGRAPH '87, Vol.21, No.4, pp.17-24.
7. Magnenat-Thalmann N, Thalmann D (1987) The Direction of Synthetic Actors in the
film Rendez-vous à Montréal, IEEE Computer Graphics and Applications, Vol.7, No.12.
145
- 11. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME
8. Nahas M, Huitric H and Saintourens M (1988) Animation of a B-spline Figure, The
Visual Computer, Vol.3, No.5.
9. A. Enmett, Digital portfolio: Tony de peltrie. Computer Graphics World, 1985, vol.
8(10), pp. 72– 77
10. F. I. Parke, Techniques of facial animation, In N. Magnenat-Thalmann and D. Thalmann,
editors, New Trends in Animation and Visualization, 1991, Chapter 16, pp. 229 – 241,
John Wiley and Sons
11. M. Cohen, D. Massara, Modeling co-articulation in synthetic visual speech. In N.
Magnenat- Thalmann, and D. Thalmann editors, Model and Technique in Computer
Animation, 1993, pp. 139– 156, Springer-Verlag, Tokyo
12. B. Guenter, A system for simulating human facial expression. In State of the Art in
Computer Animation, 1992, pp. 191–202
13. K. Waters. A muscle model for animating three-dimensional facial expression. In
Maureen C.
14. Stone, editor, Computer Graphics (Siggraph proceedings, 1987) vol. 21 pp. 17-24
15. P. Kalra, A. Mangili, N. M. Thalmann, D. Thalmann, Simulation of Facial Muscle
Actions Based on Rational Free From Deformations, Eurographics 1992, vol. 11(3), pp.
59–69
16. C. L. Y. Wang, D. R. Forsey, Langwidere: A New Facial Animation System, proceedings
of Computer Animation, 1994, pp. 59-68
17. S. Coquillart, Extended Free-Form Deformation: A Sculpturing Tool for 3D Geometric
Modeling, Computer Graphics, 1990, vol. 24, pp. 187 – 193
18. T. Beier, S. Neely, Feature-based image metamorphosis, Computer Graphics (Siggraph
proceedings 1992), vol. 26, pp. 35-42
19. M. Oka, K. Tsutsui, A. ohba, Y. Jurauchi, T. Tago, Real-time manipulation of texture-
mapped surfaces. In Siggraph 21, 1987, pp. 181–188. ACM Computer Graphics
20. F. Pighin, J. Hecker, D. Lischinski, R. Szeliski, D. H. Salesin, Synthesizing Realistic
Facial Expressions from Photographs, Siggraph proceedings, 1998, pp. 75-84
21. P. Kalra, N. Magnenat-Thanmann, Modeling of Vascular Expressions in Facial
Animation, Computer Animation, 1994, pp. 50 -58
146