2. It’s a crazy multimedia world!
– Network is everywhere… but very heterogeneous
– Terminals are “same same”… but different!
– Content must be designed with a priori knowledge of its future use
– Applications are platform-centric instead of user-centric
Fragmented value chain
Production Transmission Consumption
Capturing HW/SW Playback
Networks
devices providers devices
INTEROPERABILITY NEEDED!!!
Authoring IPR Terminal
Providers
tools holders manufacturers
Service
End users
scenarios
3. MPEG has several names
• Common: MPEG = Moving Picture Experts Group
• Official: MPEG = ISO/IEC JTC1/SC29/WG11
ISO Over 224 technical committees
JTC1 Joint Technical Committee with IEC
SC24 SC29
Computer graphics and image processing Coding of audio, picture, multimedia and hypermedia information
GKS, PHIGS WG1 JPEG
CGM, VRML, X3D Coding of still pictures
WG11 MPEG
Coding of moving pictures and audio
4. MPEG has several subgroups
Video
... and produced several
Requirements Audio Systems successful standards
3D Graphics
MPEG
MPEG-1 MPEG-2 MPEG-4 MPEG-7 MPEG-21
ISO/IEC 11172 ’92 ISO/IEC 13818 ’94 ISO/IEC 14496 ’99 ISO/IEC 15938 ’01 ISO/IEC 21000 ’02
Video & audio Video & audio Multimedia Metadata Terminal
(CD-ROM) (DVD & DVB) & interactive (description & network
applications of content) specification
MPEG-A MPEG-B MPEG-C MPEG-D MPEG-E MPEG-M MPEG-U MPEG-V
5. MPEG-4 Features
MPEG-4 Objects
Naturals – still images, audio, 2D/3D video
Synthetic - audio, 2D/3D objects and scenes
Compression
Compression
MPEG-4 Objectives Compression
Provide technologies for efficient
compression and transmission
MPEG-4 Terminal
Composition, at end-user side, of natural
and synthetic objects, into hybrid and
interactive scenes
7. Video Compression
– MPEG-4 Visual and MPEG-4 AVC
3D Graphics Compression
8. No common representation
– Heterogeneous kind of data
• Different types of geometry, appearance and animation models
– Always easier to specify a new data representation format than
learning an existent one
Very different application domains
9.
10.
11. FAST
TRANSPORT
TROUGHT
THE
NETWORK
Size is of main Size doesn’t matter
Size doesn’t matter importance
12. FAST
TRANSPORT
TROUGHT
THE
NETWORK
Size doesn’t matter
Size doesn’t matter
Size is of main
importance
13. To define a standard format for compressed 3D
synthetic content. In other words to be for
graphics what MP3 and AAC are for audio,
MPEG-2 and MPEG-4 are for video and JPEG is
for still images.
Additionally "MPEG 3D Graphics" aims at providing
mechanisms such as APIs to enable easy
integration and development of applications
using its standard representation tools.
14. MPEG 3D Graphics within the MPEG standards family
Core technologies in MPEG 3D Graphics
Compressing other standards (COLLADA, X3D, …) with MPEG
Virtual Worlds Interoperability for avatars with MPEG
15. MPEG 3D Graphics within the MPEG standards family
Core technologies in MPEG 3D Graphics
Compressing other standards (COLLADA, X3D, …) with MPEG
Virtual Worlds Interoperability for avatars with MPEG
19. MPEG 3D Graphics within the MPEG standards family
Core technologies in MPEG 3D Graphics
Compressing other standards (COLLADA, X3D, …) with MPEG
Virtual Worlds Interoperability for avatars with MPEG
31. Approximation of target surface
• Method: tesselation with predefined curved patches
• Quality: higher order (polynomic/rational) ⇒ Cn continuity
Mesh definition
• Connectivity: regular grid of quads. or triangles
⇒ planar topology
• Geometry: list of control points {…, Pk = (xk, yk, zk), …}
Mesh coding
• Connectivity (lossless): implicit
• Geometry (lossy): coordinate quantisation + prediction from conn.
32. Tensor product of cubic Bézier curves
Single patch (4x4 control points)
vs.
two patches (4x7 control points)
Compression
(3547 polygons; 1215 vertices)
vs.
(86 patches; 212 control points)
33. MPEG-4 Part 16 (2003): NURBS
• Based on VRML97 Amd., originally proposed by blaxxun
• Support for NURBS curves and patches
Specific nodes for Bézier’s curves and patches (for increased efficiency)
• Support for free-form deformations
35. SS = limit of recursive refinement of base control mesh
NB: refinement affects both Geometry smoothing achieved with
connectivity (of abstract graph) and stencils particular to each scheme
geometry (of 3D mapping)
-1/16 -1/16
½ -1/16 9/16 9/16 -1/16
1/8 1/8
Border/sharp vs.
z interior edge stencils
½
of “butterfly” scheme
x
y -1/16 -1/16
SS inherently define hierarchically nested LODs
36. Approximation of target surface
• Method: tesselation with curved patches
• Quality: higher order ⇒ Cn continuity
Mesh definition
• Connectivity: list of triangles/quads., e.g., {…, Tn = {in0, in1, in2}, …}
⇒ arbitrary (manifold) topology
• Geometry: list of control points {…, Pk = (xk, yk, zk), …}
Mesh coding
• Connectivity (lossless): as for polygonal (manifold) mesh
• Geometry (lossy): as for polygonal (manifold) mesh
37. Polygons
+ Are the simplest approach (linear approximation)
+ Can resolve fine details and handle arbitrary topologies
– Lead to unstructured, huge meshes
Patches
+ Are a more powerful approach (higher order approximation)
+ Are convenient for coarse and smooth models
– Need cumbersome trimming and stitching mechanisms
SSs
+ Connect and unify the two extremes above
++ Provide multi-resolution handles for hierarchical coding/editing
44. MPEG-4 Part 16 (2003): “plain” + wavelet SSs
• “Plain” SSs for mesh smoothing
Considered schemes: Catmull-Clark, [extended] Loop, butterfly
No details are added but…
normal control achievable through edge/vertex tagging of initial control mesh
• Wavelet/detailed SSs for surface approximation
Possibly tagged base mesh
Details are added after each subdivision step, which are…
wavelet-transformed according to one of several possible schemes
Most suited for multi-resolution editing/animation
Most suited for view-dependent transmission
56. MPEG 3DGC
Scalable Complexity 3D Mesh Compression
• Not all 3DG applications have the same needs
in compression
• Not all the 3DG applications can afford
spending extra CPU/GPU for compression
TFAN
SVA
SC3DMC
QBCR Same Quantization and
Binarization blocks
Towards the continuum model: enlarge the application domain
where MPEG-4 3DG can be used
57. MPEG 3DGC
Scalable Complexity 3D Mesh Compression
x, y, z (floats) + normals (floats)
x, y, z (floats) + colors (floats)
x, y, z (floats) + … (floats)
i, j, k (integers) Attributes
i, j, k (integers)
i, j, k (integers) Connectivity
58. SC-3DMC
General Schema
Connectivity TFAN BP CABAC
Delta
(lossless)
SVA FLB AC
BAC
Connectivity Prediction Binarization Entropy
i, j, k Analysis Encoding
Attributes Prediction Entropy
Binarization
(lossy) Encoding
Parallelogram
Delta BP CABAC
x, y, z Quantization
Barycentre FLB AC
5 different paths with different performances
62. SC-3DMC
What we measure?
Encoding performances but also complexity for decoder
(and encoder)
TFAN
Performances
SVA-AC
3DMC
SVA-BAC
SVA-BP
NCA
BIFS
Complexity
72. Vector Graphics Primitives
Color profile
Line Color Profile ( LC )
Area Color Point ( A C )
Line ( LN ), bounded by 2 Terminal Points ( TP ) The image
cannot be
displayed. Color Patch ( PA )
Line Segment ( LS ), bounded by 2 Line Points ( LP )
Sub -Texture ( ST )
Line marked as control line (Skeleton)
The image cannot be displayed.
80. Piece-wise linear functions defined
by a set of pairs time-value
Standardized by MPEG-4 for:
- position
- orientation
- scale
- coordinate in IFS
- normals in IFS
Straightforward animation based on key-frames
82. Re-sampling, sub-sampling: two
methods supported by MPEG-4
Path preserving mode
Key preserving mode
Three schemes supported by
MPEG-4
Orientation Interpolators
Coordinate Interpolators
Position Interpolators
Exploit temporal redundancy for the most common interpolators
83. Re-sampling, sub-sampling: two
methods supported by MPEG-4 Key value Key value
Path preserving mode Key(time) Key(time)
Key preserving mode N key values N-1 key values
Three schemes supported by
MPEG-4
Key value Key value
Orientation Interpolators
Coordinate Interpolators Key(time) Key(time)
Position Interpolators N-2 key values N-3 key values
Sub-sampling or re-sampling based on minimal distortion
86. Face, MPEG-4 V1,
1999
Body, MPEG-4 Amd1,
2000
Skinned Model, MPEG-4 Part 16,
2003
Two frameworks: human-like (FBA) and generic skeleton (BBA)
87. Geometry:
Seamless mesh: shapes sharing
the same vertices list
Texture:
Image Mapping on vertices
sub-set
Hierarchy:
Skeleton layer
Muscle layer
Seamless mesh affected by a hierarchical skeleton
88. 1D controllers: bone & muscle
for each bone and each muscle
- a list of affected vertices
- a measure of affectedness
are provided
Right balance between control parameters and influence volume
89. Uncompressed Uncompressed Segment #n
Frame #n BAP/FAP/BBA
BAP/FAP/BBA
Frame P DCT
Prediction DC Coeff. AC Coeff.
Frame I Segment
P Segment
Prediction I
Quantization Quantization
DC Q DC Q AC Q
Arithmetic Arithmetic
coding coding
Huffman coding Huffman coding
Binary file Binary file
Basic comp.: prediction, freq.transform, quantization and entropy encoding
90. Face, MP4 V1,
2kbps
Body, MP4 Amd1,
5-30 kbps
Skinned Model, MP4 Part 16,
5-30 kbps for a human like
skeleton
Very low bit-rate
92. Defined as a base mesh and
a collection of target meshes
Animation obtained by
updating the weights of the
target meshes
BBA stream updated to
include morph data
Usable for any kind of 3D
object
Local and precise control for shape deformation
94. Cluster the vertices with
respect to their motion
Encode a cluster motion by
an affine transform
Encode the residual error at
vertex level by traditional
approach (DCT/W,
quantization, entropy encoding)
96. Bifs-Anim
Remote animation
BIFS-Anim & BIFS Command Bifs-Command scene
Programmatic animation Script Node
- ECMA Script scene
part of the scene description
- Java Code
standardized API for accessing MPEG-J Stream
the scene graph scene
Complex application scenario can be built
98. MPEG 3DG
How measuring compression performances?
On-line benchmarking platform:
www.MyMultimediaWorld.com
• Allows easy integration of proprietary algorithms by using an
API in C++
• No need to disclaim the algorithm source code
• Benchmark automatically updated for new content
• Restricts and refines the benchmark by means of easy-to-
control parameters.
99. MPEG 3DG
How measuring compression performances?
On-line benchmarking : www.MyMultimediaWorld.com
3D compression
Web site benchmark Benchmark 2 Benchmark 3
Filter and Presentation
Engine
Benchmark manager 2
Benchmark manager 1
Indexed
MDB
3D Compression benchmark manager
API
MP4
Extended MP7 Algorithm 1 Algorithm 2
… Algorithm n
101. MPEG 3DG
How measuring compression performances?
Benchmarking platform, per-object visualization
- object properties (number of vertices/
triangles, number of components and files
size,
- distortion graph (linear or logarithmic),
- compression gain with respect to 3DMC.
- encoding time
- decoding times.
102. MPEG 3DG
How measuring compression performances?
Benchmarking platform, global visualization
Filters
- semantic category,
- average distortion,
- number of vertices,
- number of connected components in the
object,
- database subset.
103. MPEG 3DG
How measuring compression performances?
Benchmarking platform, global visualization
104. MPEG 3D Graphics within the MPEG standards family
Core technologies in MPEG 3D Graphics
Compressing other standards (COLLADA, X3D, …) with MPEG
Virtual Worlds Interoperability for avatars with MPEG
108. Any
XML Scene Representation
XML
XML
(Binary) XML
Binarisation
3D Graphics
Compression
109. Layer 1:
Any
XML Scene Representation
XML
Textual XML
representation of the
XML scene
(Binary) XML
Binarisation
3D Graphics
Compression
110. Layer 2:
Any
XML Scene Representation
XML
Binarized XML layer
XML Contains the
(Binary) XML
Binarisation unclassified elements
of the scene graph
3D Graphics
Compression
111. Layer 3:
Any
XML Scene Representation
XML
Compressed layer
XML Contains specific
(Binary) XML
Binarisation types of media
information
3D Graphics (geometry,
Compression animation, ...)
112. Result:
Any
XML Scene Representation
XML
Multiplexed
XML
layers 2 and 3
(Binary) XML
Binarisation
3D Graphics
Compression
113. MPEG-4 P25
Encoder side: possible implementation (informative)
Encoder Side
MP4
XMT,
COLLADA,
X3D
Comp ratio: 40-70:1
114. MPEG-4 P25
Decoder side (normative)
Decoder Side
MP4
XMT,
COLLADA,
X3D
Lossless for the data structure, lossless or lossy for graphics primitives
115. MPEG 3D Graphics within the MPEG standards family
Core technologies in MPEG 3D Graphics
Compressing other standards (COLLADA, X3D, …) with MPEG
Virtual Worlds Interoperability for avatars with MPEG
116. Standards for Avatars as visualization support
Standard Generic Features Avatar
representation
COLLADA 3D objects/ scenes Generic Object
VRML/ X3D 3D objects /scenes H- Anim
Application behavior
MPEG-4 2D/3D objects/ scenes A set of dedicated nodes
Application behavior A dedicated compressed
stream
Compression
117. Standards for Avatars as interaction support
Representation Features
HumanML human physical description, emotion
EmotionML emotion, facial expressions, gestures
BML speech, gesture, gaze
MPML speech
VHML facial and body animation, emotional representation
CML character attribute and animation definition
118. Why none of the existing standards is solving the issue of avatar
interoperability in Virtual Worlds?
– The Virtual Worlds are proprietary applications and the 3D assets including
the avatars have economical value
– The Virtual Worlds and in general 3D applications have specific data format
allowing rendering optimization
However, while maintaining a strict control for economic and technical
reasons, Virtual Worlds allow users to personalize the avatars
119. Avatar
Template
Attributes that can be
modified by the user
Specify the set of Personalization Parameters (PP) that transforms a
template of an arbitrary Virtual World into the user designed avatar
120. What avatar feature can be personalized?
Mainly the appearance
Analysis of SecondLife, IMVU, Entropia Universe, SonyPlaystation and HumanML
121. Very heterogeneous set of personalization parameters
Entropia Nintendo
Universe Wii
HumanML
Second Life
PlayStation
123. Interoperability at the Animation level
<Animation>
<Greeting>
<Salute>salut</Salute>
<Cheer>cheer</Cheer>
</Greeting>
<Fighting>
<shoot>pousse</shoot>
<throw>throw</throw>
</Fighting>
</Animation>
124. Motion retargeting
No “Walk”
Avatar Avatar animation
“Walk” animation
Template in VW1 Template in VW2 defined in VW2
In VW1
125. “Control” element
Body Face
Control Control <Control>
<BodyFeaturesControl >
<UpperBodyBones>
<LCalvicle>my_LCalvicle</LCalvicle>
Control <RClavicle>my_RCalvicle</RClavicle>
</UpperBodyBones>
</BodyFeaturesControl>
<FaceFeaturesControl>
<HeadOutline>
<Left X=0.23 Y=1.25 Z=7.26/>
<Right X=0.25 Y=1.25 Z=7.21/>
</HeadOutline>
</FaceFeaturesControl>
</Control>
126. MPEG-V
(Personalization
Parameters)
MPEG-V + MPEG-4
The Avatar in The Avatar in
VW1 VW2
The Avatar in an
external player
134. MPEG 3DGC
Multi-resolution 3DMC : two proposed methods
1. Work on
- document the use case scenarios (object examination, navigation, …)
- database
- comparison with other codecs
- comparison method (PSNR, Resolution)
- platform for testing
2. You are invited to contribute