Basics of Mpeg 4 3D Graphics Compression

Marius Preda, PhD
Chairman of MPEG 3DG

Institut MINES TELECOM

  It’s a crazy multimedia world!
–  Network is everywhere… but very heterogeneous
–  Terminals are “same same”… but different!
–  Content must be designed with a priori knowledge of its future use
–  Applications are platform-centric instead of user-centric
  Fragmented value chain

Production Transmission Consumption
Capturing HW/SW Playback
Networks
devices providers devices
INTEROPERABILITY NEEDED!!!
Authoring IPR Terminal
Providers
tools holders manufacturers
Service
End users
scenarios

MPEG has several names
•  Common: MPEG = Moving Picture Experts Group
•  Official: MPEG = ISO/IEC JTC1/SC29/WG11
ISO Over 224 technical committees

JTC1 Joint Technical Committee with IEC

SC24 SC29
Computer graphics and image processing Coding of audio, picture, multimedia and hypermedia information

GKS, PHIGS WG1 JPEG
CGM, VRML, X3D Coding of still pictures

WG11 MPEG
Coding of moving pictures and audio

MPEG has several subgroups
Video
... and produced several
Requirements Audio Systems successful standards
3D Graphics
MPEG

MPEG-1 MPEG-2 MPEG-4 MPEG-7 MPEG-21
ISO/IEC 11172 ’92 ISO/IEC 13818 ’94 ISO/IEC 14496 ’99 ISO/IEC 15938 ’01 ISO/IEC 21000 ’02
Video & audio Video & audio Multimedia Metadata Terminal
(CD-ROM) (DVD & DVB) & interactive (description & network
applications of content) specification

MPEG-A MPEG-B MPEG-C MPEG-D MPEG-E MPEG-M MPEG-U MPEG-V

MPEG-4 Features
MPEG-4 Objects

Naturals – still images, audio, 2D/3D video
Synthetic - audio, 2D/3D objects and scenes

Compression
Compression
MPEG-4 Objectives Compression

Provide technologies for efficient
compression and transmission

MPEG-4 Terminal
Composition, at end-user side, of natural
and synthetic objects, into hybrid and
interactive scenes

MPEG-4 Features
System architecture

Interactive Scene Description
Scene
layer

Object
descriptor
layer

MPEG-4
stream
Media
data
layer

  Video Compression
–  MPEG-4 Visual and MPEG-4 AVC

  3D Graphics Compression

  No common representation
–  Heterogeneous kind of data
•  Different types of geometry, appearance and animation models
–  Always easier to specify a new data representation format than
learning an existent one

  Very different application domains

FAST
TRANSPORT
TROUGHT
THE
NETWORK

Size is of main Size doesn’t matter
Size doesn’t matter importance

FAST
TRANSPORT
TROUGHT
THE
NETWORK

Size doesn’t matter
Size doesn’t matter
Size is of main
importance

To define a standard format for compressed 3D
synthetic content. In other words to be for
graphics what MP3 and AAC are for audio,
MPEG-2 and MPEG-4 are for video and JPEG is
for still images.
Additionally "MPEG 3D Graphics" aims at providing
mechanisms such as APIs to enable easy
integration and development of applications
using its standard representation tools.

MPEG 3D Graphics within the MPEG standards family

Core technologies in MPEG 3D Graphics

Compressing other standards (COLLADA, X3D, …) with MPEG

Virtual Worlds Interoperability for avatars with MPEG

Approximation of target surface
•  Method: tesselation with planar facets
•  Quality: first order (linear) ⇒ no smoothness (C0 continuity)

Mesh definition: IFS (Indexed Face Set)
•  Connectivity: list of faces {…, Pn = {in0, in1, in2, …}, …}
⇒ arbitrary topology (non-manifold, open, higher genus, etc.)
•  Geometry: list of vertices {…, Vk = (xk, yk, zk), …}

Mesh coding
•  Connectivity (lossless): triangle strips, triangle+vertex trees, etc.
•  Geometry (lossy): coordinate quantisation + prediction from conn.

LOD concept
•  1976: Clark introduced the idea
•  Main interest: rendering efficiency

Taxonomy of LOD extraction techniques
•  Static vs. dynamic
•  Global vs. local
•  Progressive vs. hierarchical LODs

Successful simplification techniques
•  1996: Hoppe’s edge collapses
•  1997: Garland’s quadrics ⇒ qslim

Progressive 3D mesh coding
•  1996: Hoppe’s PM (Progressive Mesh)
•  1998: IBM’s PFS (Progressive Forest Split)

Triangles: 69451 vs. 29519

MPEG-4 (1999): IFSs
•  Based on VRML97
•  Arbitrary topology meshes
•  “Properties” (normals, colours and textures)

MPEG-4 Amd.1 (2000): 3DMC (3D Mesh Coding)
•  40-50:1 compression of IFSs by IBM’s TS (Topological Surgery)
•  Incremental transmission and rendering
•  Progressive coding by IBM’s PFS
•  Error resilience by SAIT

  IFS surfaces
  Patches

•  Method: tesselation with predefined curved patches
•  Quality: higher order (polynomic/rational) ⇒ Cn continuity

Mesh definition
•  Connectivity: regular grid of quads. or triangles
⇒ planar topology
•  Geometry: list of control points {…, Pk = (xk, yk, zk), …}

Mesh coding
•  Connectivity (lossless): implicit
•  Geometry (lossy): coordinate quantisation + prediction from conn.

Tensor product of cubic Bézier curves

Single patch (4x4 control points)
vs.
two patches (4x7 control points)

Compression

 (3547 polygons; 1215 vertices)
vs.
(86 patches; 212 control points)

MPEG-4 Part 16 (2003): NURBS
•  Based on VRML97 Amd., originally proposed by blaxxun
•  Support for NURBS curves and patches
  Specific nodes for Bézier’s curves and patches (for increased efficiency)
•  Support for free-form deformations

  IFS surfaces
  Patches
  Subdivision surfaces

SS = limit of recursive refinement of base control mesh
NB: refinement affects both Geometry smoothing achieved with
connectivity (of abstract graph) and stencils particular to each scheme
geometry (of 3D mapping)
-1/16 -1/16

½ -1/16 9/16 9/16 -1/16

1/8 1/8
Border/sharp vs.
z interior edge stencils
½
of “butterfly” scheme
x
y -1/16 -1/16

SS inherently define hierarchically nested LODs

•  Method: tesselation with curved patches
•  Quality: higher order ⇒ Cn continuity

Mesh definition
•  Connectivity: list of triangles/quads., e.g., {…, Tn = {in0, in1, in2}, …}
⇒ arbitrary (manifold) topology
•  Geometry: list of control points {…, Pk = (xk, yk, zk), …}

Mesh coding
•  Connectivity (lossless): as for polygonal (manifold) mesh
•  Geometry (lossy): as for polygonal (manifold) mesh

Polygons
+ Are the simplest approach (linear approximation)
+ Can resolve fine details and handle arbitrary topologies
– Lead to unstructured, huge meshes

Patches
+ Are a more powerful approach (higher order approximation)
+ Are convenient for coarse and smooth models
– Need cumbersome trimming and stitching mechanisms

SSs
+ Connect and unify the two extremes above
++ Provide multi-resolution handles for hierarchical coding/editing

Catmull-Clark’s (1978): quadrilateral, primal, approximating, C2

Loop’s (1987): triangular, primal, approximating, C2

Dyn++’s “butterfly” (1990): triangular, primal, interpolating, C1

  IFS surfaces
  Patches
  Wavelet subdivision surfaces

Target surface
Base
mesh

Subdivide Add details

Target surface
Base
mesh

Subdivide Add details

Price (requirements)
•  Base mesh extraction
•  Subdivision scheme = predictor
•  Details (3D vectors) = prediction errors ⇒ remeshing

Prize (advantages)
•  Predictive coding ⇒ immediate (if smooth target mesh)

Spatial partitioning
Adding details
in appearing
parts

Removing details
in disappearing
parts
ZT ZT ZT … ZT

MPEG-4 Part 16 (2003): “plain” + wavelet SSs
•  “Plain” SSs for mesh smoothing
  Considered schemes: Catmull-Clark, [extended] Loop, butterfly
  No details are added but…
  normal control achievable through edge/vertex tagging of initial control mesh
•  Wavelet/detailed SSs for surface approximation
  Possibly tagged base mesh
  Details are added after each subdivision step, which are…
  wavelet-transformed according to one of several possible schemes
  Most suited for multi-resolution editing/animation
  Most suited for view-dependent transmission

  IFS surfaces
  Patches
  Mesh Grid

CW (Connectivity-Wireframe)

RG (Reference-Grid)

Vertex offset is a relative value Update vertex position when
grid is deformed or animated

  IFS surfaces
  Patches
  Mesh Grid
  Solids

Solid primitives

Solid models: the “arithmetic of forms”

Implicit equation:

Quadrics (2nd order):

Quartics (4th order):

F1

*
0 1 2

0 0 0 0
F0
1 0 1 2

2 0 2 4
A cube = multiplication of 3 degenerated quadrics

Multiplication of two forms

Examples of solid operations

Architecture
Mechanics

Biotechnology Virtual models

Exact geometry

Compactness

21 Kb 37 Kb 407 Kb 1.1 Mb

  IFS surfaces
  Patches
  Subdivision surf.
  Wavelet SS
  Mesh Grid
  Solids
  SC-3DMC

MPEG 3DGC
Scalable Complexity 3D Mesh Compression
•  Not all 3DG applications have the same needs
in compression
•  Not all the 3DG applications can afford
spending extra CPU/GPU for compression

TFAN

SVA
SC3DMC

QBCR Same Quantization and
Binarization blocks

Towards the continuum model: enlarge the application domain
where MPEG-4 3DG can be used

MPEG 3DGC
Scalable Complexity 3D Mesh Compression

x, y, z (floats) + normals (floats)
x, y, z (floats) + colors (floats)
x, y, z (floats) + … (floats)
i, j, k (integers) Attributes
i, j, k (integers)
i, j, k (integers) Connectivity

SC-3DMC
General Schema

Connectivity TFAN BP CABAC
Delta
(lossless)
SVA FLB AC

BAC
Connectivity Prediction Binarization Entropy
i, j, k Analysis Encoding

Attributes Prediction Entropy
Binarization
(lossy) Encoding
Parallelogram
Delta BP CABAC
x, y, z Quantization
Barycentre FLB AC

5 different paths with different performances

SC-3DMC
Connectivity Analysis: Empty

Do Nothing

3
1 4

2 123234

SC-3DMC
Connectivity Analysis: SVA
Previous Face
4 modes for encoding how consecutive Current Face
faces share vertices Shared Vertex
1 5
3
1, 2, 3, 1, 2, 3,
1 4 3
2, 3, 4 3, 4, 5

12304 2 2 4 123245
Mode 0 Mode 2

1 4 1 1
1, 2, 3, 1, 2, 3,
4, 5, 6 1, 2, 3
2 3 2 3
2 3 5 6 OR
1231 5 Mode 1 Mode 3 1233

SC-3DMC
Connectivity Analysis: TFAN

Split the mesh in set of triangle fans and encode each fan

1675
62937
2984
Transformed in local indices

SC-3DMC
What we measure?

Encoding performances but also complexity for decoder
(and encoder)
TFAN
Performances
SVA-AC
3DMC
SVA-BAC
SVA-BP

NCA

BIFS
Complexity

•  Visual Texture Coding Generic compression tool

•  Synthesized Textures Very Compact
representation for
•  Procedural Textures specific applications

•  Depth Image-based Representation Less calculation
intensive

•  MPEG-4 3DMC
•  MPEG-4 MeshGrid
•  MPEG-4 Subdivision Surfaces

3D Mesh

•  MPEG-4 VTC
•  JPEG 2000

3D object
2D Texture

Zero-Tree

WT

3D mipmaps

Multiple In Place mapping = multi-resolution textures

Without mipmapping With mipmapping

Zero-Tree

WT

Computational Graceful Degradation: Quality => processing power

Region of Interest with resolution/quality selection

Visual
importance
depends on
viewing
angle

Low
Resolution

High
Resolution

Packet selection by using error-resilience markers

  IFS surfaces   VTC
  Patches   Synthesized texture
  Subdivision surf.
  Wavelet SS
  Mesh Grid
  Solids
  SC-3DMC

Vector Graphics Primitives

Color profile

Line Color Profile ( LC )

Area Color Point ( A C )
Line ( LN ), bounded by 2 Terminal Points ( TP ) The image
cannot be
displayed. Color Patch ( PA )
Line Segment ( LS ), bounded by 2 Line Points ( LP )
Sub -Texture ( ST )
Line marked as control line (Skeleton)
The image cannot be displayed.

65x96 pixels
10 seconds animation
1.35 kB

  Subdivision surf.   Procedural texture
  Wavelet SS
  Mesh Grid
  Solids
  SC-3DMC

DEF Fabric ProceduralTexture {
type 2 width 256 height 256
cellWidth 4 cellHeight 4
roughness 1
distortion 0.05
seed 114300
color [ 0.898 0.89418 0.95294, 0.34118 0.29418 0.70196, 0 0 0, 0 0 0 ]
aWarpmap [ 0 0, 0.03 1, 0.88 1, 1 0 ]
bWarpmap [ 0 0, 0.48 1, 1 0 ]
aWeights [ 0, 0, 0, 0, 0, 0, 0.56, 0, 0, 0, 0, 0, 0, 0, 0.20, 0.24 ]
bWeights [ 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0 ]
}
DEF Marble ProceduralTexture {
width 256 height 256
roughness 1
seed 22209
color [ 0.8 0.7098 0.6902, 0.95686 0.8902 0.87451,
0.87451 0.37255 0.23529, 0.95686 0.8902 0.87451 ]
aWarpmap [ 0 1, 0.33 0, 1 1 ]
bWarpmap [ 0 0, 0.55 0, 0.6 1, 0.65 0, 1 0 ]
bWeights [ 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
0, 1, 0, 0 ]
}

  Wavelet SS   DIBR
  Mesh Grid
  Solids
  SC-3DMC

Reconstruct 3D representation from projections

Depth image

Projection

  Mesh Grid   Point Texture
  Solids
  SC-3DMC

  IFS surfaces   VTC   Interpolators
  Solids
  SC-3DMC

Piece-wise linear functions defined
by a set of pairs time-value

Standardized by MPEG-4 for:
- position
- orientation
- scale

- coordinate in IFS
- normals in IFS

Straightforward animation based on key-frames

Large amount of data for high quality, smooth animation

Re-sampling, sub-sampling: two
methods supported by MPEG-4

Path preserving mode
Key preserving mode

Three schemes supported by
MPEG-4
Orientation Interpolators
Coordinate Interpolators
Position Interpolators

Exploit temporal redundancy for the most common interpolators

Re-sampling, sub-sampling: two
methods supported by MPEG-4 Key value Key value

Path preserving mode Key(time) Key(time)

Key preserving mode N key values N-1 key values

Three schemes supported by
MPEG-4
Key value Key value

Orientation Interpolators
Coordinate Interpolators Key(time) Key(time)

Position Interpolators N-2 key values N-3 key values

Sub-sampling or re-sampling based on minimal distortion

A dedicated elementary stream for IC, multiplexed into an AFX stream

  Patches   Synthesized texture   Bone-Based
  Solids
  SC-3DMC

  Face, MPEG-4 V1,
1999

  Body, MPEG-4 Amd1,
2000

  Skinned Model, MPEG-4 Part 16,
2003

Two frameworks: human-like (FBA) and generic skeleton (BBA)

  Geometry:
Seamless mesh: shapes sharing
the same vertices list

  Texture:
Image Mapping on vertices
sub-set

  Hierarchy:
Skeleton layer
Muscle layer

Seamless mesh affected by a hierarchical skeleton

  1D controllers: bone & muscle

  for each bone and each muscle
- a list of affected vertices
- a measure of affectedness
are provided

Right balance between control parameters and influence volume

Uncompressed Uncompressed Segment #n
Frame #n BAP/FAP/BBA
BAP/FAP/BBA
Frame P DCT

Prediction DC Coeff. AC Coeff.
Frame I Segment
P Segment
Prediction I
Quantization Quantization

DC Q DC Q AC Q
Arithmetic Arithmetic
coding coding
Huffman coding Huffman coding

Binary file Binary file

Basic comp.: prediction, freq.transform, quantization and entropy encoding

  Face, MP4 V1,
2kbps

  Body, MP4 Amd1,
5-30 kbps

  Skinned Model, MP4 Part 16,
5-30 kbps for a human like
skeleton

Very low bit-rate

  Subdivision surf.   Procedural texture   Morphing
  Solids
  SC-3DMC

  Defined as a base mesh and
a collection of target meshes

  Animation obtained by
updating the weights of the
target meshes

  BBA stream updated to
include morph data

  Usable for any kind of 3D
object

Local and precise control for shape deformation

  Wavelet SS   DIBR   FAMC
  Solids
  SC-3DMC

  Cluster the vertices with
respect to their motion

  Encode a cluster motion by
an affine transform

  Encode the residual error at
vertex level by traditional
approach (DCT/W,
quantization, entropy encoding)

  Remote and
programmatic
  Solids
  SC-3DMC

Bifs-Anim
  Remote animation
BIFS-Anim & BIFS Command Bifs-Command scene

  Programmatic animation Script Node
- ECMA Script scene
part of the scene description
- Java Code
standardized API for accessing MPEG-J Stream
the scene graph scene

Complex application scenario can be built

  Mesh Grid   Remote and
  Point Texture
programmatic
  Solids
  SC-3DMC

MPEG 3DG
How measuring compression performances?

On-line benchmarking platform:
www.MyMultimediaWorld.com

•  Allows easy integration of proprietary algorithms by using an
API in C++
•  No need to disclaim the algorithm source code
•  Benchmark automatically updated for new content
•  Restricts and refines the benchmark by means of easy-to-
control parameters.

MPEG 3DG
On-line benchmarking : www.MyMultimediaWorld.com
3D compression
Web site benchmark Benchmark 2 Benchmark 3

Filter and Presentation
Engine

Benchmark manager 2
Benchmark manager 1

Indexed
MDB

3D Compression benchmark manager
API

MP4

Extended MP7 Algorithm 1 Algorithm 2
… Algorithm n

MPEG 3DG
On-line benchmarking : www.MyMultimediaWorld.com

GetNbBitstream
n
Vertex Buffer Encoder Proprietary
Coder library

DumpBitSream

BitStream
(1,n)
DumpCompressedVB
Proprietary
Decoder
Decoder library
.Compressed_VB (1,n)

MPEG 3DG

Benchmarking platform, per-object visualization

- object properties (number of vertices/
triangles, number of components and files
size,
- distortion graph (linear or logarithmic),
- compression gain with respect to 3DMC.
-  encoding time
-  decoding times.

MPEG 3DG

Benchmarking platform, global visualization

Filters

- semantic category,
- average distortion,
- number of vertices,
- number of connected components in the
object,
- database subset.

MPEG 3DG

Benchmarking platform, global visualization

Geometry compression:
-compression ratio 40:1

Animation compression:
-compression ratio 100:1

Any
XML Scene Representation
XML

XML
(Binary) XML
Binarisation

3D Graphics
Compression

Layer 1:
Any
XML
Textual XML
representation of the
XML scene
(Binary) XML
Binarisation

3D Graphics
Compression

Layer 2:
Any
XML
Binarized XML layer

XML Contains the
(Binary) XML
Binarisation unclassified elements
of the scene graph
3D Graphics
Compression

Layer 3:
Any
XML
Compressed layer

XML Contains specific
(Binary) XML
Binarisation types of media
information
3D Graphics (geometry,
Compression animation, ...)

Result:
Any
XML
Multiplexed
XML
layers 2 and 3
(Binary) XML
Binarisation

3D Graphics
Compression

MPEG-4 P25
Encoder side: possible implementation (informative)

Encoder Side

MP4
XMT,
COLLADA,
X3D

Comp ratio: 40-70:1

MPEG-4 P25
Decoder side (normative)

Decoder Side

MP4
XMT,
COLLADA,
X3D

Lossless for the data structure, lossless or lossy for graphics primitives

Standards for Avatars as visualization support

Standard Generic Features Avatar
representation
COLLADA 3D objects/ scenes Generic Object

VRML/ X3D 3D objects /scenes H- Anim

Application behavior

MPEG-4 2D/3D objects/ scenes A set of dedicated nodes

Application behavior A dedicated compressed
stream
Compression

Standards for Avatars as interaction support

Representation Features
HumanML human physical description, emotion

EmotionML emotion, facial expressions, gestures

BML speech, gesture, gaze
MPML speech
VHML facial and body animation, emotional representation

CML character attribute and animation definition

Why none of the existing standards is solving the issue of avatar
interoperability in Virtual Worlds?

–  The Virtual Worlds are proprietary applications and the 3D assets including
the avatars have economical value
–  The Virtual Worlds and in general 3D applications have specific data format
allowing rendering optimization

However, while maintaining a strict control for economic and technical
reasons, Virtual Worlds allow users to personalize the avatars

Avatar
Template

Attributes that can be
modified by the user

Specify the set of Personalization Parameters (PP) that transforms a
template of an arbitrary Virtual World into the user designed avatar

What avatar feature can be personalized?

Mainly the appearance

Analysis of SecondLife, IMVU, Entropia Universe, SonyPlaystation and HumanML

Very heterogeneous set of personalization parameters

Entropia Nintendo
Universe Wii

HumanML

Second Life
PlayStation

“Appearance” element
<Appearance>

<Body>
<BodyHeight value=165/>
<BodyFat value=15/>
</Body>

<Head>
<HeadShape value="oval"/>
<EggHead value="true"/>
</Head>

</Appearance>

Interoperability at the Animation level

<Animation>

<Greeting>
<Salute>salut</Salute>
<Cheer>cheer</Cheer>
</Greeting>

<Fighting>
<shoot>pousse</shoot>
<throw>throw</throw>
</Fighting>

</Animation>

Motion retargeting

No “Walk”
Avatar Avatar animation
“Walk” animation
Template in VW1 Template in VW2 defined in VW2
In VW1

“Control” element
Body Face
Control Control <Control>
<BodyFeaturesControl >
<UpperBodyBones>
<LCalvicle>my_LCalvicle</LCalvicle>
Control <RClavicle>my_RCalvicle</RClavicle>
</UpperBodyBones>
</BodyFeaturesControl>

<FaceFeaturesControl>
<HeadOutline>
<Left X=0.23 Y=1.25 Z=7.26/>
<Right X=0.25 Y=1.25 Z=7.21/>
</HeadOutline>
</FaceFeaturesControl>

</Control>

MPEG-V
(Personalization
Parameters)

MPEG-V + MPEG-4
The Avatar in The Avatar in
VW1 VW2

The Avatar in an
external player

MPEG 3DGC
Multi-Resolution 3DMC

Current 3DG content in VW

© Samsung © Second Life

Tomorrow’s 3DG content in VW
(100 times denser)
© OTOY © OTOY

MPEG 3DGC
Reconfigurable Graphics Coding

A framework that allows to set up the decoder at run time

Thank you!

www.MyMultimediaWorld.com
www-artemis.it-sudparis.eu
marius.preda@it-sudparis.eu

MPEG 3DGC
Multi-resolution 3DMC

Progressive mesh [Hoppe’96] Progressive Forest Split (PFS) [Taubin’98] Patch coloring [Cohen-Or’99]

Valence-based decimation approach [Alliez’01] Octree-based compression [Peng’05]

Spectral coding [Karni’01]

MPEG 3DGC
Multi-resolution 3DMC : two proposed methods

1. Progressive TFAN
Original

Interpolated with 1%
of the vertices

MPEG 3DGC

2. KLT encoder

2
2

2 1
1

1

MPEG 3DGC

1. Work on
-  document the use case scenarios (object examination, navigation, …)
-  database
-  comparison with other codecs
-  comparison method (PSNR, Resolution)
-  platform for testing

2. You are invited to contribute

Basics of Mpeg 4 3D Graphics Compression

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (6)

Similar to Basics of Mpeg 4 3D Graphics Compression

Similar to Basics of Mpeg 4 3D Graphics Compression (20)

More from Marius Preda PhD

More from Marius Preda PhD (7)

Recently uploaded

Recently uploaded (20)

Basics of Mpeg 4 3D Graphics Compression