È un problema ormai comune quello di cercare di visualizzare in tempo reale modelli di grandi dimensioni. Modelli di grandi dimensioni sono ormai diffusi nel cinema, nei videogiochi, nella progettazione CAD, nelle immagini mediche, analisi sismiche, dati del territorio, ecc.., e la loro visualizzazione risulta problematica. Questo seminario espone le tecniche che attualmente sono in grado di superare queste limitazioni per rendere possibile la visualizzazione in tempo reale di grandi modelli 3D.
1. www.crs4.it/vic/
Massive Model Rendering
Fabio Marton
CRS4
Visual Computing
2. F. Marton– CRS4/Visual Computing, October 2012
Goal: interactive inspection of
massive models on PC platforms…
Massive datasets rendered on a commodity PC
3. F. Marton– CRS4/Visual Computing, October 2012
Application domains / data sources
Impossibile v isualizzare l'immagine. La memoria del computer potrebbe essere insufficiente per aprire l'immagine oppure l'immagine potrebbe essere
danneggiata. Riav v iare il computer e aprire di nuovo il file. Se v iene v isualizzata di nuov o la x rossa, potrebbe essere necessario eliminare l'immagine e
inserirla di nuov o.
Local Terrain Models
2.5D – Flat – Dense regular
sampling
• Many important
application domains
Planetary terrain models
2.5D – Spherical – Dense • Today’s models exceed
regular sampling
– O(108-1010) samples
Laser scanned models – O(109-1011) bytes
3D – Moderately simple topology – • Varying
low depth complexity - dense
– Dimensionality
CAD models
3D – complex topology – high
– Topology
depth complexity – structured – Sampling distribution
- ‘ugly’ mesh
Natural objects / Simulation
results
3D – complex topology + high depth
complexity + unstructured/high
frequency details
4. F. Marton– CRS4/Visual Computing, October 2012
The (minimal) challenge: real-time
real-
rendering of massive static models
• Explore very large models at interactive
rates
– Update screen at “interactive rates” as viewpoint
changes
View parameters
Storage Screen
I/O
I/O Projection + Visibility + Shading
Limited bandwidth
(network/disk/RAM/CPU/PCIe/GPU/…)
Giga/Tera Bytes Mega Pixels/frame
at 10/100 fps
5. F. Marton– CRS4/Visual Computing, October 2012
A real-time data filtering problem!
real-
• Models of unbounded complexity on limited
computers
– Need for output-sensitive techniques (O(N), not O(K))
→∞)
• We assume less data on screen (N) than in model (K →∞
– Need for memory-efficient techniques (maximize cache
hits!)
– Need for parallel techniques (maximize CPU/GPU core
usage)
View parameters
Storage Screen
I/O Projection + Visibility + Shading
Limited bandwidth
(network/disk/RAM/CPU/PCIe/GPU/…)
O(K=unbounded) bytes 10-100 Hz
(triangles, points, …) O(N=1M-100M) pixels
6. F. Marton– CRS4/Visual Computing, October 2012
A real-time data filtering problem!
real-
• Models of unbounded complexity on limited
computers
– Need for output-sensitive techniques (O(N), not O(K))
→∞)
• We assume less data on screen (N) than in model (K →∞
– Need for memory-efficient techniques (maximize cache
hits!)
– Need for parallel techniques (maximize CPU/GPU core
usage)
View parameters
Storage Screen
Small
I/O Working Set
Projection + Visibility + Shading
Limited bandwidth
(network/disk/RAM/CPU/PCIe/GPU/…)
O(K=unbounded) bytes 10-100 Hz
(triangles, points, …) O(N=1M-100M) pixels
7. F. Marton– CRS4/Visual Computing, October 2012
Output-
Output-sensitive techniques
• At preprocessing
time: build MR COARSE
hierarchy
– Data prefiltering!
– Visibility + simplification
– Not output sensitive
• At run-time: selective
view-dependent
refinement from out-
of-core data FINE
– Must be output sensitive
– Access to prefiltered data under
real-time constraints
– Visibility + LOD
8. F. Marton– CRS4/Visual Computing, October 2012
Output-
Output-sensitive techniques
• At preprocessing
time: build MR
hierarchy
– Data prefiltering!
FRONT
– Visibility + simplification
– Not output sensitive
• At run-time: selective
view-dependent
refinement from out-
of-core data
– Must be output sensitive Occluded / Out-of-view
– Access to prefiltered data under Inaccurate
real-time constraints
Accurate
– Visibility + LOD
9. F. Marton– CRS4/Visual Computing, October 2012
Our contributions
GPU-
GPU-friendly output-sensitive techniques
output-
• Chunk-based multiresolution
structures
– Combine space partitioning + level of detail
– Same structure used for visibility and detail culling
• Seamless combination of chunks Partitioning
and
Adaptive Cache
rendering
– Dependencies ensure consistency at the level of simplification GPU
chunks
• Complex rendering primitives
Off-line On-line
– GPU programming features
– Curvilinear patches, view-dependent voxels, …
Network /
• Chunk-based external memory Bus
management
– Compression/decompression, block transfers,
caching
Multiresolution
structure
(data+dependency)
10. F. Marton– CRS4/Visual Computing, October 2012
Our contributions
GPU-
GPU-friendly output-sensitive techniques
output-
Impossibile v isualizzare l'immagine. La memoria del computer potrebbe essere insufficiente per aprire l'immagine
oppure l'immagine potrebbe essere danneggiata. Riavv iare il computer e aprire di nuov o il file. Se v iene v isualizzata di
nuov o la x rossa, potrebbe essere necessario eliminare l'immagine e inserirla di nuov o.
*-BDAM – Local and Global Terrain Models
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)
EG 2003, IEEE Viz 2003, EG 2005
Adaptive Tetrapuzzles – Dense meshes
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)
SIGGRAPH 2004
Layered Point Clouds – Dense clouds
Gobbetti/Marton (CRS4)
SPBG 2004 / Computers & Graphics 2004
Far Voxels – General
Gobbetti/Marton (CRS4)
SIGGRAPH 2005
Blockmaps – Hybrid volumetric city model
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR)
EG 2007
MOVR – Volumetric models
Gobbetti/Marton/Iglesias Guitian (CRS4)
CGI 2008
11. F. Marton– CRS4/Visual Computing, October 2012
Our contributions
GPU-
GPU-friendly output-sensitive techniques
output-
Impossibile v isualizzare l'immagine. La memoria del computer potrebbe essere insufficiente per aprire l'immagine
oppure l'immagine potrebbe essere danneggiata. Riavv iare il computer e aprire di nuov o il file. Se v iene v isualizzata di
nuov o la x rossa, potrebbe essere necessario eliminare l'immagine e inserirla di nuov o.
*-BDAM – Local and Global Terrain Models
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)
EG 2003, IEEE Viz 2003, EG 2005
Adaptive Tetrapuzzles – Dense meshes
RASTERIZATION
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)
SIGGRAPH 2004
Layered Point Clouds – Dense clouds
Gobbetti/Marton (CRS4)
SPBG 2004 / Computers & Graphics 2004
Far Voxels – General
Gobbetti/Marton (CRS4)
SIGGRAPH 2005
Blockmaps – Hybrid volumetric city model
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR)
EG 2007
MOVR – Volumetric models RAYCASTING
Gobbetti/Marton/Iglesias Guitian (CRS4)
CGI 2008
12. F. Marton– CRS4/Visual Computing, October 2012
Our contributions
GPU-
GPU-friendly output-sensitive techniques
output-
Impossibile v isualizzare l'immagine. La memoria del computer potrebbe essere insufficiente per aprire l'immagine
oppure l'immagine potrebbe essere danneggiata. Riavv iare il computer e aprire di nuov o il file. Se v iene v isualizzata di
nuov o la x rossa, potrebbe essere necessario eliminare l'immagine e inserirla di nuov o.
*-BDAM – Local and Global Terrain Models
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)
EG 2003, IEEE Viz 2003, EG 2005
Adaptive Tetrapuzzles – Dense meshes
MESH-BASED FRAMEWORK
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)
SIGGRAPH 2004
Layered Point Clouds – Dense clouds
Gobbetti/Marton (CRS4)
SPBG 2004 / Computers & Graphics 2004
Far Voxels – General
Gobbetti/Marton (CRS4)
SIGGRAPH 2005
Blockmaps – Hybrid volumetric city model MESH-LESS FRAMEWORK
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR)
EG 2007
MOVR – Volumetric models
Gobbetti/Marton/Iglesias Guitian (CRS4)
CGI 2008
13. F. Marton– CRS4/Visual Computing, October 2012
Our contributions
GPU-
GPU-friendly output-sensitive techniques
output-
Impossibile v isualizzare l'immagine. La memoria del computer potrebbe essere insufficiente per aprire l'immagine
oppure l'immagine potrebbe essere danneggiata. Riavv iare il computer e aprire di nuov o il file. Se v iene v isualizzata di
nuov o la x rossa, potrebbe essere necessario eliminare l'immagine e inserirla di nuov o.
*-BDAM – Local and Global Terrain Models Specialize
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)
EG 2003, IEEE Viz 2003, EG 2005 Chunked Multi-
Triangulations
Gobbetti/Marton (CRS4),
Adaptive Tetrapuzzles – Dense meshes Cignoni/
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) Ganovelli/Ponchio/Scopigno
SIGGRAPH 2004 (CNR) IEEE Viz 2005
Layered Point Clouds – Dense clouds Generalize
Gobbetti/Marton (CRS4)
SPBG 2004 / Computers & Graphics 2004
Specialize
Far Voxels – General
Gobbetti/Marton (CRS4)
SIGGRAPH 2005
View-dep.
Blockmaps – Hybrid volumetric city model Volumetric
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR)
Model
EG 2007
In progress
MOVR – Volumetric models
Gobbetti/Marton/Iglesias Guitian (CRS4)
CGI 2008
Generalize
14. F. Marton– CRS4/Visual Computing, October 2012
Our contributions
GPU-
GPU-friendly output-sensitive techniques
output-
Impossibile v isualizzare l'immagine. La memoria del computer potrebbe essere insufficiente per aprire l'immagine
oppure l'immagine potrebbe essere danneggiata. Riavv iare il computer e aprire di nuov o il file. Se v iene v isualizzata di
nuov o la x rossa, potrebbe essere necessario eliminare l'immagine e inserirla di nuov o.
*-BDAM – Local and Global Terrain Models Specialize
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)
EG 2003, IEEE Viz 2003, EG 2005 Chunked Multi-
Triangulations
Gobbetti/Marton (CRS4),
Adaptive Tetrapuzzles – Dense meshes Cignoni/
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) Ganovelli/Ponchio/Scopigno
SIGGRAPH 2004 (CNR) IEEE Viz 2005
Layered Point Clouds – Dense clouds Generalize
Gobbetti/Marton (CRS4)
SPBG 2004 / Computers & Graphics 2004
Specialize
Far Voxels – General
Gobbetti/Marton (CRS4)
SIGGRAPH 2005
View-dep.
Blockmaps – Hybrid volumetric city model Volumetric
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR)
Model
EG 2007
In progress
MOVR – Volumetric models
Gobbetti/Marton/Iglesias Guitian (CRS4)
CGI 2008
Generalize
15. F. Marton– CRS4/Visual Computing, October 2012
Real-
Real-time adaptive meshes
• The problem: efficiently
create view-dependent
meshes
• Constraints:
– must approximate original
surface with controlled
screen-space error
– must preserve continuity
(conforming meshes)
– must handle meshes of
varying topology
– must be efficiently rendered
16. F. Marton– CRS4/Visual Computing, October 2012
Chunked Multi Triangulations
The Multi Triangulation Framework
• Theoretical basis
– MT multiresolution
framework (Puppo 1996)
Partitioning Cache
Adaptive
• Our contribution and
simplification
rendering GPU
– GPU friendly implementation
based on surface chunks Off-line On-line
with boundary constraints
Network /
– Optimized implicit Bus
specializations
(TetraPuzzles/V-Partitions)
Multiresolution
– Parallel out-of-core pre- structure
(data+dependency)
processing and out-of-core
run-time Cignoni, Ganovelli, Gobbetti, Marton, Ponchio, and Scopigno.
Batched Multi Triangulation.
In Proc. IEEE Visualization. Pages 207-214. October 2005.
17. F. Marton– CRS4/Visual Computing, October 2012
Chunked Multi Triangulations
The Multi Triangulation Framework
• Consider a sequence of local
modifications over a given
description D
– Each modification replaces a
portion of the domain with a
different conforming portion
(simplified)
– f1 floor
– g1 the new fragment
D’=D f∪ g
Di+1=Di⊕ gi+1
18. F. Marton– CRS4/Visual Computing, October 2012
Chunked Multi Triangulations
The Multi Triangulation Framework
• Dependencies
between
modifications can
be arranged in a
DAG
19. F. Marton– CRS4/Visual Computing, October 2012
Chunked Multi Triangulations
The Multi Triangulation Framework
• Dependencies
between
modifications can
be arranged in a
DAG
– Adding a sink to
the DAG we can
associate each
fragment to an arc
leaving a node
20. F. Marton– CRS4/Visual Computing, October 2012
Chunked Multi Triangulations
MT Cuts
• A cut of the DAG
defines a new
representation
– Just paste all the
fragments above the
cut
D*=D0 ⊕ g1 ⊕ g4
21. F. Marton– CRS4/Visual Computing, October 2012
Chunked Multi Triangulations
MT Cuts
• A cut of the DAG defines
a new representation
– Collect all the fragment
floors of cut arcs and you
get a new conforming
mesh
D*=D0 ⊕ g1 ⊕ g4 = f0∞ ∪ f02 ∪ f03 ∪ f13 ∪ f1∞ ∪ f4∞
22. F. Marton– CRS4/Visual Computing, October 2012
Chunked Multi Triangulations
GPU Friendly MT
• Chunked MT assume
fragments are triangle
patches with proper
boundary constraints
– DAG << original mesh
(patches composed by
thousands of tri)
– Structure memory +
traversal overhead
amortized over thousands
of triangles
– Per-patch optimizations
23. F. Marton– CRS4/Visual Computing, October 2012
Chunked Multi Triangulations
GPU Friendly MT
• Chunked MT assume
regions provide good
hierarchical space-
partitioning
– Compact
• Close-to-spherical
– Used for computing fast
projected error upper
bounds
– Used for visibility
queries
24. F. Marton– CRS4/Visual Computing, October 2012
Chunked Multi Triangulations
GPU Friendly MT
• Construction
– Start with hires triangle soup
– Partition model using a
hierarchical space
partitioning scheme
– Construct non-leaf cells by
bottom-up recombination
and simplification of lower
level cells
– Assign model space errors
to cells
• Rendering
– Refine conformal hierarchy, Cache
Adaptive
render selected precomputed rendering GPU
cells
– Project errors to screen
On-line
– Dual queue
25. F. Marton– CRS4/Visual Computing, October 2012
Chunked Multi Triangulations
DAG problems
• Not all MTs are good MTs!
– The topology of dependencies
may lower the adaptivity of the
multiresolution structure
• Cascading dependencies are BAD!!!
– The geometry of DAG regions
may cause problems in view-
dependent rendering
• Compact regions
• Proposed solutions:
– SIGGRAPH 2004: Efficient
constrained technique
(TetraPuzzles)
– IEEE Viz 2005: General
construction technique (V-
Partition)
– … see also QVDR, IEEE Viz 2004
and other related work…
26. F. Marton– CRS4/Visual Computing, October 2012
Adaptive TetraPuzzles
• Construction
– Start with hires triangle
soup
– Partition model using a
conformal hierarchy of
tetrahedra
– Construct non-leaf cells by
bottom-up recombination
and simplification of lower
level cells
• Rendering
– Refine conformal
hierarchy, render selected
precomputed cells
27. F. Marton– CRS4/Visual Computing, October 2012
Adaptive TetraPuzzles
• Construction
– Start with hires triangle
soup
– Partition model using a
conformal hierarchy of
tetrahedra
– Construct non-leaf cells by
bottom-up recombination
and simplification of lower
level cells
• Rendering
– Refine conformal
hierarchy, render selected
precomputed cells
28. F. Marton– CRS4/Visual Computing, October 2012
Adaptive TetraPuzzles
Overview
• Construction
– Start with hires triangle
soup
– Partition model using a
conformal hierarchy of
tetrahedra
– Construct non-leaf cells by
bottom-up recombination
and simplification of lower
level cells
• Rendering
View dependent mesh – Refine conformal
refinement hierarchy, render selected
precomputed cells
29. F. Marton– CRS4/Visual Computing, October 2012
Adaptive TetraPuzzles
Results
Michelangelo’s St. Matthew
Source: Digital Michelangelo
Project
Data: 374M triangles
Intel Xeon 2.4GHz 1GB
GeForce FX 5800U AGP8X
30. F. Marton– CRS4/Visual Computing, October 2012
Advantages of mesh-based
mesh-
multiresolution models
• First GPU bound methods
for very large meshes
– Adaptive conforming
meshes
• Reduced overdraw
– Extensive optimization
• Stripification, cache
coherence, compression, …
– State of the art
performance
• GPU bound, >4Mtri/frame at
>30 fps on modern GPUs
• Extremely high quality for
large dense models with
“well behaved” surface
31. F. Marton– CRS4/Visual Computing, October 2012
Limitations of mesh-based
mesh-
multiresolution models
• Visibility and multiresolution
solved as separate problems
– Error measured on boundary
surfaces
– LOD construction based on
local surface
coarsening/simplification
operations
– LOD construction unaware of
visibility (view-independent
approximations)
• Hard to apply to models with
high detail and complex
topology and high depth
complexity!
32. F. Marton– CRS4/Visual Computing, October 2012
Overcoming limitations of local
mesh refinement techniques
• Tight integration of
visibility and LOD
construction
– Multi-scale modeling of
appearance rather than
geometry
– Volume-based rather
than surface-based
33. F. Marton– CRS4/Visual Computing, October 2012
Our contributions
GPU-
GPU-friendly output-sensitive techniques
output-
Impossibile v isualizzare l'immagine. La memoria del computer potrebbe essere insufficiente per aprire l'immagine
oppure l'immagine potrebbe essere danneggiata. Riavv iare il computer e aprire di nuov o il file. Se v iene v isualizzata di
nuov o la x rossa, potrebbe essere necessario eliminare l'immagine e inserirla di nuov o.
*-BDAM – Local and Global Terrain Models Specialize
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)
EG 2003, IEEE Viz 2003, EG 2005 Chunked Multi-
Triangulations
Gobbetti/Marton (CRS4),
Adaptive Tetrapuzzles – Dense meshes Cignoni/
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) Ganovelli/Ponchio/Scopigno
SIGGRAPH 2004 (CNR) IEEE Viz 2005
Layered Point Clouds – Dense clouds Generalize
Gobbetti/Marton (CRS4)
SPBG 2004 / Computers & Graphics 2004
Specialize
Far Voxels – General
Gobbetti/Marton (CRS4)
SIGGRAPH 2005
View-dep.
Blockmaps – Hybrid volumetric city model Volumetric
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR)
Model
EG 2007
In progress
MOVR – Volumetric models
Gobbetti/Marton/Iglesias Guitian (CRS4)
CGI 2008
Generalize
34. F. Marton– CRS4/Visual Computing, October 2012
Far Voxels
Handling Huge Complex 3D models
• General purpose technique
that targets many model
kinds
• Underlying ideas
– Multi-scale modeling of
appearance rather than
geometry
– Volume-based rather than
surface-based
– Tight integration of
visibility and LOD
construction
– GPU accelerated
(programmabilty +
batching)
35. F. Marton– CRS4/Visual Computing, October 2012
Far Voxels
The Far Voxel Concept
• Assumption: opaque surfaces,
non participating medium
• Goal is to represent the
appearance of complex far
geometry
– Near geometry can be
represented at full resolution
• Idea is to discretize a model
into many small volumes
located in the neighborood of
surfaces
– Approximates how a small
subvolume of the model reflects
the incoming light
=> View-dependent cubical voxel
36. F. Marton– CRS4/Visual Computing, October 2012
Far Voxels
The Far Voxel Concept
• Assumption: opaque surfaces,
non participating medium
• Goal is to represent the
appearance of complex far
geometry
– Near geometry can be
represented at full resolution
• Idea is to discretize a model
into many small volumes
located in the neighborhood of
surfaces
– Approximates how a small
subvolume of the model reflects
the incoming light
=> View-dependent voxel
37. F. Marton– CRS4/Visual Computing, October 2012
Far Voxels
The Far Voxel Concept
• A far voxel returns color
attenuation given
– View direction
– Light direction
• Rendered using a Shader = f (view direction, light direction)
customized vertex shader
executed on the GPU
39. F. Marton– CRS4/Visual Computing, October 2012
Far Voxels
Construction overview: Inner nodes
• Sample a model subvolume D min
to build a grid of far voxels
θ max
• Voxels are far
– Project to worst case θmax
– Viewed not closer than dmin
Section of the 3D grid of far voxels
40. F. Marton– CRS4/Visual Computing, October 2012
Far Voxels
Construction overview: Inner nodes
• Sample a model subvolume D min
to build a grid of far voxels
θ max
• Voxels are far
– Project to worst case θmax
– Viewed not closer than dmin
• Raycasting samples
original model and
identifies visible voxels
Section of the 3D grid of far voxels
41. F. Marton– CRS4/Visual Computing, October 2012
Far Voxels
Construction overview: Inner nodes
• Sample a model subvolume D min
to build a grid of far voxels
θ max
• Voxels are far
– Project to worst case θmax
– Viewed not closer than dmin
• Raycasting samples
original model and
identifies visible voxels
Section of the 3D grid of far voxels
42. F. Marton– CRS4/Visual Computing, October 2012
Far Voxels
Construction overview: Object Space
Occlusion
• Environment occlusion D min
θ max
• Cull interior part of grid
X
X
of far voxels
Section of the 3D grid of far voxels
43. F. Marton– CRS4/Visual Computing, October 2012
Far Voxels
Construction overview: Object Space
Occlusion
• Environment occlusion D min
θ max
• Cull interior part of grid
X
X
of far voxels
Section of the 3D grid of far voxels
44. F. Marton– CRS4/Visual Computing, October 2012
Far Voxels
Construction overview: Object Space
Occlusion
• Environment occlusion D min
θ max
• Cull interior part of grid
X
X
of far voxels
• Culls 40% of the high depth
complexity Boeing 777 model,
• worst case θmax = 0.5 deg
(~10 pixel tolerance for
1024x1024 viewport using
50deg FOV) Section of the 3D grid of far voxels
• Minimize artifacts due to
leaking of occluded parts of
different colors
45. F. Marton– CRS4/Visual Computing, October 2012
Far Voxels
Construction overview: Far Voxel
• Consider voxel subvolume
• Samples gathered from
unoccluded directions
– Sample:
• (BRDF, n) = f(view direction)
46. F. Marton– CRS4/Visual Computing, October 2012
Far Voxels
Construction overview: Far Voxel
• Consider voxel subvolume
• Samples gathered from
unoccluded directions
– Sample:
• (BRDF, n) = f(view direction)
• Compress shading
information by fitting
samples to a compact
analytical representation
47. F. Marton– CRS4/Visual Computing, October 2012
Far Voxels
Construction overview: Far Voxel Shaders
• Build all the K different far Flat proxy:
voxels representations 2 components
– K = flat, smooth..
– Principal component analysis
Smooth proxy:
• Evaluate each representation 6 components
error
– Compare real values (samples)
…
with the voxel approximations Others…
from the sample direction
Err(k) =
• Choose approximation with
lowest error
48. F. Marton– CRS4/Visual Computing, October 2012
Far Voxels
Rendering
• Hierarchical traversal with coherent culling
– Stop when out-of view, occluded (GPU
feedback), or accurate enough
• Leaf node: Triangle rendering
– Draw the precomputed triangle strip
• Inner node: Voxel rendering
– For each far voxel type
• Enable its shader
• Draw all its view dependent primitives using
glDrawArrays
– Splat voxels as antialiased point primitives
– Limits
• Does not consider primitive opacity
• Rendering quality similar to one-pass point splat Triangles
methods (no sorting/blending) Far Voxels
49. F. Marton– CRS4/Visual Computing, October 2012
Far Voxels
Results
• Tested on extremely complex heterogeneous surface
models
– St.Matthew, Boeing 777, Richtmyer Meshkov isosurf., all at
once
• Tested in a number of situations
– Single processor / cluster construction
– Workstation viewing, large scale display
373M triangles 350M triangles 472M triangles 1.2G triangles
14.5 GB 13.7 GB 18.4 GB 46.6 GB
50. F. Marton– CRS4/Visual Computing, October 2012
Far Voxels
Results
• 1-16 Athlon 2200+ CPU, 3 x 70GB ATA 133 Disk
(IDE+NFS)
• 1-20K triangles/sec
– Scales well, limited by slow disk I/O for large meshes
– Slow!! (but similar to recent adaptive tessellation methods)
• Avg. triangles per leaf 5K
• Avg. voxels per inner node 2.5K
5h18m (16 CPU) 6h51m (16 CPU) 8h06m (16 CPU)
10.6 GB 14.9 GB 16.1 GB 41.6 GB
51. F. Marton– CRS4/Visual Computing, October 2012
Far Voxels
Results
• Xeon 2.4GHz, 70GB SCSI 320 Disk, GeForce FX6800GT AGP
8x
• Window size: from video resolution to stereo projector
display
– St.Matthew, Boeing, Isosurface: 640 x 480
– All at once: 640 x 480 and Stereo 2 x 1024 x 768
• Pixel tolerance: [Target 1 | Actual ~0.9 | Max ~10]
• Resident set size limited to ~200 MB
640 x 480
20 Fps
42 MPrim/s
2 x 1024 x 768
45 Fps 44 Fps 34 Fps 20 Fps
51 MPrim/s 42 MPrim/s 41 MPrim/s 40 MPrim/s
52. F. Marton– CRS4/Visual Computing, October 2012
Far Voxels
Conclusions
• General purpose
technique that targets
many model kinds
– Seamless integration of
• multiresolution
• occlusion culling
• out-of-core data management
– High performance
– Scalability
• Main limitations
– Slow preprocessing
– Non-photorealistic rendering
quality
Intel Xeon 2.4GHz 1GB, GeForce 6800GT AGP8X
53. F. Marton– CRS4/Visual Computing, October 2012
Far Voxels
Conclusions
• General purpose
technique that targets
many model kinds
– Seamless integration of
• multiresolution
• occlusion culling
• out-of-core data management
– High performance
– Scalability
• Main limitations
– Slow preprocessing
– Non-photorealistic rendering
quality
Intel Xeon 2.4GHz 1GB, GeForce 6800GT AGP8X
54. F. Marton– CRS4/Visual Computing, October 2012
Our contributions
GPU-
GPU-friendly output-sensitive techniques
output-
Impossibile v isualizzare l'immagine. La memoria del computer potrebbe essere insufficiente per aprire l'immagine
oppure l'immagine potrebbe essere danneggiata. Riavv iare il computer e aprire di nuov o il file. Se v iene v isualizzata di
nuov o la x rossa, potrebbe essere necessario eliminare l'immagine e inserirla di nuov o.
*-BDAM – Local and Global Terrain Models Specialize
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)
EG 2003, IEEE Viz 2003, EG 2005 Chunked Multi-
Triangulations
Gobbetti/Marton (CRS4),
Adaptive Tetrapuzzles – Dense meshes Cignoni/
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) Ganovelli/Ponchio/Scopigno
SIGGRAPH 2004 (CNR) IEEE Viz 2005
Layered Point Clouds – Dense clouds Generalize
Gobbetti/Marton (CRS4)
SPBG 2004 / Computers & Graphics 2004
Specialize
Far Voxels – General
Gobbetti/Marton (CRS4)
SIGGRAPH 2005
View-dep.
Blockmaps – Hybrid volumetric city model Volumetric
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR)
Model
EG 2007
In progress
MOVR – COVRA Volumetric models
Gobbetti/Marton/Iglesias Guitian (CRS4)
CGI 2008
Generalize
55. www.crs4.it/vic/
Recent Advances in Massive
Volume Visualization
56. F. Marton– CRS4/Visual Computing, October 2012
Introduction
Goal
• Visualization of massive scalar
volumes without size limitations
– A single-pass raycasting
technique working out-of-core on
GPU parallel architectures
• Compress data to facilitate data
streaming and 4D visualizations
– Novel compression architecture
and novel compression methods
56
57. F. Marton– CRS4/Visual Computing, October 2012
Introduction
Teaser
Compression-domain adaptive volume rendering based on sparse representation of
voxel blocks. NVIDIA GTX 560
57
58. F. Marton– CRS4/Visual Computing, October 2012
The Visual Computer 2008 & 2010
MOVR: A single-pass raycasting
single-
technique working out-of-core on
out-of-
GPU parallel architectures
58
59. F. Marton– CRS4/Visual Computing, October 2012
Massive Volumes Visualization
Volume rendering problem Early ray termination
Accumulation
Pixel Empty space skipping
Order independent Order dependent
59
60. F. Marton– CRS4/Visual Computing, October 2012
Massive Volumes Visualization
Volume rendering problem
• Current interactive
solutions are based on
GPU architectures
– Massive parallelism
– Huge memory bandwidth
• E.g. GeForce GTX 580
– has a 192.4 GB/s of
bandwidth
– Has 1581.1 GFLOPs
[ hardwareinsight.com ]
60
61. F. Marton– CRS4/Visual Computing, October 2012
Massive Volumes Visualization
Related work. Moderately sized volumes
• Current high quality
solutions based on GPUs
implementing …
– Slice-based methods
– Ray casting techniques
[ Li et al, 2003 ]
• The full volume must fit
on GPU memory
[ Krüger et al., 2003 ]
61
62. F. Marton– CRS4/Visual Computing, October 2012
Massive Volumes Visualization
Contribution to the state-of-the-art
state-of-the-
• Multiresolution out-of-core Volume Renderer
– Preprocessing
• build multiresolution octree of volume bricks
– Rendering:
• Adaptive CPU loading of the data from local/remote repository
cooperates with separate render thread fully executed in the
GPU
• Stackless traversal of an adaptive working set
• Exploitation of the visibility feedback
E. Gobbetti, F. Marton, and J. A. Iglesias Guitián. J. A. Iglesias Guitián, E. Gobbetti and F. Marton
A single-pass GPU ray casting framework for interactive View-dependent exploration of massive volumetric
out-of-core rendering of massive volumetric datasets. models on large-scale light field displays.
The Visual Computer, 24, 2008. The Visual Computer, 26, 2010.
62
63. F. Marton– CRS4/Visual Computing, October 2012
Massive Volumes Visualization
Contribution to the state-of-the-art
state-of-the-
• Use CPU for …
– Creation & loading
– Octree refinement
– Encode current cut using
an spatial index
• Use GPU for …
Architecture overview
– Stackless octree traversal
• Using neighbour pointers
– Rendering
• Flexible ray traversal /
compositing strategies
• Improved visibility
feedback
Neighbour pointer navigation
63
64. F. Marton– CRS4/Visual Computing, October 2012
Massive Volumes Visualization
Method overview
[ creation and maintainance ] [ rendering ]
preprocessing
adaptive loader
offline
octree refinement
visibility
feedback
has current working set
no
enough accuracy?
storage
yes
octree node volume
database prepare to render
render
CPU GPU
64
65. F. Marton– CRS4/Visual Computing, October 2012
Massive Volumes Visualization
Visibility feedback
• Working set reduction
– Opaque 1731 -> 1035 bricks
– Transp. 1984 -> 1789 bricks
• Rendered on window size 1024x576
65
66. F. Marton– CRS4/Visual Computing, October 2012
Massive Volumes Visualization
Results (2/2)
Interactive exploration of
a 16bit 2GB CT volume on
a consumer NVidia 8800
GTS graphics board with
640MB (2008)
66
67. F. Marton– CRS4/Visual Computing, October 2012
Compression – Domain
Volume Rendering
• 60 Time steps of the 432^3 supernova dataset
67
68. F. Marton– CRS4/Visual Computing, October 2012
Volume Compression
Introduction
• Limited bandwidth and memory =>
– LOD (MOVR)
– Compression
• Compression is fully exploited if data is
maintained in compressed form through the
entire pipe-line
– Compression-domain volume renderers + deferred filtering
• Highly asymmetric encoding/decoding schemes
– We can afford slow offline compression and precomputation
– Fast real-time data decoding, interpolation and shading
– Spatially independent random-access to data
68
69. F. Marton– CRS4/Visual Computing, October 2012
State-of-the-
State-of-the-art
• CPU decompression
– Do not limit bandwidth and memory
• [Ning & Hesselink, 92] and many others...
• [Gobbetti et al. 08, Iglesias et al. 10]
• Hardware based
– E.g. S3TC [Brown], NVidia VTC [Craighead]
– Full random access
– Limited compression
• GPU decompression
– Full working set GPU decompression
• Tensor Approximation [Suter et al.2010]
• Do not limit memory
• Limit Bandwidth
– Partial working set
• Limit both memory and bandwitdh
70. F. Marton– CRS4/Visual Computing, October 2012
Tensor Approximation
(CRS4 & UZH 2010)
• Multiresolution
• Brick Based
• Extract dominant data features
• Real Time GPU Reconstruction
– Full Working set
• Bandwidth optimization
• Memory Consumption
S. Suter, J. A. Iglesias Guitián, F.Marton, M. Agus, A. Elsener, C. Zollikofer,
M. Gopi, E. Gobbetti, and R. Pajarola.
Interactive Multiscale Tensor Reconstruction for Multiresolution
Volume Visualization. In: IEEE Transactions on Visualization and
Computer Graphics, pp. 2135–2143, vol 17, 2011
71. F. Marton– CRS4/Visual Computing, October 2012
Volume Compression
Contribution to the state-of-the-art
state-of-the-
• COVRA: Compression-domain Output-sensitive
Volume Rendering Architecture
– Novel architecture w/ parameterized cache behaviour
– Supports and extend state-of-the-art compression methods
• ☺ Efficient multisampling (HQ shading)
• ☺ No perspective limitations
• ☺ Fully adaptive multiresolution approach
• ☺ Multipass working set decompression
• ☺ High compression ratios and signal quality
J. A. Iglesias Guitián, F.Marton and E. Gobbetti.
COVRA: a Compression Domain Output-Sensitive Volume Rendering
Architecture based on sparse representation of voxel blocks
In: proceedings of Eurovis 2012
72. F. Marton– CRS4/Visual Computing, October 2012
Volume Compression
COVRA: Overview
• Main concepts:
– Preprocessor builds multiresolution octree of compressed nodes
– Data travel in compressed format until last stage.
– Fully adaptive Rendering
– Highly integrated decompression / rendering supporting high
quality filtering and shading
72
73. F. Marton– CRS4/Visual Computing, October 2012
Run-
Run-time
COVRA: Subtree management
• Three rendering steps:
1. CPU multiresolution octree
Adaptive refinement
2. Partitioning of the octree into a set
of subtrees
• Use GPU decompressed cache size as
constraint
• Front-to-back order decided at real-
time during the octree traversal
3. Subtree decompression,
raycasting and compositing
• Decompress to temporary buffer or
available GPU cache
• Raycast decompressed octree nodes
• Compose with previous results
Framebuffer
73
74. F. Marton– CRS4/Visual Computing, October 2012
Volume Compression
Sparse coding of volume blocks
• Each multiresolution octree node decomposed in blocks.
• Each block, made of few^3 voxels, is compressed
Single octree node containing
overlapping information Compressed block
• Each block represented by a sparse linear combination of
few dictionary elements
– Data specific representation
– Compression is achieved by storing indices and magnitudes
74
75. F. Marton– CRS4/Visual Computing, October 2012
Volume Compression
Sparse coding of volume blocks
• Generalization of vector quantization
– Combine vectors instead of choosing single ones
– Overcomes limitations due to dictionary sizes
• Generalization of data-specific bases
– Dictionary is an overcomplete basis
– Sparse projection
• Encoding in two steps
– Training: Find data specific dictionary
– Sparse coding: Find best representation of each block
using linear combination of dictionary elements under
sparsity constraint
• We employ ORMP via Choleski Decomposition
75
76. F. Marton– CRS4/Visual Computing, October 2012
Volume Compression
Finding an optimal dictionary
• We employ the K-SVD algorithm for dictionary
training
– Algorithm for designing overcomplete dictionaries for sparse
representations [Aharon et al. 06]
• But running K-SVD calculations directly on
massive volumes would be unfeasible,
therefore …
– … we applied the concept of coreset [Agarwal et al. 05] to
smartly subsample and reweight the original training set
[Feldman & Langberg 11, Feigin et al. 11]
76
77. F. Marton– CRS4/Visual Computing, October 2012
Volume Compression
Dictionary learning (K-SVD)
(K-
• K-SVD can be seen as a K-Means generalization
• Basic steps:
– Sparse coding of signals in X, producing Γ
– Update dictionary atoms given the sparse representations
• Optimize one atom at a time, keeping the rest fixed
• The size of E is proportional to the number of training signals
– As in [Rubinstein et al. 08] we replace the SVD computation
with a simpler numerical approximation
77
78. F. Marton– CRS4/Visual Computing, October 2012
Volume Compression
Coreset construction
• Calculations on massive input volumes are still
unfeasible, but we can …
– … reduce the amount of data used for training
– … use importance sampling
• We associate an importance to each of the
original blocks, being the standard deviation
of the entries in
– Picking C elements with probability proportional to
– More important blocks should finish in our coreset
78
79. F. Marton– CRS4/Visual Computing, October 2012
Volume Compression
Coreset construction
• Non-uniform sampling introduces a severe bias
– Scale each selected block by a weight where
is the associated probability
– Applying K-SVD to scaled coefficients will converge to a
dictionary associated with the original problem
• Coreset scalability
79
80. F. Marton– CRS4/Visual Computing, October 2012
Volume Compression
COVRA: Results
• PSNR vs. Bits Per Sample
80
81. F. Marton– CRS4/Visual Computing, October 2012
Volume Compression
COVRA: Results
• Comparison against state-of-the-art GPU-based
decompression methods
83. F. Marton– CRS4/Visual Computing, October 2012
Volume Compression
COVRA: Results
• Gradient mapped to RGB color
83
84. F. Marton– CRS4/Visual Computing, October 2012
Volume Compression
COVRA: Video
Compression-domain adaptive volume rendering based on sparse representation of
voxel blocks. NVIDIA GTX 560. (2012)
84
85. F. Marton– CRS4/Visual Computing, October 2012
Summary and Conclusions
Summary
• Improved the scalability of state-of-the-art
volume rendering techniques
– MOVR: a novel single-pass GPU ray casting framework supporting a
flexible ray traversal and incorporating visibility feedback for interactive
exploration of large volumes without size limitations
• Improved compression and streaming of large
and time-varying volumes
– COVRA: Proposed a novel compression-domain architecture, supporting
state-of-the-art compression methods, random-access to compressed
data and HQ shading
– A novel compression method for massive volumes based on sparse-
coding (K-SVD) and coreset training sets
85
86. F. Marton– CRS4/Visual Computing, October 2012
Our contributions
GPU-
GPU-friendly output-sensitive techniques
output-
Impossibile v isualizzare l'immagine. La memoria del computer potrebbe essere insufficiente per aprire l'immagine
oppure l'immagine potrebbe essere danneggiata. Riavv iare il computer e aprire di nuov o il file. Se v iene v isualizzata di
nuov o la x rossa, potrebbe essere necessario eliminare l'immagine e inserirla di nuov o.
*-BDAM – Local and Global Terrain Models Specialize
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)
EG 2003, IEEE Viz 2003, EG 2005 Chunked Multi-
Triangulations
Gobbetti/Marton (CRS4),
Adaptive Tetrapuzzles – Dense meshes Cignoni/
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR) Ganovelli/Ponchio/Scopigno
SIGGRAPH 2004 (CNR) IEEE Viz 2005
Layered Point Clouds – Dense clouds Generalize
Gobbetti/Marton (CRS4)
SPBG 2004 / Computers & Graphics 2004
Specialize
Far Voxels – General
Gobbetti/Marton (CRS4)
SIGGRAPH 2005
View-dep.
Blockmaps – Hybrid volumetric city model Volumetric
Gobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR)
Model
EG 2007
In progress
MOVR – COVRA Volumetric models
Gobbetti/Marton/Iglesias Guitian (CRS4)
CGI 2008
Generalize
87. F. Marton– CRS4/Visual Computing, October 2012
A real-time data filtering problem!
real-
• Models of unbounded complexity on limited
computers
– Need for output-sensitive techniques (O(N), not O(K))
→∞)
• We assume less data on screen (N) than in model (K →∞
– Need for memory-efficient techniques (maximize cache
hits!)
– Need for parallel techniques (maximize CPU/GPU core
usage)
View parameters
Storage Screen
Small
I/O Working Set
Projection + Visibility + Shading
Limited bandwidth
(network/disk/RAM/CPU/PCIe/GPU/…)
O(K=unbounded) bytes 10-100 Hz
(triangles, points, …) O(N=1M-100M) pixels
88. F. Marton– CRS4/Visual Computing, October 2012
A real-time data filtering problem!
real-
• Models of unbounded complexity on limited
computers
– Need for output-sensitive techniques (O(N), not O(K))
→∞)
• We assume less data on screen (N) than in model (K →∞
– Need for memory-efficient techniques (maximize cache
hits!)
– Need for parallel techniques (maximize CPU/GPU core
usage)
View parameters
Storage Screen
Small
I/O Working Set
Projection + Visibility + Shading
Limited bandwidth
(network/disk/RAM/CPU/PCIe/GPU/…)
O(K=unbounded) bytes 10-100 Hz
(triangles, points, …) O(N=1M-100M) pixels
89. F. Marton– CRS4/Visual Computing, October 2012
THANK YOU!
Questions and Answers
Next Session
Technologies for improving real-time
real-
immersive exploration of massive
(volumetric) models.
presented by
Marco Agus