presentation about 2 emerging standards activities that I started and led in MPeG, point cloud compression on a new image and video format, and NBMP for media delivery in 5G networks. Presented at Philips R&D in Eindhoven the Netherlands
1. A summary of two emerging next generation media standards: MPEG
Point Cloud Compression & Network Based Media Processing
Rufael Mekuria, Dirk Griffioen,
Philips Research 20 Maart 2018
2. Unified Streaming
Pioneer in HTTP Streaming (e.g mod H.264)
First to support MPEG DASH Streaming in 2011
Software for video streaming workflows
DRM, Packaging, Content Stitching, live video
Embedded in cloud, Telco and CDN environments
Standards: DASH-IF, MPEG, 3GPP, DVB
3. About the speaker
MSc (EE) Delft 2011, PhD VU Amsterdam 2017
TNO (2011), CWI(2011-2016), Unified (2016-date)
MPEG AhG Chair PCC jan 2014 – July 2017
MPEG NBMP co-Chair PCC July 2017 -
5. Real and Virtual Engagement in Realistic Immersive Environments
Point Cloud Compression/transmission:
Immersive Communications (2014) in Reverie FP7
Highly realistic representation for immersive communications reveriefp7.eu
Human is reconstructed as a photo realistic 3D Cloud (or mesh) of Points in a 3D space!
Challenges: Low bit rate, real-time encoding, color coding, inter frame coding, scalability
7. A collection of points
Not related to each other
Typically no order
Typically no local topology (no mesh!)
Each is point is the given of
a position (X,Y,Z)
a color (R,G,B) or (Y,U,V)
possibly other things like transparency, time of
acquisition, etc.
Point Cloud
Compression
Point cloud content from Microsoft
research laboratory Donated to MPEG
Point Cloud Format
8. .ply files =
an example raw data format for point cloud
This is an example raw point coud file format
(compare to YUV for video coding)
How many points?
Static case up to several tens of millions,
depending on the application
Dynamic case ~1 million per frame, 30 fps
Probably more is needed for good VR
Format
Geometry XYZ
Fixed precision for VR applications
Float still often used
Colors RGB
As usual, integer 8/10 bits.
Possibly other attributes
(not present in the ply file here on the left)
plyformat ascii 1.0
element vertex 764940
property float x
property float y
property float z
property uchar red
property uchar green
property uchar blue
end_header
211 63 63 127 98 73
213 63 61 134 109 87
212 62 63 122 97 75
212 63 62 129 102 79
212 63 63 124 98 76
213 62 63 122 98 76
213 63 62 128 104 81
213 63 63 124 99 78
215 61 63 120 97 76
214 63 60 141 117 95
214 63 61 135 111 89
215 63 60 144 120 97
215 63 61 133 109 87
214 62 62 126 102 80
214 62 63 122 98 77
214 63 62 128 104 82
etc.
one point X Y Z R G B
no order! Swapping points
does not change the data
Point Cloud
Compression
9. An application of point cloud: free-view point (6DoF) for sport
Scene model
360°/omnidirectional background
reshaping depending on viewpoint
3D object
occlusion, parallax (in HMD)
position relatively to the background
Free-view path
viewer body position freely chosen on the free-view path
+ free head movement (in HMD)
360°
background
3D objects
free-
view
path
https://www.youtube.com/watch?v=Q-LNA9KlHhw
Point Cloud
Compression
12. How to render point clouds?
Giving size to points
Splats, rectangles, cubes (=3D pixels)
Trade-off size vs. texture high frequency
Meshing, and using illumation techniques
A demo using PCC contents (and renderer)
8i content
Technicolor based rendering
14. MPEG Use Case immersive mixed reality tele-presence
Point Cloud
Compression
3D Recon
Struction
Software
Reconstructed 3D
Human
3D Point Cloud
Representation
3D Source
Encoding
IP
Network
Multi Depth
Camera
Capture
Or other 3D
Capture
Real-Time 3D
Rendering
Composition
In Virtual World
Packetization
&
Transmission
3D Source
Decoding
N RGB + Depth
Images
Or other sensor
data
Reception
&
Synchronization
Figure 1: Transmission pipeline for conferencing with 3D geometry
15. MPEG Use Case : High Quality broadcast with parallax
Point Cloud
Compression
16. MPEG Use Case : Cultural Heritage
Point Cloud
Compression
Figure: Cultural heritage in the 3D Cloud project
Key requirements for this application are:
Progressive coding to enable increasing quality.
Color attributes coding is needed, preferable 8-12 bits
per component
Generic attributes coding such as for material properties.
Lossless is important to enable the best representation
when possible
Typical point clouds in this use case have the following
characteristics:
1 Million upto billions of points (e.g. [13])
Color attributes of 8-12 bits per color component
Can contain multiple clusters/groups of points
17. Table 2 – MOS on the perceived quality
Point cloudImages
Point Cloud Compression in
MPEG VR Ecosystem (1)
Point Cloud
Compression
18. MPEG Work
Point Cloud
Compression
Datasets
Quality Metric, evaluation methodology (BD-Rate) (2015)
Anchor Codec developed (2015)
Subjective Metric, Rendering (2016)
Large Industry consensus on methodology (2016)
Call for Proposals, first version (2017)
19. Data sets cat 1
Point Cloud
Compression
Red and black
Egyptian Mask
House without a
roof
Frog
Facade
Loot
21. Data sets cat 3
Point Cloud
Compression
Contributed by automitive companies
Like mitshubishi (this example) and ford,
22. Quality Metric (also implemented open source)
Decoded coud
Original cloud
Symmetric distances:
the max of D to O and O to D
PSNR using scalar based on peak value (based on the size of the bounding cube)
𝑃𝑆𝑁𝑅 = 10 log10
3𝑝2
𝑀𝑆𝐸
𝑑 𝑟𝑚𝑠 𝑉𝑜𝑟 , 𝑉𝑑𝑒𝑔 =
1
√𝐾
||𝑣𝑙 − 𝑣 𝑛𝑛 _𝑑𝑒𝑔 ||2𝑣 𝑙 ∈𝑉𝑜𝑟 ,
(4)
𝑑 𝑠𝑦𝑚 _𝑟𝑚𝑠 𝑉𝑜𝑟 , 𝑉𝑑𝑒𝑔 = 𝑚𝑎𝑥( 𝑑 𝑟𝑚𝑠 𝑉𝑜𝑟 , 𝑉𝑑𝑒𝑔 , 𝑑 𝑟𝑚𝑠 𝑉𝑑𝑒𝑔 , 𝑉𝑜𝑟 ) (5)
𝑑ℎ 𝑎𝑢𝑠𝑠 𝑉𝑜𝑟 , 𝑉𝑑𝑒𝑔 = 𝑚𝑎𝑥 𝑣 𝑙 ∈𝑉𝑜𝑟 ,
( ||𝑣𝑙 − 𝑣 𝑛𝑛 _𝑑𝑒𝑔 ||2 ) (6)
𝑑 𝑠𝑦𝑚 _ℎ 𝑎𝑢𝑠𝑠 𝑉𝑜𝑟 , 𝑉𝑑𝑒𝑔 = 𝑚𝑎 𝑥 𝑑ℎ 𝑎𝑢𝑠𝑠 𝑉𝑜𝑟 , 𝑉𝑑𝑒𝑔 , 𝑑ℎ 𝑎𝑢𝑠𝑠 𝑉𝑑𝑒𝑔 , 𝑉𝑜𝑟 (7)
𝑝𝑠𝑛𝑟𝑔𝑒𝑜𝑚 = 10𝑙𝑜𝑔10 ( | 𝑚𝑎𝑥 𝑥,𝑦,𝑧 𝑉𝑑𝑒𝑔 |2
2
/ (𝑑 𝑠𝑦𝑚 𝑟𝑚𝑠 𝑉𝑜𝑟 , 𝑉𝑑𝑒𝑔 )2
) (8)
𝑑 𝑦 𝑉𝑜𝑟 , 𝑉𝑑𝑒𝑔 =
1
√𝐾
||𝑦(𝑣𝑙) − 𝑦(𝑣 𝑛𝑛 _𝑑𝑒𝑔 )||2𝑣 𝑙 ∈𝑉𝑜𝑟 ,
(9)
𝑝𝑠𝑛𝑟𝑦 = 10𝑙𝑜𝑔10 ( |𝑚𝑎𝑥 𝑦 𝑉𝑑𝑒𝑔 |2
2
/ (𝑑 𝑦 𝑉𝑜𝑟 , 𝑉𝑑𝑒𝑔 )2
) (10)
Tool for computing metric available in MPEG, and as open source with the anchor codec:
https://github.com/RufaelDev/pcc-mp3dg (old version)
23. Benchmark anchor
Point Cloud
Compression
Extension of PCL codec, available in open source, used reference technologies:
Design and implementation
R. Mekuria, K. Blom and P. Cesar, "Design, Implementation, and Evaluation of a Point Cloud Codec for Tele-Immersive Video,"
in IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 4, pp. 828-842, April 2017.
doi: 10.1109/TCSVT.2016.2543039
24. Subjective Evaluation
Point Cloud
Compression
Rendering on unknown camera path
Raw HD video
Evaluation on large screen using video based methodology
Rate points corresponding to the anchor and selected bit-rates
Paper on the assessment methodology available:
R. Mekuria, S. Laserre and C. Tulvan, "Performance assessment of
point cloud compression," 2017 IEEE Visual Communications and
Image Processing (VCIP), St. Petersburg, FL, 2017, pp. 1-4.
doi: 10.1109/VCIP.2017.8305132
25. Call for Proposals
Point Cloud
Compression
- 13 responses including all major mobile device manufacturers
(Nokia, Huawei, Samsung, Apple and Sony), but also
Technicolor, Owlii, 8i and Unified Streaming
- 3 Test Models, will become 1 or 2 hopefully
- Octree based coding for static and mobile mapping point
clouds
- Mapping to image grid for dynamic point clouds (holograms)
and further compression using HEVC
- Timeline, FDIS expected late 2019 or 2020
26. Acknowledgement
Point Cloud
Compression
- Thanks to 8i for the datasets and Technicolor for the renderer
- Thanks to all companies for participating in the MPEG Call for
Proposals
- Thanks to CWI and Reverie FP7 for the support
- Thanks to Unified Streaming and H2020 for continuing
support for this work
27. MPEG NBMP
Point Cloud
Compression
Motivation: cloud native edge converged networking with
mixed types of hardware
NBMP Overview
Reference Architecture
Media ingest example use case
Expected Timeline
28. cloud native edge converged networking
with mixed types of hardware
Edge cloud
Edge cloud
Core NetworkAccess Network
User equipment
User equipment
Scalable data center
cloud infrastructure
Aggregation Network
4G LTE
LTE
Broadcast
5G
Wireless
Virtualized Network
User equipment
Cloud-RAN
Radio
network
information
ETSI MEC
orchestration
Radio
network
information
MEC Cloud
Regional cloud
api
api
29. Superfluidity Cloud converged 5G: 4 I’s (I for independence)
Location independence (multiple data centers, fine grained allocation of
the location)
Hardware independence (run on commodity hardware x64, x86, ARM,
FPGA, GPU, dedicated hardware via API)
Time independence (fast instantiation and teardown of services
components)
Scale independence (scale from one to many users instantananeously)
Joint work with Nokia, Intel, RedHat, Citrix, BT, Telefonica, OnApp etc.
30. Media Distribution in 5G Superfluidity
Distributed processing (DRM, on-the-fly conversion of file format,
content stitching), also in the mobile edge
May run on different hardware (USP on raspberry PI, setop box)
Fast near instant spinning up of media services like TV stations
Huge scalability from 1 to millions of users by scaling OpenStack
instances automatically based on telemetry and machine learning
http://www.unified-streaming.com/blog/5g-superfluidity-and-future-streaming-video
http://www.unified-streaming.com/blog/superfluid-5g-mobile-network-architecture
31. MPEG NBMP
Leverage edge converged 5G architecture for media
API for media processing in the cloud/network
Data exchange formats for improved media processing
(e.g. guided transcoding, content stitching)
Descriptors for media processing workflows, chaining of
media processing functions
NBMP “framework” for MPEG-I
Alligned with 5G Media distribution activity in 3GPP
Re-use existing MPEG standards
35. Media Processing Entity
Media Processing Entity
Publish Format
(e.g. CMAF,
DASH, MPU,
HLS, MPEG-2 TS)
Control Functions
Media source
(e.g. Camera,
PC, storage, live
encoder)
Media sink
(e.g. Player)
Funtions
Processing
Functions
NBMP Format
NBMP Format
NBMP Format : media resources, supplementary information, workflow(instruction) description
36. Example: Unified Origin on Edge
Origin
Content
One format in the network
Reduce cache by storing one
format
All formats are created on
request only
Per geo specific rules can be
applied
Cloud Edge
37. Example prototypes related NBMP
- Papers:
- Rufael Mekuria, Jelte Fennema, and Dirk Griffioen. 2016. Multi-Protocol Video
Delivery with Late Trans-Muxing. In Proceedings of the 2016 ACM on Multimedia
Conference (MM '16). ACM, New York, NY, USA, 92-96. DOI:
https://doi.org/10.1145/2964284.2967189
- Open implementation: https://github.com/unifiedstreaming/ltm-edge
- Arjen Wagenaar, Dirk Griffioen, and Rufael Mekuria. 2017. Unified Remix: a
Server Side Solution for Adaptive Bit-Rate Streaming with Inserted and Edited
Media Content. In Proceedings of the 8th ACM on Multimedia Systems
Conference (MMSys'17)
- Open implementation: https://github.com/unifiedremix/remix
38. NBMP Summary
- Cloud/Telco interfaces for distributed media processing
- Wide range of use cases key for immersive media delivery, adopted for
MPEG-I
- Exploration phase
- Unified Streaming chairs with Samsung/SK Telecom
- CfP Expected in April 2018
- Responses Expected in July 2018
- Delay could happen
- Unified Streaming working mainly on media ingest part (not discussed in
this presentation)
- Unified Open to collaborate with Philips on a use case if this fits with some
Philips Technologies
For comparison purpose we set the SC3DMC coders to code with 8 bits and differential coding options. We compared the orignal captured meshes to the meshes decoded using a tool to measure the symmetric distance between the surfaces. This metric is often known as haussdorf distance. A set of live captured models compressed and decoded shows that the qualities are comparible. Note that lower value implies better quality.
For comparison purpose we set the SC3DMC coders to code with 8 bits and differential coding options. We compared the orignal captured meshes to the meshes decoded using a tool to measure the symmetric distance between the surfaces. This metric is often known as haussdorf distance. A set of live captured models compressed and decoded shows that the qualities are comparible. Note that lower value implies better quality.
For comparison purpose we set the SC3DMC coders to code with 8 bits and differential coding options. We compared the orignal captured meshes to the meshes decoded using a tool to measure the symmetric distance between the surfaces. This metric is often known as haussdorf distance. A set of live captured models compressed and decoded shows that the qualities are comparible. Note that lower value implies better quality.
For comparison purpose we set the SC3DMC coders to code with 8 bits and differential coding options. We compared the orignal captured meshes to the meshes decoded using a tool to measure the symmetric distance between the surfaces. This metric is often known as haussdorf distance. A set of live captured models compressed and decoded shows that the qualities are comparible. Note that lower value implies better quality.
For comparison purpose we set the SC3DMC coders to code with 8 bits and differential coding options. We compared the orignal captured meshes to the meshes decoded using a tool to measure the symmetric distance between the surfaces. This metric is often known as haussdorf distance. A set of live captured models compressed and decoded shows that the qualities are comparible. Note that lower value implies better quality.
For comparison purpose we set the SC3DMC coders to code with 8 bits and differential coding options. We compared the orignal captured meshes to the meshes decoded using a tool to measure the symmetric distance between the surfaces. This metric is often known as haussdorf distance. A set of live captured models compressed and decoded shows that the qualities are comparible. Note that lower value implies better quality.
For comparison purpose we set the SC3DMC coders to code with 8 bits and differential coding options. We compared the orignal captured meshes to the meshes decoded using a tool to measure the symmetric distance between the surfaces. This metric is often known as haussdorf distance. A set of live captured models compressed and decoded shows that the qualities are comparible. Note that lower value implies better quality.
For comparison purpose we set the SC3DMC coders to code with 8 bits and differential coding options. We compared the orignal captured meshes to the meshes decoded using a tool to measure the symmetric distance between the surfaces. This metric is often known as haussdorf distance. A set of live captured models compressed and decoded shows that the qualities are comparible. Note that lower value implies better quality.
For comparison purpose we set the SC3DMC coders to code with 8 bits and differential coding options. We compared the orignal captured meshes to the meshes decoded using a tool to measure the symmetric distance between the surfaces. This metric is often known as haussdorf distance. A set of live captured models compressed and decoded shows that the qualities are comparible. Note that lower value implies better quality.
For comparison purpose we set the SC3DMC coders to code with 8 bits and differential coding options. We compared the orignal captured meshes to the meshes decoded using a tool to measure the symmetric distance between the surfaces. This metric is often known as haussdorf distance. A set of live captured models compressed and decoded shows that the qualities are comparible. Note that lower value implies better quality.
For comparison purpose we set the SC3DMC coders to code with 8 bits and differential coding options. We compared the orignal captured meshes to the meshes decoded using a tool to measure the symmetric distance between the surfaces. This metric is often known as haussdorf distance. A set of live captured models compressed and decoded shows that the qualities are comparible. Note that lower value implies better quality.