Building efficient 5G NR base stations with Intel® Xeon® Scalable Processors

Daniel Towner, Senior Wireless Systems Architect
Intel Corp. Network Platform Group

2
Introduction
In these slides we will be talking about:
 Building Physical Layer software for Virtual Radio Access Network (vRAN)
 What 5G means for Physical Layer compute requirements
 Introducing Intel® Xeon® Scalable processors and Intel® Advanced Vector
Extensions 512 (Intel® AVX-512)

Why must general purpose processors be good at signal processing?

Virtual Radio Access Networks (vRAN)
vRAN is being used to create a new style of
adaptable RAN:
 Efficient scaling of resource
 Overall compute capacity reduces
through pooling
 Improved load balancing
 New types of services
ACCESS
NETWORK
RADIO ACCESS
TECHNOLOGY
CORE
NETWORK
4

5
Custom System on a Chip (SOCs) for RAN
The benefits of vRAN come from being able to build the entire radio access
network on virtualised general purpose processors.
Previous generation RANs had custom
processing (e.g., DSP, accelerators) in some
parts of the system.
 Difficult to scale to different
deployment sizes.
 Limited co-location makes pooling hard.
 Custom hardware requires significant
expense early in the design cycle.

6
Network Stack Layers
Application
Presentation
Session
Transport
Network (L3)
Data Link (L2)
Physical (L1)
Network stack
High software-
complexity
Low/modest
processing
requirements
Low software
complexity
Huge data
processing
requirements
L1 often requires hardware
accelerators, or special DSPs
L2+ can be run on general
purpose processors.
L2+ Easily handled by
general purpose
processor
For vRAN the general purpose processor must be capable of handling signal processing.

The Promise of 5G
7
5G is promising many new features
and improvements over previous
standards:
 More bandwidth
 Greater capacity
 Lower latency
 Internet of Things
 Ultra reliability
What do these features mean for the
Physical Layer signal processing?

8
The Effect of 5G on the Physical Layer
Capacity
Improve RF utilisation:
• Beamforming
• NOMA
• Massive MIMO
Requires sophisticated
floating-point signal
processing functions
to make better use
of spectrum.
Increased bandwidth
• More bands
• mmWave
More data to process at
any given time.
Lower latency
Faster turn-around, resulting in
the need to compute results
faster than previously required.
5G

Beamforming is used to improve capacity
Beamforming is a sophisticated algorithm, and requires floating-point to work
well. The virtualized general purpose processor must be able to run signal
processing in floating-point on large data sets.
We will revisit this later to see how Intel’s processors enable it to be done.
9
Case Study: Beamforming
Omni-directional
Frequency/time multiplexing to
target different UEs
Beamforming
Spatial multiplexing as well as frequency and
time multiplexing improves spectral efficiency

Delivering General Purpose
5G vRAN Compute
10
We have:
 More complex algorithms
 Processing more data
 With less time to do it!
We have to compete against accelerators and DSPs
for Physical Layer processing, and do so in the more
demanding world of 5G.
The general purpose processor needs to deliver
dramatically better compute capabilities than
previously possible.
This is what Intel® Xeon® Scalable processors (formerly Skylake) can deliver.

With Intel® Advanced Vector Extensions 512 (Intel® AVX-512)

12
The Xeon Scalable Processor has many
improved features over the previous generation
Xeon processors:
 Intel® AVX-512: A comprehensive extension
to to to the existing vector instruction set.
 Improvements to the cache hierarchy.
 Improved microarchitecture to deliver more
instructions per cycle.
We need to look at some of these in more
detail to understand how and why they help
with building 5G vRAN.
Intel® Xeon® Scalable Processors

13
Compute: 2x Data Throughput compared to
previous generation
Twice as many
floating-point units
Basic instructions
are now 512b,
instead of 256b
L1 cache bandwidth
has doubled
L2 bandwidth has
doubled
1MB L2
Cache
32K Data
Cache
Load
Unit
Load
Unit
Store
Unit
Float
Unit
Float
Unit
ALU
Shuffle
Instruction scheduler
Register File
32 x 512b
L3 Cache
32K Insn
Cache
https://www.intel.com/content/www/us/en/architecture-and-technology/avx-512-overview.html

14
Data: 4x as Much Storage as previous generation
Non-inclusive
caching improves
the overall
efficiency of the
memory hierarchy
(i.e., data is not
replicated in
different cache
levels), and gives
each core more
storage.
Each register is
twice as big and
there are twice as
many
L2 cache up to
1MB from 256KB
1MB L2
Cache
32K Data
Cache
Load
Unit
Load
Unit
Store
Unit
Float
Unit
Float
Unit
ALU
Shuffle
Instruction scheduler
Register File
32 x 512b
L3 Cache
32K Insn
Cache
The processor’s working set increases in size,
thereby improving efficiency.
5G wireless algorithms tend to fit neatly into the
new register sizes, making them more efficient.

15
Many new instructions, ranging from
bit manipulation to sophisticated
floating-point operations.
Instructions aren’t just wider but do
more too. What took several
instructions in previous processors
can now be done in one instruction.
In combination the compute efficiency
has improved. There is 2x as much
compute resource, and more than 2x
as much processing can be done.
New Instructions: Intel® AVX-512
Intel® Xeon®
processor
families
(formerly
Haswell and
Broadwell)
Intel® Xeon®
processor
Scalable
family
(formerly
code-named
Skylake-SP)
SSE* SSE*
AVX AVX
AVX2 AVX2
AVX512CD
AVX512F
AVX512DQ
AVX512BW
AVX512VL

16
Some of the important new
instructions:
 Masking – operate on selected
SIMD elements only.
 Ternary logic – combine 3 boolean
operations into one.
 Bigger set of conversions possible.
 Extended floating-point operations.
How does this help 5G vRAN?
New Instructions Continued…

17
How Does Intel® AVX-512 Help Beamforming?
Sequential beamforming
Input Output
Input
Input
Input
Input
Input
Input from multiple
data sources:
wider loads, gather
instructions, multi-
input permute
Heavy-duty floating point
algorithm: more floating
point units available
Algorithm requires special
instructions to handle edge
cases: ternary logic, NaN/Inf
handling, and so on.
Output
Output
Output
Output
Output
Output to multiple
data sources: wider
stores, scatter
instructions,
multi-input permute
Run beamforming on multiple
data sets (external data
parallelism).
Number of data sets is
governed by the number of
SIMD lanes: Intel® AVX-512
provides more lanes
Some beams will be marked as
invalid and won’t generate a
useable answer: mask register
switches off individual lanes

18
How Does Intel® AVX-512 Help Modulation Mapping?
…1101010100010101 S0 S1 S2 S3 S4 S5 S6…
High-throughput streaming of
input bits and output symbols:
wider load/store, higher cache
bandwidth
QPSK
QPSK mapping is a direct bit to symbol
conversion: mask registers allow direct lookup
QAM mapping is a table lookup of bit groups to
symbols. The large table size requires regions of
memory/cache to be set aside to store the table.
Throughput is governed by number of loads per
cycle from L1, not the raw compute throughput.
S0
s1
S1
Sn
0101 S5
QAMn
Intel® AVX-512 allows the tables to be stored in
the register file itself: more and wider registers,
multi-input permutes, masked blends.
Digital data streaming in Radio modulation data streaming out
Mapper

19
How Does Intel® AVX-512 Help Polar List Decoding?
ListN decoding in
SIMD lanes. Each
small square is an
integer value (LLR)
Polar List
Decoder
Noisy digital data in Error corrected clean digital data out
Polar operates on blocks
of data: 2x compute,
wider registers, more
registers
Decoding sequence left-to-right
Each element has one of
two different operations
performed on it at any
given point: mask registers
Final `LLR’ integer values
converted to raw bits:
threshold to mask
instructions
Reorder elements across
or within lanes: multi-
source permute, wider
registers, finer element
granularity
New instructions can
allow faster processing:
range instruction (e.g.,
prior min product’)

20
Intel® AVX-512 for Signal Processing
This presentation has shown three different signal processing kernels:
 Beamforming: heavy duty floating point
 Modulation mapping: high-throughput bit bashing
 Polar decoding: per-element conditional processing
All three benefit from Intel’s AVX-512 instruction set, and the same is true
of other important signal processing kernels.
Intel provides a SDK called FlexRAN which provides a comprehensive set of
software building blocks for RAN ECO system to build Virtualized LTE & 5G
NR VNF.

22
Conclusion
 To realize the benefits of vRAN, general purpose processors need to handle
signal processing effectively.
 Intel® Xeon® Scalable Processors deliver many new improvements which
allow them to take on the considerable challenge of 5G Physical Layer
processing.
 In vRAN one compute platform can take on everything from edge to cloud,
being placed where needed.
Smart Devices Radio Access
Technology
Access and
Edge Network
Core Network Cloud
NFV/SDN
MN WAVE/
LTE/
NB-IOT/
WIFI

23
Notices and Disclaimers
Benchmark results were obtained prior to implementation of recent software patches and firmware updates intended to address exploits
referred to as "Spectre" and "Meltdown". Implementation of these updates may make these results inapplicable to your device or system.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.
Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations
and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance
tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other
products. For more information go to www.intel.com/benchmarks.
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a
particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in
trade.
This document contains information on products, services and/or processes in development. All information provided here is subject to
change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps.
The products and services described may contain defects or errors known as errata which may cause deviations from published
specifications. Current characterized errata are available on request.
Intel, the Intel logo and Xeon, are trademarks of Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others
© Intel Corporation.

Building efficient 5G NR base stations with Intel® Xeon® Scalable Processors

Building efficient 5G NR base stations with Intel® Xeon® Scalable Processors

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Building efficient 5G NR base stations with Intel® Xeon® Scalable Processors

Ähnlich wie Building efficient 5G NR base stations with Intel® Xeon® Scalable Processors (20)

Mehr von Michelle Holley

Mehr von Michelle Holley (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Building efficient 5G NR base stations with Intel® Xeon® Scalable Processors