In this talk I gave at EHSM 2014 event ( http://ehsm.eu ) I am explaining what a MMU is and how it works. I then explain how I ported NetBSD (and EdgeBSD which is a fork of NetBSD) on this open source LM32 CPU in which I added an MMU.
2. About me
• Yann Sionneau
• Embedded software developer
• Working at Sequans Communication
• M-Labs contributor
• @yannsionneau on twitter
• Email: yann.sionneau@gmail.com
3. I’m going to talk about…
How to run NetBSD and
EdgeBSD on the
Milkymist One
4. Agenda
• I) The hardware part: the MMU
–What is a MMU and how it works
• II) The software part
–How to port NetBSD to a new CPU
12. LatticeMico32 CPU
• 32 bits Harvard Architecture RISC
• Big Endian
• 6 stages
• Fully bypassed
• Optional configurable I/D caches
– Direct mapped or
– 2-way set associative
• Wishbone on-chip bus
13. LatticeMico32 , Good points
• Small
• Portable (works with several FPGA vendors)
• Fast (~100 MHz on Slowtanpartan 6)
• Actually works
• GCC/Binutils/GDB/Qemu/uCLinux/OpenWRT
support
• OPEN SOURCE
18. What’s a pipeline?
• « In computing, a pipeline is a set of
data processing elements connected
in series, where the output of one
element is the input of the next
one. »
-- Pipeline (computing), Wikipedia
19. What’s a pipeline?
Data processing
element 1
Data processing
element 2
Data processing
element 3
IN
IN INOUTOUT
OUT
32. What’s the MMU’s job?
• Translate « virtual addresses » into « physical
addresses »
• Memory protection against unwanted
execution of code or data write (e.g. software
bug or security issue)
– Memory right access management
41. Main Memory
CPU pipeline
VA PA
VA : Virtual Address
PA : Physical Address
Page Table
TLB
TLB : Translation
Lookaside Buffer
42. Main Memory
CPU pipeline
VA PA
VA : Virtual Address
PA : Physical Address
Page Table
TLB
Operating
System
Updates the
Gets information from the
Updates the
43. Features?
• Page size
–Only 4 kB
32 bits physical address :
xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
How many bits of an address indicate the
offset within a given page?
51. Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001004
52. Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004
Page number
Offset in the page
53. Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
54. Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001
55. Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001
VPN = 0xA0001 1010 0000 0000 0000 0001
56. Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001 TLB index = 1
VPN = 0xA0001 1010 0000 00 00 0000 0001
TLB index, used to
select a TLB line
57. Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001 TLB index = 1
VPN = 0xA0001 1010 0000 00 00 0000 0001
TLB index, used to
select a TLB line
58. Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001 TLB index = 1
VPN = 0xA0001 1010 0000 00 00 0000 0001
Tag = 0x280 1010 0000 00
=
59. Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001 TLB index = 1
VPN = 0xA0001 1010 0000 00 00 0000 0001
Tag = 0x280 1010 0000 00
=
Physical page number = 0xB0001
60. Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001 TLB index = 1
VPN = 0xA0001 1010 0000 00 00 0000 0001
Tag = 0x280 1010 0000 00
=
Physical page number = 0xB0001
Physical Address = 0xB0001004
61. Porting NetBSD
• 1°) NetBSD cross compilation toolchain
– build.sh
– Makefiles here and there
– Arch-specific directories
Allows to do:
$ ./build.sh -U -m lm32 tools
62. Porting NetBSD
• 2°) Support for built-ins in libkern
– NetBSD kernel is
• Not linked against libgcc
• Linked against libkern
– Need to implement basic arithmetic functions
emitted by gcc in object code
– Implementation in sys/lib/libkern/arch/lm32
63. Porting NetBSD
• 3°) Building my first kernel
– Create sys/arch/lm32 and sys/arch/milkymist
– Populate
• sys/arch/<cpu|soc>/include
• sys/arch/<cpu|soc>/conf
– Stub, stub, stub…
Allows to do:
$ ./build.sh -m milkymist -U kernel=GENERIC
66. Porting NetBSD
• 7°) Implement pmap.9
pmap -- machine-dependent portion of the virtual
memory system
– pmap_bootstrap()
– pmap_init, pmap_create, pmap_destroy …
– SW managed TLB? -> sys/uvm/pmap/
– used in (PowerPC Booke and LM32)
67. Porting NetBSD
• 8°) Implement copyin/copyout
• 9°) Implement atomic operations
– No atomic instruction RAS (Restartable Atomic
Sequence) CAS (Compare And Swap)
– Other atomic ops built around this CAS
68. RAS CAS
int _atomic_cas_32(volatile uint32_t *val, uint32_t old,
uint32_t new);
_atomic_cas_32:
_atomic_cas_ras_start:
lw r4, (r1+0) /* load *val into r4 */
bne r4, r2, 1f /* compare r4 (*val) and old (r2) */
sw (r1+0), r3
_atomic_cas_ras_end:
1:
mv r1, r4 /* return (*val) */
ret
69. Porting NetBSD
• 10°) Add support for interrupts
– Write a function to register interrupt handlers
• 11°) Have a running system clock
– Write cpu_initclocks()
– Write clock irq handler
• Call hardclock()
70. Other functions to write
• Switch context from one thread to another
– cpu_switchto(9)
• Copy data and abort on page fault
– kcopy(9)
• Save current context
– setfault()
• Low level code to finish up fork() operation
– cpu_lwp_fork(9)
71. Other functions to write
• Block interrupts to protect critical sections
– spl(9)
• Init CPU and print copyright message
– cpu_startup(9)
• Determine the root file system device
– cpu_rootconf(9)
• Etc…
72. Porting NetBSD
• To boot user space
– Create dummy ramdisk with /sbin/init
– Build kernel with MFS
– Insert ramdisk with mdsetimage
– Boot it!
74. Thank you!
Sébastien Bourdeauducq, Michael Walle, Robert
Swindells, Stefan Kristiansson, Lars-Peter
Clausen, Pierre Pronchery, Radoslaw Kujawa,
Youri Mouton, Matt Thomas, tech-kern@, M-
Labs mailing list, and many more
Electronic device aimed at generating artistic video performance in parties and concerts
You can capture live dancers and apply videos effects like rotations zoom in/out translations and project the result against a screen of a wall
It reacts in real time with synchronization to audio input and can be controlled via MIDI keyboard or DMX (protocol used to control stage lighting and effects)
Array of configurable logic blocks, linked together by a programmable switch matrix
Previously I said it’s slow to access main memory.
Here MMU is accessing PT (in RAM) each time to get translations, aren’t we slowing our CPU down?
TLB: clever word for « cache for PVA -> PPA translations »
1st time you wanna translate a page -> go to PT in RAM
Next time you translate the same page -> TLB hit in 1 cycle
In LM32, like MIPS or PowerPC Booke, MMU does not read the page table itself to refill the TLB. (no hardware page tree walker)
Instead MMU raises exception and lets the OS update the TLB.
TLB is entirely managed by SW.
kcopy: copy data like memcpy, aborts on page fault
Setfault: saves current context for later restoring if we take a page fault
cpu_lwp_fork() is the machine-dependent portion of fork1() which finishes a fork operation
cpu_startup: init cpu, print copyright message
spl: raise and lower the interrupt priority level used by kernel code to block interrupts in critical sections
cpu_rootconf: determine the root file system device
Thank you for attending, and thanks for all those who helped for this work