SlideShare a Scribd company logo
1 of 76
Porting NetBSD
on
the open source
LatticeMico32 CPU
Yann Sionneau
M-Labs
@ EHSM 2014
About me
• Yann Sionneau
• Embedded software developer
• Working at Sequans Communication
• M-Labs contributor
• @yannsionneau on twitter
• Email: yann.sionneau@gmail.com
I’m going to talk about…
How to run NetBSD and
EdgeBSD on the
Milkymist One
Agenda
• I) The hardware part: the MMU
–What is a MMU and how it works
• II) The software part
–How to port NetBSD to a new CPU
Milkymist One?!
Milkymist One?!
Milkymist One?!
The Milkymist One uses an FPGA
What’s an FPGA??
• A chip
FPGA internals
Milkymist System-on-Chip
LatticeMico32 CPU
• 32 bits Harvard Architecture RISC
• Big Endian
• 6 stages
• Fully bypassed
• Optional configurable I/D caches
– Direct mapped or
– 2-way set associative
• Wishbone on-chip bus
LatticeMico32 , Good points
• Small
• Portable (works with several FPGA vendors)
• Fast (~100 MHz on Slowtanpartan 6)
• Actually works
• GCC/Binutils/GDB/Qemu/uCLinux/OpenWRT
support
• OPEN SOURCE
LatticeMico32, Bad points
• No Memory Management Unit… yet!
LatticeMico32, Bad points
• No Memory Management Unit… yet!
Done 
Used in…
• Closed source commercial ASICs
• Open source projects
• Can achieve 800 MHz in TSMC 90nm
standard cell process
LatticeMico32 pipeline
What’s a pipeline?
• « In computing, a pipeline is a set of
data processing elements connected
in series, where the output of one
element is the input of the next
one. »
-- Pipeline (computing), Wikipedia
What’s a pipeline?
Data processing
element 1
Data processing
element 2
Data processing
element 3
IN
IN INOUTOUT
OUT
What’s a pipeline?
$ cat .bash_history | grep 'cat' | wc -l
6
What’s a CPU pipeline?
What’s a CPU pipeline?
Pipelined instruction execution
Instr.
number
Pipeline Stage
1 A
2
3
4
Clock cycle 1 2 3 4 5 6 7
Pipelined instruction execution
Instr.
number
Pipeline Stage
1 A F
2 A
3
4
Clock cycle 1 2 3 4 5 6 7
Pipelined instruction execution
Instr.
number
Pipeline Stage
1 A F D
2 A F
3 A
4
Clock cycle 1 2 3 4 5 6 7
Pipelined instruction execution
Instr.
number
Pipeline Stage
1 A F D X
2 A F D
3 A F
4 A
Clock cycle 1 2 3 4 5 6 7
Pipelined instruction execution
Instr.
number
Pipeline Stage
1 A F D X M
2 A F D X
3 A F D
4 A F
Clock cycle 1 2 3 4 5 6 7
Pipelined instruction execution
Instr.
number
Pipeline Stage
1 A F D X M W
2 A F D X M
3 A F D X
4 A F D
Clock cycle 1 2 3 4 5 6 7
Pipelined instruction execution
Instr.
number
Pipeline Stage
1 A F D X M W
2 A F D X M W
3 A F D X M
4 A F D X
Clock cycle 1 2 3 4 5 6 7
Main Memory
CPU Internal
Before
PHYSICAL
ADDRESS
PHYSICAL
ADDRESS
PA
PA
Main Memory
CPU Internal
Raising exception
After
VIRTUAL ADDRESSES PHYSICAL ADDRESSES
What’s the MMU’s job?
• Translate « virtual addresses » into « physical
addresses »
• Memory protection against unwanted
execution of code or data write (e.g. software
bug or security issue)
– Memory right access management
Main Memory
CPU pipeline
VA PA
VA : Virtual Address
PA : Physical Address
Main Memory
CPU pipeline
VA PA
VA : Virtual Address
PA : Physical Address
How does the MMU know the VA->PA
translation ?
Main Memory
CPU pipeline
VA PA
VA : Virtual Address
PA : Physical Address
Page Table
Main Memory
CPU pipeline
VA PA
VA : Virtual Address
PA : Physical Address
Page TableWhy « PAGE »?
Why « Page »?
• 0x00000004 -> 0x10000000
• 0x00000005 -> 0x10000001
• 0x00000006 -> 0x10000002
Etc…
Why « Page »?
• 0x00000004 -> 0x10000000
• 0x00000005 -> 0x10000001
• 0x00000006 -> 0x10000002
Etc…
This is WRONG!!!
Why « Page »?
• 0x00000*** -> 0x10000***
• 0x00001*** -> 0x10001***
• 0x00002*** -> 0x10002***
Etc…
Main Memory
CPU pipeline
VA PA
VA : Virtual Address
PA : Physical Address
Page Table
Main Memory
CPU pipeline
VA PA
VA : Virtual Address
PA : Physical Address
Page Table
TLB
TLB : Translation
Lookaside Buffer
Main Memory
CPU pipeline
VA PA
VA : Virtual Address
PA : Physical Address
Page Table
TLB
Operating
System
Updates the
Gets information from the
Updates the
Features?
• Page size
–Only 4 kB
32 bits physical address :
xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
How many bits of an address indicate the
offset within a given page?
Features?
• Page size
–Only 4 kB
32 bits physical address :
xxxxxxxx xxxxxxxx xxxx xxxx xxxxxxxx
Page number [31:12]
20 bits
Offset [11:0]
12 bits
Features?
• 2 TLB (Translation Lookaside Buffer)
–ITLB
–DTLB
• Each TLB contains 1024 entries
–How many bits needed to index the TLB?
Features?
• 2 TLB (Translation Lookaside Buffer)
–ITLB
–DTLB
• Each TLB contains 1024 entries
–How many bits needed to index the TLB?
10 bits!
Features?
• No hardware page-tree walker
– i.e. TLB is software assisted
Virtual address
Load or store?
Instruction or
Data?
Physical address
Access
granted/denied
Virtual address
Load or store?
Instruction or
Data?
Physical address
Access
granted/denied
I don’t know!
Let’s have a look inside
Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001004
Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004
Page number
Offset in the page
Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001
Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001
VPN = 0xA0001  1010 0000 0000 0000 0001
Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001 TLB index = 1
VPN = 0xA0001  1010 0000 00 00 0000 0001
TLB index, used to
select a TLB line
Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001 TLB index = 1
VPN = 0xA0001  1010 0000 00 00 0000 0001
TLB index, used to
select a TLB line
Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001 TLB index = 1
VPN = 0xA0001  1010 0000 00 00 0000 0001
Tag = 0x280  1010 0000 00
=
Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001 TLB index = 1
VPN = 0xA0001  1010 0000 00 00 0000 0001
Tag = 0x280  1010 0000 00
=
Physical page number = 0xB0001
Tag [10] Physical page number [20] Read-only [1] Valid [1]
0xABC 0xABC00 0 0
0x280 0xB0001 1 1
0x300 0x00001 0 1
The TLB
VA = 0xA0001 004 Page offset = 4
Virtual Page number = 0xA0001 TLB index = 1
VPN = 0xA0001  1010 0000 00 00 0000 0001
Tag = 0x280  1010 0000 00
=
Physical page number = 0xB0001
Physical Address = 0xB0001004
Porting NetBSD
• 1°) NetBSD cross compilation toolchain
– build.sh
– Makefiles here and there
– Arch-specific directories
Allows to do:
$ ./build.sh -U -m lm32 tools
Porting NetBSD
• 2°) Support for built-ins in libkern
– NetBSD kernel is
• Not linked against libgcc
• Linked against libkern
– Need to implement basic arithmetic functions
emitted by gcc in object code
– Implementation in sys/lib/libkern/arch/lm32
Porting NetBSD
• 3°) Building my first kernel
– Create sys/arch/lm32 and sys/arch/milkymist
– Populate
• sys/arch/<cpu|soc>/include
• sys/arch/<cpu|soc>/conf
– Stub, stub, stub…
Allows to do:
$ ./build.sh -m milkymist -U kernel=GENERIC
Porting NetBSD
• 4°) Write basic console driver for early prints
struct consdev milkymist_com_cons = {
[…]
milkymist_com_cngetc, /* cn_getc: kernel getchar interface */
milkymist_com_cnputc, /* cn_putc: kernel putchar interface */
[…]
};
Porting NetBSD
• 5°) Implement exception handlers
• 6°) Call milkymist_startup() C code
– Initialize console driver
• -> consinit() -> milkymist_uart_cnattach()
• cn_tab = &milkymist_com_cons;
– Initialiaze virtual memory subsystem
• Call MD pmap_bootstrap()
– Let the kernel boot
• Call NetBSD MI main()
Porting NetBSD
• 7°) Implement pmap.9
pmap -- machine-dependent portion of the virtual
memory system
– pmap_bootstrap()
– pmap_init, pmap_create, pmap_destroy …
– SW managed TLB? -> sys/uvm/pmap/
– used in (PowerPC Booke and LM32)
Porting NetBSD
• 8°) Implement copyin/copyout
• 9°) Implement atomic operations
– No atomic instruction  RAS (Restartable Atomic
Sequence) CAS (Compare And Swap)
– Other atomic ops built around this CAS
RAS CAS
int _atomic_cas_32(volatile uint32_t *val, uint32_t old,
uint32_t new);
_atomic_cas_32:
_atomic_cas_ras_start:
lw r4, (r1+0) /* load *val into r4 */
bne r4, r2, 1f /* compare r4 (*val) and old (r2) */
sw (r1+0), r3
_atomic_cas_ras_end:
1:
mv r1, r4 /* return (*val) */
ret
Porting NetBSD
• 10°) Add support for interrupts
– Write a function to register interrupt handlers
• 11°) Have a running system clock
– Write cpu_initclocks()
– Write clock irq handler
• Call hardclock()
Other functions to write
• Switch context from one thread to another
– cpu_switchto(9)
• Copy data and abort on page fault
– kcopy(9)
• Save current context
– setfault()
• Low level code to finish up fork() operation
– cpu_lwp_fork(9)
Other functions to write
• Block interrupts to protect critical sections
– spl(9)
• Init CPU and print copyright message
– cpu_startup(9)
• Determine the root file system device
– cpu_rootconf(9)
• Etc…
Porting NetBSD
• To boot user space
– Create dummy ramdisk with /sbin/init
– Build kernel with MFS
– Insert ramdisk with mdsetimage
– Boot it!
Porting NetBSD
DEMO
Thank you!
Sébastien Bourdeauducq, Michael Walle, Robert
Swindells, Stefan Kristiansson, Lars-Peter
Clausen, Pierre Pronchery, Radoslaw Kujawa,
Youri Mouton, Matt Thomas, tech-kern@, M-
Labs mailing list, and many more
Questions?
NetBSD/milkymist Memory Layout
Kernel
space
User space
0 0xffffffff
0xc0000000
0xc8000000
Ram window
User stack
Kernel
stack
DDR SDRAM :
128 MB

More Related Content

Similar to Porting NetBSD to the open source LatticeMico32 CPU

Porting NetBSD to the LatticeMico32 open source CPU by Yann Sionneau
Porting NetBSD to the LatticeMico32 open source CPU by Yann SionneauPorting NetBSD to the LatticeMico32 open source CPU by Yann Sionneau
Porting NetBSD to the LatticeMico32 open source CPU by Yann Sionneaueurobsdcon
 
Windows debugging sisimon
Windows debugging   sisimonWindows debugging   sisimon
Windows debugging sisimonSisimon Soman
 
Windows kernel debugging workshop in florida
Windows kernel debugging   workshop in floridaWindows kernel debugging   workshop in florida
Windows kernel debugging workshop in floridaSisimon Soman
 
SignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer OptimizationSignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer OptimizationSignalFx
 
The forgotten art of assembly
The forgotten art of assemblyThe forgotten art of assembly
The forgotten art of assemblyMarian Marinov
 
Hacklu11 Writeup
Hacklu11 WriteupHacklu11 Writeup
Hacklu11 Writeupnkslides
 
Lessons learned from Isbank - A Story of a DB2 for z/OS Initiative
Lessons learned from Isbank - A Story of a DB2 for z/OS InitiativeLessons learned from Isbank - A Story of a DB2 for z/OS Initiative
Lessons learned from Isbank - A Story of a DB2 for z/OS InitiativeCuneyt Goksu
 
Ak12 upgrade
Ak12 upgradeAk12 upgrade
Ak12 upgradeAccenture
 
DEF CON 27 - XILING GONG PETER PI - exploiting qualcom wlan and modem over th...
DEF CON 27 - XILING GONG PETER PI - exploiting qualcom wlan and modem over th...DEF CON 27 - XILING GONG PETER PI - exploiting qualcom wlan and modem over th...
DEF CON 27 - XILING GONG PETER PI - exploiting qualcom wlan and modem over th...Felipe Prado
 
Code and Memory Optimisation Tricks
Code and Memory Optimisation Tricks Code and Memory Optimisation Tricks
Code and Memory Optimisation Tricks Sperasoft
 
Code and memory optimization tricks
Code and memory optimization tricksCode and memory optimization tricks
Code and memory optimization tricksDevGAMM Conference
 
Dynamic Binary Analysis and Obfuscated Codes
Dynamic Binary Analysis and Obfuscated Codes Dynamic Binary Analysis and Obfuscated Codes
Dynamic Binary Analysis and Obfuscated Codes Jonathan Salwan
 
[Advantech] PAC SW Multiprog Tutorial step by step
[Advantech] PAC SW Multiprog Tutorial step by step [Advantech] PAC SW Multiprog Tutorial step by step
[Advantech] PAC SW Multiprog Tutorial step by step Ming-Hung Hseih
 
Windows Debugging with WinDbg
Windows Debugging with WinDbgWindows Debugging with WinDbg
Windows Debugging with WinDbgArno Huetter
 
Fundamentals of Complete Crash and Hang Memory Dump Analysis
Fundamentals of Complete Crash and Hang Memory Dump AnalysisFundamentals of Complete Crash and Hang Memory Dump Analysis
Fundamentals of Complete Crash and Hang Memory Dump AnalysisDmitry Vostokov
 
basic computer programming and micro programmed control
basic computer programming and micro programmed controlbasic computer programming and micro programmed control
basic computer programming and micro programmed controlRai University
 
A compact bytecode format for JavaScriptCore
A compact bytecode format for JavaScriptCoreA compact bytecode format for JavaScriptCore
A compact bytecode format for JavaScriptCoreTadeu Zagallo
 
Ccna2 mod3-configuring a-router
Ccna2 mod3-configuring a-routerCcna2 mod3-configuring a-router
Ccna2 mod3-configuring a-router97148881557
 

Similar to Porting NetBSD to the open source LatticeMico32 CPU (20)

Porting NetBSD to the LatticeMico32 open source CPU by Yann Sionneau
Porting NetBSD to the LatticeMico32 open source CPU by Yann SionneauPorting NetBSD to the LatticeMico32 open source CPU by Yann Sionneau
Porting NetBSD to the LatticeMico32 open source CPU by Yann Sionneau
 
Windows debugging sisimon
Windows debugging   sisimonWindows debugging   sisimon
Windows debugging sisimon
 
Windows kernel debugging workshop in florida
Windows kernel debugging   workshop in floridaWindows kernel debugging   workshop in florida
Windows kernel debugging workshop in florida
 
SignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer OptimizationSignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer Optimization
 
The forgotten art of assembly
The forgotten art of assemblyThe forgotten art of assembly
The forgotten art of assembly
 
Hacklu11 Writeup
Hacklu11 WriteupHacklu11 Writeup
Hacklu11 Writeup
 
Lessons learned from Isbank - A Story of a DB2 for z/OS Initiative
Lessons learned from Isbank - A Story of a DB2 for z/OS InitiativeLessons learned from Isbank - A Story of a DB2 for z/OS Initiative
Lessons learned from Isbank - A Story of a DB2 for z/OS Initiative
 
Ak12 upgrade
Ak12 upgradeAk12 upgrade
Ak12 upgrade
 
DEF CON 27 - XILING GONG PETER PI - exploiting qualcom wlan and modem over th...
DEF CON 27 - XILING GONG PETER PI - exploiting qualcom wlan and modem over th...DEF CON 27 - XILING GONG PETER PI - exploiting qualcom wlan and modem over th...
DEF CON 27 - XILING GONG PETER PI - exploiting qualcom wlan and modem over th...
 
Code and Memory Optimisation Tricks
Code and Memory Optimisation Tricks Code and Memory Optimisation Tricks
Code and Memory Optimisation Tricks
 
Code and memory optimization tricks
Code and memory optimization tricksCode and memory optimization tricks
Code and memory optimization tricks
 
Dynamic Binary Analysis and Obfuscated Codes
Dynamic Binary Analysis and Obfuscated Codes Dynamic Binary Analysis and Obfuscated Codes
Dynamic Binary Analysis and Obfuscated Codes
 
[Advantech] PAC SW Multiprog Tutorial step by step
[Advantech] PAC SW Multiprog Tutorial step by step [Advantech] PAC SW Multiprog Tutorial step by step
[Advantech] PAC SW Multiprog Tutorial step by step
 
Windows Debugging with WinDbg
Windows Debugging with WinDbgWindows Debugging with WinDbg
Windows Debugging with WinDbg
 
Fundamentals of Complete Crash and Hang Memory Dump Analysis
Fundamentals of Complete Crash and Hang Memory Dump AnalysisFundamentals of Complete Crash and Hang Memory Dump Analysis
Fundamentals of Complete Crash and Hang Memory Dump Analysis
 
2014 ii c08t-sbc pic para ecg
2014 ii c08t-sbc pic para ecg 2014 ii c08t-sbc pic para ecg
2014 ii c08t-sbc pic para ecg
 
basic computer programming and micro programmed control
basic computer programming and micro programmed controlbasic computer programming and micro programmed control
basic computer programming and micro programmed control
 
A compact bytecode format for JavaScriptCore
A compact bytecode format for JavaScriptCoreA compact bytecode format for JavaScriptCore
A compact bytecode format for JavaScriptCore
 
Ccna2 mod3-configuring a-router
Ccna2 mod3-configuring a-routerCcna2 mod3-configuring a-router
Ccna2 mod3-configuring a-router
 
microprocessors
microprocessorsmicroprocessors
microprocessors
 

Recently uploaded

Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 

Recently uploaded (20)

Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 

Porting NetBSD to the open source LatticeMico32 CPU

  • 1. Porting NetBSD on the open source LatticeMico32 CPU Yann Sionneau M-Labs @ EHSM 2014
  • 2. About me • Yann Sionneau • Embedded software developer • Working at Sequans Communication • M-Labs contributor • @yannsionneau on twitter • Email: yann.sionneau@gmail.com
  • 3. I’m going to talk about… How to run NetBSD and EdgeBSD on the Milkymist One
  • 4. Agenda • I) The hardware part: the MMU –What is a MMU and how it works • II) The software part –How to port NetBSD to a new CPU
  • 8. The Milkymist One uses an FPGA
  • 12. LatticeMico32 CPU • 32 bits Harvard Architecture RISC • Big Endian • 6 stages • Fully bypassed • Optional configurable I/D caches – Direct mapped or – 2-way set associative • Wishbone on-chip bus
  • 13. LatticeMico32 , Good points • Small • Portable (works with several FPGA vendors) • Fast (~100 MHz on Slowtanpartan 6) • Actually works • GCC/Binutils/GDB/Qemu/uCLinux/OpenWRT support • OPEN SOURCE
  • 14. LatticeMico32, Bad points • No Memory Management Unit… yet!
  • 15. LatticeMico32, Bad points • No Memory Management Unit… yet! Done 
  • 16. Used in… • Closed source commercial ASICs • Open source projects • Can achieve 800 MHz in TSMC 90nm standard cell process
  • 18. What’s a pipeline? • « In computing, a pipeline is a set of data processing elements connected in series, where the output of one element is the input of the next one. » -- Pipeline (computing), Wikipedia
  • 19. What’s a pipeline? Data processing element 1 Data processing element 2 Data processing element 3 IN IN INOUTOUT OUT
  • 20. What’s a pipeline? $ cat .bash_history | grep 'cat' | wc -l 6
  • 21. What’s a CPU pipeline?
  • 22. What’s a CPU pipeline?
  • 23. Pipelined instruction execution Instr. number Pipeline Stage 1 A 2 3 4 Clock cycle 1 2 3 4 5 6 7
  • 24. Pipelined instruction execution Instr. number Pipeline Stage 1 A F 2 A 3 4 Clock cycle 1 2 3 4 5 6 7
  • 25. Pipelined instruction execution Instr. number Pipeline Stage 1 A F D 2 A F 3 A 4 Clock cycle 1 2 3 4 5 6 7
  • 26. Pipelined instruction execution Instr. number Pipeline Stage 1 A F D X 2 A F D 3 A F 4 A Clock cycle 1 2 3 4 5 6 7
  • 27. Pipelined instruction execution Instr. number Pipeline Stage 1 A F D X M 2 A F D X 3 A F D 4 A F Clock cycle 1 2 3 4 5 6 7
  • 28. Pipelined instruction execution Instr. number Pipeline Stage 1 A F D X M W 2 A F D X M 3 A F D X 4 A F D Clock cycle 1 2 3 4 5 6 7
  • 29. Pipelined instruction execution Instr. number Pipeline Stage 1 A F D X M W 2 A F D X M W 3 A F D X M 4 A F D X Clock cycle 1 2 3 4 5 6 7
  • 31. Main Memory CPU Internal Raising exception After VIRTUAL ADDRESSES PHYSICAL ADDRESSES
  • 32. What’s the MMU’s job? • Translate « virtual addresses » into « physical addresses » • Memory protection against unwanted execution of code or data write (e.g. software bug or security issue) – Memory right access management
  • 33. Main Memory CPU pipeline VA PA VA : Virtual Address PA : Physical Address
  • 34. Main Memory CPU pipeline VA PA VA : Virtual Address PA : Physical Address How does the MMU know the VA->PA translation ?
  • 35. Main Memory CPU pipeline VA PA VA : Virtual Address PA : Physical Address Page Table
  • 36. Main Memory CPU pipeline VA PA VA : Virtual Address PA : Physical Address Page TableWhy « PAGE »?
  • 37. Why « Page »? • 0x00000004 -> 0x10000000 • 0x00000005 -> 0x10000001 • 0x00000006 -> 0x10000002 Etc…
  • 38. Why « Page »? • 0x00000004 -> 0x10000000 • 0x00000005 -> 0x10000001 • 0x00000006 -> 0x10000002 Etc… This is WRONG!!!
  • 39. Why « Page »? • 0x00000*** -> 0x10000*** • 0x00001*** -> 0x10001*** • 0x00002*** -> 0x10002*** Etc…
  • 40. Main Memory CPU pipeline VA PA VA : Virtual Address PA : Physical Address Page Table
  • 41. Main Memory CPU pipeline VA PA VA : Virtual Address PA : Physical Address Page Table TLB TLB : Translation Lookaside Buffer
  • 42. Main Memory CPU pipeline VA PA VA : Virtual Address PA : Physical Address Page Table TLB Operating System Updates the Gets information from the Updates the
  • 43. Features? • Page size –Only 4 kB 32 bits physical address : xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx How many bits of an address indicate the offset within a given page?
  • 44. Features? • Page size –Only 4 kB 32 bits physical address : xxxxxxxx xxxxxxxx xxxx xxxx xxxxxxxx Page number [31:12] 20 bits Offset [11:0] 12 bits
  • 45. Features? • 2 TLB (Translation Lookaside Buffer) –ITLB –DTLB • Each TLB contains 1024 entries –How many bits needed to index the TLB?
  • 46. Features? • 2 TLB (Translation Lookaside Buffer) –ITLB –DTLB • Each TLB contains 1024 entries –How many bits needed to index the TLB? 10 bits!
  • 47. Features? • No hardware page-tree walker – i.e. TLB is software assisted
  • 48. Virtual address Load or store? Instruction or Data? Physical address Access granted/denied
  • 49. Virtual address Load or store? Instruction or Data? Physical address Access granted/denied I don’t know!
  • 50. Let’s have a look inside
  • 51. Tag [10] Physical page number [20] Read-only [1] Valid [1] 0xABC 0xABC00 0 0 0x280 0xB0001 1 1 0x300 0x00001 0 1 The TLB VA = 0xA0001004
  • 52. Tag [10] Physical page number [20] Read-only [1] Valid [1] 0xABC 0xABC00 0 0 0x280 0xB0001 1 1 0x300 0x00001 0 1 The TLB VA = 0xA0001 004 Page number Offset in the page
  • 53. Tag [10] Physical page number [20] Read-only [1] Valid [1] 0xABC 0xABC00 0 0 0x280 0xB0001 1 1 0x300 0x00001 0 1 The TLB VA = 0xA0001 004 Page offset = 4
  • 54. Tag [10] Physical page number [20] Read-only [1] Valid [1] 0xABC 0xABC00 0 0 0x280 0xB0001 1 1 0x300 0x00001 0 1 The TLB VA = 0xA0001 004 Page offset = 4 Virtual Page number = 0xA0001
  • 55. Tag [10] Physical page number [20] Read-only [1] Valid [1] 0xABC 0xABC00 0 0 0x280 0xB0001 1 1 0x300 0x00001 0 1 The TLB VA = 0xA0001 004 Page offset = 4 Virtual Page number = 0xA0001 VPN = 0xA0001  1010 0000 0000 0000 0001
  • 56. Tag [10] Physical page number [20] Read-only [1] Valid [1] 0xABC 0xABC00 0 0 0x280 0xB0001 1 1 0x300 0x00001 0 1 The TLB VA = 0xA0001 004 Page offset = 4 Virtual Page number = 0xA0001 TLB index = 1 VPN = 0xA0001  1010 0000 00 00 0000 0001 TLB index, used to select a TLB line
  • 57. Tag [10] Physical page number [20] Read-only [1] Valid [1] 0xABC 0xABC00 0 0 0x280 0xB0001 1 1 0x300 0x00001 0 1 The TLB VA = 0xA0001 004 Page offset = 4 Virtual Page number = 0xA0001 TLB index = 1 VPN = 0xA0001  1010 0000 00 00 0000 0001 TLB index, used to select a TLB line
  • 58. Tag [10] Physical page number [20] Read-only [1] Valid [1] 0xABC 0xABC00 0 0 0x280 0xB0001 1 1 0x300 0x00001 0 1 The TLB VA = 0xA0001 004 Page offset = 4 Virtual Page number = 0xA0001 TLB index = 1 VPN = 0xA0001  1010 0000 00 00 0000 0001 Tag = 0x280  1010 0000 00 =
  • 59. Tag [10] Physical page number [20] Read-only [1] Valid [1] 0xABC 0xABC00 0 0 0x280 0xB0001 1 1 0x300 0x00001 0 1 The TLB VA = 0xA0001 004 Page offset = 4 Virtual Page number = 0xA0001 TLB index = 1 VPN = 0xA0001  1010 0000 00 00 0000 0001 Tag = 0x280  1010 0000 00 = Physical page number = 0xB0001
  • 60. Tag [10] Physical page number [20] Read-only [1] Valid [1] 0xABC 0xABC00 0 0 0x280 0xB0001 1 1 0x300 0x00001 0 1 The TLB VA = 0xA0001 004 Page offset = 4 Virtual Page number = 0xA0001 TLB index = 1 VPN = 0xA0001  1010 0000 00 00 0000 0001 Tag = 0x280  1010 0000 00 = Physical page number = 0xB0001 Physical Address = 0xB0001004
  • 61. Porting NetBSD • 1°) NetBSD cross compilation toolchain – build.sh – Makefiles here and there – Arch-specific directories Allows to do: $ ./build.sh -U -m lm32 tools
  • 62. Porting NetBSD • 2°) Support for built-ins in libkern – NetBSD kernel is • Not linked against libgcc • Linked against libkern – Need to implement basic arithmetic functions emitted by gcc in object code – Implementation in sys/lib/libkern/arch/lm32
  • 63. Porting NetBSD • 3°) Building my first kernel – Create sys/arch/lm32 and sys/arch/milkymist – Populate • sys/arch/<cpu|soc>/include • sys/arch/<cpu|soc>/conf – Stub, stub, stub… Allows to do: $ ./build.sh -m milkymist -U kernel=GENERIC
  • 64. Porting NetBSD • 4°) Write basic console driver for early prints struct consdev milkymist_com_cons = { […] milkymist_com_cngetc, /* cn_getc: kernel getchar interface */ milkymist_com_cnputc, /* cn_putc: kernel putchar interface */ […] };
  • 65. Porting NetBSD • 5°) Implement exception handlers • 6°) Call milkymist_startup() C code – Initialize console driver • -> consinit() -> milkymist_uart_cnattach() • cn_tab = &milkymist_com_cons; – Initialiaze virtual memory subsystem • Call MD pmap_bootstrap() – Let the kernel boot • Call NetBSD MI main()
  • 66. Porting NetBSD • 7°) Implement pmap.9 pmap -- machine-dependent portion of the virtual memory system – pmap_bootstrap() – pmap_init, pmap_create, pmap_destroy … – SW managed TLB? -> sys/uvm/pmap/ – used in (PowerPC Booke and LM32)
  • 67. Porting NetBSD • 8°) Implement copyin/copyout • 9°) Implement atomic operations – No atomic instruction  RAS (Restartable Atomic Sequence) CAS (Compare And Swap) – Other atomic ops built around this CAS
  • 68. RAS CAS int _atomic_cas_32(volatile uint32_t *val, uint32_t old, uint32_t new); _atomic_cas_32: _atomic_cas_ras_start: lw r4, (r1+0) /* load *val into r4 */ bne r4, r2, 1f /* compare r4 (*val) and old (r2) */ sw (r1+0), r3 _atomic_cas_ras_end: 1: mv r1, r4 /* return (*val) */ ret
  • 69. Porting NetBSD • 10°) Add support for interrupts – Write a function to register interrupt handlers • 11°) Have a running system clock – Write cpu_initclocks() – Write clock irq handler • Call hardclock()
  • 70. Other functions to write • Switch context from one thread to another – cpu_switchto(9) • Copy data and abort on page fault – kcopy(9) • Save current context – setfault() • Low level code to finish up fork() operation – cpu_lwp_fork(9)
  • 71. Other functions to write • Block interrupts to protect critical sections – spl(9) • Init CPU and print copyright message – cpu_startup(9) • Determine the root file system device – cpu_rootconf(9) • Etc…
  • 72. Porting NetBSD • To boot user space – Create dummy ramdisk with /sbin/init – Build kernel with MFS – Insert ramdisk with mdsetimage – Boot it!
  • 74. Thank you! Sébastien Bourdeauducq, Michael Walle, Robert Swindells, Stefan Kristiansson, Lars-Peter Clausen, Pierre Pronchery, Radoslaw Kujawa, Youri Mouton, Matt Thomas, tech-kern@, M- Labs mailing list, and many more
  • 76. NetBSD/milkymist Memory Layout Kernel space User space 0 0xffffffff 0xc0000000 0xc8000000 Ram window User stack Kernel stack DDR SDRAM : 128 MB

Editor's Notes

  1. Say « Memory Management Unit »
  2. Electronic device aimed at generating artistic video performance in parties and concerts
  3. You can capture live dancers and apply videos effects like rotations zoom in/out translations and project the result against a screen of a wall It reacts in real time with synchronization to audio input and can be controlled via MIDI keyboard or DMX (protocol used to control stage lighting and effects)
  4. Array of configurable logic blocks, linked together by a programmable switch matrix
  5. Previously I said it’s slow to access main memory. Here MMU is accessing PT (in RAM) each time to get translations, aren’t we slowing our CPU down?
  6. TLB: clever word for « cache for PVA -> PPA translations » 1st time you wanna translate a page -> go to PT in RAM Next time you translate the same page -> TLB hit in 1 cycle
  7. In LM32, like MIPS or PowerPC Booke, MMU does not read the page table itself to refill the TLB. (no hardware page tree walker) Instead MMU raises exception and lets the OS update the TLB. TLB is entirely managed by SW.
  8. kcopy: copy data like memcpy, aborts on page fault Setfault: saves current context for later restoring if we take a page fault cpu_lwp_fork() is the machine-dependent portion of fork1() which finishes a fork operation
  9. cpu_startup: init cpu, print copyright message spl: raise and lower the interrupt priority level used by kernel code to block interrupts in critical sections cpu_rootconf: determine the root file system device
  10. Thank you for attending, and thanks for all those who helped for this work