1. CPU Verification Metrics
Shahram Salamian
CPU Verification Manager
Mobility Group- Texas Design Center
6/27/2006
1
2. CPU Verification
- Architectural Verification (AV)
- Implementation meets Intel architecture definition
- Arch state compared against Architectural simulator
- Split across a few categories
- Done at full chip (FC).
- uArchitecture Verification (uAV)
- Cluster level (Fetch, LD/ST, etc)
- Full chip
- Cluster level checkers, templates, etc
- Power management features Verification
- Formal Verification
- System level Verification
- DFD/DFT feature Verification
2
3. Architecture Verification
- Directed set of directed & semi-random templates generating instruction
set level (assembly) tests
- Accumulated over years. Assumed to have high coverage
- Large test base. Smaller subset with good sampling used first
- Highly scrutinized by mgmt. Needs to be almost perfect by tape out
Legacy Tests Passrate
100
90
80
70
as g
60
%p s in
pass_rate
50
goal_line
40
30
20
10
0
w ork w eek
3
4. UArchitecture Verification
Coverage
Reset
- Functional coverage conditions jointly specified by verification & design
- Internal tools to specify and measure
- Use random and/or directed-random templates to cover
- Conditions are typically prioritized based on complexity, bugs, etc
- Tape out targets varies by cluster.
- Focusing on raw coverage be misleading
- A few “Easy” to cover set of monitors can skew covered %
- Can be misinterpreted by mgmt as having great or bad coverage
- Have to be looked at in conjunction with bug count, pass rate, etc
`
4
5. Bug Rate
160
140
120
100
80
60
40
20
0
3_1
4_1
4_1
4_1
_2
_2
_2
1_2
1_2
2_2
2_2
2_2
3_2
3_2
4_2
4_2
4_2
_3
70
10
50
90
10
50
90
30
70
10
50
90
30
70
10
50
90
10
New Bugs (14) Open Bugs (7) Open LT Bugs (0) "Smoke Alarm"
- Many different views of bug data base (Total bugs, open bugs, etc)
- Smoke alarm set based on previous projects bug history
- Exceeding smoke alarm causes scrutiny by design & validation
- Design reviews of areas where bug count jumps up
- At times, it is a sign of better checkers, new tests going on line
5
6. RTL Lines Of Change
80000
# of changed lines
# of RTL checkins
60000
40000
20000
0
ay-01
ay-02
S -00
N v-00
-01
ar-01
l-01
S -01
N v-01
-02
ar-02
l-02
S -02
N v-02
ep
Jan
ep
Jan
ep
Ju
Ju
o
o
o
M
M
M
M
- RTL Change rate to measure stability, allowing verification team to
make progress in exercising RTL
- Also measure RTL change request rate & type of request
6
7. Health Of the Model (HOM)
He a lth of the Mode l (HOM)
100
90
80
70
60
Soe
cr
50
40
30
20
10
0
0
Qt r 5 Qt r 4 Qt r 3 Qt r 2 Qt r 1 Qt r 0
Tim e to Tape out
- Measures functional convergence trend. Informs project on RTL
health is affecting verification team’s progress
- Uses empirical formula using past projects data
- Incorporates new bugs, bugs unresolved, and verification team’s
ability to make progress. Subjective & quantative components
- Low HOM relative to goal drives actions on fixing bugs and issues
affecting verification
7
8. 2nd Tier metrics
- Cycles run each week, licenses, etc
- Bugs caught at full chip vs cluster.
-Used to improve test bench quality
- Bug cause
- New condition hit or as result of timing, other bug fix, etc
- Test bench & other validation collateral bugs
8