SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Downloaden Sie, um offline zu lesen
Implementing Useful Skew
Using Skew Groups
Matthew Mei
Cisco Systems
2
Matthew Mei
• Overview of skew
• Example design affected by skew
• What is useful skew
• Using skew groups to achieve useful skew
• Experimental results of trials on example design
• Inserting clock buffers to achieve useful skew
• Comparing skew groups and buffer insertion
• Conclusions
Outline
3
Matthew Mei
Skew
Capture
Flip
Flop
Clock
Port
• Skew equals insertion delay at capture minus
insertion delay at launch
• The insertion delay from:
report_clock_timing -to <pin> -type latency
-setup
• Common path pessimism removal from:
report_crpr -from <pin1> -to <pin2> -setup
Launch
Flip
Flop
4
Matthew Mei
• 40 nm technology being used
• The block was about 8000 µm ×4000 µm
• Block utilization was about 75%, while standard
cell utilization was only about 20% (~600K cells)
• The block was mostly Ternary Content
Addressable Memories (TCAMs), which are
large memory macros used for fast searches
The Example Design
5
Matthew Mei
Example Failing Path
(Diagram)
Memory
Capture
Flip
Flops
clk_core
• Thus, the skew is equal to:
1.0460 ns – 1.1783 ns = -0.132 ns
• Therefore, this timing path has -132 ps of skew
1.4831 ns 0.0000 ns
1.0460 ns1.1783 ns
6
Matthew Mei
Example Failing Path
(Timing Report)
Path Type: max
Point Incr Path
----------------------------------------------------------
clock clk_core (rise edge) 0.0000 0.0000
clock network delay (propagated) 1.1783 1.1783
w/m_36x1/CLK 0.0000 1.1783 r
w/m_36x1/QXY[13] 1.4831 2.6614 f
w/r0_data_read1_s_36x1_13_ (net) 0.0000 2.6614 f
w/r1_data_read1_s_36x1_reg_13_/D 0.0000 & 2.6614 f
data arrival time 2.6614
clock clk_core (rise edge) 1.6670 1.6670
clock network delay (propagated) 1.0460 2.7130
clock uncertainty -0.0580 2.6550
w/r1_data_read1_s_36x1_reg_13_/CK 0.0000 2.6550 r
library setup time -0.1197 2.5353
data required time 2.5353
----------------------------------------------------------
data required time 2.5353
data arrival time -2.6614
----------------------------------------------------------
slack (VIOLATED) -0.1261
7
Matthew Mei
Example Failing Path
(Layout)
• Pipeline flops already added and magnet placed
8
Matthew Mei
Using Skew Groups to Achieve
Useful Skew
TCAMs
Pipeline
Flip
Flops
clk_core
• To improve the setup timing performance, delay
can be added to the red clock path
• Tried to achieve the target skew using skew
groups
• Also tried manual buffer insertion (later)
Target Skew
9
Matthew Mei
Skew Groups
• Skew groups were defined before clock tree
synthesis
• The following commands were used before
clock_opt to create a skew group:
set_skew_group -name <name> -target_skew <skew>
<pins list>
report_skew_group -name <name>
commit_skew_group
• The pins list in the example design included the
clock pins of about 8000 flip flops
• Tried 50 ps, 120 ps, 200 ps, 240 ps, 300 ps
10
Matthew Mei
Skew Groups
Effective Skew vs. Target Skew
-0.05
0
0.05
0.1
0.15
0.2
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35
EffectiveSkew(ns)
Target Skew (ns)
Effective Skew vs. Target Skew
Clock Opt Effective Skew
Route Opt Effective Skew
Post Route Effective Skew
11
Matthew Mei
Skew Groups
Setup Timing Performance
-700
-600
-500
-400
-300
-200
-100
0
-0.18
-0.16
-0.14
-0.12
-0.1
-0.08
-0.06
-0.04
-0.02
0
0 0.05 0.1 0.15
NegativeSlack(ns)
Effective Skew (ns)
Negative Slack vs. Effective
Skew
WNS
TNS
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
0 0.05 0.1 0.15
FailingPaths
Effective Skew (ns)
Failing Paths vs. Effective
Skew
12
Matthew Mei
Skew Groups
Hold Timing Performance
0
20
40
60
80
100
120
140
0 0.05 0.1 0.15
FailingPaths
Effective Skew (ns)
Failing Hold Paths vs.
Effective Skew
-1.8
-1.6
-1.4
-1.2
-1
-0.8
-0.6
-0.4
-0.2
0
-0.14
-0.12
-0.1
-0.08
-0.06
-0.04
-0.02
0
0 0.05 0.1 0.15
NegativeSlack(ns)
Effective Skew (ns)
Negative Hold Slack vs.
Effective Skew
Worst Hold
Total Hold
13
Matthew Mei
Skew Groups
Path Skew Distribution
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
-0.1 -0.05 0 0.05 0.1 0.15 0.2 0.25 0.3
NumberofFlops(Cumulative)
Skew of Individual Path (ns)
Cumulative Distribution of Path Skew Among Skew
Group Flip Flops
Effective Skew 0.005 ns
Effective Skew 0.085 ns
Effective Skew 0.121 ns
Effecitve Skew 0.138 ns
14
Matthew Mei
• Using skew groups causes the clock tree to
branch out at an early level
• The TCAMs and the pipeline flip flops had zero
common path pessimism removed
• More complex clock tree, more cells and routing
Skew Groups
Effects on Clock Tree
15
Matthew Mei
Skew Groups
Clock Tree Cells and Buffer Area
23000
24000
25000
26000
27000
28000
29000
5950
6000
6050
6100
6150
6200
6250
6300
6350
6400
6450
Control 0.05 0.12 0.2 0.24 0.3
BufferArea(µm2)
NumberofClockCells
Target Skew (ns)
Clock Tree vs. Target Skew
Buffer Area
Clock Cells
• Increased clock tree size by about 250 cells
16
Matthew Mei
Skew Groups
Power Consumption
0
0.2
0.4
0.6
0.8
1
1.2
0
1
2
3
4
5
6
7
8
0.05 0.12 0.2 0.24 0.3
IncreaseinTotalPower(%)
IncreaseinClockTreePower(%)
Target Skew (ns)
Power Increase vs. Target Skew
Percent Total Power Increase
Percent Clock Tree Power Increase
• On average, increase by 5.16% in clock tree and
0.66% in total block power consumption
17
Matthew Mei
Manual Buffer Insertion to Achieve
Useful Skew
TCAMs
Pipeline
Flip
Flops
clk_core
• The instinctive way of inserting delay is to
manually insert clock buffers:
insert_buffer –no_of_cells <num buffers> <pins
list> <buffer type>
• The target skew is determined by the number
and type of buffers, not by numerical value
Target Skew
18
Matthew Mei
Manual Buffer Insertion
• Clock buffers were inserted right before clock
tree routing
• Two buffers of low drive strength were used.
Each buffer added about 40 ps of delay
• The pins list in the example design included the
clock pins of the same ~8000 flip flops
• The clock buffer insertion resulted in a “Post
Route Effective Skew” of about 0.084 ns
• The TCAMs and the flip flops had on average 38
ps of common path pessimism removed
19
Matthew Mei
Manual Buffer Insertion
Setup Timing Performance
-700
-600
-500
-400
-300
-200
-100
0
-0.18
-0.16
-0.14
-0.12
-0.1
-0.08
-0.06
-0.04
-0.02
0
0 0.05 0.1 0.15
NegativeSlack(ns)
Effective Skew (ns)
Negative Slack vs. Effective
Skew
WNS
WNS (clkbuf)
TNS
TNS (clkbuf)
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
0 0.05 0.1 0.15
FailingPaths
Effective Skew (ns)
Failing Paths vs. Effective
Skew
Failing Paths
Failing Paths (clkbuf)
20
Matthew Mei
Manual Buffer Insertion
Hold Timing Performance
0
20
40
60
80
100
120
140
0 0.05 0.1 0.15
FailingPaths
Effective Skew (ns)
Failing Hold Paths vs.
Effective Skew
Failing Paths
Failing Paths (clkbuf)
-1.8
-1.6
-1.4
-1.2
-1
-0.8
-0.6
-0.4
-0.2
0
-0.14
-0.12
-0.1
-0.08
-0.06
-0.04
-0.02
0
0 0.05 0.1 0.15
NegativeSlack(ns)
Effective Skew (ns)
Negative Hold Slack vs.
Effective Skew
Worst Hold
Worst Hold (clkbuf)
Total Hold
Total Hold (clkbuf)
21
Matthew Mei
Manual Buffer Insertion
Path Skew Distribution
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
-0.1 -0.05 0 0.05 0.1 0.15 0.2 0.25 0.3
NumberofFlops(Cumulative)
Path Skew (ns)
Cumulative Distribution of Path Skew Among Skew
Group Flip Flops
Effective Skew 0.005 ns
Effective Skew 0.085 ns
Effective Skew 0.121 ns
Effecitve Skew 0.138 ns
Effective Skew clkbuf
22
Matthew Mei
Manual Buffer Insertion
Power Consumption
• Buffer insertion resulted in about 22000 clock
cells, dramatically increasing power
0
0.5
1
1.5
2
2.5
3
3.5
4
0
10
20
30
40
50
60
0.05 0.12 0.2 0.24 0.3 clkbuf
IncreaseinTotalPower(%)
IncreaseinClockTreePower(%)
Target Skew (ns)
Power Increase vs. Target Skew
Percent Total Power Increase
Percent Clock Tree Power Increase
23
Matthew Mei
Conclusions
• Both methods are easy to setup in IC Compiler
• Skew groups:
– Easy to specify target skew
– Results in smaller increase in cells, power, and area
• Manual buffer insertion:
– Relies on past experience for buffer selection
– Results in larger increase in cells, power, and area
Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

VLSI-Physical Design- Tool Terminalogy
VLSI-Physical Design- Tool TerminalogyVLSI-Physical Design- Tool Terminalogy
VLSI-Physical Design- Tool Terminalogy
Murali Rai
 
Floorplanning.pdf
Floorplanning.pdfFloorplanning.pdf
Floorplanning.pdf
Ahmed Abdelazeem
 
minimisation of crosstalk in VLSI routing
minimisation of crosstalk in VLSI routingminimisation of crosstalk in VLSI routing
minimisation of crosstalk in VLSI routing
Chandrajit Pal
 
Timing closure document
Timing closure documentTiming closure document
Timing closure document
Alan Tran
 

Was ist angesagt? (20)

VLSI-Physical Design- Tool Terminalogy
VLSI-Physical Design- Tool TerminalogyVLSI-Physical Design- Tool Terminalogy
VLSI-Physical Design- Tool Terminalogy
 
Floorplanning.pdf
Floorplanning.pdfFloorplanning.pdf
Floorplanning.pdf
 
Clock Tree Timing 101
Clock Tree Timing 101Clock Tree Timing 101
Clock Tree Timing 101
 
Physical design-complete
Physical design-completePhysical design-complete
Physical design-complete
 
Eco
EcoEco
Eco
 
Basic synthesis flow and commands in digital VLSI
Basic synthesis flow and commands in digital VLSIBasic synthesis flow and commands in digital VLSI
Basic synthesis flow and commands in digital VLSI
 
Synopsys Fusion Compiler-Comprehensive RTL-to-GDSII Implementation System
Synopsys Fusion Compiler-Comprehensive RTL-to-GDSII Implementation SystemSynopsys Fusion Compiler-Comprehensive RTL-to-GDSII Implementation System
Synopsys Fusion Compiler-Comprehensive RTL-to-GDSII Implementation System
 
Pd flow i
Pd flow iPd flow i
Pd flow i
 
minimisation of crosstalk in VLSI routing
minimisation of crosstalk in VLSI routingminimisation of crosstalk in VLSI routing
minimisation of crosstalk in VLSI routing
 
ASIC Design.pdf
ASIC Design.pdfASIC Design.pdf
ASIC Design.pdf
 
Clock Tree Synthesis.pdf
Clock Tree Synthesis.pdfClock Tree Synthesis.pdf
Clock Tree Synthesis.pdf
 
Define Width and Height of Core and Die (http://www.vlsisystemdesign.com/PD-F...
Define Width and Height of Core and Die (http://www.vlsisystemdesign.com/PD-F...Define Width and Height of Core and Die (http://www.vlsisystemdesign.com/PD-F...
Define Width and Height of Core and Die (http://www.vlsisystemdesign.com/PD-F...
 
Vlsi interview questions compilation
Vlsi interview questions compilationVlsi interview questions compilation
Vlsi interview questions compilation
 
Placement.pdf
Placement.pdfPlacement.pdf
Placement.pdf
 
Timing closure document
Timing closure documentTiming closure document
Timing closure document
 
PowerPlanning.pdf
PowerPlanning.pdfPowerPlanning.pdf
PowerPlanning.pdf
 
Static_Time_Analysis.pptx
Static_Time_Analysis.pptxStatic_Time_Analysis.pptx
Static_Time_Analysis.pptx
 
Routing.pdf
Routing.pdfRouting.pdf
Routing.pdf
 
Vlsi best notes google docs
Vlsi best notes   google docsVlsi best notes   google docs
Vlsi best notes google docs
 
ASIC Design Flow | Physical Design | VLSI
ASIC Design Flow | Physical Design | VLSI ASIC Design Flow | Physical Design | VLSI
ASIC Design Flow | Physical Design | VLSI
 

Andere mochten auch

Low Power Design and Verification
Low Power Design and VerificationLow Power Design and Verification
Low Power Design and Verification
DVClub
 
Transmission Line Basics
Transmission Line BasicsTransmission Line Basics
Transmission Line Basics
John Williams
 
Fujitsu 100G Overview
Fujitsu 100G OverviewFujitsu 100G Overview
Fujitsu 100G Overview
Ed Dodds
 

Andere mochten auch (20)

Clock Skew 1
Clock Skew 1Clock Skew 1
Clock Skew 1
 
Clock Skew 2
Clock Skew 2Clock Skew 2
Clock Skew 2
 
Micro-Absorption Refrigeration System
Micro-Absorption Refrigeration SystemMicro-Absorption Refrigeration System
Micro-Absorption Refrigeration System
 
Thermal Modeling of Fluid Cooled 3D ICs
Thermal Modeling of Fluid Cooled 3D ICsThermal Modeling of Fluid Cooled 3D ICs
Thermal Modeling of Fluid Cooled 3D ICs
 
Clock distribution
Clock distributionClock distribution
Clock distribution
 
Regular buffer v/s Clock buffer
Regular buffer v/s Clock bufferRegular buffer v/s Clock buffer
Regular buffer v/s Clock buffer
 
BGP in 2014
BGP in 2014BGP in 2014
BGP in 2014
 
Low Power Design and Verification
Low Power Design and VerificationLow Power Design and Verification
Low Power Design and Verification
 
Cisco catalyst 6500 architecture white paper
Cisco catalyst 6500 architecture white paperCisco catalyst 6500 architecture white paper
Cisco catalyst 6500 architecture white paper
 
Ota
OtaOta
Ota
 
Prof. Uri Weiser,Technion
Prof. Uri Weiser,TechnionProf. Uri Weiser,Technion
Prof. Uri Weiser,Technion
 
Clock Distribution
Clock DistributionClock Distribution
Clock Distribution
 
floor planning
floor planningfloor planning
floor planning
 
OIF on 400G for Next Gen Optical Networks Conference
OIF on 400G for Next Gen Optical Networks ConferenceOIF on 400G for Next Gen Optical Networks Conference
OIF on 400G for Next Gen Optical Networks Conference
 
OIF CEI 56-G-FOE-April2015
OIF CEI 56-G-FOE-April2015OIF CEI 56-G-FOE-April2015
OIF CEI 56-G-FOE-April2015
 
Prof. Danny Raz, Director, Bell Labs Israel, Nokia
 Prof. Danny Raz, Director, Bell Labs Israel, Nokia  Prof. Danny Raz, Director, Bell Labs Israel, Nokia
Prof. Danny Raz, Director, Bell Labs Israel, Nokia
 
ECOC Panel on OIF CEI 56G
ECOC Panel on OIF CEI 56GECOC Panel on OIF CEI 56G
ECOC Panel on OIF CEI 56G
 
Transmission Line Basics
Transmission Line BasicsTransmission Line Basics
Transmission Line Basics
 
Clock gating
Clock gatingClock gating
Clock gating
 
Fujitsu 100G Overview
Fujitsu 100G OverviewFujitsu 100G Overview
Fujitsu 100G Overview
 

Ähnlich wie Implementing Useful Clock Skew Using Skew Groups

Low latency & mechanical sympathy issues and solutions
Low latency & mechanical sympathy  issues and solutionsLow latency & mechanical sympathy  issues and solutions
Low latency & mechanical sympathy issues and solutions
Jean-Philippe BEMPEL
 
Noha danms13 talk_final
Noha danms13 talk_finalNoha danms13 talk_final
Noha danms13 talk_final
Noha Elprince
 

Ähnlich wie Implementing Useful Clock Skew Using Skew Groups (20)

02-11-2005.ppt
02-11-2005.ppt02-11-2005.ppt
02-11-2005.ppt
 
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
 
Training Slides: Advanced 302: Performing Schema Changes in a Multi-Site/Mult...
Training Slides: Advanced 302: Performing Schema Changes in a Multi-Site/Mult...Training Slides: Advanced 302: Performing Schema Changes in a Multi-Site/Mult...
Training Slides: Advanced 302: Performing Schema Changes in a Multi-Site/Mult...
 
Scan insertion
Scan insertionScan insertion
Scan insertion
 
Oracle Database In-Memory Option in Action
Oracle Database In-Memory Option in ActionOracle Database In-Memory Option in Action
Oracle Database In-Memory Option in Action
 
In Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry OsborneIn Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry Osborne
 
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-Flight
 
Low latency & mechanical sympathy issues and solutions
Low latency & mechanical sympathy  issues and solutionsLow latency & mechanical sympathy  issues and solutions
Low latency & mechanical sympathy issues and solutions
 
Hy 523
Hy 523Hy 523
Hy 523
 
Is ScalaC Getting Faster, or Am I just Imagining It
Is ScalaC Getting Faster, or Am I just Imagining ItIs ScalaC Getting Faster, or Am I just Imagining It
Is ScalaC Getting Faster, or Am I just Imagining It
 
Performance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla ClusterPerformance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla Cluster
 
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
 
VietOpenStack meetup 7th Auto-scaling
VietOpenStack meetup 7th  Auto-scalingVietOpenStack meetup 7th  Auto-scaling
VietOpenStack meetup 7th Auto-scaling
 
Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...
Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...
Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...
 
Fuzzy Control meets Software Engineering
Fuzzy Control meets Software EngineeringFuzzy Control meets Software Engineering
Fuzzy Control meets Software Engineering
 
Noha danms13 talk_final
Noha danms13 talk_finalNoha danms13 talk_final
Noha danms13 talk_final
 
Use of a Levy Distribution for Modeling Best Case Execution Time Variation
Use of a Levy Distribution for Modeling Best Case Execution Time VariationUse of a Levy Distribution for Modeling Best Case Execution Time Variation
Use of a Levy Distribution for Modeling Best Case Execution Time Variation
 
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
 
Autonomic Resource Provisioning for Cloud-Based Software
Autonomic Resource Provisioning for Cloud-Based SoftwareAutonomic Resource Provisioning for Cloud-Based Software
Autonomic Resource Provisioning for Cloud-Based Software
 
Burst clock controller
Burst clock controllerBurst clock controller
Burst clock controller
 

Kürzlich hochgeladen

Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
jaanualu31
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
MsecMca
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 

Kürzlich hochgeladen (20)

A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
Rums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfRums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdf
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 

Implementing Useful Clock Skew Using Skew Groups

  • 1. Implementing Useful Skew Using Skew Groups Matthew Mei Cisco Systems
  • 2. 2 Matthew Mei • Overview of skew • Example design affected by skew • What is useful skew • Using skew groups to achieve useful skew • Experimental results of trials on example design • Inserting clock buffers to achieve useful skew • Comparing skew groups and buffer insertion • Conclusions Outline
  • 3. 3 Matthew Mei Skew Capture Flip Flop Clock Port • Skew equals insertion delay at capture minus insertion delay at launch • The insertion delay from: report_clock_timing -to <pin> -type latency -setup • Common path pessimism removal from: report_crpr -from <pin1> -to <pin2> -setup Launch Flip Flop
  • 4. 4 Matthew Mei • 40 nm technology being used • The block was about 8000 µm ×4000 µm • Block utilization was about 75%, while standard cell utilization was only about 20% (~600K cells) • The block was mostly Ternary Content Addressable Memories (TCAMs), which are large memory macros used for fast searches The Example Design
  • 5. 5 Matthew Mei Example Failing Path (Diagram) Memory Capture Flip Flops clk_core • Thus, the skew is equal to: 1.0460 ns – 1.1783 ns = -0.132 ns • Therefore, this timing path has -132 ps of skew 1.4831 ns 0.0000 ns 1.0460 ns1.1783 ns
  • 6. 6 Matthew Mei Example Failing Path (Timing Report) Path Type: max Point Incr Path ---------------------------------------------------------- clock clk_core (rise edge) 0.0000 0.0000 clock network delay (propagated) 1.1783 1.1783 w/m_36x1/CLK 0.0000 1.1783 r w/m_36x1/QXY[13] 1.4831 2.6614 f w/r0_data_read1_s_36x1_13_ (net) 0.0000 2.6614 f w/r1_data_read1_s_36x1_reg_13_/D 0.0000 & 2.6614 f data arrival time 2.6614 clock clk_core (rise edge) 1.6670 1.6670 clock network delay (propagated) 1.0460 2.7130 clock uncertainty -0.0580 2.6550 w/r1_data_read1_s_36x1_reg_13_/CK 0.0000 2.6550 r library setup time -0.1197 2.5353 data required time 2.5353 ---------------------------------------------------------- data required time 2.5353 data arrival time -2.6614 ---------------------------------------------------------- slack (VIOLATED) -0.1261
  • 7. 7 Matthew Mei Example Failing Path (Layout) • Pipeline flops already added and magnet placed
  • 8. 8 Matthew Mei Using Skew Groups to Achieve Useful Skew TCAMs Pipeline Flip Flops clk_core • To improve the setup timing performance, delay can be added to the red clock path • Tried to achieve the target skew using skew groups • Also tried manual buffer insertion (later) Target Skew
  • 9. 9 Matthew Mei Skew Groups • Skew groups were defined before clock tree synthesis • The following commands were used before clock_opt to create a skew group: set_skew_group -name <name> -target_skew <skew> <pins list> report_skew_group -name <name> commit_skew_group • The pins list in the example design included the clock pins of about 8000 flip flops • Tried 50 ps, 120 ps, 200 ps, 240 ps, 300 ps
  • 10. 10 Matthew Mei Skew Groups Effective Skew vs. Target Skew -0.05 0 0.05 0.1 0.15 0.2 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 EffectiveSkew(ns) Target Skew (ns) Effective Skew vs. Target Skew Clock Opt Effective Skew Route Opt Effective Skew Post Route Effective Skew
  • 11. 11 Matthew Mei Skew Groups Setup Timing Performance -700 -600 -500 -400 -300 -200 -100 0 -0.18 -0.16 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0 0 0.05 0.1 0.15 NegativeSlack(ns) Effective Skew (ns) Negative Slack vs. Effective Skew WNS TNS 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 0 0.05 0.1 0.15 FailingPaths Effective Skew (ns) Failing Paths vs. Effective Skew
  • 12. 12 Matthew Mei Skew Groups Hold Timing Performance 0 20 40 60 80 100 120 140 0 0.05 0.1 0.15 FailingPaths Effective Skew (ns) Failing Hold Paths vs. Effective Skew -1.8 -1.6 -1.4 -1.2 -1 -0.8 -0.6 -0.4 -0.2 0 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0 0 0.05 0.1 0.15 NegativeSlack(ns) Effective Skew (ns) Negative Hold Slack vs. Effective Skew Worst Hold Total Hold
  • 13. 13 Matthew Mei Skew Groups Path Skew Distribution 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 -0.1 -0.05 0 0.05 0.1 0.15 0.2 0.25 0.3 NumberofFlops(Cumulative) Skew of Individual Path (ns) Cumulative Distribution of Path Skew Among Skew Group Flip Flops Effective Skew 0.005 ns Effective Skew 0.085 ns Effective Skew 0.121 ns Effecitve Skew 0.138 ns
  • 14. 14 Matthew Mei • Using skew groups causes the clock tree to branch out at an early level • The TCAMs and the pipeline flip flops had zero common path pessimism removed • More complex clock tree, more cells and routing Skew Groups Effects on Clock Tree
  • 15. 15 Matthew Mei Skew Groups Clock Tree Cells and Buffer Area 23000 24000 25000 26000 27000 28000 29000 5950 6000 6050 6100 6150 6200 6250 6300 6350 6400 6450 Control 0.05 0.12 0.2 0.24 0.3 BufferArea(µm2) NumberofClockCells Target Skew (ns) Clock Tree vs. Target Skew Buffer Area Clock Cells • Increased clock tree size by about 250 cells
  • 16. 16 Matthew Mei Skew Groups Power Consumption 0 0.2 0.4 0.6 0.8 1 1.2 0 1 2 3 4 5 6 7 8 0.05 0.12 0.2 0.24 0.3 IncreaseinTotalPower(%) IncreaseinClockTreePower(%) Target Skew (ns) Power Increase vs. Target Skew Percent Total Power Increase Percent Clock Tree Power Increase • On average, increase by 5.16% in clock tree and 0.66% in total block power consumption
  • 17. 17 Matthew Mei Manual Buffer Insertion to Achieve Useful Skew TCAMs Pipeline Flip Flops clk_core • The instinctive way of inserting delay is to manually insert clock buffers: insert_buffer –no_of_cells <num buffers> <pins list> <buffer type> • The target skew is determined by the number and type of buffers, not by numerical value Target Skew
  • 18. 18 Matthew Mei Manual Buffer Insertion • Clock buffers were inserted right before clock tree routing • Two buffers of low drive strength were used. Each buffer added about 40 ps of delay • The pins list in the example design included the clock pins of the same ~8000 flip flops • The clock buffer insertion resulted in a “Post Route Effective Skew” of about 0.084 ns • The TCAMs and the flip flops had on average 38 ps of common path pessimism removed
  • 19. 19 Matthew Mei Manual Buffer Insertion Setup Timing Performance -700 -600 -500 -400 -300 -200 -100 0 -0.18 -0.16 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0 0 0.05 0.1 0.15 NegativeSlack(ns) Effective Skew (ns) Negative Slack vs. Effective Skew WNS WNS (clkbuf) TNS TNS (clkbuf) 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 0 0.05 0.1 0.15 FailingPaths Effective Skew (ns) Failing Paths vs. Effective Skew Failing Paths Failing Paths (clkbuf)
  • 20. 20 Matthew Mei Manual Buffer Insertion Hold Timing Performance 0 20 40 60 80 100 120 140 0 0.05 0.1 0.15 FailingPaths Effective Skew (ns) Failing Hold Paths vs. Effective Skew Failing Paths Failing Paths (clkbuf) -1.8 -1.6 -1.4 -1.2 -1 -0.8 -0.6 -0.4 -0.2 0 -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0 0 0.05 0.1 0.15 NegativeSlack(ns) Effective Skew (ns) Negative Hold Slack vs. Effective Skew Worst Hold Worst Hold (clkbuf) Total Hold Total Hold (clkbuf)
  • 21. 21 Matthew Mei Manual Buffer Insertion Path Skew Distribution 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 -0.1 -0.05 0 0.05 0.1 0.15 0.2 0.25 0.3 NumberofFlops(Cumulative) Path Skew (ns) Cumulative Distribution of Path Skew Among Skew Group Flip Flops Effective Skew 0.005 ns Effective Skew 0.085 ns Effective Skew 0.121 ns Effecitve Skew 0.138 ns Effective Skew clkbuf
  • 22. 22 Matthew Mei Manual Buffer Insertion Power Consumption • Buffer insertion resulted in about 22000 clock cells, dramatically increasing power 0 0.5 1 1.5 2 2.5 3 3.5 4 0 10 20 30 40 50 60 0.05 0.12 0.2 0.24 0.3 clkbuf IncreaseinTotalPower(%) IncreaseinClockTreePower(%) Target Skew (ns) Power Increase vs. Target Skew Percent Total Power Increase Percent Clock Tree Power Increase
  • 23. 23 Matthew Mei Conclusions • Both methods are easy to setup in IC Compiler • Skew groups: – Easy to specify target skew – Results in smaller increase in cells, power, and area • Manual buffer insertion: – Relies on past experience for buffer selection – Results in larger increase in cells, power, and area