here

Of 67:1
Optimization Techniques for
High–Performance Digital Circuits
Chandu Visweswariah
IBM Thomas J. Watson Research Center
Yorktown Heights, NY
(914)945–3124
[email protected]
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Embedded Tutorial
Session 4B
Optimization Techniques for High–Performance Digital Circuits
Acknowledgements
Nonlinear optimization
Andy Conn and Chai Wah Wu
Dynamic tuning
Ruud Haring
Static tuning
Sachin Sapatnekar, University of Minnesota
Statistical tuning
Sani Nassif, IBM Austin Research Lab.
Special topics and suggestions on the paper
Phillip Restle (clock/interconnect tuning), Ken Shepard
(noise–aware tuning), Prabhakar Kudva
Project partners
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Phil Strenski, Bill Wright, Nam Nguyen, Abe Elfadel, David
Ling, Paula Coulman, Greg Morrill, Katya Scheinberg,
Peter O’Brien, Jeff Soreff
Optimization Techniques for High–Performance Digital Circuits
Of 67:2
Warning
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Opinionated presentation ahead!
Optimization Techniques for High–Performance Digital Circuits
Of 67:3
Outline
1. Introduction and motivation
2. Dynamic tuning
3. Gradient computation
direct and adjoint methods
4. Static tuning
5. Statistical tuning
6. Nonlinear optimization as applied to circuit tuning
7. Observations and guidelines
8. Special topics
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
interconnect tuning
power, clock and ground networks
signal integrity
9. Research opportunities
Optimization Techniques for High–Performance Digital Circuits
Of 67:4
Trends
Introduction and Motivation
increased custom design
competition drives aggressive tuning with focus on
performance
quicker turnaround (design re–use is key)
focus on manufacturability
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
for the highest possible performance
for rapid, repeatable and robust tuning (productivity)
for design re–use
for objective comparisons of circuit alternatives
to ensure manufacturability
Automatic circuit optimization is essential
Optimization Techniques for High–Performance Digital Circuits
Of 67:5
Introduction and Motivation
Update
design
manual tuning is a slow, iterative and error–prone procedure
it relies on human intuition and experience
Simulate
Good
enough?
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Design re–use is not easy!
Optimization Techniques for High–Performance Digital Circuits
Of 67:6
Some Definitions
Statement of the problem
given a logically correct schematic, optimally assign sizes
to transistors and/or wires
metrics: delay, slew, area, signal integrity, parametric yield
i
Optimization problem
min f (x)
Objective function
s.t. c e(x) 0
Equality constraint
Inequality constraints s.t. c l(x) 0
{
s.t. c g(x) 0
s.t. x l x x u
Simple bounds
Minimax optimization
min [ max f i(x) ], i 1, 2, , m
x
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Semi–infinite problem
min f (x) s.t. g(x, p) 0 for all p P
x
Optimization Techniques for High–Performance Digital Circuits
Of 67:7
Desired Features
ratio–ing of transistor sizes
grouping of similar structures
ease of use and back–annotation of results
accommodation of complex timing constraints
layout constraints
simultaneous transistor/wire sizing
Design requirements
Algorithmic requirements
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
convergence (local? global?)
minimax optimization
flexible constraints and objective functions
manufacturability (ability to tune at multiple process
corners)
failure recovery
Optimization Techniques for High–Performance Digital Circuits
Of 67:8
Taxonomy
based on time–domain simulation of the circuit
accurate but slow; limited to small circuits
requires input patterns; tunes only identified paths
best suited to small data–flow circuits or ‘‘cross sections”
Dynamic tuning
Static tuning
based on static timing analysis of the circuit
all paths considered at once
no input patterns required; works well on control circuits
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Monte Carlo analysis across process corners
extreme case analysis
yield prediction
design centering
Statistical tuning
Optimization Techniques for High–Performance Digital Circuits
Of 67:9
Outline
1. Introduction and motivation
2. Dynamic tuning
3. Gradient computation
direct and adjoint methods
4. Static tuning
5. Statistical tuning
6. Nonlinear optimization as applied to circuit tuning
7. Observations and guidelines
8. Special topics
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
interconnect tuning
power, clock and ground networks
signal integrity
9. Research opportunities
Optimization Techniques for High–Performance Digital Circuits
Of 67:10
Variables &
simple bounds
Y
Dynamic Tuning
Objective fn.
& constraints
Nonlinear
optimizer
Done?
Of 67:11
Simulation info.
(netlist, patterns,
device models)
Tuner
Back annotate
results on
the schematic
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
x new N
Circuit
simulator
Optimization Techniques for High–Performance Digital Circuits
Function &
gradient values
Advantages
Dynamic Tuning
accurate and realistic
detects failing circuits
false paths are avoided
Disadvantages
inherits all the weaknesses of transient simulation
– slow, can handle only small circuits
– requires input patterns
paths to be tuned must be identified and sensitized
requires a working circuit to begin
vulnerable to omission of tacit requirements
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
– loading on previous stage
– rise/fall time
– shrinking of non–switching transistors
Optimization Techniques for High–Performance Digital Circuits
Of 67:12
Dynamic Tuning – Methodology Impact
dynamic tuning is best applied to pre–layout schematics
after extraction, correspondence between transistor sizes
and diffusion capacitances is lost => impossible to tune
(poses a new extraction challenge)!
Dynamic
tuner
Of 67:13
since specification of a dynamic tuning run is quite involved,
save tuning criteria as attributes of the schematic
encourages design re–use; changes the paradigm!
1. Determine paths
2. Prune circuit
3. Sensitize paths
can use a static timer automatically to generate dynamic
tuning problems
Static
timer
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Static and dynamic tuning complement each other!
Optimization Techniques for High–Performance Digital Circuits
Dynamic Tuners
DELIGHT.SPICE (Nye et al, 1988)
Of 67:14
SPICE–based
minimax PHASE I–II–III optimization by method of feasible
directions
handled semi–infinite constraints
tailored to small analog circuits
JiffyTune (Conn et al, 1996)
SPECS–based (device modeling simplifications, fast
event–driven simulation at reduced accuracy)
relies on fast adjoint gradient computation
uses the general purpose nonlinear optimization package
LANCELOT
– trust–region approach
– augmented Lagrangian merit function
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
tailored to large circuits (5K+ transistors)
Optimization Techniques for High–Performance Digital Circuits
Outline
1. Introduction and motivation
2. Dynamic tuning
3. Gradient computation
direct and adjoint methods
4. Static tuning
5. Statistical tuning
6. Nonlinear optimization as applied to circuit tuning
7. Observations and guidelines
8. Special topics
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
interconnect tuning
power, clock and ground networks
signal integrity
9. Research opportunities
Optimization Techniques for High–Performance Digital Circuits
Of 67:15
Gradient Computation
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Circuit tuning is best approached by means
of gradient–based nonlinear optimization
Optimization Techniques for High–Performance Digital Circuits
Of 67:16
Gradient Computation
Never use Derivative Free (DF) if Derivative (D)
is possible
DF popular since easy to understand
DF popular since easy to implement
useful only if # variables < 20
not robust
not efficient (especially for circuit applications)
no convergence theory
Most implementations of DF:
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Of 67:17
That’s why gradient computation is so important!
Optimization Techniques for High–Performance Digital Circuits
Gradient Computation
Nonlinear
optimization
Administration
and overhead
Gradient computation is often the bottleneck
Gradient
evaluation
Function
evaluation
example: 1,000 transistor circuit, all transistors tunable
Of 67:18
50 measurements in the objective function and constraints
50 iterations to convergence
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
number of gradients required =
1,000 transistors x 6 parameters per transistor
x 50 measurements x 50 iterations = 15 million!
Optimization Techniques for High–Performance Digital Circuits
Direct and Adjoint Sensitivity Computation
Direct method
v
–
C v.
p
v^
^
C
v^
+
+– ^
R
i
i R
p
^
v^ i R Ri
p
+
^
i
^
–
–
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
i C dv C v
p
dt
.
directly differentiate electrical characteristics of devices:
+
i
R
v iR
R
i
v
i
R
p
p
p
+
–
i v
C
i C dv
dt
dv
dv C
i
C
(
)
p dt
p
dt p
Optimization Techniques for High–Performance Digital Circuits
Of 67:19
Procedure
Direct Method
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
add a zero–valued source at each parameter
run a nominal transient simulation
turn off the original sources; activate the zero–valued
sources with the results of the nominal simulation
solve the ‘‘sensitivity circuit”
re–use nominal LU factors at each time point
any number of measurements, one parameter
can run the nominal and sensitivity simulations
simultaneously
in our previous example:
number of gradient analyses = 1000 transistors x 6
parameters per transistor x 50 iterations = 300,000
Optimization Techniques for High–Performance Digital Circuits
Of 67:20
KVL
KCL
Adjoint Method via Tellegen’s Theorem
branches
current of branches incident at each node 0
1, 0
A topological incidence matrix Ai b 0
v bi v nifrom – v nito i
^
^
same
topology
^
^
^
i bi v bT i b [A Tv n] T i b v nT A i b 0
v^bi i bT v^b i bT A T v^n [A i b] T v^n 0
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Of 67:21
instantaneous power of branch 0 Conservation of energy
bi
Tellegen’s theorem
i
bi
v
i
i
Optimization Techniques for High–Performance Digital Circuits
v b Ti b 0
v b A Tv n
nodes
Adjoint Sensitivity via Tellegen’s Theorem, Contd.
i
^
i
v^b,i b
^
Tellegen’s theorem applied to a perturbed circuit
v b, i b
v b v b, i b i b
^
i
i
i
bi
i
^
^
i
i
i bi – i bi v^bi ) 0
i
i
^
Key relation!
i
vb ib
ib vb
0
i
i
(vb vb ) ib (ib ib ) vb 0
i
(v
i
Motivation
^
Vs
^
others
–vI iI iV vV typical term
I
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Of 67:22
k pk , we can pick up all the gradients fpk
if we can manipulate into
and then into f Optimization Techniques for High–Performance Digital Circuits
,
Adjoint Sensitivity Computation
Consider a circuit with only resistors and current sources:
^
for any current source:
i I;
i I I
I
^
Typical term (v I i I–I v^I)
v R i R R R i R R i R
for any resistor:
v R i R R;
^
^
^
Rs
^
–vI iI –I vI iR iR R
Is
By choosing v^R R i R, we get typical term i R i R R.
total circuit:
Is
given a performance v Ik, f v Ik.
^
choose i Ik –1
and
^
f
i Rj i Rj .
R j
pick off the required sensitivities:
Thus
f
–v^Ij
I j
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
any number of parameters can be accommodated at
once!
Optimization Techniques for High–Performance Digital Circuits
Of 67:23
Procedure
Adjoint Method
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
add a zero–valued source at each measurement point
run a nominal transient simulation
turn off the original sources; activate the zero–valued
sources appropriately
reverse time and control
solve the ‘‘adjoint circuit”
re–use the LU factors at each time point (modulo
time–point mismatch problems)
obtain gradients by convolving nominal and adjoint
waveforms
any number of parameters, one measurement
in our previous example:
number of gradient analyses = 50 measurements x 50
iterations = 2,500
Optimization Techniques for High–Performance Digital Circuits
Of 67:24
Adjoint Lagrangian Sensitivity Computation
Motivation
the gradient computation is the bottleneck; it is the stumbling
block in being able to tune an entire circuit macro
Of 67:25
if the problem has m measurements and p tunable parameters,
we need to compute m × p gradients at each iteration
if m >> p, the direct method is used, which involves solving
an associated sensitivity circuit p times
if p >> m the adjoint method is used, which involves solving
an associated adjoint circuit m times
Is there a way of solving an associated circuit just once to
obtain all the gradients of interest?
Adjoint Lagrangian formulation
for the purposes of optimization, yes, it is possible to get all
the required gradients at once!
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
key observation: optimizer minimizes scalar merit function
Optimization Techniques for High–Performance Digital Circuits
p 1, p 2, , p n
i
d1
.
slope v 1
An Example to Demonstrate the Theory
+
–
p 1, p 2, , p n
Nominal simulation
Sensitivity run #1
d 1
1
–
. conv(p i) i
p i
v1
p 1, p 2, , p n
min d 1
s.t. d 2 0
d 1 d 2 1 d 22
2
Sensitivity run #2
d
2
– 1. conv(p i) i
p i
v2
Combine the gradients
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
d 1 ( d 2) d 2
p i
p i
p i
Optimization Techniques for High–Performance Digital Circuits
.
slope v 2
d2
Of 67:26
+
–
h2
d1
h1
slope v 1
d
– 1. ( 2)
v2
p 1, p 2, , p n
min d 1
s.t. d 2 0
d 1 d 2 1 d 22
2
p 1, p 2, , p n
h2
d2
.
slope v 2
An Example to Demonstrate the Theory,. Contd.
Nominal simulation
h – 1.
1
v1
Single sensitivity run
Gradients
conv(p ) i
i
p i
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Of 67:27
any number of measurements or constraints handled at once!
any number of parameters handled at once (inherent in adjoint)!
speedup #measurements
Optimization Techniques for High–Performance Digital Circuits
Example
Run time measurements on an example
Direct method Adjoint
method
635.4
(10.6 minutes)
827.1
5.8%
(57 solutions)
0.001%
1.82 ms
Adjoint
Lagrangian
21.9
213.6
11.4%
(1 solution)
Of 67:28
0.00003%
0.063 ms (63 ms)
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
39,041
(10.8 hours)
39,233
3.3%
(6,114
solutions)
0.06%
112 ms
IOmux circuit from an actual LPM chip with 6,900 MOSFETs,
770 RC wire models, 57 functions, 6,114 parameters, 348,498
gradients
Gradient computation
CPU time (s)
Total run time (s)
Run time overhead
per sensitivity circuit
solution
Overhead per gradient
Run time per gradient
Optimization Techniques for High–Performance Digital Circuits
Run time of adjoint method
Complexity Analysis
Run time of direct method
Of 67:29
Adjoint Lagrangian run time
The 3–D plots on this page
have been deleted so that
the document can be
viewed with Ghostview
Run time with heuristic choice of method
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
The 3–D plots on this page
have been deleted so that
the document can be
viewed with Ghostview
Optimization Techniques for High–Performance Digital Circuits
Application to Piecewise Approximate Simulation
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
i–v characteristics are approximated by piecewise
approximate functions
when the i–v characteristics are differentiated, very simple
elements are obtained
event times of the nominal circuit correspond one–to–one
with the event times of the sensitivity or adjoint circuits =>
time–point mismatch problems are solved!
in SPECS, the sensitivity or adjoint circuit consists of
disconnected grounded capacitors with impulses of
charge being exchanged at event times
convolution of piecewise approximate waveforms is
efficient and can be computed in an event–driven fashion
particularly in this context, adjoint computation of
sensitivities is extremely attractive!
Optimization Techniques for High–Performance Digital Circuits
Of 67:30
Gradient Computation: Summary
i
t C W
t I W
eff
eff
i
1
ds
2
I ds W eff PW
C i W eff PW
i
t 2 C i W eff
C i W eff PW
method of choice for large circuits is the adjoint method
the adjoint method is even more attractive if the optimizer
builds a scalar merit function
adjoint Lagrangian formulation requires close cooperation
between the simulator and the nonlinear optimizer
sensitivity computation involves plenty of chain ruling and
combining of gradients (beware of errors!); diffusion
capacitances must be taken into account
delay t 1 t 2
t
t
delay
1
– 2
PW
PW PW
t I W
eff
ds
1
I ds W eff PW
t 1
t 1 C i area W eff
C i
perimeter W eff
W eff
PW
C i area W eff PW
C i perimeter
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Of 67:31
Never use gradient–free approaches if you have a choice!
Optimization Techniques for High–Performance Digital Circuits
Outline
1. Introduction and motivation
2. Dynamic tuning
3. Gradient computation
direct and adjoint methods
4. Static tuning
5. Statistical tuning
6. Nonlinear optimization as applied to circuit tuning
7. Observations and guidelines
8. Special topics
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
interconnect tuning
power, clock and ground networks
signal integrity
9. Research opportunities
Optimization Techniques for High–Performance Digital Circuits
Of 67:32
Static Tuning
AT out max(AT i d ij)
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Of 67:33
all paths considered at once
arrival times (ATs) and required arrival times (RATs) computed
at each node of the graph; the difference is the slack
no input vectors required
analysis is conservative in both early and late modes
various timing checks can be performed as part of the
analysis
Optimization Techniques for High–Performance Digital Circuits
Convex Functions and Convex Sets
xb
f
xb
Non–convex
xa
f(x) is convex if the line joining any two points f(xa ) and
f(xb ) lies on or above the function
f
xa
Convex
formally, f(x) is convex if
Of 67:34
f (x a x b) f (x a) f (x b) 1, 0, 0
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
any local minimum of a convex function is a global
minimum
Optimization Techniques for High–Performance Digital Circuits
Convex Problems
min f (x)
s.t. c(x) 0
is a convex problem if f(x) is convex and the feasible
region is a convex set; i.e., f(x) and c(x) are both convex
functions
i
ij
j xi ,
j 0, ij any local solution of a convex problem is a global solution
Posynomials
g(x) j
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
a posynomial is like a polynomial but with positive
coefficients and real exponents
by substituting xi =e zi , we can map a posynomial into a
convex function
Optimization Techniques for High–Performance Digital Circuits
Of 67:35
Transistor Sizing
min Area
s.t. Delay Target
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
area S xi
is a posynomial
delay ∝ RC, R ∝ x –1, C ∝ x with positive coefficients
path delay = S gate delays is a posynomial
hence both the objective function and the constraint are
posynomials
they can be mapped into convex functions
any local solution is therefore a global solution!
note that Elmore delay is a posynomial function of
transistor and wire sizes
Optimization Techniques for High–Performance Digital Circuits
Of 67:36
Procedure
TILOS (Fishburn 1985)
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Of 67:37
set all transistors to the minimum size (minimum area
solution)
conduct a static timing analysis using Elmore (or Penfield–
Rubinstein) delay models
find the sensitivity of delay to transistor size for all
transistors on the critical path
bump us the size of the transistor with the largest negative
sensitivity by a fixed increment
repeat until delay cannot be improved further
entire delay–area trade–off curve can be generated
Optimization Techniques for High–Performance Digital Circuits
F
Inaccuracies in TILOS
A
E
D
C
B
Path at a time model is inadequate
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
may be better to size A than B, C and D
if F–E is near–critical and A–D is critical, size A, not D
CONTRAST (Vaidya 1989) solves the convex problem
exactly
there have been subsequent efforts to incorporate power
optimization and simultaneous gate/wire tuning in the
same general framework
Optimization Techniques for High–Performance Digital Circuits
Of 67:38
Pros and Cons of Static Tuning
Advantages
fast (linear time complexity); can handle large circuits
pattern–independent; all paths considered at once
interconnect easily accommodated into basic algorithm
Disadvantages
poor delay model; tuning gains are suspect
– Elmore and other RC–based delays inadequate
– lumping into ‘‘equivalent inverters” inadequate
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
improving the delay model often destroys the posynomial
form, sacrificing the guarantee of a global solution
conservative and prone to false–path problems
not applicable to full–custom circuitry or circuits without
delay rules
non–working circuits are not detectable
Very interesting and challenging area of research!
Optimization Techniques for High–Performance Digital Circuits
Of 67:39
Outline
1. Introduction and motivation
2. Dynamic tuning
3. Gradient computation
direct and adjoint methods
4. Static tuning
5. Statistical tuning
6. Nonlinear optimization as applied to circuit tuning
7. Observations and guidelines
8. Special topics
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
interconnect tuning
power, clock and ground networks
signal integrity
9. Research opportunities
Optimization Techniques for High–Performance Digital Circuits
Of 67:40
Statistical Tuning
Yield loss on a fabrication line
catastrophic (e.g., dust particles)
parametric (or circuit–limited)
Impact of parametric yield loss
in non–sorted designs, bad parts have to be discarded
in sorted designs like microprocessors, poor yield
means fewer parts in the ‘‘high profit” bins
variations dominated by ACLV effects
general goal is to characterize the variability in parameter
space and then map the variability into performance
space
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Aggressive tuning can drive a design into
a precipitous corner of the design space!
Optimization Techniques for High–Performance Digital Circuits
Of 67:41
P1
2
4
Uncertainty
3
5
6
7
Variability
Variability and Uncertainty
1
10
9
8
P2
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Of 67:42
variability arises from known simulatable phenomena like
Vdd or processing variations
uncertainty arises in the face of unknown or unsimulatable
phenomena like substrate noise or modeling errors
uncertainty forces conservatism in the design in order to
guarantee sufficient yield
Optimization Techniques for High–Performance Digital Circuits
p1
99%
p1
2
3
4
Of 67:43
p2
Values of Performance contours
1
Parameter Space and Performance Histograms
90%
60%
p2
Distribution of parameters
#
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
12 3 4
Performance histogram
Optimization Techniques for High–Performance Digital Circuits
Given
Problem Statement
a design specification
a target performance
a model of the variability and uncertainty
Determine
Of 67:44
a minimal representative sampling of the parameter space
– Monte Carlo methods
the expected spread of the performance
– extreme case analysis
the distribution of the performance
– yield estimation
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
how to change the design to improve the performance
– design centering
– yield optimization
Optimization Techniques for High–Performance Digital Circuits
Monte Carlo Analysis
sample the parameter space
use correlation and principal component analysis to
reduce the dimensionality of the sampled space
the sampled points are like process corners
run a simulation or tuning at each sample point
given sufficient sample points, can predict worst– and
best–case behavior, distribution of performance, yield
Example
convert a nominal tuning problem to a problem across the
process corners j:
x
j
min[max f j(x)]
x
ij
min [ max f ij(x) ]
x
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
min f (x)
]
c j(x) 0
max f i(x)
i
c(x) 0
min [
x
Optimization Techniques for High–Performance Digital Circuits
Of 67:45
Extreme Case Analysis
attempts to predict the worst–case behavior of the circuit
also need to predict the best–case behavior for short
paths
statistical analysis too expensive for large circuits
Shortcut
perform analysis on typical circuit for typical performances
compute the sensitivity of performances to parameters to
determine the direction in which the parameters must be
moved to mimic worst–case behavior
identify extreme–case parameter sets
use these parameter sets on all other circuits
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Unfortunately, sensitivities can be different
for different performances and for different
circuits/circuit families.
Optimization Techniques for High–Performance Digital Circuits
Of 67:46
p1
Yield Estimation
4
Yield loss
3
Values of 2
Of 67:47
12 3 4
Performance histogram
#
yield is determined in parameter space
many academic approaches such as yield integrals and
geometric methods
1
Performance contours p 2
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
instead, can use intelligent sampling to build performance
model as a function of the important parameters
(response surface model RSM) and derive complete PDF
characterizing the performance
Optimization Techniques for High–Performance Digital Circuits
Yield Optimization and Design Centering
Yield optimization
very expensive, instead:
– optimize the performance of the worst circuit
– optimize the performance of the extreme circuits
– optimize yield as estimated by intelligent sampling
once a good RSM is built, yield optimization can be
RSM–based
Design centering
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Of 67:48
attempts to create robust designs by ‘‘centering” the circuit
in the feasible region
the technique used may be geometric or ‘‘method of
moments”
Optimization Techniques for High–Performance Digital Circuits
Statistical Tuning: The Bottom Line
statistical design and tuning is obviously crucial to
obtaining reasonable yields and hence profits
design for manufacturability (design for profit) must be an
integral part of the design methodology
obtaining meaningful statistical models is a challenge
manufacturability analysis is an art practised by a few
experts!
Monte Carlo and extreme–case analyses are the most
popular techniques since they are easy to understand and
the most accessible among available tools
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Much work needs to be done in bringing
theoretical approaches proposed in the
literature to mainstream practice.
Optimization Techniques for High–Performance Digital Circuits
Of 67:49
Outline
1. Introduction and motivation
2. Dynamic tuning
3. Gradient computation
direct and adjoint methods
4. Static tuning
5. Statistical tuning
6. Nonlinear optimization as applied to circuit tuning
7. Observations and guidelines
8. Special topics
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
interconnect tuning
power, clock and ground networks
signal integrity
9. Research opportunities
Optimization Techniques for High–Performance Digital Circuits
Of 67:50
Nonlinear Optimization: Terminology
0
c i(x) s i 0
s.t. s i 0
ici(x)
constrained vs. unconstrained optimization
linear vs. nonlinear programming
feasibility and feasible region
constraints that are ‘‘tight” are called activities
c i(x)
inequalities typically converted to equalities using slack
variables:
Lagrangian: (x, ) f (x) i
optimality conditions: c i(x *) 0 (primal feasibility) and
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
x(x *, *) 0 (dual feasibility)
Optimization Techniques for High–Performance Digital Circuits
Of 67:51
Some Key Messages
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
circuit tuning is a nonlinear problem; no amount of
familiarity with linear programming will change this fact
we are interested in solving large problems
we are interested in general–purpose optimization with
general nonlinear constraints
the good news: there has been tremendous progress in
large–scale nonlinear optimization in the last 15 years
MINOS (active set and line–search methods): very
effective on constraints that are linear or near linear
LANCELOT (trust–region methods): very effective with
highly nonlinear constraints
CUTE testing environment allows the testing of several
optimization packages with relatively low investment
Optimization Techniques for High–Performance Digital Circuits
Of 67:52
MINOS (Stanford, ca. 1980)
min F(x) c Tx d Ty
nonlinear linear
s.t. f (x) A 1y b 1
nonlinear constraints
A 2x A 3y b 2
linear constraints
lx x ux
simple bounds
ly y uy
is remapped to
~
~
~
min F(x) c Tx d Ty kT[f (x) – f(x)] 1 [f (x) – f(x)] T[f (x) – f(x)]
2
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
augmentation
objective function Lagrangian of
nonlinear part
of constraints
~
s.t. f(x) A 1y b 1
linearized constraints
A 2x A 3y b 2
linear constraints
lx x ux
simple bounds
ly y uy
where
~
where f(x) f (x) J k(x–x k)
(linearization)
Optimization Techniques for High–Performance Digital Circuits
Of 67:53
MINOS, Cont’d.
large–scale, general–purpose, nonlinear optimization
package
intuition:
– separate linear and nonlinear parts
– use active set methods for the linear part (LP–like)
– use augmented Lagrangian for the nonlinear part
– minimize quadratic model of the merit function
subject to linear model of constraints (QP)
– line–search methods for minimizing the quadratic
model
in practice, very effective, especially if
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
– most constraints are linear or near linear
– real number of degrees of freedom at the solution is
not large
Optimization Techniques for High–Performance Digital Circuits
Of 67:54
LANCELOT
large–scale, general–purpose, nonlinear optimization package
efficient, especially for nonlinear constraints
exploits structure via group partial separability
Major Iteration
update l or m value(s)
formulate augmented Lagrangian and iteratively:
Minor Iteration
trust–region approach
simple bounds handled explicitly and well
minimize merit function
test for convergence
many options:
Of 67:55
– l l 2 trust regions (box and sphere)
– various preconditioners and initializations available
– approximate/accurate bounded quadratic problem (BQP) solver
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
has successfully handled 20,000 variable problems with
20,000 nonlinear constraints
Optimization Techniques for High–Performance Digital Circuits
c (x) 1
i
i
2
b
ci(x)2
i
penalty function
to weight feasibility
x2
Of 67:56
Quadratic model
min f (x 1, x 2)
0 x1 a
0 x2 b
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Lagrangian
i
How LANCELOT Works
(x, , ) f (x) v
w
composite function objective
for unconstrained function
–f
minimization
x1
u
Trust region
Simple bounds
a
xk
Optimization Techniques for High–Performance Digital Circuits
Minimax Optimization
x
i
the original problem is min [ max { f i(x) } ], i 1, 2, , m
f 2(x)
x*
x, z
x
z
not directly differentiable, so remap to min (z) s.t. z f i(x) i
f 1(x)
xk
Special considerations
i* 1
Of 67:57
, so initialize i 1m i
initialize z to max fi (x) after the first function evaluation
optimality condition:
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
can handle minimax problems in other ways
Optimization Techniques for High–Performance Digital Circuits
Andy and Chandu’s Top–10 Key Messages!
10.Circuit tuning is a nonlinear problem. The only guarantee
is convergence to a local stationary point (hopefully the
problem is convex and hopefully we get improvement).
9. We need general–purpose, large–scale, robust optimizers;
we require flexible objective functions and constraints
(exchanging objective function and constraints does not a
new ICCAD paper make).
Of 67:58
8. Tremendous progress has been made in nonlinear optimization in the last 15 years; do not use outdated methods!
7. Never use derivative–free methods unless you have no
choice! Efficient sensitivity computation is essential in any
high–performance circuit tuner.
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
6. In circuit tuning, CPU time is dominated by simulation and
gradient computation; must leverage optimizer to reduce
number of iterations!
Optimization Techniques for High–Performance Digital Circuits
Andy and Chandu’s Top–10 Key Messages!
Of 67:59
5. ‘‘Black box” optimization under–utilizes capabilities of a
package (and often leads to greatly sub–optimal solutions)!
Enhancements like the ‘‘adjoint Lagrangian” formulation
require tinkering with the optimizer source.
4. Follow Seidel’s advice, ‘‘If you know it, use it!”
3. Carefully specify all aspects of the problem you really want
to solve (forces careful understanding of the circuit and
problem being posed)!
2. Any simulation–based optimization is inherently noisy!
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
1. Replace good solutions to bad models by potentially less
good solutions to better models!
Optimization Techniques for High–Performance Digital Circuits
Outline
1. Introduction and motivation
2. Dynamic tuning
3. Gradient computation
direct and adjoint methods
4. Static tuning
5. Statistical tuning
6. Nonlinear optimization as applied to circuit tuning
7. Observations and guidelines
8. Special topics
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
interconnect tuning
power, clock and ground networks
signal integrity
9. Research opportunities
Optimization Techniques for High–Performance Digital Circuits
Of 67:60
Observations and Guidelines
N
Y
If you are evaluating a proposal, research paper or
vendor software:
1. Does the tool have a proof of convergence?
2. Does it use a nonlinear optimizer that the nonlinear
optimization community laughed off the face of the
earth several years ago?
3. Is it gradient–based?
Does it stab blindly at n–dimensional space?
Are the gradients computed by finite–differences?
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Of 67:61
4. Is the simulation/delay model sufficiently accurate? Are the tuner gains credible?
Optimization Techniques for High–Performance Digital Circuits
Handy–Dandy Questionnaire, Cont’d.
5. Are there heuristics involved?
Y
N
6. Will the tool have manageable methodology impact? 7. Is the tool flexible for application to different
types of circuits?
8. Is it easy to use?
Does it allow ratio–ing? Grouping?
Do the resulting circuits lend themselves to your
layout style and tools?
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Of 67:62
If you checked even a single red box, render swift judgment!
Optimization Techniques for High–Performance Digital Circuits
Outline
1. Introduction and motivation
2. Dynamic tuning
3. Gradient computation
direct and adjoint methods
4. Static tuning
5. Statistical tuning
6. Nonlinear optimization as applied to circuit tuning
7. Observations and guidelines
8. Special topics
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
interconnect tuning
power, clock and ground networks
signal integrity
9. Research opportunities
Optimization Techniques for High–Performance Digital Circuits
Of 67:63
Special Topics
Interconnect tuning
topic of a separate embedded tutorial by J. Cong,
session 8B on Tuesday afternoon at 4:00 p.m.
simultaneous tuning of gates and wires has been shown
to yield superior results
reduced–order modeling has allowed analysis of large
interconnect networks in an efficient manner; these
models are amenable to sensitivity computations
interconnect tuners work best when tightly coupled to
the floorplanner and layout data
several approaches:
– wire sizing
– buffer placement
– topology changes
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
need a holistic solution
special considerations for clock and power distribution
Optimization Techniques for High–Performance Digital Circuits
Of 67:64
Special Topics
Noise–aware tuning
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Of 67:65
noise checking usually carried out on a static timing basis
noise considerations pose additional constraints to the tuner
these constraints may take the form of semi–infinite
constraints; or they may be derived from rules of thumb
stability criteria can be additional metrics in the tuning problem
often a direct trade–off between noise immunity and
performance
channel lengths may be tuned for additional noise immunity
Optimization Techniques for High–Performance Digital Circuits
Outline
1. Introduction and motivation
2. Dynamic tuning
3. Gradient computation
direct and adjoint methods
4. Static tuning
5. Statistical tuning
6. Nonlinear optimization as applied to circuit tuning
7. Observations and guidelines
8. Special topics
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
interconnect tuning
power, clock and ground networks
signal integrity
9. Research opportunities
Optimization Techniques for High–Performance Digital Circuits
Of 67:66
Research Opportunities
tuning based on static timing analysis (while maintaining
accuracy, reasonable speed, convergence and the ability
to accommodate custom circuitry)
tremendous opportunity in applying statistical design
techniques to improve yields; at 1 GHz and above,
manufacturability is key!
noise–aware tuning (on the heels of formal noise analysis
tools)
incremental circuit Hessian computation
mixed integer/continuous problems
semi–infinite problems
multi–criteria optimization
C. Visweswariah, ICCAD ’97 Embedded Tutorial, Session 4B
Thank you for your attention!
Optimization Techniques for High–Performance Digital Circuits
Of 67:67