[pdf]

Transformation-Based Verification Using Generalized
Retiming
Andreas Kuehlmann1 and Jason Baumgartner2
1
2
Cadence Berkeley Labs, Berkeley, CA 94704
IBM Enterprise Systems Group, Austin, TX 78758
Abstract. In this paper we present the application of generalized retiming for
temporal property checking. Retiming is a structural transformation that relocates registers in a circuit-based design representation without changing its actual input-output behavior. We discuss the application of retiming to minimize
the number of registers with the goal of increasing the capacity of symbolic state
traversal. In particular, we demonstrate that the classical definition of retiming
can be generalized for verification by relaxing the notion of design equivalence
and physical implementability. This includes (1) omitting the need for equivalent
reset states by using an initialization stump, (2) supporting negative registers,
handled by a general functional relation to future time frames, and (3) eliminating peripheral registers by converting them into simple temporal offsets. The
presented results demonstrate that the application of retiming in verification can
significantly increase the capacity of symbolic state traversal. Our experiments
also demonstrate that the repeated use of retiming interleaved with other structural simplifications can yield reductions beyond those possible with single applications of the individual approaches. This result suggests that a tool architecture
based on re-entrant transformation engines can potentially decompose and solve
verification problems that otherwise would be infeasible.
1 Introduction
The main bottleneck of temporal property checking is the potentially exorbitant computational resources necessary for state traversal. In general, there is no clear dependency
between the structure or size of the analyzed circuit and the resource requirements to
perform reachability analysis. However, a smaller number of state bits, i.e., registers,
generally correlates with a lower memory and runtime consumption for performing
state traversal. In particular, for BDD-based techniques [1, 2] fewer registers result in
fewer BDD variables which typically decreases the size of the BDDs representing the
set of states and transitions among them. Similarly, in SAT-based state enumeration [3],
the complexity of the state recording device directly depends on the number of registers.
A second motivation for our work comes from the observation that a reduced number
of registers often decreases the functional correlation between them. Intuitively, this
produces a less scattered state encoding which results in a more compact BDD or cube
structure for BDD or SAT-based reachability analysis, respectively.
In this paper we discuss the application of retiming to reduce the number of registers with the goal of improving symbolic reachability analysis. Retiming is commonly referred to as a structural transformation of a circuit-based design description
that changes the positions of the state holding elements without modifying the inputoutput behavior [4]. Traditionally, the use of retiming is focused on design synthesis
with two constraints that fundamentally limit the solution space: the circuit must be
physically implementable and it must preserve its original input-output behavior. In
property verification these restrictions can be lifted, which results in significantly more
freedom for register minimization. There are three extensions of classical retiming for
a generalized application in verification. First, a temporally partitioned state traversal
eliminates the restriction on the retimed circuit of having an equivalent reset state. Second, a generalized symbolic state traversal algorithm can handle “negative registers.”
This significantly increases the solution space for legal retimings by removing the nonnegative register count constraints from the problem formulation. Third, state bits which
are exclusively driven by primary inputs or drive only primary outputs represent a mere
temporal shift of peripheral values, and can be suppressed for state space traversal.
In this paper we describe the application of retiming for verification using these
three generalizations. This work provides a specific approach in a more general scheme
for property checking which uses a set of targeted circuit transformations. In an enginebased architecture, a retiming engine is applied as one step in a series of transformations
which gradually simplify the verification problem until it can be solved by a terminal
engine (e.g., BDD- or SAT-based). Note that such a modular, transformation-based approach was key in making automatic logic synthesis practical [5].
2 Illustrating Example
Figure 1a shows a circuit example with six registers R1 ; : : : ; R6 , two inputs a and b,
and one output p. Using a notation introduced in Section 4, the initial states of the six
1
1
2
1
1
1
registers are assumed to be I = (I21
; I24 ; I24 ; I36 ; I54 ; I6p ) = (1; 0; 0; 1; 0; 1). The
subscript and superscript denote the circuit arc and the register position along this arc,
respectively. Further, let p 1 be a predicate to be checked for all reachable states.
Retiming moves registers forward and backward across gates with the goal of minimizing their count. The corresponding optimization problem can be formulated as an
Integer Linear Program (ILP) using a directed graph model of the circuit [4]. The graph
vertices and arcs represent gates and interconnection (i.e., wires), respectively. A special host vertex is introduced which is connected to all inputs and outputs. Figure 1b
shows the retiming graph for the given example. The arc labels denote the number of
registers at the corresponding nets. The ILP determines a lag for each vertex which
represents the number of registers moved backward through it [4].
The original definition of retiming for synthesis requires preserving input-output
behavior. With this restriction, the circuit of Figure 1a cannot be retimed since registers
R1 and R2 have incompatible initial states and cannot be merged by a backward move.
To show this, if both registers were shared with a joint initial state of 1, the sequence
(a; b) = ((0; 0); (1; 0); (0; 0)) would produce p = (1; 1; 1) and p = (1; 1; 0) in the original and retimed circuit, respectively. Similarly, for a joint initial state of 0 the sequence
(a; b) = ((1; 0); (0; 0); (1; 0); (0; 0)) would distinguish the behavior of the circuits.
In verification, we need not to preserve input-output equivalence of the retimed circuit as long as we can preserve the truth of the given properties. The requirement for
I=1
I=0
R1
R5
1
G2
G
G5
G4
R2 R3
I=1
I=1
a
b
0
0
I=0 I=0
G1
G3
1
G6
0
G3
G6
1
b
host
(b)
(a)
G5
1
0
a
R6
0
G4
0
p
R4
0
2
G2
1
p
0
0
0
Fig. 1: Retiming example: (a) original circuit, (b) retiming graph.
equivalent reset states can be relaxed by unrolling the circuit for multiple cycles until
the set of retimed initial states is uniquely determined. This corresponds to a temporal
decomposition of the verification task into two parts: (1) checking a bounded acyclic
initialization structure, further referred to as the retiming stump, and (2) checking the
retimed circuit, further referred to as the retimed recurrence structure. The first part involves a SAT check to prove the correctness of the properties for the time frames that are
included in the retiming stump. The second part involves model checking the retimed
circuit, which effectively provides an inductive correctness proof for all remaining time
frames. The initialization state of the retimed circuit can be computed by symbolically
simulating the retiming stump up to the retimed recurrence structure.
Registers at the inputs and outputs are mere temporal signal offsets and do not
impact the state reachability of the circuit core [6]. Thus, they can be ignored during
reachability analysis. For failing properties, the offsets are restored by temporal shifts
in the counter-example trace. Adopting the terminology from Malik et al. [7] we will
refer to this method as peripheral retiming. For peripheral retiming the host vertex is
removed from the retiming graph, causing the ILP to pull as many registers as possible
out of the circuit. Figure 2a shows the graph for a maximal peripheral retiming of the
example ignoring initial state equivalence. The arc labels represent the register counts
of the original and retimed circuit. The vertex labels denote their lag, i.e., the number
of registers that have been pushed backward through them. As shown, by merging R1
and R2 and removing R6 , the register count could be reduced from six to four.
1/0
−1
G1
0/1
1/1
0
G
2/1
−1
1/0
−1
G4
2
−1
G1
G5
0/1
1/1
0
G2
2/0
−2
−2
G4
G5
0/−1
−1
a
(a)
1/1
−1
G3
b −1
−1
−1
G6
G3
a
1/0
p −2
(b)
1/0
−1
b −1
−2
1/0
G6
p −3
Fig. 2: Graphs for relaxed retimings for the example of Fig. 1 : (a) peripheral retiming ignoring
reset state equivalence, (b) retiming with negative registers permitted.
G2
G1
~
R2
a
b
...
~
R3
~
R1
G4
G5
...
=
ASSERT(1)
p
G6
G3
...
...
non−det
(b)
(a)
0 0
1 1
b a
2 2
b a
−1
1
I 21
1
I 24
~
R1
b a
0
R’1
...
...
−1
2
...
...
I 24
1
I 36
~
R2
−2
1
I 54
−2
1
p0
−2
−2
~
R3
...
~
R’3
−2
−2
I 6p
~
R’2
1
p
p2
p
3
(c)
Fig. 3: Retiming result of Fig. 2b: (a) retimed circuit, (b) intuitive interpretation of negative registers, (c) interpretation of the unrolled circuit structure (dark: retiming stump, medium shaded:
retiming recurrence structure, lightly shaded: retiming top).
A third relaxation of retiming is achieved by enabling negative register counts at
the arcs. This approach is motivated by the fact that registers merely denote functional
relations between different time frames. In logic synthesis, clocked or unclocked delay
elements are used to physically implement these relations. Such delays can only realize
backward constraints, each consisting of a combinational expression in the present and a
variable in a future time frame. In symbolic verification, this limitation can be lifted and
arbitrary relations can be handled. This includes forward constraints between variables
in the current time frame and expressions in future time frames, represented by negative
registers. In contrast to the common case of symbolic forward traversal, constraints
imposed by negative registers delay the decision about the actual reachability of a state
until all referred future time frames are processed. This results in a third component for
the above described temporal verification decomposition, reflected by the retiming top.
To enable negative registers, the non-negativity constraints on the arc labels are removed from the ILP. Figure 2b shows the resulting retiming graph for the example. By
using one negative register, the total register count is reduced to three. Figure 3a shows
the resulting circuit. Note that these three registers reflect the actual temporal relations
present in the loops and reconverging paths of the original circuit. Figure 3b gives an
intuitive interpretation of negative registers in a circuit context. In symbolic reachability
analysis, negative registers can simply be handled by exchanging the current and next
state variables in the transition relation. Figure 3c illustrates the retiming process using
the unrolled circuit structure. The medium shaded area reflects the retimed recurrence
structure which is passed to symbolic model checking. The dark area denotes the retiming stump which is used to compute the initial state for the retimed circuit and to verify
p for the first two time frames. The lightly shaded area represents the retiming top.
The actual verification process consists of several steps. First, we need to prove
that the property holds for the retiming stump using a SAT check. In the given example, it easy to show that pi 1 for i = 0; 1; 2. Further, the set of initial states I~ for
the retimed recurrence structure is computed by symbolically executing the stump, re1 ~1
~1 ) j 9a0 :9b0 :9v:(I~1 a0 ^ I~1 v ^ I~1 1)g =
; I34 ; I
sulting in I~ = f(I~12
54
12
34
54
f(0; 0; 1); (0; 1; 1); (1; 0; 1); (1; 1; 1)g. Next, starting from these initial states, symbolic
traversal is performed on the retimed structure. This leads to a counter example for the
1 ~1 ~1
1
1
initial state (I~12
; I34 ; I54 ) = (0; 1; 1) with the inputs a = 0 and b = 0. Further, the
1
retiming top imposes a constraint on the negative register I~34 a2 _ b2 which can be
satisfied for the given failing state. A complete counter-example trace is composed by
a satisfying assignment of the retiming stump for generating the required reset state of
the retimed structure, a counter-example trace generated by the retimed structure, and a
satisfying assignment for the constraint imposed by the negative registers. For the given
example, this results in (a; b) = ((0; 0); (0; 0); (0; 1)).
3 Previous Work
The application of structural circuit transformations in sequential verification is a relatively new research area. Hasteer et al. [8] proposed the concepts of retiming and state
space folding for sequential equivalence checking. Their state-folding technique works
for circuits in which the number of latches contained in loops and reconverging paths
is constant modulo n. In this case n succeeding state transitions can be concatenated
for symbolic state traversal. Baumgartner et al. [9] extend the state-folding concept to
handle arbitrary registers and general CTL property checking. The idea of state space
folding is orthogonal to the retiming approach described in this paper, and the combination of both techniques is a promising subject of our future research.
For logic optimization, Leiserson and Saxe [10] describe the application of structural retiming and propose an ILP [4] formulation using a graph model. Malik et al. [7]
were the first to introduce peripheral retiming with the objective of moving a maximum
number of registers to the circuit boundaries. This makes the combinational circuit core
as large as possible for providing maximum freedom for conventional combinational
optimizations. They also introduced the concept of negative registers as a method of
temporarily “borrowing” registers from inputs and outputs. After finishing the combinational optimization, these registers are “legalized” by retiming them back to positive
registers. In contrast, our paper describes the direct application of negative registers for
verification and gives formal algorithms to fully handle them.
The problem of generating valid initial states for the retimed circuit has been addressed in several publications. Touati and Brayton [11] proposed a method for adding
reset circuitry which forces an equivalent initial state. Even et al. [12] described a mod-
ified retiming algorithm that favors forward retiming, allowing a simple computation
of the initial states. All previous work on reset state computation assumes input-output
equivalence. In this paper we propose a method of eliminating that limitation for verification and describe how a more generalized reset state can be obtained.
Gupta et al. [6] were first to propose the application of maximal peripheral retiming
in the context of simulation-based verification. They showed that peripheral registers
can be omitted during test generation without compromising the coverage of the resulting transition tour. Still, their approach is focused on test generation and does not
consider full reachability. Further, the paper does not address the initialization problem and does not use the concept of negative registers. The work of Cabodi et al. [13],
which uses retiming to enhance symbolic reachability analysis, is the closest to ours.
However, they use an original synthesis retiming algorithm with the above-mentioned
limitations regarding enforced reset state equivalence and non-negative registers. Further, the applied retiming grid is based on next-state functions which significantly reduces the optimization freedom. Consequently, the reported results show mostly modest
improvements over existing techniques.
4 Generalized Retiming for Verification
Let C = (G; E ) denote a circuit where G represents a set of combinational gates, primary inputs, and primary outputs, and E G G is a set of arcs connecting the gates.
Each arc (u; v ) 2 E is associated with a non-negative weight w(u; v ) representing the
number of registers at this arc. Clearly, for all hardware designs we can assume that the
initial register count of all arcs is non-negative: i.e., w(u; v ) 0. Further, without loss
of generality, we assume that the circuit does not contain combinational loops.
i
; 1 i w (u; v ) denote the initial value of register i along arc (u; v )
Let Iuv
and gu (fju ; : : : ; fku ) be the function of gate u using the functions fju ; : : : ; fku of arcs
(j; u); : : : ; (k; u) at its inputs. If u represents a primary input, gu denotes the sampled
input value at a given time. The state of C at time t 0 is computed recursively as:
(
t
fuv
t
gu
=
=
w (u;v )
Iuv
t
w (u;v )
t
gu
if
t < w (u; v );
otherwise;
t
t
gu (fju ; : : : ; fku ):
(1)
This definition of f can be used to express the function of any internal net of the design
modeled by C . For example, the value at time t of the net connecting the output of
t+w (u;v ) i
.
register i with the input of register i + 1 of arc (u; v ) is fuv
A retiming of C is defined as a gate labeling r : G ! Z, where r(u) is the lag of
gate u denoting the number of registers that are moved backward through it. The new
~ are computed as follows:
~ of the retimed circuit C
arc weights w
w
~ (u; v )
=
w (u; v )
+ r (v )
r (u):
(2)
In this context we are interested in minimizing the total number of registers of C~ :
X
8(
2
u;v )
E
j ~(
j ! min
w u; v )
:
(3)
Note that due to the missing host vertex, the formulation aims at maximal peripheral
retiming which removes registers from the primary inputs and outputs. The given modeling does not take into account that the registers of the outgoing arcs from a gate can
be shared and must be counted only once in the objective function. A correct ILP modeling of “register sharing” can be achieved by a slightly modified problem formulation
for which the details are presented in [4]. In contrast to retiming for synthesis, we do not
~ . Therefore, the new circuit may have negative
impose a non-negative constraint on w
arc weights, representing negative registers.
Equation (2) imposes an equivalence relation on the set of retimings. Two retimings
r1 and r2 result in identical circuits and are said to be equivalent if and only if r1 =
0
r2 + , where denotes an integer constant. We define a normalized retiming r as:
r
0 =r
max
8
u
r (u):
(4)
In the following we will use the term retiming to denote normalized retimings. Similar
to formula (1), for a given retiming r the state of C~ at time t can be computed as:
(~
w
~ (u;v )
~t
f
Iuv
=
uv
t
g
~u
=
t
if
w
~ (u;v )
t
g
~u
t < w
~ (u; v );
otherwise;
~ ; : : : ; f~ );
g u (f
ju
ku
t
t
(5)
i
represent the initial states of C~ . In contrast to formula (1), it is not obvious
where the I~uv
~ (u; v ) can assume negative values.
that this formula is well formed, because the w
Theorem 1. Let C be a circuit containing a finite number of gates, arcs, and nonnegative registers without combinational loops, and r be a retiming resulting in circuit
~ . The evaluation of formula (5) for computing the state of C
~ at time t will terminate
C
for any finite t 0.
Proof. First, it is obvious that t remains non-negative during the evaluation of (5). Second, since C and therefore C~ contain a finite number of gates, any non-terminating evaluation of formula (5) must involve an infinite recursion on at least one gate. Let u be one
of those gates and p = u ! u1 ! : : : ! u be the circular path in C~ corresponding to the recursion. The difference between t and t0 of two suceeding recursions
is then t t0 = w
~ (u; u1 ) + w
~ (u1 ; u2 ) + : : : + w
~ (un ; u). A substitution using (2) leads to
0
t
t = w (u; u1 ) + w (u1 ; u2 ) + : : : + w (un ; u). All registers are positive (w (ui ; uj ) 0), and there are no combinational loops (9(ui ; uj ) 2 p with w (ui ; uj ) > 0). Therefore t strictly decreases after each recursion which causes the evaluation to terminate
~ (ui ; vj ) for some arc (ui ; uj ) 2 p.
ut
once t < w
(u;u1 )
(u1 ;u2 )
(un ;u)
The retiming stump of a retiming r is a partial unrolling of C and is defined as:
S
=
f
t
suv
j
t
suv
=
t
fuv
^
(u; v )
2 ^ 0
E
t < w
~ (u; v )
g
r (v ) :
(6)
The new verification structure is composed of S and C~ , where S provides the arc functions for the first cycles and the initial states for the positive registers of C~ as follows:
~
I
uv
i
=
w
~ (u;v )
suv
i
r (v )
;
0
< i
~(
w u; v ):
Note that this formula is well formed for normalized retimings because r(v )
(7)
0.
Theorem 2. Let C be a circuit containing a finite number of gates, arcs, and nonnegative registers without combinational loops and r be a retiming resulting in circuit C~
and the retiming stump S . The following relations provide a bijective mapping between
each arc function of fC~ ; S g to the corresponding arc function of C and vise versa:
t
if t < w
~ (u; v )
r (v ) ;
suv
t
fuv =
(8)
t+r (v )
~
otherwise;
f
uv
(
t
suv
t
f~
uv
t
=
fuv
=
t r (v )
fuv
:
if
t < w
~ (u; v )
r (v );
(9)
~ (u; v )
Proof. First we show that function (8) correcly maps fC~ ; S g to C : For t < w
r (v ), (8) reflects the definition of s given in (6). For t w
~uv
r (v ), after substitution
t+r (v ) w
~ (u;v )
t+r (v ) w
~ (u;v )
t
using (5), we must show that fuv
) which is
; : : : ; f~
= gu (f~iu
ju
t+r (v )
w
~ (u;v )
w (u;v )
= fiu
.
done by inductively proving for the arguments of gu that f~iu
t+r (v ) w
~ (u;v )
~
~ (u; v ) < w
~ (i; u)): Using (5) and (7) we get fiu
Base case (t + r(v ) w
=
t+r (v ) w
~ (u;v ) r (u)
~ (i;u) t r (v )+w
~ (u;v )
~w
which,
after
applying
(2),
shows
the
re=
f
I
iu
iu
quired equality. Inductive step (t + r(v ) w
~ (u; v ) w
~ (i; u)): A substitution using (5)
t+r (v ) w
~ (u;v ) w
~ (i;u)
~ (u;v ) w
~ (i;u)
t+r (v ) w
~ (u;v )
~
~t+r (v ) w
~
= gi (fhi
;:::;f
). If
results in fiu
ki
w
~ (i; u) > 0 we can immediately reduce the arguments of gi by induction which results
t w (u;v ) w (i;u)
~t+w (u;v ) w (i;u) ) = f t w (u;v ) and show equivalence. If
in gi (fhi
;:::; f
iu
ki
w
~ (i; u) 0, then the right hand side needs to be further expanded until an inductive reduction can be performed. A termination analysis similar to the proof of theorem 1 can
be applied showing the superscript value of f will eventually decrease and therefore the
expansion will terminate after a finite number of steps. Next, showing that (9) correctly
maps fC~ ; S g to C is straight forward by using the definition for s for the first part and
an inductive proof identical to the one used in the first theorem for the second part. u
t
t
Corollary 1. Let C~ be derived from C by retiming and be a Boolean constant, then
8
t
t:(fuv
),8
t:[(0
t < w
~ (u; v )
r (v ))
)(
t
suv
)℄
^8 0 (~ 0
t
t : fuv
):
(10)
In other words, generalized retiming provides a circuit transformation that is sound
and complete for verifying properties of the form AG(p), where the primary circuit
inputs are non-deterministic and p is a predicate on any net of the circuit. Its application
for more complex safety properties requires that the property formula be expressed as
a circuit which is composed with the actual design before retiming can be applied.
Similarly, in order to handle constrained circuit inputs, the verification environment
must be composed with the circuit before retiming can be applied.
Corollary 2. Let C~ be a circuit derived from C by retiming and S be the corresponding
retiming stump. Further, let AG(p) be a property that fails for C~ for an initial state I~
resulting in a counter-example trace T~. The counter example T for the original circuit
~ and S .
C can be obtained by applying formula (8) on T
In essence, formula (8) provides the mechanism for trace lifting that back-translates any
counter example from the retimed circuit to the original circuit.
5 Transformation-based Verification
We implemented the retiming transformation as a re-entrant reduction engine with a
“push” interface similar to a BDD package. The engine consumes a circuit from a
higher-level engine, performs retiming, and then passes the resulting circuit down to a
lower-level engine. For debugging of failing properties, the engine implements a backtranslation mechanism that passes counter-example traces from the lower-level engine
back to the higher-level. This setting allows an iterative usage of retiming and other
reduction algorithms until the circuit can be passed to a “terminal” decision engine.
As an internal data structure we use a two-input A ND /I NVERTER graph similar
to the one presented in [14] except that registers are modeled as edge attributes. This
representation allows the application of several on-the-fly reduction algorithms, including inverter “dragging” and forward retiming of latches, both enabling a generalized
identification of functionally identical structures by hashing. As an ILP solver we utilized the primal network simplex algorithm from IBM’s Optimization Solutions Library
(OSL) [15] to solve the register minimization problem.
As a second simplification engine, we implemented an algorithm for combinational
redundancy removal which was adopted from an equivalence checking application [14].
This engine uses BDD sweeping and a SAT procedure to identify and eliminate functionally equivalent circuit structures, including the removal of redundant registers. As
a terminal reachability engine we adapted VIS [16] version 1.4 (beta) for our experiments. In addition to the partitioned transition relation algorithm, VIS 1.4 incorporates
a robust hybrid image computation approach.
6 Experimental Results
We performed a number of experiments to evaluate the impact of retiming on symbolic
reachability analysis, using 31 sequential circuits from the ISCAS89 benchmarks and
27 circuits randomly selected from IBM’s Gigahertz Processor (GP) design. All experiments were done on an IBM RS/6000 Model 260, with a 256 MBytes memory limit.
In the first set of experiments we assessed the potential of generalized retiming
for reducing register count. In particular, we evaluated an iterative scheme where the
retiming engine (RET) and the combinational reduction engine (COM) are called in an
interleaved manner. The results for the ISCAS and GP circuits are given in Table 1.
For the ISCAS benchmarks, we list only the circuits with more than 16 registers since
smaller designs are of less interest for these experiments. Columns 2, 3, and 4 report
the number of registers of the original circuit, after applying COM only, and RET only,
respectively. The following columns give the register counts after performing various
numbers of iterations of COM followed by RET. The number of negative registers, if
non-zero, is given in parentheses. For brevity, we report only up to three iterations;
more iterations provided only marginal improvements. The reported maximum lag in
column 9 gives an indication of the size of the retiming stump.
Overall, the results indicate that generalized retiming has a significant potential for
reducing the number of registers for verification. For the ISCAS benchmarks we obtained a maximum register reduction of 79% with an average of 27%. For the processor
circuits we achieved an average reduction of 62%.
Design
PROLOG
S1196
S1238
S1269
S13207 1
S1423
S1512
S15850 1
S3271
S3330
S3384
S35932
S382
S38584 1
S400
S444
S4863
S499
S526N
S5378
S635
S641
S6669
S713
S838 1
S9234 1
S938
S953
S967
S991
C RAS
D DASA
D DCLA
D DUDD
I IBBC
I IFAR
I IFEC
I IFPF
L EMQ
L EXEC
L FLUSH
L LMQ
L LRU
L PNTR
L TBWK
M CIU
S SCU1
S SCU2
V CACH
V DIR
V L2FB
V SCR1
V SCR2
V SNPC
V SNPM
W GAR
W SFA
Number of Registers (negative)
Relative
Original COM RET COM-RET COM-RET COM-RET Reduction
Only Only 1 Iteration 2 Iterations 3 Iterations (Best)
136
81 45 (1)
45 (1)
45 (3)
44 (2)
67.6%
18
16
16
14
14
14
22.2%
18
17
16
15
14
14
22.2%
37
37
36
36
36
36
2.7%
638
513 390
343
292 (1)
289
54.7%
74
74
72
72
72
72
2.7%
57
57
57
57
57
57
0.0%
534
518 498
488
485
485
9.2%
116
116 110
110
110
110
5.2%
132
81 44 (2)
44 (3)
44 (2)
44 (2)
66.7%
183
183
72
72
72
72
60.7%
1728 1728 1728
1728
1728
1728
0.0%
21
21
15
15
15
15
28.6%
1426 1415 1375
1375
1374
1374
3.6%
21
21
15
15
15
15
28.6%
21
21
15
15
15
15
28.6%
104
88
37
37
37
37
64.4%
22
22
22
22
20
20
9.1%
21
21
21
21
21
21
0.0%
179
164 112 (6) 112 (6)
111 (6)
111 (6)
38.0%
32
32
32
32
32
32
0.0%
19
17
15
15
15
15
21.1%
239
231
92
75
75
75
68.6%
19
17
15
15
15
15
21.1%
32
32
32
32
32
32
0.0%
211
193 172
172
165
131
37.9%
32
32
32
32
32
32
0.0%
29
29
6
6
6
6
79.3%
29
29
6
6
6
6
79.3%
19
19
19
19
19
19
0.0%
431
431 378
370
348
348
19.3%
115
115 100
100
100
100
13.0%
1137 1137 771
750
750
750
34.0%
129
129 100
100
100
100
22.5%
195
195
40
40
38
36
81.5%
413
413 142
139
136
136
67.1%
182
182
45
45
45
45
75.3%
1546 1356 673 (4) 661 (4)
449 (2)
442 (2)
71.4%
220
220
87
88
74
74
66.4%
535
535 163
137
135
134
75.0%
159
159
1
1
1
1
99.4%
1876 1831 1190
1185
433 (3)
425 (3)
77.3%
237
237
94
94
94
94
60.3%
541
541 245
245
245
245
54.7%
307
307 124
124
40
40
87.0%
777
686 415
415
411
387 (1)
50.2%
373
373 204
200
192
192
48.5%
1368 1368 566
565
426
423
69.1%
173
155 104 (2) 96 (3)
96 (2)
95 (1)
45.1%
178
151
87
83
43
42 (1)
76.4%
75
75
26
26
26
26
65.3%
150
128
52
48 (1)
48 (1)
48
68.0%
551
551
86
82
82
82
85.1%
93
93
21
21
21
21
77.4%
1421 1216 233 (7) 233 (7)
231 (11)
227 (8)
84.0%
242
232 91 (1)
90
90
79 (1)
67.4%
64
64
42
42
41
41
35.9%
Max. Time (s) /
Results of
Lag Memory (MB) [6]/ [13]
(Best)
(Registers)
2
1.4 / 22.4
-/1
0.6 / 10.7
16 / 1
0.9 / 21.1
17 / 1
0.4 / 6.2
-/11
3.8 / 34.7
-/1
0.5 / 6.2
72 / 74
1
0.5 / 6.2
- / 57
6
5.3 / 31.8
-/5
0.7 / 7.0
- / 116
3
0.7 / 7.0
-/6
0.7 / 7.1
- / 147
1
7.2 / 38.0
-/1
0.3 / 5.9
15 / 5
29.4 / 127.4
-/0
0.3 / 5.9
15 / 1
0.3 / 5.9
15 / 4
0.9 / 7.3
- / 96
1
0.6 / 15.1
-/2
0.4 / 5.9
-/5
1.6 / 18.4
- / 144
1
0.4 / 5.9
-/2
0.4 / 5.9
18 / 5
1.6 / 14.1
-/2
0.4 / 5.9
-/0
0.5 / 6.1
-/3
2.5 / 26.2
-/0
0.4 / 6.1
-/0
0.4 / 6.1
-/0
0.4 / 6.1
-/2
0.4 / 6.0
-/3
6.0 / 22.6
-/2
0.9 / 7.1
-/1
35.4 / 36.2
-/3
0.9 / 7.0
-/2
1.6 / 21.6
-/4
3.1 / 19.5
-/6
0.7 / 7.0
-/10
46.5 / 127.9
-/4
3.4 / 18.5
-/6
9.8 / 28.1
-/3
0.8 / 7.0
-/3
50.7 / 139.1
-/2
1.1 / 7.1
-/3
1.8 / 8.8
-/3
2.7 / 18.0
-/15
26.3 / 76.6
-/3
9.0 / 20.6
-/5
102.2 / 67.4
-/9
1.1 / 24.0
-/5
0.9 / 22.3
-/2
0.5 / 5.9
-/4
0.7 / 10.9
-/4
4.4 / 15.0
-/4
0.5 / 6.8
-/15
14.7 / 65.2
-/2
3.2 / 25.4
-/1
1.0 / 16.0
-/-
Table 1: Retiming results for ISCAS circuits (upper part) and GP circuits (lower part).
Design
PROLOG
S1196
S1238
S1269
S3330
S382
S400
S444
S4863
S499
S641
S713
S953
S967
C RAS
D DASA
D DUDD
I IBBC
I IFAR
I IFEC
L EMQ
L EXEC
L FLUSH
L PNTR
L TBWK
S SCU1
V CACH
V DIR
V L2FB
V SCR1
V SCR2
V SNPC
W GAR
W SFA
Original Circuit
Reduced Circuit
Number of Reachability Time (sec) / Number of Reachability BDDi nit
Registers Steps, Algo Memory(MB) Registers Steps, Algo Nodes
136
17 C I
2285 / 134.5
45
16 C H
611
18
4CI
1.1 / 6.5
14
2CI
122
18
4CI
1.2 / 6.5
14
2CI
159
37
11 C H 13194 / 185.5
36
11 C H
901
132
17 C H
668.0 / 35.3
45
16 C I
194
21
13 C I
< 0:1 / 6.2
15
11 C I
17
21
10 C I
< 0:1 / 6.2
15
10 C H
16
21
4CI
< 0:1 / 6.1
15
3CH
27
104
3I
14400 / 174.2
37
4CI
199
22
1CH
0.2 / 6.2
20
1CH
21
19
6CI
0.8 / 6.4
15
5CI
15
19
6CI
0.9 / 6.3
15
5CI
15
29
6CI
0.8 / 6.4
6
5CH
7
29
4CI
1.1 / 6.3
6
3CH
7
431
1028 C I
724.3 / 57.2
370
1026 C I
415
115
6CI
19.7 / 7.8
100
5CI
200
129
13 C I
953.3 / 112.8
100
11 C H
2568
195
5CH
145.3 / 11.4
40
3CH
41
413
5I
14400 / 87.0
139
22 C I
719
182
6CI
66.3 / 8.4
45
2CH
151
220
8CH
323.7 / 17.0
88
5CH
5519
535
5H
14400 / 63.2
137
9CI
1856
159
4CI
37.4 / 7.7
1
2CH
2
541
6CI
6687 / 138.5
245
3CI
242
307
6CH
184.1 / 9.1
124
4CH
123
373
14 C H
8934 / 165.8
200
12 C H
755
173
11 C H
92.1 / 17.2
97
8CI
910
178
8CH
57.9 / 8.3
83
2CI
95
75
4CI
2.9 / 6.3
26
2CH
27
150
20 C H
250.0 / 17.7
48
17 C I
90
551
22 C I
1201 / 105.0
82
20 C I
220
93
4CH
4.9 / 6.6
21
1CH
17
242
11 C I
109.8 / 25.0
90
9CH
191
64
7CI
3.7 / 6.8
42
6CI
14
Time (sec) /
Memory(MB)
81.6 / 27.5
0.5 / 6.3
0.1 / 6.3
13395 / 187.5
35.8 / 15.6
< 0:1 / 6.1
< 0:1 / 6.1
< 0:1 / 6.1
14.8 / 16.6
< 0:1 / 6.2
1.0 / 6.4
0.6 / 6.4
< 0:1 / 6.1
< 0:1 / 6.1
424.0 / 51.8
33.0 / 11.6
359.1 / 33.7
4.4 / 6.4
2302 / 102.0
28.0 / 6.9
205.6 / 33.0
593.6 / 103.2
< 0:1 / 6.2
2423 / 51.2
74.0 / 7.4
1195 / 118.1
20.0 / 8.9
11.1 / 7.0
< 0:1 / 6.1
5.0 / 15.5
260.0 / 36.7
< 0:1 / 6.2
82.5 / 13.0
3.6 / 6.9
Relative
Improvement
Time / Memory
96.4% / 79.6%
54.5% / 3.1%
91.7% / 3.1%
-1.5% / -1.1%
94.6% / 55.8%
0.0% / 1.6%
0.0% / 1.6%
0.0% / 0.0%
99.9% / 90.5%
100% / 0.0%
-25.0% / 0.0%
33.3% / -1.6%
100% / 4.7%
100% / 3.2%
41.5% / 9.4%
-67.5% / -48.7%
62.3% / 70.1%
97.0% / 43.9%
84.0% / -17.2%
57.8% / 17.9%
36.5% / -94.1%
95.9% / -63.3%
100% / 19.5%
63.8% / 63.0%
59.8% / 18.7%
86.6% / 28.8%
78.3% / 48.3%
80.8% / 15.7%
100% / 3.2%
98.0% / 12.4%
78.4% / 65.0%
100% / 6.1%
24.9% / 48.0%
2.7% / -1.5%
Table 2: Effect of retiming on reachability analysis (C = completed within the time limit of four
hours, = hybrid image computation, = IWLS95 image computation).
H
I
The number of negative registers generated by retiming is surprisingly small. This
can be explained by the two-input A ND /I NVERTER data structure used as circuit representation. One can show that within each strongly connected component (SCC) of
such circuits, there exists an optimal retiming with only positive registers. Only paths
between the SCCs may require negative registers for an optimal solution.
Table 2 gives the performance results for symbolic reachability analysis. We report
results for all circuits of Table 1 for which retiming resulted in a register reduction and
reachability analysis could be completed. We ran each experiment with two options for
the VIS image computation: the IWLS95 partitioned transition relation method and the
hybrid approach. The best of the two results on a per-example basis are then reported.
Although after reduction we can complete traversal for only three additional circuits, the
results clearly show that retiming significantly improves the overall performance. The
CPU time is decreased by an average of 53.1% for ISCAS and 64.0% for GP circuits,
respectively. The corresponding memory reductions are 17.2% and 12.3%, respectively.
The cumulative run time speedup is 55.7% for the ISCAS benchmarks and 83.5% for
Symbolic Reachability Profile for S3330
100000
Number of BDD Nodes
80000
No Reduction
COM Only
Retiming Only
COM + Retiming
60000
40000
20000
0
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Time-step
Fig. 4: BDD size profile for traversing S3330 with method IWLS95 after various transformations.
the GP circuits. To illustrate the complexity of the retiming stump, we report the BDD
sizes for the initial states in column 7. As shown, these BDDs remain fairly small and
do not impact the complexity of the reachability analysis.
Figure 4 shows the profile of the BDD size while traversing benchmark S3330 for
the original circuit and after applying various reduction steps. This example demonstrates how retiming typically benefits the performance of the traversal. To further illustrate the effect of retiming on the correlation of the state encoding, we analyzed the
traversal of circuit S4863. Reachability timed out during the third traversal step of the
original circuit. Using retiming, the correlation between the remaining registers was
completely removed resulting in full reachability of all 237 states. While such a profound result is likely atypical, this is strong evidence of the power of both structural
simplification and retiming to reduce register correlation.
7 Conclusions and Future Work
We presented the application of generalized retiming for enhancing symbolic reachability analysis. We discussed three extensions of the classical retiming approach which
include: (1) eliminating the need for equivalent reset states by introducing the concept
of an initialization stump, (2) supporting negative registers, handled as general functional relations to future time frames, and (3) removing peripheral registers by converting them into simple temporal offsets. We implemented the presented algorithm in a
transformation-engine-based tool architecture that allows an efficient iteration between
multiple reduction engines before the model is passed to a terminal reachability algorithm. Our experiments based on standard benchmarks and industrial circuits indicate
that the presented approach significantly increases the capacity of standard reachability
algorithms. In particular, we demonstrated that the repeated interleaved application of
retiming and other restructuring algorithms in a transformation-based setting can yield
reduction results that cannot be achieved with a monolithic approach.
In this paper the application of retiming is focused on minimizing the total number
of registers as an approximate method for enhancing reachability analysis. It does not
take into account that the actual register placement can have a significant impact on
other algorithms used for improving symbolic state traversal. An interesting problem
for future research is to extend the formulation of structural transformations beyond
simple retiming to obtain a more global approach for improving reachability analysis.
References
1. O. Coudert, C. Berthet, and J. C. Madre, “Verification of synchronous sequential machines
based on symbolic execution,” in International Workshop on Automatic Verification Methods
for Finite State Systems, Springer-Verlag, June 1989.
2. J. R. Burch, E. M. Clarke, K. L. McMillan, D. L. Dill, and L. J. Hwang, “Symbolic model
checking: 1020 states and beyond,” in IEEE Symposium on Logic in Computer Science,
pp. 428–439, IEEE, June 1990.
3. T. Niermann and J. H. Patel, “HITEC: A test generation package for sequential circuits,” in
The European Conference on Design Automation, pp. 214–218, IEEE, February 1991.
4. C. Leiserson and J. Saxe, “Retiming synchronous circuitry,” Algorithmica, vol. 6, pp. 5–35,
1991.
5. J. A. Darringer, D. Brand, J. V. Gerbi, W. H. Joyner, and L. H. Trevillyan, “Logic synthesis through local transformations,” IBM Journal on Research and Development, vol. 25,
pp. 272–280, July 1981.
6. A. Gupta, P. Ashar, and S. Malik, “Exploiting retiming in a guided simulation based validation methodology,” in Correct Hardware Design and Verification Methods (CHARME’99),
pp. 350–353, September 1999.
7. S. Malik, E. M. Sentovich, R. K. Brayton, and A. Sangiovanni-Vincentelli, “Retiming and
resynthesis: Optimizing sequential networks with combinational techniques,” IEEE Transactions on Computer-Aided Design, vol. 10, pp. 74–84, January 1991.
8. G. Hasteer, A. Mathur, and P. Banerjee, “Efficient equivalance checking of multi-phase
designs using retiming,” in IEEE International Conference on Computer-Aided Design,
pp. 557–561, November 1998.
9. J. Baumgartner, A. Tripp, A. Aziz, V. Singhal, and F. Andersen, “An abtraction algorithm for
the verification of generalized C-slow designs,” in Conference on Computer Aided Verification (CAV’00), pp. 5–19, July 2000.
10. C. Leiserson and J. Saxe, “Optimizing synchronous systems,” Journal of VLSI and Computer
Systems, vol. 1, pp. 41–67, January 1983.
11. H. J. Touati and R. K. Brayton, “Computing the initial states of retimed circuits,” IEEE
Transactions on Computer-Aided Design, vol. 12, pp. 157–162, January 1993.
12. G. Even, I. Y. Spillinger, and L. Stok, “Retiming revisited and reversed,” IEEE Transactions
on Computer-Aided Design, vol. 15, pp. 348–357, March 1996.
13. G. Cabodi, S. Quer, and F. Somenzi, “Optimizing sequential verification by retiming transformations,” in 37th ACM/IEEE Design Automation Conference, pp. 601–606, June 2000.
14. A. Kuehlmann, M. K. Ganai, and V. Paruthi, “Circuit-based Boolean reasoning,” in Proceedings of the 38th ACM/IEEE Design Automation Conference, ACM/IEEE, June 2001.
15. M. S. Hung, W. O. Rom, and A. Waren, Optimization with IBM OSL. Scientific Press, 1993.
16. The VIS Group, “VIS: A system for verification and synthesis,” in Conference on Computer
Aided Verification (CAV’96), pp. 428–432, Springer-Verlag, July 1996.