Enhanced Diameter Bounding via Structural Transformation Jason Baumgartner1 Andreas Kuehlmann2 1 2 IBM Server Group, Austin, TX Cadence Berkeley Labs, Berkeley, CA Abstract relies upon quantified Boolean formulae (QBF) thus is PSPACEcomplete [2]. Sufficient heuristics for solving such problems are not yet available, though some are beginning to emerge [3, 4]. Due to the computational complexity of diameter calculation, various overapproximate techniques have been proposed. For example, the recurrence diameter [2] of a design is its maximumlength irredundant state sequence, and may be calculated by a series of propositional satisfiability problems. This approach is NPcomplete as long as the depth is polynomial with respect to design size. However, the recurrence diameter may be exponentially larger than the diameter, hence it is of limited practical utility and may require exorbitant resources to compute. The technique of [7] performs a fast diameter overapproximation based upon purely structural analysis, and enables compositional diameter bounding. This approach may yield tight bounds for certain designs (primarily acyclic and memory-based) for which the recurrence diameter is loose, though may also result in exponentially-loose bounds for other designs. Other approaches, such as [8], have proposed the use of incomplete algorithms to estimate diameter, though are not guaranteed to yield an upper-bound. In some cases the use of diameter to bound BMC depth is more conservative than necessary; i.e., a smaller bound is sufficient for completeness. For example, we may ignore any vertices outside of the cone of influence of the property, which may decrease the diameter [7]. Additionally, a BMC application for the maximum distance from any initial state, rather than from any reachable state, suffices for property checking [6]. Furthermore, to assess if a given design signal t may ever be asserted, we need to perform a search only deep enough to assess whether signal t may toggle from 0 to 1 relative to an initial state; the amount of time necessary to toggle t from a 1 to a 0 from any state may be exponentially greater. These concepts are revisited in Definition 3, which generalizes the traditional state-based diameter definition, and are used in Theorem 4. In this paper we demonstrate how structural transformations may be used to enhance arbitrary diameter approximation techniques. This research enables the use of a diameter bound obtained upon a transformed design to yield a tight bound for the original, untransformed design via a constant-time calculation. Due to the reduction potential of these transformations, this theory may enable overapproximate techniques to yield exponentially tighter diameter bounds. Furthermore, the diameter bounding process itself may attain significant speedups by operating on the smaller, transformed designs. Numerous experimental results are provided for benchmark and industrial designs to illustrate our approach. There are several motivations for this research. Bounded model checking (BMC) has gained widespread industrial use due to its relative scalability. Its exhaustiveness over all valid input vectors allows it to expose arbitrarily complex design flaws. However, BMC is limited to analyzing only a specific time window, hence will only expose those flaws which manifest within that window and thus cannot readily prove correctness. The diameter of a design has thus become an important concept — a bounded check of depth equal to the diameter constitutes a complete proof. While the diameter of a design may be exponential in the number of its state elements, in practice it often ranges from tens to a few hundred regardless of design size. Therefore, a powerful diameter overapproximation technique may enable automatic proofs that otherwise would be infeasible. Unfortunately, exact diameter calculation requires exponential resources, and overapproximation techniques may yield exponentially loose bounds. In this paper, we provide a general approach for enabling the use of structural transformations, such as redundancy removal, retiming, and target enlargement, to tighten the bounds obtained by arbitrary diameter approximation techniques. Numerous experiments demonstrate that this approach may significantly increase the set of designs for which practically useful diameter bounds may be obtained. 1 Introduction Due to the complexity of modern hardware designs, formal verification methods are gaining widespread use to augment the coverage shortcomings of simulation-based validation approaches. Unfortunately, formal methods generally require exponential resources with respect to design size. General unbounded approaches, such as symbolic reachability analysis, are PSPACEcomplete whereas bounded approaches are NP-complete for a given bound [1]. Due to the lower resource requirements, and the wider variety of available algorithms for discharging a bounded verification problem, BMC has gained significant industrial utilization in recent years. BMC [2] attempts to find a property violation within k time-steps from the initial state(s) of a design. Though exhaustive, the bounded nature of this approach implies incompleteness; performing a check of depth 0 through k does not necessarily imply that no violation will occur at depths greater than k. The incompleteness of BMC has resulted in a set of research activities to discern a minimal k for which a bounded check is complete. The diameter [2] of a design is typically defined as the maximum distance between any two of its states si and s j such that s j is reachable from si . The distance from si to s j is the minimum number of time-steps to transition the design from si to s j . Clearly a bounded check of depth greater or equal to the diameter of the design is complete. However, a general diameter calculation A hybrid between a QBF and a recurrence diameter approach is proposed in [5] to partially alleviate the respective shortcomings of these techniques. Bounded coneof-influence analysis is proposed in [6] to tighten recurrence-diameter-based bounds. 1 1. To our knowledge, this paper is the first to discuss the theoretical impact of these transformations upon design diameter. the following example provides a distinction. Consider a netlist comprising a primary input i0 , which fans out to register r1 (whose initial value is primary input i1 ), which in turn fans out to register r2 (whose initial value is primary input i2 ). The diameter of r1 r2 is two, since it will take two time-steps to view 1 1 after 0 0 . However, the diameter of r2 is one since r2 may produce any valuation at any time-step independently of other time-steps, being trace-equivalent (refer to Definition 4) to any primary input. 2. These transformations often reduce design size, hence often reduce verification resource requirements as demonstrated by the cited papers. Therefore, one may question the utility of back-translating a diameter to the original design vs. attempting a proof solely upon the transformed design. However, note that in some cases, such a transformation may actually hurt verification. E.g., retiming may increase input count while reducing register count [9]; retiming and phase-abstraction [10] may increase the size of next-state functions. These transformations may vary both resource requirements and tightness of the obtained approximation using diameter approximation techniques; this variance may not correlate to their impact upon verification. Therefore, this research constitutes yet another practical mechanism which may be attempted to discharge difficult verification problems. 2 3 Diameter Bounding Transformed Netlists In this section, we discuss the impact of various types of transformations upon the diameter of a design. This theory will enable a diameter bound dˆ U obtained upon vertex set U of a transformed netlist N to immediately imply a diameter bound dˆ U on the corresponding vertex set of the original, untransformed netlist N via a constant-time calculation. In many cases, this theory enables that a tight dˆ U can yield a tight dˆ U . In the worst case, for all of the semantics-preserving transformations to be discussed, the resultant dˆ U will be at most a linear factor of dˆ U with respect to < R < . Recall that diameter overapproximation techniques may yield exponentially loose bounds with respect to register count, and may require exponential resources with respect to design size. Structural transformations are capable of yielding arbitrarily large reductions in design size, and often require polynomial resources. This collectively implies that the theory of this section may enhance both the resource requirements and the tightness of the diameter overapproximation obtained via any technique. We also discuss how diameter bounds obtained through approximate transformations (which are not both sound and complete) cannot be used in general to bound the diameter of the original netlist. Design Semantics: The Netlist Definition 1. A netlist is a tuple N V E G Z comprising a directed graph with vertices V and edges E V V . Function G : V types represents a semantic mapping from vertices to gate types, including constants, primary inputs (i.e., nondeterministic bits), registers (denoted as R), and combinational gates with various functions. A state is a mapping from registers to 0 1 values; Z represents the set of initial states. Definition 2. The semantics of a netlist N are defined in terms of semantic traces: 0 1 valuations to gates over time. We denote the set of all legal traces associated with a netlist by P V 0 1 , defining P as the subset of all possible functions from V to 0 1 which are consistent with G. The value of gate v at time i in trace p is denoted by p v i . 3.1 Trace-Equivalence-Preserving Transformations The following definition and theorem illustrates that traceequivalence-preserving transformations do not alter diameter. Our verification problem consists of a set of targets T V correlating to a set of properties AG t t T . We say that target t is hit in trace p at time i iff p t i 1. Due to the ability to synthesize safety properties into automata [11], this invariant-checking model is rarely a practical limitation. Definition 4. Vertex sets A and A of netlists N and N , respectively, are said to be trace equivalent iff there exists a bijective mapping ψ : A A which satisfies the following conditions. = = Definition 3. The diameter d U of vertex set U is the minimum number such that for any trace p and any increasing succession† k1 kc , there exists another trace p and another increasing succession l1 lc such that ! cj" 1 l j # k j and $ lc # lc % 1 & d U ' , taking l0 )( 1, which satisfies u U *! cj " 1 $ p u k j + p u l j ' . p P > p p P > i ?@ a A p a i . p $ ψ a i ' p P i ?@ a A p a i . p $ ψ a i ' P Theorem 1. Let N and N be netlists which are trace-equivalent with respect to vertex sets A and A and bijective mapping ψ : A A . The diameter of A is equal to that of A . Note that our definition of diameter is generally one greater than the standard definition for graphs; this matches the number of timesteps necessary to ensure completeness of BMC, and is always greater than zero. Definition 3 may be viewed as a generalization of a traditional state-based diameter definition because the diameter of vertices U actually need not correlate to that of coi U -, R. This definition provides an opportunity to bound diameter without a need to analyze the underlying state space representation. For example, if a vertex u encodes an X OR function of a primary input and a sequentially-driven signal A, then d u . 1 regardless of d A since any valuation to u will be producible at any time-step. From this example, one may be tempted to view this definition as using a set of vertices U as a filter through which the state space of the netlist is observed; while this is an intuitive interpretation, Proof. This theorem follows immediately from Definitions 3 and 4. It is well-known that bisimilarity, relating state transition graphs of netlists, is more restrictive than trace equivalence – bisimilarity implies trace equivalence, though the latter does not imply the former. Numerous approaches have been proposed for reducing the size of a design while preserving bisimilarity (thus also preserving trace equivalence); for example, those of [12, 13]. Such techniques often require the analysis of the state space of the design, which is computationally too expensive to consistently offer benefits for invariant checking as noted in [13]. However, the technique of redundancy removal [14, 15] is widely applied for enhancing verification. The idea of this approach is to attempt to identify two semanticallyequivalent vertices u and v; when two such vertices are found, all fanout edges from v are moved to u and v is made a simple buffer † An increasing succession is an ordered set of natural numbers k1 /1020203/ kc for c 4 1 which satisfies the relation ki 5 ki6 1 / 7 i 8:9 1 / c % 1; . 2 be bounded at i & 1 by Theorem 2, which is likely to be exact. As demonstrated in [9], redundancy removal plus retiming yields average reductions in register count of 27% for ISCAS89 benchmarks, and of 62% for IBM Gigahertz processor netlists. (one-input A ND gate) sourced by u. Clearly, redundancy removal does not alter the semantics of any vertex. Such merging is beneficial since it reduces the number of vertices in the cone of influence of a target, and may enable further reductions of the fanout cone of the merged vertex v. Identification of semantically-equivalent vertices may be performed efficiently by structural analysis or by BDD and SAT sweeping [14, 15] with no need to analyze the state space of the netlist. Note also that a cone-of-influence reduction preserves trace-equivalence of all vertices in the cone. The technique of parametric re-encoding of a netlist [16, 17] replaces the fanin cone C of a cut C C V A C of a netlist with a trace-equivalent cone C . Such re-encoding preserves traceequivalence of any vertex set in C . 3.3 State-Folding Abstractions In this section we discuss two structural transformations which entail state folding: phase abstraction and c-slow abstraction. Phase abstraction [10, 17] is a technique to yield a register-based netlist from one composed of level-sensitive latches: gates with two inputs (data and clock), which act as buffers when their clock is active and hold their last-sampled data values otherwise. C-slow abstraction [21, 17] is directly applicable to register-based netlists. Both abstractions are applicable to netlists in which the state elements may be c-colored such that state elements of color i may only combinationally fan out to state elements of color i & 1 mod c.‡ By eliminating all but one color of state elements (transforming others into combinational logic), both abstractions reduce the number of state elements of a netlist by a factor of 1 H c or greater. The semantic effect of these abstractions is to temporally fold the resulting netlist modulo-c. 3.2 Retiming Leiserson and Saxe [18] proposed the technique of retiming as a synthesis optimization for reducing the number of registers of a design (and thereby its size), or for reducing the clock period of a design by decreasing its longest purely-combinational path. More recently, several researchers have proposed the use of retiming to enhance verification [19, 20, 9]. Definition 5. A retiming of netlist N is a gate labeling r : V CB , where r v is the lag of vertex v, denoting the number of registers that are moved backward through v. A normalized retiming r may be obtained from an arbitrary retiming r, and is defined as r r ( maxv 8 V r v . Theorem 3. If the diameter of a set of identically-colored vertices Ũ of phase abstracted or c-slow abstracted netlist Ñ is d Ũ , then the diameter of the corresponding vertex set U of the original netlist N is at most c I d Ũ . Proof. By Definition 3, if the diameter of the phase- or c-slowabstracted vertex set Ũ is d Ũ , then the longest required duration to witness a particular valuation to Ũ is d Ũ time-steps. From [10, 21, 17], we know that phase abstraction and c-slow abstraction fold time modulo-c. Therefore, any transition of states in Ñ correlates to c transitions in N, and the corresponding valuation to U must occur within c I d Ũ time-steps. In [9], the use of normalized retiming is proposed for enhanced invariant checking. The retimed netlist Ñ has two components: 1) a sequential recurrence structure Ñ which has a unique representative in the original netlist N for each combinational vertex, and whose registers are placed according to Definition 5, and 2) a combinational retiming stump Ñ obtained through unfolding, representing retimed initial values as well as the functions of all gates for prefix time-steps that were effectively discarded from the recurrence structure. It is demonstrated that each gate ũ within Ñ is trace-equivalent to the corresponding u within N, modulo a temporal skew of ( r ũ2 time-steps. Furthermore, there will be ( r ũ2 correspondents to this u within Ñ , each being trace-equivalent to ũ for one time-step during this temporal skew. 3.4 Target Enlargement Target enlargement is a technique to render a target which may be hit at a shallower depth from the initial states of a netlist, and with a higher probability, than the original target [22, 23]. Target enlargement is based upon preimage computation to calculate the set of states which may hit the target within k time-steps. Inductive simplification may be performed upon the i-th preimage to eliminate states which hit the target in fewer than i time-steps. The result of this calculation is the characteristic function of the set of states S which is a subset of all states that can hit the target in 0 k steps, and a superset of those that can hit the target in exactly k steps minus those that can hit the target in 0 J k ( 1 steps. Prior research has noted the benefit of representing the enlarged target structurally to enable better synergy with simulation and SAT-based analysis [24], and to enable a reduction in the size of the cone-of-influence of the enlarged target [7]. Theorem 2. If the diameter of a set of vertices Ũ of the recurrence structure of a normalized-retimed netlist is d Ũ 1 , and > i ũ+ Ũ DE( r ũ1F i, then the diameter of the original set of vertices U satisfies d U # d Ũ & i. Proof. In [7], it is shown that for a given set of vertices u1 un with diameter d, the diameter of a set of acyclic registers v1 vn whose input edges are sourced by u1 un , respectively, will be at most d & 1. Note that we may compose a series of i registers to each ũ , whose initial values are determined by corresponding values from the retiming stump. This yields a set of vertices U which are trace-equivalent to U. Each stage of this pipeline is an acyclic register, hence increments diameter by at most 1. This proof therefore follows from Theorem 1 and the results of [9, 7]. Theorem 4. If the diameter of a k-step enlarged target t is d t , then the original target t is hittable within d t & k time-steps, if at all. Each lag will be between (G< R < and 0 for any optimal normalized retiming, thus guaranteeing that d U is a linear factor of d Ũ . Retiming is potentially capable of eliminating all registers from a netlist; e.g., for a pipeline with a primary input driving a set of registers connected in series. Because the diameter of a combinational netlist is 1, the diameter of the i-th register of this pipeline will Proof. If d t K i, then t must be hittable at time 0 J i ( 1 if at all, as per Definition 3. Because the enlarged target is a subset of the states that hit the target within k time-steps, and a superset ‡ Note that c may readily be bounded by L R L . In [17], a netlist preprocessing technique is formalized to allow c-slow abstraction to be applied to any netlist where each directed cycle comprises a factor of c M 1 registers. 3 Therefore, diameter bounds obtained upon an underapproximated netlist cannot generally be used to bound the original netlist. of those that hit the target in exactly k time-steps minus those that hit the target in fewer than k time-steps, the original target must be hittable within 0 k & i ( 1 time-steps, if at all. 4 Due to the temporal union and input quantification inherent in target enlargement, it may be the case that a transition of the enlarged target t from 1 to 0 may be skewed and possibly eliminated with respect to such a transition of the original target t. For example, target t may be an O R of signals A and B, where B encodes the function counter O N 0 for a mod -c counter. The first hit of t via A may cause the counter to unconditionally begin counting, such that t may thereafter only be deasserted one time-step of every c time-steps. Target enlargement may obscure this deasserted timestep such that once hit, t will never be deasserted. The number of time-steps necessary to drive a binary 1 to t from any reachable state of N may be exponentially smaller than the number necessary to subsequently drive a binary 0 onto t. Therefore, target enlargement does not entail as clean of an impact on diameter as we may hope; we cannot use a target enlargement approach to bound the diameter of an intermediate component of a partitioned netlist [7], for example. However, the result of Theorem 4 is sufficient to allow a bound derived from the target-enlarged netlist to imply an upperbound on the number of time-steps sufficient to perform BMC in a complete manner for the original target. In [7], it is demonstrated that target enlargement may yield average register count reductions of 14% for ISCAS89 benchmarks and of 71% for IBM Gigahertz processor netlists. Experimental Results In this section we provide the experimental results of our transformation-based diameter overapproximation theory. All experiments were run on an IBM ThinkPad model T21 running RedHat Linux 6.2, with an 800 MHz Pentium III and 256 MB main memory. As transformations, we utilized a redundancy removal engine COM [27], a retiming engine RET [9], and for the IBM Gigahertz Processor examples, a phase abstraction engine [10]. The diameter overapproximation algorithm utilized for our results is the fast structural technique presented in [7]. To briefly summarize this technique, the approach first partitions the netlist into an acyclic sequence of components, and then compositionally derives an overapproximate diameter bound dˆ from this partition. While the diameter of each component is generally exponential in its register count, it is demonstrated that certain commonlyoccurring types of components have a much smaller impact on diameter. Four classes of components are considered: constant components (CCs) — these do not increase the diameter; acyclic components (ACs), which represent a one-stage pipeline of arbitrary width — these increment the diameter by one regardless of how many bits wide the pipeline stage is; memory/queue components (MCs/QCs) — these multiply the diameter by the number of atomically-updated rows plus one regardless of the bit-width of each row; and general components (GCs) — these are the catchall category for all other strongly-connected components, and may have a diameter exponential in their register count. Rather than using more expensive diameter bounding techniques such as QBF or recurrence diameter for GCs, we assume an exponential diameter increase in these experiments for speed. Our first set of experiments summarized in Table 1 are on the ISCAS89 benchmarks, using each primary output as a target for lack of any more meaningful available targets. We categorized the registers in the netlist into the various component types. We additionally ran the diameter overapproximation algorithm of [7] on all targets; any with a diameter of less than 50 were enumerated in set T T and the average of these corresponding diameters is reported. The bound of 50 was arbitrarily chosen as being a reasonable cut-off size for discharging with BMC. In the bottom row we report the cumulative sum of registers of the corresponding types, and the cumulative sum of < T < and < T < . We performed these experiments on the original netlists; on redundancy-removed netlists (COM); and on netlists after redundancy removal and retiming (COM,RET,COM), using Theorems 1 and 2. We perform the identical set of experiments on randomly-selected phase-abstracted GP netlists in Table 2. We do not report per-line computational resources for these experiments; the structural diameter overapproximation algorithms consume less than 1 second and 1 MB per target. In contrast to the results reported in [7] which do not reflect the results of retiming nor redundancy removal, these experiments are intended to illustrate precisely the effects of these transformations. Many registers are initially non-complex for the ISCAS netlists: 21% are acyclic registers, and 5% are table cells (elements of MCs/QCs). A total of 477 original targets (30%) have a diameter of less than 50. After redundancy removal, 24% of the registers are acyclic, and 10% are table cells; 556 of the targets (34%) have a diameter of less than 50. After redundancy removal and retiming, 10% of the registers are acyclic and 11% are table cells. This drop 3.5 Overapproximate Abstraction Various techniques have been developed for attempting to reduce netlist size while providing an overapproximation of the behavior of that netlist. In other words, any target assessed to be unreachable after overapproximation is guaranteed to be unreachable before the overapproximation. However, a target hit obtained for the overapproximated netlist may not imply that the target is hittable in the original netlist. For example, cut-point insertion [25] is commonly used in equivalence checking, and consists of replacing an internal netlist vertex by a primary input. Localization [26] is another common overapproximation technique which replaces all vertices sourcing crossing edges of a cut of the netlist by primary inputs. There are two byproducts of overapproximate abstractions. First, states which are not truly reachable may become reachable, which may clearly increase the diameter. Second, state transitions which are not reachable may become reachable, which may decrease the diameter. Therefore, diameter bounds obtained upon an overapproximated netlist cannot be used in general to obtain a bound for the original netlist. 3.6 Underapproximate Abstraction An underapproximate abstraction attempts to reduce the netlist size while providing an underapproximation of its behavior. In other words, any trace hitting a target of the abstracted netlist is valid to illustrate a hit of the unabstracted target. However, target unreachable results obtained upon the underapproximate netlist may not imply that the target is unreachable in the original netlist. For example, case splitting may replace certain primary inputs by constant values. While a special case of case splitting is semanticspreserving (if the image of a netlist cut is preserved as described in Section 3.1), underapproximate abstractions may generally cause two byproducts. First, reachable states may become unreachable, which may decrease the diameter. Second, reachable state transitions may become unreachable, which may increase diameter. 4 in acyclic registers is due primarily to their elimination by retiming. A total of 639 targets (40%) have a diameter of less than 50. A larger fraction of the registers is originally non-complex for the GP netlists, as is intuitive for highly-pipelined gigahertz designs: 1% are constants, 57% are acyclic, and 13% are table cells. A total of 95 targets (33%) have a diameter of less than 50. After redundancy removal, 0.5% of the registers are constants, 58% are acyclic, and 15% are table cells. A total of 111 targets (39%) have a diameter of less than 50. After retiming and redundancy removal, 1% of the registers are constants, 19% are acyclic, and 34% are table cells. A total of 126 (44%) of these targets have a diameter of less than 50. Overall, the results clearly indicate that the use of structural transformations, coupled with the theories presented in this paper, are capable of significantly increasing the set of targets for which a practically useful overapproximate diameter bound may be obtained. We increase the percentage by 10% or more of such targets in both ISCAS89 and GP netlists. This result is particularly profound noting that we did not employ any (possibly costly) techniques to attempt to tighten GC diameter bounds; our experiments thus reflect a very fine line between being able to attain a small diameter bound and a huge bound. As techniques emerge for efficiently improving diameter bounding for GCs, the transformationbased theory we have developed should prove even more useful to obtain superior results with lesser resources. In some cases, the diameter bound computed for a retimed netlist is slightly larger than that of the original netlist – for example, with S1196 and S15850 1. This is partially due to the inequality in Theorem 2; we must add the negated target lag to its diameter bound, even though retiming may not have reduced register count for that target. Use of a normalized retiming helps minimize this potential increase, as would retiming and normalizing a single target cone at a time (instead of the entire netlist). However, the potential for increase tends to be very small (since most lags tend to be very small and are bounded by (P< R < ), whereas the potential for decrease is exponentially greater. Transformations also impact table identification and clustering heuristics for MCs/QCs. Due to the speed of these heuristic algorithms, it may be beneficial to run them on every possible netlist representation to enable the best possible result. 5 [2] A. Biere, A. Cimatti, E. M. Clarke, and Y. Zhu, “Symbolic model checking without BDDs,” in Tools and Algorithms for Construction and Analysis of Systems, March 1999. [3] L. Zhang and S. Malik, “Conflict driven learning in a quantified Boolean satisfiability solver,” in Int’l Conference on Computer Design, Nov. 2002. [4] M. N. Mneimneh and K. A. Sakallah, “SAT-based sequential depth computation,” in ASP Design Automation Conference, January 2003. [5] M. Sheeran, S. Singh, and G. Stalmarck, “Checking safety properties using induction and a SAT-solver,” in Formal Methods in ComputerAided Design, Nov. 2000. [6] D. Kroening and O. Strichman, “Efficient computation of recurrence diameters,” in Int’l Conference on Verification, Model Checking, and Abstract Interpretation, Jan. 2003. [7] J. Baumgartner, A. Kuehlmann, and J. Abraham, “Property checking via structural analysis,” in Computer-Aided Verification, July 2002. [8] C.-C. Yen, K.-C. Chen, and J.-Y. Jou, “A practical approach to cycle bound estimation,” in Int’l Workshop on Logic & Synthesis, 2002. [9] A. Kuehlmann and J. Baumgartner, “Transformation-based verification using generalized retiming,” in Computer-Aided Verification, July 2001. [10] J. Baumgartner, T. Heyman, V. Singhal, and A. Aziz, “Model checking the IBM Gigahertz Processor: An abstraction algorithm for highperformance netlists,” in Computer-Aided Verification, July 1999. [11] E. A. Emerson, “Temporal and modal logic,” Handbook of Theoretical Computer Science, vol. B, 1990. [12] A. Aziz, V. Singhal, G. M. Swamy, and R. K. Brayton, “Minimizing interacting finite state machines: A compositional approach to language containment,” in Int’l Conference on Computer Design, 1994. [13] K. Fisler and M. Vardi, “Bisimulation and model checking,” in Correct Hardware Design and Verification Methods, Sept. 1999. [14] A. Kuehlmann, V. Paruthi, F. Krohm, and M. Ganai, “Robust Boolean reasoning for equivalence checking and functional property verification,” IEEE Transactions on Computer-Aided Design, vol. 21, Dec. 2002. [15] J. Baumgartner and A. Kuehlmann, “Min-area retiming on flexible circuit structures,” in Int’l Conference on Computer-Aided Design, 2001. [16] I.-H. Moon, H. H. Kwak, J. Kukula, T. Shiple, and C. Pixley, “Simplifying circuits for formal verification using parametric representation,” in Formal Methods in Computer-Aided Design, Nov. 2002. [17] J. Baumgartner, Automatic Structural Abstraction Techniques for Enhanced Verification. PhD thesis, University of Texas, Dec. 2002. [18] C. Leiserson and J. Saxe, “Retiming synchronous circuitry,” Algorithmica, vol. 6, 1991. [19] A. Gupta, P. Ashar, and S. Malik, “Exploiting retiming in a guided simulation based validation methodology,” in Correct Hardware Design and Verification Methods, Sept. 1999. [20] G. Cabodi, S. Quer, and F. Somenzi, “Optimizing sequential verification by retiming transformations,” in Design Automation Conference, June 2000. Conclusion We have presented a practical approach with supporting theory to enable the use of structural transformations (such as retiming, redundancy removal, and target enlargement) to enhance diameter overapproximation techniques, and thus to help enable completeness of bounded verification approaches. The theory enables a diameter bound obtained upon a transformed design to backwardimply a tight bound for the original, untransformed design. Due to the reduction potential inherent in these transformations, this theory is capable of significantly increasing the set of designs for which practically useful diameter bounds may be calculated efficiently. This capability is validated by numerous experimental results. A promising future research direction is to apply this theory for speeding up quantified-Boolean-formulae-based diameter calculation, and to exponentially tighten alternate overapproximate techniques such as ones based upon recurrence diameter. [21] J. Baumgartner, A. Tripp, A. Aziz, V. Singhal, and F. Andersen, “An abstraction algorithm for the verification of generalized C-slow designs,” in Computer-Aided Verification, July 2000. [22] J. Yuan, J. Shen, J. Abraham, and A. Aziz, “On combining formal and informal verification,” in Computer-Aided Verification, June 1997. [23] C. H. Yang and D. L. Dill, “Validation with guided search of the state space,” in Design Automation Conference, June 1998. [24] M. Ganai, Algorithms for Efficient State Space Search. PhD thesis, University of Texas, May 2001. [25] A. Kuehlmann and F. Krohm, “Equivalence checking using cuts and heaps,” in Design Automation Conference, June 1997. [26] R. P. Kurshan, Computer-Aided Verification of Coordinating Processes. Princeton University Press, 1994. References [1] A. Aziz, V. Singhal, and R. K. Brayton, “Verifying interacting finite state machines: Complexity issues,” Tech. Rep. UCB/ERL M93/52, University of California at Berkeley, July 1993. [27] A. Kuehlmann, M. Ganai, and V. Paruthi, “Circuit-based Boolean reasoning,” in Design Automation Conference, June 2001. 5 Design Original Netlist Q R Q1R CC; AC; Q T STQ / Q T Q ; Avg. dˆV t S W Q R Q1R CC; AC; COM Q T STQ / Q T Q ; Avg. dˆV t S W PROLOG S1196 S1238 S1269 S13207 1 S1423 S1488 S1494 S1512 S15850 1 S208 1 S27 S298 S3271 S3330 S3384 S344 S349 S35932 S382 S38584 1 S386 S400 S420 1 S444 S4863 S499 S510 S526N S5378 S635 S641 S6669 S713 S820 S832 S838 1 S9234 1 S938 S953 S967 S991 MC+QC; GC 0 ; 107 ; 1 ; 28 0 ; 18 ; 0 ; 0 0 ; 18 ; 0 ; 0 0 ; 9 ; 17 ; 11 0 ; 314 ; 128 ; 196 0 ; 3 ; 16 ; 55 0;0;0;6 0;0;0;6 0 ; 0 ; 1 ; 56 0 ; 99 ; 124 ; 311 0;0;0;8 0;1;2;0 0 ; 0 ; 1 ; 13 0 ; 6 ; 0 ; 110 0 ; 103 ; 1 ; 28 0 ; 111 ; 0 ; 72 0 ; 0 ; 4 ; 11 0 ; 0 ; 4 ; 11 0 ; 0 ; 0 ; 1728 0 ; 6 ; 0 ; 15 0 ; 47 ; 4 ; 1375 0;0;0;6 0 ; 6 ; 0 ; 15 0 ; 0 ; 0 ; 16 0 ; 6 ; 0 ; 15 0 ; 62 ; 0 ; 42 0 ; 0 ; 0 ; 22 0;0;0;6 0 ; 0 ; 1 ; 20 0 ; 115 ; 0 ; 64 0 ; 0 ; 0 ; 32 0 ; 7 ; 0 ; 12 0 ; 181 ; 0 ; 58 0 ; 7 ; 0 ; 12 0;0;0;5 0;0;0;5 0 ; 0 ; 0 ; 32 0 ; 45 ; 9 ; 157 0 ; 0 ; 0 ; 32 0 ; 23 ; 0 ; 6 0 ; 23 ; 0 ; 6 0 ; 0 ; 0 ; 19 14 / 73 ; 8.9 14 / 14 ; 3.3 14 / 14 ; 3.3 2 / 10 ; 10.0 49 / 152 ; 2.0 1 / 5 ; 1.0 19 / 19 ; 33.0 19 / 19 ; 33.0 0 / 21 ; 0.0 115 / 150 ; 2.7 0 / 1 ; 0.0 1 / 1 ; 4.0 0 / 6 ; 0.0 1 / 14 ; 7.0 16 / 73 ; 11.9 6 / 26 ; 16.5 3 / 11 ; 5.0 3 / 11 ; 5.0 0 / 320 ; 0.0 0 / 6 ; 0.0 56 / 304 ; 1.0 7 / 7 ; 33.0 0 / 6 ; 0.0 0 / 1 ; 0.0 0 / 6 ; 0.0 0 / 16 ; 0.0 0 / 22 ; 0.0 7 / 7 ; 33.0 0 / 6 ; 0.0 4 / 49 ; 1.5 0 / 1 ; 0.0 3 / 24 ; 1.0 37 / 55 ; 3.4 3 / 23 ; 1.0 19 / 19 ; 17.0 19 / 19 ; 17.0 0 / 1 ; 0.0 22 / 39 ; 1.2 0 / 1 ; 0.0 3 / 23 ; 2.0 3 / 23 ; 2.0 17 / 17 ; 8.8 MC+QC; GC 0 ; 103 ; 1 ; 28 0 ; 18 ; 0 ; 0 0 ; 18 ; 0 ; 0 0 ; 9 ; 17 ; 11 0 ; 315 ; 128 ; 195 0 ; 3 ; 16 ; 55 0;0;0;6 0;0;0;6 0 ; 0 ; 0 ; 57 0 ; 96 ; 107 ; 328 0;0;0;8 0;1;2;0 0 ; 0 ; 1 ; 13 0 ; 6 ; 0 ; 110 0 ; 103 ; 1 ; 28 0 ; 111 ; 0 ; 72 0 ; 0 ; 4 ; 11 0 ; 0 ; 4 ; 11 0 ; 0 ; 0 ; 1728 0 ; 6 ; 0 ; 15 1 ; 203 ; 366 ; 854 0;0;0;6 0 ; 6 ; 0 ; 15 0 ; 0 ; 0 ; 16 0 ; 6 ; 0 ; 15 0 ; 83 ; 0 ; 21 0 ; 0 ; 0 ; 22 0;0;0;6 0 ; 0 ; 1 ; 20 0 ; 126 ; 0 ; 53 0 ; 0 ; 0 ; 32 0 ; 7 ; 0 ; 12 0 ; 181 ; 0 ; 58 0 ; 7 ; 0 ; 12 0;0;0;5 0;0;0;5 0 ; 0 ; 0 ; 32 0 ; 49 ; 5 ; 157 0 ; 0 ; 0 ; 32 0 ; 23 ; 0 ; 6 0 ; 23 ; 0 ; 6 0 ; 0 ; 0 ; 19 16 / 73 ; 11.9 14 / 14 ; 3.3 14 / 14 ; 3.3 2 / 10 ; 10.0 49 / 152 ; 2.1 1 / 5 ; 1.0 19 / 19 ; 33.0 19 / 19 ; 33.0 0 / 21 ; 0.0 115 / 150 ; 2.7 0 / 1 ; 0.0 1 / 1 ; 4.0 0 / 6 ; 0.0 1 / 14 ; 7.0 16 / 73 ; 11.9 6 / 26 ; 16.5 3 / 11 ; 5.0 3 / 11 ; 5.0 0 / 320 ; 0.0 0 / 6 ; 0.0 133 / 304 ; 14.9 7 / 7 ; 33.0 0 / 6 ; 0.0 0 / 1 ; 0.0 0 / 6 ; 0.0 0 / 16 ; 0.0 0 / 22 ; 0.0 7 / 7 ; 33.0 0 / 6 ; 0.0 4 / 49 ; 1.5 0 / 1 ; 0.0 3 / 24 ; 1.0 37 / 55 ; 3.4 3 / 23 ; 1.0 19 / 19 ; 17.0 19 / 19 ; 17.0 0 / 1 ; 0.0 22 / 39 ; 1.2 0 / 1 ; 0.0 3 / 23 ; 2.0 3 / 23 ; 2.0 17 / 17 ; 8.8 ∑ 0; 1317; 313; 4622 477 / 1615 1; 1503; 653; 4086 556 / 1615 COM,RET,COM Q T SUQ / Q T Q ; MC+QC; GC Avg. dˆV t S W 0 ; 16 ; 1 ; 28 24 / 73 ; 21.0 0 ; 16 ; 0 ; 0 14 / 14 ; 4.3 0 ; 16 ; 0 ; 0 14 / 14 ; 4.3 0 ; 8 ; 17 ; 11 2 / 10 ; 10.0 0 ; 77 ; 89 ; 183 79 / 152 ; 6.4 0 ; 1 ; 12 ; 59 1 / 5 ; 2.0 0;0;0;6 19 / 19 ; 33.0 0;0;0;6 19 / 19 ; 33.0 0 ; 0 ; 0 ; 57 0 / 21 ; 0.0 0 ; 73 ; 81 ; 292 115 / 150 ; 4.7 0;0;0;8 0 / 1 ; 0.0 0;1;2;0 1 / 1 ; 4.0 0 ; 0 ; 1 ; 13 0 / 6 ; 0.0 0 ; 0 ; 0 ; 110 1 / 14 ; 7.0 0 ; 16 ; 1 ; 28 33 / 73 ; 25.3 0 ; 0 ; 0 ; 72 6 / 26 ; 16.5 0 ; 0 ; 4 ; 11 3 / 11 ; 5.0 0 ; 0 ; 4 ; 11 3 / 11 ; 5.0 0 ; 0 ; 0 ; 1728 0 / 320 ; 0.0 0 ; 0 ; 0 ; 15 0 / 6 ; 0.0 0 ; 170 ; 345 ; 832 110 / 304 ; 16.7 0;0;0;6 7 / 7 ; 33.0 0 ; 0 ; 0 ; 15 0 / 6 ; 0.0 0 ; 0 ; 0 ; 16 0 / 1 ; 0.0 0 ; 0 ; 0 ; 15 0 / 6 ; 0.0 0 ; 16 ; 0 ; 21 0 / 16 ; 0.0 0 ; 0 ; 0 ; 22 0 / 22 ; 0.0 0;0;0;6 7 / 7 ; 33.0 0 ; 0 ; 1 ; 20 0 / 6 ; 0.0 0 ; 56 ; 0 ; 56 7 / 49 ; 3.9 0 ; 0 ; 0 ; 32 0 / 1 ; 0.0 0 ; 4 ; 0 ; 10 7 / 24 ; 2.0 0 ; 18 ; 0 ; 58 37 / 55 ; 4.0 0;7;0;7 7 / 23 ; 2.3 0;0;0;5 19 / 19 ; 17.0 0;0;0;5 19 / 19 ; 17.0 0 ; 0 ; 0 ; 32 0 / 1 ; 0.0 0 ; 14 ; 25 ; 133 22 / 39 ; 2.0 0 ; 0 ; 0 ; 32 0 / 1 ; 0.0 0;0;0;6 23 / 23 ; 29.8 0;0;0;6 23 / 23 ; 29.8 0 ; 0 ; 0 ; 19 17 / 17 ; 8.8 Q R Q3R CC; AC; 0; 509; 583; 3992 639 / 1615 Table 1: Diameter bounding experiments for ISCAS89 benchmarks. Design CP RAS CLB CNTL CR RAS D DASA D DCLA D DUDD I IBBQn I IFAR I IFPF L3 SNP1 L EMQn L EXEC L FLUSHn L INTRo L LMQo L LRU L PFQo L PNTRn L PRQn L SLB L TBWKn M CIU SIDECAR4 S SCU1 V CACH V DIR V SNPM W GAR W SFA ∑ Original Netlist Q R Q1R CC; AC; MC+QC; GC 0 ; 279 ; 66 ; 315 0 ; 29 ; 2 ; 19 0 ; 96 ; 6 ; 329 0 ; 16 ; 81 ; 18 0 ; 382 ; 1 ; 754 0 ; 30 ; 28 ; 71 0 ; 623 ; 1488 ; 0 0 ; 303 ; 11 ; 99 11 ; 893 ; 44 ; 598 25 ; 529 ; 39 ; 82 5 ; 146 ; 6 ; 66 12 ; 421 ; 0 ; 102 6 ; 198 ; 0 ; 4 14 ; 143 ; 12 ; 5 28 ; 690 ; 4 ; 133 0 ; 142 ; 20 ; 75 14 ; 1936 ; 17 ; 84 3 ; 228 ; 10 ; 11 34 ; 366 ; 106 ; 265 3 ; 135 ; 6 ; 27 0 ; 202 ; 117 ; 14 0 ; 343 ; 10 ; 424 3 ; 109 ; 32 ; 455 1 ; 232 ; 4 ; 136 5 ; 94 ; 15 ; 59 6 ; 91 ; 13 ; 68 65; 846; 134; 376 0 ; 159 ; 0 ; 83 0 ; 22 ; 0 ; 42 235; 9683; 2272; 4714 Q TS Q / Q T Q ; Avg. dˆV t S W COM Q R Q1R CC; AC; QTS Q / QT Q; Avg. dˆV t S W COM,RET,COM Q R Q1R CC; AC; QTS Q / QT Q; Avg. dˆV t S W 0 / 2 ; 0.0 0 / 2 ; 0.0 0 / 1 ; 0.0 2 / 2 ; 28.0 0 / 2 ; 0.0 7 / 22 ; 11.0 15 / 15 ; 4.7 0 / 2 ; 0.0 0 / 1 ; 0.0 1 / 5 ; 1.0 1 / 1 ; 1.0 0 / 2 ; 0.0 7 / 7 ; 4.0 30 / 30 ; 3.6 0 / 16 ; 0.0 12 / 12 ; 15.0 1 / 67 ; 1.0 23 / 31 ; 4.0 10 / 10 ; 8.0 2 / 3 ; 1.0 1 / 21 ; 1.0 6 / 6 ; 1.0 0 / 1 ; 0.0 2 / 3 ; 2.0 1 / 1 ; 1.0 2 / 2 ; 8.0 2 / 2 ; 1.5 1 / 7 ; 1.0 0 / 8 ; 0.0 126 / 284 0 / 2 ; 0.0 0 / 2 ; 0.0 0 / 1 ; 0.0 1 / 2 ; 35.0 0 / 2 ; 0.0 4 / 22 ; 9.2 15 / 15 ; 4.7 0 / 2 ; 0.0 0 / 1 ; 0.0 0 / 5 ; 0.0 0 / 1 ; 0.0 0 / 2 ; 0.0 7 / 7 ; 3.7 30 / 30 ; 3.8 0 / 16 ; 0.0 0 / 12 ; 0.0 1 / 67 ; 1.0 23 / 31 ; 2.0 10 / 10 ; 15.2 2 / 3 ; 1.0 0 / 21 ; 0.0 0 / 6 ; 0.0 0 / 1 ; 0.0 0 / 3 ; 0.0 0 / 1 ; 0.0 0 / 2 ; 0.0 1 / 2 ; 2.0 1 / 7 ; 1.0 0 / 8 ; 0.0 MC+QC; GC 0 ; 286 ; 66 ; 307 0 ; 25 ; 2 ; 19 0 ; 100 ; 7 ; 321 0 ; 10 ; 86 ; 13 0 ; 387 ; 1 ; 748 0 ; 21 ; 28 ; 71 0 ; 623 ; 1488 ; 0 0 ; 257 ; 11 ; 93 1 ; 923 ; 35 ; 525 6 ; 400 ; 41 ; 62 5 ; 136 ; 6 ; 66 0 ; 430 ; 0 ; 58 0 ; 194 ; 0 ; 4 0 ; 135 ; 12 ; 5 24 ; 682 ; 4 ; 141 0 ; 127 ; 86 ; 9 8 ; 1929 ; 82 ; 20 0 ; 211 ; 10 ; 11 30; 367; 108; 260 0 ; 126 ; 6 ; 26 0 ; 186 ; 119 ; 12 0 ; 321 ; 5 ; 417 0 ; 60 ; 34 ; 453 0 ; 220 ; 6 ; 124 0 ; 93 ; 14 ; 52 0 ; 100 ; 13 ; 55 3 ; 762 ; 97 ; 401 0 ; 158 ; 0 ; 82 0 ; 22 ; 0 ; 42 0 / 2 ; 0.0 0 / 2 ; 0.0 0 / 1 ; 0.0 2 / 2 ; 27.0 0 / 2 ; 0.0 4 / 22 ; 10.8 15 / 15 ; 4.7 0 / 2 ; 0.0 0 / 1 ; 0.0 0 / 5 ; 0.0 1 / 1 ; 1.0 0 / 2 ; 0.0 7 / 7 ; 3.7 30 / 30 ; 3.8 0 / 16 ; 0.0 12 / 12 ; 15.0 1 / 67 ; 1.0 23 / 31 ; 2.0 10 / 10 ; 15.2 2 / 3 ; 1.0 1 / 21 ; 1.0 0 / 6 ; 0.0 0 / 1 ; 0.0 0 / 3 ; 0.0 0 / 1 ; 0.0 0 / 2 ; 0.0 2 / 2 ; 1.5 1 / 7 ; 1.0 0 / 8 ; 0.0 MC+QC; GC 0 ; 179 ; 65 ; 238 0 ; 15 ; 2 ; 20 0 ; 52 ; 10 ; 284 0 ; 1 ; 86 ; 13 0 ; 14 ; 0 ; 736 0 ; 1 ; 21 ; 71 0 ; 0 ; 1488 ; 0 0 ; 41 ; 18 ; 79 0 ; 191 ; 4 ; 218 0 ; 31 ; 30 ; 41 5 ; 20 ; 14 ; 57 0 ; 88 ; 0 ; 57 0 ; 12 ; 0 ; 4 0 ; 3 ; 12 ; 4 24 ; 114 ; 2 ; 132 0 ; 0 ; 86 ; 8 8 ; 192 ; 83 ; 17 0 ; 1 ; 10 ; 11 30 ; 12 ; 64 ; 302 0 ; 15 ; 6 ; 23 0 ; 1 ; 78 ; 53 0 ; 63 ; 60 ; 286 0 ; 24 ; 34 ; 67 0 ; 75 ; 4 ; 70 1 ; 22 ; 15 ; 27 0 ; 13 ; 10 ; 20 0 ; 51 ; 26 ; 46 0 ; 10 ; 0 ; 81 0 ; 0 ; 0 ; 42 95 / 284 77; 9291; 2367; 4397 111 / 284 68; 1241; 2228; 3007 Table 2: Diameter bounding experiments for phase-abstracted GP netlists. 6