® ORCA Series 4 I/O Tuning via PLL August 2002 Technical Note TN1011 Introduction This technical note describes how to use the Series 4 phase-locked loops (PLLs) to solve several classic timing issues that face FPGA designers. Series 4 FPGAs and FPSCs provide the designer with up to six general-purpose programmable PLLs (PPLLs) capable of operating at speeds of 15-420 MHz. Note: This technical note assumes the reader’s familiarity with the concepts covered in technical note number TN1014, ORCA Series 4 FPGA PLL Elements . Description of I/O Timing Issues Three common timing problems are discussed below: 1. When an FPGA is communicating with an off-chip agent, the designer usually must meet timing requirements of an interface definition that specifies timing at the FPGA’s pin boundary. However, internally, the timing is defined in terms of relationships between the clock and data signals at the internal ports to the registers. This difference in point of definition leads to problems for the designer, who must reconcile these two timing domains to one another. 2. A second problem arises from the fact that the external interface specification defines input setup and hold requirements without regard to the FPGA’s internal capabilities, and although the external specification may provide for an adequate data window (setup + hold), this window’s position relative to the clock’s active edge is often less than optimum from the FPGA’s perspective. It would be helpful if the designer could “borrow” from a loose input setup requirement in order to “lend” to a tight input hold requirement, or vice-versa. 3. A third issue involves the interplay between the clock-to-out requirement of the driving device and the input setup specification of the receiving device. Some protocols specify a zero or negative value for input setup (meaning that the data window begins at or after its clock). This makes it easier for the system to avoid “shoot-through” problems, since the driving chip can’t change output data until after its reference clock edge occurs. Conversely, a protocol can specify a negative value for clock-to-out, meaning that the driving device must begin sending valid data before it receives the associated input clock edge. All of these problems can be addressed by conditioning the clock with a PLL. What a PLL can do: • • • • Null out clock tree delay or otherwise shift the clock to adjust its delay Provide clock phase shifting in increments of 1/8 of a clock period Perform clock frequency multiplication/division (not discussed here) Perform clock duty cycle conditioning (not discussed here) What a PLL cannot do: • • • • Handle clocks having a varying frequency Handle clocks having a frequency outside prescribed limits Handle clocks that stop Null out net delay of the portion of the clock net from the device’s clock input pin to the PLL input (although an equivalent value can be nulled out, as outlined below) • Perform two or all of the following in a single PLL: – Frequency multiplication/division – Phase shifting between clock outputs MCLK and NCLK – Duty cycle conditioning www.latticesemi.com 1 tn1011_02 Lattice Semiconductor ORCA Series 4 I/O Tuning via PLL Solution #1: Reconciling Internal Timing to an External Specification Case I: PLL With Internal Feedback A PLL can be used to “null out” the delay introduced by a clock net. In so doing, the clock loads on that net are effectively brought closer in timing to the device’s external clock input pin, allowing timing to be referenced to that pin. This section explains how that is accomplished. Figure 1 illustrates a clock net that includes a PLL. The PLL is in Delay Mode and the FB (feedback) input is driven by the PLL’s INTFB (internal feedback) output. The resulting timing is shown in Figure 2, which contains an excerpt from the actual Trace report (*.twr) of this design. Table 1 correlates the paths in Figure 1 with the delays in Figure 2. When the feedback is internal, a small portion of the PLL’s delay is nulled out, and the resultant delay [A to EX] is large. Figure 1. Clock Tree with PLL, No Nulling REG Clock Tree D E1 PPLL A B C BUF G MCLK CLK D CLKIN REG FB INTFB H D E2 CLK Table 1. Correlation of Schematic and Trace Reports Path in Figure 1 Line in Figure 2 Delay A to B Line 130 1.480 ns B to C Line 131 1.730 ns C to D Line 132 0.385 ns D to EX Line 133 2.863 ns C to H Line 147 0.000 ns H to G Line 148 0.097 ns 2 Lattice Semiconductor ORCA Series 4 I/O Tuning via PLL Figure 2. PLL with Internal Feedback Case I: Trace Report, PLL with Internal Feedback (Refer to Figure 1) Line 101 Line 105 Line 110 Line 115 Line 120 Line 125 Line 130 Line 135 Line 140 Line 145 Line 150 Line 155 - ================================================================================ Preference: CLOCK_TO_OUT PORT "d_out" 7.000000 ns CLKNET "clk_c" ; 1 item scored, 1 timing error detected. -------------------------------------------------------------------------------Error: The following path exceeds requirements by 4.657ns Logical Details: Cell type Pin type Cell name (clock net +/-) Source: Destination: IO-FF Out Port Q Pad d_out_0io d_out (from mclk +) Data Path Delay: 5.296ns (100.0% logic, 0.0% route), 1 logic levels. Clock Path Delay: 6.458ns (28.9% logic, 71.1% route), 2 logic levels. Constraint Details: 6.458ns 0.097ns 5.296ns 7.000ns delay clk to d_out less feedback compensation delay d_out to d_out (totaling 11.657ns) exceeds offset clk to d_out by 4.657ns Physical Path Details: Clock path clk to d_out: Name IN_DEL ROUTE MCLK_DEL ROUTE Fanout --2 --1 Delay (ns) Site Resource 1.480 C6.PAD to C6.INDD clk 1.730 C6.INDD to ULPPLL.CLKIN clk_c 0.385 ULPPLL.CLKIN to ULPPLL.MCLK pll_macro_inst/pll_macro_0_0 2.863 ULPPLL.MCLK to E8.SC mclk -------6.458 (28.9% logic, 71.1% route), 2 logic levels. Data path d_out to d_out: Name Fanout OUTREGSL_D --- Delay (ns) Site Resource 5.296 E8.SC to E8.PAD d_out (from mclk) -------5.296 (100.0% logic, 0.0% route), 1 logic levels. Feedback path: Name INTFB_DEL ROUTE Warning: Fanout --1 Delay (ns) Site Resource 0.000 ULPPLL.CLKIN to ULPPLL.INTFB pll_macro_inst/pll_macro_0_0 0.097 ULPPLL.INTFB to ULPPLL.FB pll_macro_inst/fb -------0.097 (0.0% logic, 100.0% route), 1 logic levels. 11.657ns is the minimum offset for this preference. 1 preference not met. Notes: • The clock frequency is 100.000 MHz (10.000 ns period). • The feedback compensation of 0.097 ns, shown in line 121, is the delay of the PLL (actually only a portion of it, since the internal feedback path is faster than the MCLK and NCLK outputs). • The clock-to-out delay of 11.657 ns, shown in line 122, exceeds (fails) the specified 7.000 ns requirement. 3 Lattice Semiconductor ORCA Series 4 I/O Tuning via PLL Case II: PLL With External Feedback From MCLK Figure 3 illustrates the same circuit as Figure 1, but with the PLL’s FB input driven by the output of the clock tree (i.e., the PLL’s FB input is just another load on the clock tree). In this case, the PLL nulls out the delay through the PLL [C to D], as well as the clock tree itself [D to EX]. This represents the best that can be achieved by direct nulling, since the PLL can only null out delay that is injected after the PLL’s input ports. Figure 4 shows the actual timing from a Trace run. Table 2 correlates the paths in Figure 3 with the delays in Figure 4. IMPORTANT: refer to item #4 under the section Tips For Successful PLL Usage for information on ensuring that the correct delay is nulled out. Figure 3. Clock Tree with PLL, PLL Nulled Out REG Clock Tree D E1 PPLL A B C BUF G MCLK CLK D CLKIN REG FB INTFB D E2 CLK Table 2. Correlation of Schematic and Trace Report Path in Figure 3 Line in Figure 4 Delay A to B Line 230 1.480 ns B to C Line 231 1.730 ns C to D Line 232 0.385 ns D to EX Line 233 2.863 ns C to D Line 247 0.385 ns D to G Line 248 2.878 ns 4 Lattice Semiconductor ORCA Series 4 I/O Tuning via PLL Figure 4. PLL with External Feedback from MCLK Case II: Trace Report, PLL with External Feedback from MCLK (Refer to Figure 3) Line 201 Line 205 Line 210 Line 215 Line 220 Line 225 Line 230 Line 235 Line 240 Line 245 Line 250 Line 255 - ================================================================================ Preference: CLOCK_TO_OUT PORT "d_out" 7.000000 ns CLKNET "clk_c" ; 1 item scored, 1 timing error detected. -------------------------------------------------------------------------------Error: The following path exceeds requirements by 1.491ns Logical Details: Cell type Pin type Cell name (clock net +/-) Source: Destination: IO-FF Out Port Q Pad d_out_0io d_out (from mclk +) Data Path Delay: 5.296ns (100.0% logic, 0.0% route), 1 logic levels. Clock Path Delay: 6.458ns (28.9% logic, 71.1% route), 2 logic levels. Constraint Details: 6.458ns 3.263ns 5.296ns 7.000ns delay clk to d_out less feedback compensation delay d_out to d_out (totaling 8.491ns) exceeds offset clk to d_out by 1.491ns Physical Path Details: Clock path clk to d_out: Name IN_DEL ROUTE MCLK_DEL ROUTE Fanout --2 --2 Delay (ns) Site Resource 1.480 C6.PAD to C6.INDD clk 1.730 C6.INDD to ULPPLL.CLKIN clk_c 0.385 ULPPLL.CLKIN to ULPPLL.MCLK pll_macro_inst/pll_macro_0_0 2.863 ULPPLL.MCLK to E8.SC mclk -------6.458 (28.9% logic, 71.1% route), 2 logic levels. Data path d_out to d_out: Name Fanout OUTREGSL_D --- Delay (ns) Site Resource 5.296 E8.SC to E8.PAD d_out (from mclk) -------5.296 (100.0% logic, 0.0% route), 1 logic levels. Feedback path: Name MCLK_DEL ROUTE Fanout --2 Delay (ns) Site Resource 0.385 ULPPLL.CLKIN to ULPPLL.MCLK pll_macro_inst/pll_macro_0_0 2.878 ULPPLL.MCLK to ULPPLL.FB mclk -------3.263 (11.8% logic, 88.2% route), 1 logic levels. Warning: 8.491ns is the minimum offset for this preference. 1 preference not met. Notes: • The clock frequency is 100.000 MHz (10.000 ns period). • The feedback compensation of 3.263 ns, shown in line 221, is the delay from the input of the PLL to the outputs of the clock tree. • The clock-to-out delay of 8.491 ns, shown in line 222, exceeds (fails) the specified 7.000 ns requirement. 5 Lattice Semiconductor ORCA Series 4 I/O Tuning via PLL Case III: PLL With External Feedback Through a PIO It is possible to achieve a closer approximation to the goal of nulling out the entire clock tree than was achieved in Case II. Figure 5 illustrates a technique for accomplishing this. Here, the PLL’s FB input is driven by a buffer that is in turn driven by the clock tree. The key assumption is that the buffer’s delay [E3 to F] plus routing [F to G] approximates the combined delay of the clock’s input buffer [A to B] and associated routing [B to C]. Thus, when the PLL nulls out [C to D], [D to E3], [E3 to F] and [F to G], it is approximately the same as nulling out the entire clock tree [A to B], [B to C], [C to D] and [D to EX]. Figure 6 shows the actual timing from a Trace run. Table 3 correlates the paths in Figure 5 with the delays in Figure 6. Figure 5. Clock Tree with PLL, Clock Tree Nulled Out REG Clock Tree D E1 PPLL A B C BUF G MCLK CLK D CLKIN REG FB D INTFB E2 CLK E3 BUF Table 3. Correlation of Schematic and Trace Report Path in Figure 5 Line in Figure 6 Delay A to B Line 330 1.480 ns B to C Line 331 1.730 ns C to D Line 332 0.385 ns D to E1 Line 333 2.891 ns C to D Line 347 0.385 ns D to E3 Line 348 1.700 ns E3 to F Line 349 0.312 ns F to G Line 350 1.273 ns 6 Lattice Semiconductor ORCA Series 4 I/O Tuning via PLL Figure 6. PLL with External Feedback through a PIO Case III Trace Report, PLL with External Feedback from Clock Tree Through a Buffer (Refer to Figure 5) Line 301 Line 305 Line 310 Line 315 Line 320 Line 325 Line 330 Line 335 Line 340 Line 345 Line 350 Line 355 Line 357 - ================================================================================ Preference: CLOCK_TO_OUT PORT "d_out" 7.000000 ns CLKNET "clk_c" ; 1 item scored, 1 timing error detected. -------------------------------------------------------------------------------Error: The following path exceeds requirements by 1.112ns Logical Details: Cell type Pin type Cell name (clock net +/-) Source: Destination: IO-FF Out Port Q Pad d_out_0io d_out (from mclk +) Data Path Delay: 5.296ns (100.0% logic, 0.0% route), 1 logic levels. Clock Path Delay: 6.486ns (28.8% logic, 71.2% route), 2 logic levels. Constraint Details: 6.486ns 3.670ns 5.296ns 7.000ns delay clk to d_out less feedback compensation delay d_out to d_out (totaling 8.112ns) exceeds offset clk to d_out by 1.112ns Physical Path Details: Clock path clk to d_out: Name IN_DEL ROUTE MCLK_DEL ROUTE Fanout --2 --2 Delay (ns) Site Resource 1.480 C6.PAD to C6.INDD clk 1.730 C6.INDD to ULPPLL.CLKIN clk_c 0.385 ULPPLL.CLKIN to ULPPLL.MCLK pll_macro_inst/pll_macro_0_0 2.891 ULPPLL.MCLK to D6.SC mclk -------6.486 (28.8% logic, 71.2% route), 2 logic levels. Data path d_out to d_out: Name Fanout OUTREGSL_D --- Delay (ns) Site Resource 5.296 D6.SC to D6.PAD d_out (from mclk) -------5.296 (100.0% logic, 0.0% route), 1 logic levels. Feedback path: Name MCLK_DEL ROUTE BUF_DEL ROUTE Fanout --2 --1 Delay (ns) Site Resource 0.385 ULPPLL.CLKIN to ULPPLL.MCLK pll_macro_inst/pll_macro_0_0 1.700 ULPPLL.MCLK to SLIC_R4C2.SIN0 mclk 0.312 SLIC_R4C2.SIN0 to LIC_R4C2.SOUT0 SLIC_0 1.273 LIC_R4C2.SOUT0 to ULPPLL.FB mclk_d -------3.670 (19.0% logic, 81.0% route), 2 logic levels. Warning: 8.112ns is the minimum offset for this preference. 1 preference not met. Notes: • The clock frequency is 100.000 MHz (10.000 ns period). • The feedback compensation of 3.670 ns, shown in line 321, is the delay from the input to the PLL to the outputs of the clock tree, plus the delay to and through a tristate buffer, which altogether approximates the total delay from the clock input pin to the output of the clock tree. • The clock-to-out delay of 8.112 ns, shown in line 322, exceeds (fails) the specified 7.000 ns requirement. 7 Lattice Semiconductor ORCA Series 4 I/O Tuning via PLL Solution #2: Adjusting input setup and hold times to match external constraints Frequently, a designer will find that an external timing specification will allow for a generous data sampling window (tW = tS + tH = input setup time + input hold time) but that one of the two components, either the setup or hold time, is too small to achieve. Figure 7 illustrates an example of this. Here, the external specification defines the clock frequency to be 100 MHz, the input setup time to be 3.75 ns, and the input hold time to be -1.25 ns (a negative hold time means that the data sampling window ends before its corresponding clock edge occurs). Thus the sampling window is large (tW = tS + tH = (3.75 ns) + (-1.25 ns) = (2.50 ns)), larger than the sampling window for a typical ORCA Series 4 register (for our example, tW = tS + tH = (0.25 ns) + (0.01 ns) = (0.26 ns)). Nevertheless, the external specification can’t be met, because the register’s sampling window does not fall inside the sampling window of the specification. Here, a PLL can be employed to shift the clock edge the register sees, so that the incoming data covers the register’s sampling window. Figure 7. Clock/Data Timing Relationships -5.00 -3.75 -2.50 -1.25 0.00 1.25 clk tS = 3.75 ns tH = -1.25 ns tW = 2.50 ns external data timing tS1 = 0.25 ns tH1 = 0.01 ns tW1 = 0.26 ns register timing (original) tS2 = 2.75 ns tH2 = -2.49 ns tW2 = 0.26 ns register timing (adjusted) tS = input setup time tH = input hold time tW = data sampling window In this case, we need to move the clock tree output earlier by approximately two nanoseconds, so that it will transition in the middle of the data. This shift is shown in the bottom two traces of Figure 7 as the effective shift in the data sampling window at the device’s pin interface. Note that after the shift, the register’s sampling window is comfortably inside the external specification’s sampling window. There are two methods that can be employed to shift the clock. The first is to adjust the delay that exists in the feedback path to the PLL, as was done in the previous section. As delay is inserted in the feedback path, the result is that the clock output of the clock tree shifts earlier in time. This will cause a shift that is not dependent on clock frequency, but is dependent on device propagation delay characteristics. As such, it will vary with supply voltage, device temperature and speed grade. This was beneficial in the previous example, since it caused the shift to track with the delay that it was nulling. In this example, the desired shift is a fixed 2.50 ns; therefore we will find the second method more appropriate. The second method is to shift the clock using the PLL’s phase shift mode. Here, the shift is specified as a fraction of a clock period. The PLL will continuously and dynamically adjust for variables such as supply voltage, device temperature and speed grade. 8 Lattice Semiconductor ORCA Series 4 I/O Tuning via PLL For our example, we will use the circuit of Figure 8, which is a modification of the circuit in Figure 5. The modification is necessary because, if the phase-shifted output of MCLK were fed back, the phase adjustment would be nulled out. The circuit in Figure 8 feeds back the NCLK output of the PLL, which is not phase-shifted in PHSIFT Mode (caution: both outputs are phase-shifted in DELAY Mode). To determine the phase adjustment required, the desired shift (2 ns) is divided by the period of the 100 MHz clock (10 ns), resulting in a required phase shift of 1/4 of a period (VCOTAP = 6). We need to shift the clock earlier by 1/4 period (90º), but the phase shifts that are listed for the PLL shift the output later. Therefore, the specified phase shift would actually be 3/4 period (360º -90º = 270º). Refer to technical note number TN1014, ORCA Series 4 FPGA PLL Elements for information on using the PLL_PHASE_BACK attribute in the preference files in this situation. Figure 8. Clock Tree with PLL, Delay and Phase Adjusted REG Clock Tree D E1 PPLL A B C BUF G MCLK CLKIN FB NCLK INTFB CLK D J REG D E2 CLK Mode = PHSIFT VCOTAP = 6 E3 9 BUF F Lattice Semiconductor ORCA Series 4 I/O Tuning via PLL Figure 9. PLL Phase Adjusted by +315 Case IV: Trace Report, PLL with External Feedback from NCLK through a Buffer (Refer to Figure 8) Line 401 Line 405 Line 410 Line 415 Line 420 Line 425 Line 430 Line 435 Line 440 Line 445 Line 450 Line 455 Line 457 - ================================================================================ Preference: CLOCK_TO_OUT PORT "d_out" 7.000000 ns CLKNET "clk_c" ; 1 item scored, 1 timing error detected. -------------------------------------------------------------------------------Error: The following path exceeds requirements by 9.902ns Logical Details: Cell type Pin type Cell name (clock net +/-) Source: Destination: IO-FF Out Port Q Pad d_out_0io d_out (from mclk +) Data Path Delay: 5.296ns (100.0% logic, 0.0% route), 1 logic levels. Clock Path Delay: 14.823ns (69.0% logic, 31.0% route), 2 logic levels. Constraint Details: 14.823ns 3.217ns 5.296ns 7.000ns delay clk to d_out less feedback compensation delay d_out to d_out (totaling 16.902ns) exceeds offset clk to d_out by 9.902ns Physical Path Details: Clock path clk to d_out: Name IN_DEL ROUTE MCLK_DEL ROUTE Fanout --2 --1 Delay (ns) Site Resource 1.480 C6.PAD to C6.INDD clk 1.730 C6.INDD to ULPPLL.CLKIN clk_c 8.750 ULPPLL.CLKIN to ULPPLL.MCLK pll_macro_inst/pll_macro_0_0 2.863 ULPPLL.MCLK to E8.SC mclk -------14.823 (69.0% logic, 31.0% route), 2 logic levels. Data path d_out to d_out: Name Fanout OUTREGSL_D --- Delay (ns) Site Resource 5.296 E8.SC to E8.PAD d_out (from mclk) -------5.296 (100.0% logic, 0.0% route), 1 logic levels. Feedback path: Name NCLK_DEL ROUTE BUF_DEL ROUTE Warning: Fanout --1 --1 Delay (ns) Site Resource 0.000 ULPPLL.CLKIN to ULPPLL.NCLK pll_macro_inst/pll_macro_0_0 1.632 ULPPLL.NCLK to SLIC_R4C2.SIN0 nclk 0.312 SLIC_R4C2.SIN0 to LIC_R4C2.SOUT0 SLIC_0 1.273 LIC_R4C2.SOUT0 to ULPPLL.FB nclk_d -------3.217 (9.7% logic, 90.3% route), 2 logic levels. 16.902ns is the minimum offset for this preference. 1 preference not met. Notes: • The clock frequency is 100.000 MHz (10.000 ns period). • The feedback compensation of 3.217 ns, shown in line 421, is the delay from the input to the PLL, out the PLL’s NCLK output, and through a tristate buffer, which altogether approximates the total delay from the clock input pin to the output of the clock tree. • The clock-to-out delay of 16.902 ns, shown in line 422, exceeds (fails) the specified 7.000 ns requirement. • The 14.823 ns (delay from clk to d_out) shown in line 420 is taken directly from line 435, since there is no PLL_PHASE_BACK attribute on this CLOCK_TO_OUT preference (compare with the same lines in Figure 11). 10 Lattice Semiconductor ORCA Series 4 I/O Tuning via PLL Solution #3: Matching clock-to-out of driver device with input setup of receiver device Another common design problem involves the need to provide maximum time for a signal to pass from its driver device to its receiver. If the clock-to-out time tP can be reduced, the impact on inter-device delay can be minimized. Once again, the PLL can perform this job. Just as the PLL “borrowed” from the input hold time in order to provide greater input setup time in the previous example, here a PLL allows the driver device to “borrow” from the input setup time of its output register in order to reduce tp on that register’s output. Refer to Figure 10. If the delay from C to D is large so that the input setup time at D is not met, but the delay from A to B is small so that the input setup time at B is met with time to spare, the PLL in the driving device can be configured to shift the clock at F earlier with respect to clocks at E and G. As in the above examples, the clock can be shifted earlier either by adding delay to the PLL’s FB input or by using the PLL in Phase Shift mode. Once again, the shifts are negative, so a shift of 1/8 of a clock phase (45º) requires a VCOTAP setting of 7 (315º). This clock adjustment technique is illustrated in the Trace runs in Fig X9 (no PLL_PHASE_BACK attribute and therefore +315º phase shift) and Figure 11 (PLL_PHASE_BACK attribute and therefore -45º phase shift). Table 2 correlates the paths in Figure 8 with the delays in Figure 9 and Figure 11. Figure 10. Phase Adjustment to Compensate for Large Path Delays Driving Device A Logic E B Receiving Device D C F G PLL System Clock Table 4. Correlation of Schematic and Trace Reports Path in Figure 8 Line in Figure 9 Line in Figure 11 Delay A to B Line 430 Line 530 1.480 ns B to C Line 431 Line 531 1.730 ns C to D Line 432 Line 532 8.750 ns D to E Line 433 Line 533 2.863 ns C to J Line 447 Line 547 0.000 ns J to E3 Line 448 Line 548 1.632 ns E3 to F Line 449 Line 549 0.312 ns F to G Line 450 Line 550 1.273 ns 11 Lattice Semiconductor ORCA Series 4 I/O Tuning via PLL Figure 11. PLL Phase Adjusted by -45 Case V: Trace Report, PLL with External Feedback from NCLK through a Buffer (Refer to Figure 8) Line 501 Line 505 Line 510 Line 515 Line 520 Line 525 Line 530 Line 535 Line 540 Line 545 Line 550 Line 555 Line 557 - ================================================================================ Preference: CLOCK_TO_OUT PORT "d_out" 7.000000 ns CLKNET "clk_c" PLL_PHASE_BACK ; 1 item scored, 0 timing errors detected. -------------------------------------------------------------------------------Passed: The following path meets requirements by 0.098ns Logical Details: Cell type Pin type Cell name (clock net +/-) Source: Destination: IO-FF Out Port Q Pad d_out_0io d_out (from mclk +) Data Path Delay: 5.296ns (100.0% logic, 0.0% route), 1 logic levels. 14.823ns (69.0% logic, 31.0% route), 2 logic levels. Clock Path Delay: Constraint Details: 4.823ns 3.217ns 5.296ns 7.000ns delay clk to d_out less feedback compensation delay d_out to d_out (totaling 6.902ns) meets offset clk to d_out by 0.098ns Physical Path Details: Clock path clk to d_out: Name IN_DEL ROUTE MCLK_DEL ROUTE Fanout --2 --1 Delay (ns) Site Resource 1.480 C6.PAD to C6.INDD clk 1.730 C6.INDD to ULPPLL.CLKIN clk_c 8.750 ULPPLL.CLKIN to ULPPLL.MCLK pll_macro_inst/pll_macro_0_0 2.863 ULPPLL.MCLK to E8.SC mclk -------14.823 (69.0% logic, 31.0% route), 2 logic levels. Data path d_out to d_out: Name Fanout OUTREGSL_D --- Delay (ns) Site Resource 5.296 E8.SC to E8.PAD d_out (from mclk) -------5.296 (100.0% logic, 0.0% route), 1 logic levels. Feedback path: Name NCLK_DEL ROUTE BUF_DEL ROUTE Report: Fanout --1 --1 Delay (ns) Site Resource 0.000 ULPPLL.CLKIN to ULPPLL.NCLK pll_macro_inst/pll_macro_0_0 1.632 ULPPLL.NCLK to SLIC_R4C2.SIN0 nclk 0.312 SLIC_R4C2.SIN0 to LIC_R4C2.SOUT0 SLIC_0 1.273 LIC_R4C2.SOUT0 to ULPPLL.FB nclk_d -------3.217 (9.7% logic, 90.3% route), 2 logic levels. 6.902ns is the minimum offset for this preference. All preferences were met. Notes: • The clock frequency is 100.000 MHz (10.000 ns period). • The feedback compensation of 3.217 ns, shown in line 521, is the delay from the input to the PLL, out the PLL’s NCLK output, and through a tristate buffer, which altogether approximates the total delay from the clock input pin to the output of the clock tree. • The clock-to-out delay of 6.902 ns, shown in line 522, meets (passes) the specified 7.000 ns requirement. • The 4.823 ns (delay from clk to d_out) shown in line 520 is equal to the 14.823 ns from line 535 minus the 10.000 ns clock period, since there is a PLL_PHASE_BACK attribute on this CLOCK_TO_OUT preference (compare with the same lines in Figure 9). 12 Lattice Semiconductor ORCA Series 4 I/O Tuning via PLL Summary Table 5 summarizes the five cases examined, and lists the effects of the configurations on the feedback compensation and clock-to-out delay. Table 5. Summary of Cases I Through V Case Schematic Trace Report I Fig X1 Figure 2 PLL with internal feedback Description Feedback Compensation Clock-to-Out Delay 0.097 11.657 II Fig X3 Figure 4 PLL with clock tree feedback 3.263 8.491 III Fig X5 Figure 6 Case 2, with added buffer1 3.670 8.112 IV Fig X8 Figure 9 Case 3, with 315º phase shift2 3.217 16.902 V Fig X8 Figure 11 Case 3, with -45º phase shift2 3.217 6.902 1. Case III feeds the output of the clock tree through a buffer to the PLL’s FB input. 2. Cases IV and V feed the PLL’s NCLK output through a buffer to the PLL’s FB input (the clock tree is fed by the PLL’s MCLK output). This is done because, if the actual clock tree were used, the phase shift would be nulled out. Tips For Successful PLL Usage: 1. Always use the ispLEVER Module/IP Manager HDL generator to produce the code for a PLL. The ispLEVER Module/IP Manager generator will ensure that timing requirements are met and calculate the proper values for all parameters. 2. The ispLEVER Module/IP Manager HDL generator cannot presently generate designs that utilize external feedback. To generate an externally linked PLL, use ispLEVER Module/IP Manager to generate the equivalent design with internal feedback, and then modify the output to utilize external feedback. The necessary modifications involve adding ports or logic to the module for FB (and INTFB if needed), and connecting them to the corresponding ports on the PPLL element that is instantiated within the module. Reference #2 provides additional information to assist in this modification. 3. When using a PLL to alter frequency, keep in mind that the frequencies at all points in the PLL must remain within the PLL’s range of operation. This means that as well as the input and output signals, the feedback and some internal signals must also remain in range. Once again, ispLEVER Module/IP Manager will ensure that these requirements are met. 4. When nulling out the clock tree by feeding the clock back into the PPLL’s FB input through a buffer, be aware that ispLEVER Project Navigator will utilize any available copy of the clock net to feed the buffer, not being careful to minimize skew, since it will not recognize the input to the buffer as a clock load. Therefore, after automatic routing is complete, check the buffer’s input to make sure that it is connected as a “leaf” on the clock tree, similar to the way any other clock load is connected. Otherwise, the delay of the clock tree will not be properly represented. 5. If clocks are shifted in phase relation to each other, it may be necessary to place multi-cycle constraints on the clocks in order to properly model the intended behavior. 6. Warning: exercise extreme caution if you are using the PLL to insert large phase shifts into your design in order to accommodate paths exhibiting large delay. Bear in mind that there is a large variability in delay values, as caused by voltage, temperature and device speed variation. If a path contains a worst-case (greatest) delay of more that one clock period, it is still very likely that the best-case (least) delay for that path is very small. If the receiver’s clock is shifted later, the signal may shortpath (“shoot-through”) and fail to operate over full range. 7. The TRACE program in ispLEVER Project Navigator can be used to verify that input hold time requirements are being met; however, a separate trace run must be made that specifies the “-hld” option, or “check hold times” from the ispLEVER Project Navigator. This run must be in addition to the regular run, since this one will only check hold times (shortpaths) and not setup times (longpaths). Place preferences on all affected inputs, using one of the following formats: 13 Lattice Semiconductor ORCA Series 4 I/O Tuning via PLL INPUT_SETUP port_name time_spec HOLD time_spec CLK_NET clk_netname ; or INPUT port_name SETUP time_spec HOLD time_spec CLK_NET clk_netname ; Be sure to evaluate your hold-time requirements early in the design cycle. 8. When using a PPLL for phase shifting, the phase-shifted output cannot be used as the PPLL’s feedback input, since the PPLL would then dutifully null out the inserted phase shift. Either feed back the internal feedback output (INTFB), or, if additional delay compensation is desired, use the output NCLCK to drive a delay network and then drive the PPLL’s FB input with the output of that delay network. References 1. ORCA Series 4 FPGAs Data Sheet 2. ORCA Series 4 FPGA PLL Elements Technical Note (technical note number TN1014) Technical Support Assistance Hotline: 1-800-LATTICE (Domestic) 1-408-826-6002 (International) e-mail: [email protected] 14