

Advanced Strategies and Recipes for High-Speed Non-Rectilinear Partition Convergence

See Eng Heng, Wong Ji Kai Intel Corporation

#### Challenges in High-Speed Non-Rectilinear Partition Convergence

- High-speed inter-block timing convergence with tight timing window
- Routing congestion with extreme aspect ratio + nonrectilinear shape
- Timing convergence for high-speed module of 2GHz clock
- Timing correlation from placement database to post route database (timing degradation through out R2G implementation due to routing congestion, noise and etc)



Figure1: A non-rectilinear shape partition with extreme aspect ratio

sn



#### Convergence Strategy and Recipe 7 Keys strategies to enable convergence by construction.

## 1. Partition Interface Port Placement Optimization



- Grouping them based on functionality, criticality, or specific interface requirements
- Mitigates timing challenges, reduces the risk of overlooking crucial paths, and allows partition interface optimization strategies to be carried out effectively
- Easier to meet high speed frequency.

snug

## 2. Partition Interface Flop Bounding





Figure 3: Interface Flop Bounding

- Recommended to hard-bound the interface flops at a desired distance from interface ports.
- Sometimes timing constraints may not be satisfied, especially when there are huge WNS on internal paths.
- Bounding ensures clean by construction in full chip integration.
- Effective when there is tight inter-partition timing window.

#### 3. Placement $\rightarrow$ PostRoute Timing Correlation Recipe snu Post CTS and Routing Stage: Placement Stage: CTS Stage: Set Signal Max Routing layer to M13 Set Signal Max CTS layer M8-M13 Enable layer promoting Routing layer to M11 Initial Placement & CTS & Clock Post CTS Routing & Incremental Placement Routing Optimization Optimization **Optimization Stage**

Figure 6: Recipe to enable correlation from placement to post routing stage

- Compile stage has unrealistic optimism with the routing resources on the higher metal layers.
- Clock routes take up some of the higher routing layers, causing the miscorrelation between pre and post clock delay estimation.
- Disabling some routing layers in pre-cts pushes the tool to be more realistic with additional pessimism to increase optimization effort.
- Recommended to keep 2 layers for layer promoting at later stage of optimization.
- After clock tree synthesis stage, enable the maximum routing, allow layer promoting to match pre-cts timing quality, and improve RC due to congestion.



#### Global route congestion map before and after



#### Before

After

## <image>



Figure 5: Route guide & placement blockage to resolve congestion

- As illustrated in Figure 5, to help with the horizontal routing congestion, use route guides to block the horizontal layers below the leftmost high-density port region
- This can direct signals towards the middle of the horizontal channel before progressing to the right.
- Usage of placement\_blockages with 50% blockage and routing guides with 50% utilization ratio at the high dense corner regions to control congestion.

\*\*\* recipe on macro region congestion resolution is not discussed in this presentation. Focus on high-speed block without macros discussion.

#### Congestion map results before and after





#### Before



## 5. Concurrent Clock Data (CCD) Implementation



• CCD in compile and placement is most critical and yields the best results

#### 6. Group Path Implementation for Critical Path

#### **Common Mistake**

- Excessive group paths increase optimization runtime
- Generic group paths defined. When everything is priority, there is no priority.
- Insufficient weight hence results are not obvious

#### Recommendation

- Be very specific & selective on group path. Ensure critical path is getting very high priority.
- Using high weightage for group path.







Figure 7: Group path Recipe Case Study

### 7. Level of Logic Analysis and Pipestage Request



- Early analysis of the design to feedback on RTL quality.
- Request for architectural improvement like additional pipestage if there is unreasonable long level of logic.
- Maximum level of logic in a path depends on process, voltage, temperature, frequency and routability of the design. It is crucial to understand the design specifications.



# **Result & Conclusion**

#### **Correlation & Worst Negative Slack Improvement**



|       | Compile Final Opto<br>(CFO) | Clock Route Opt<br>(CRO) | Route_Opt |
|-------|-----------------------------|--------------------------|-----------|
|       | WNS (ns)                    | WNS (ns)                 | WNS (ns)  |
| Run5  | -0.18                       | -0.541                   | -0.456    |
| Run6  | 0                           | -0.373                   | -0.297    |
| Run7  | -0.027                      | -0.528                   | -0.455    |
| Run8  | -0.026                      | -0.45                    | -0.392    |
| Run9  | -0.052                      | -0.316                   | -0.478    |
| Run10 | -0.091                      | -0.351                   | -0.551    |
| Run11 | -0.119                      | -0.191                   | -0.412    |
| Run13 | -0.132                      | -0.202                   | -0.342    |
| Run14 | -0.193                      | -0.135                   | -0.55     |
| Run15 | -0.202                      | -0.259                   | -0.384    |
| Run16 | -0.201                      | -0.221                   | -0.467    |
| Run17 | -0.156                      | -0.149                   | -0.34     |
| Run18 | -0.079                      | -0.081                   | -0.19     |
| Run19 | -0.159                      | -0.134                   | -0.216    |
| Run20 | -0.055                      | -0.071                   | -0.204    |
| Run21 | -0.053                      | -0.114                   | -0.193    |
| Run22 | -0.116                      | -0.109                   | -0.202    |
| Run23 | -0.032                      | -0.066                   | -0.152    |
| Run24 | -0.07                       | -0.083                   | -0.088    |
| Run28 | -0.056                      | -0.131                   | -0.157    |
| Run29 | -0.032                      | -0.132                   | -0.157    |
| Run30 | -0.048                      | -0.045                   | -0.066    |
| Run32 | -0.155                      | -0.057                   | -0.183    |
| Run33 | -0.077                      | -0.007                   | -0.045    |
| Run34 | -0.036                      | -0.054                   | -0.033    |



Figure: WNS Convergence Progress

Sn

## Correlation & Negative Violated Path Improvement

Figure below shows the overall total NVP (Negative Violated Path) convergence progress. The results finally reach convergence at Run34 with only 140 violation left with only small magnitude.

|       | Compile Final Opto<br>(CFO) | Clock Route Opt<br>(CRO) | Route_Opt |
|-------|-----------------------------|--------------------------|-----------|
|       | Total NVP                   | Total NVP                | Total NVP |
| Run5  | 728                         | 12112                    | 15397     |
| Run6  | 20                          | 3539                     | 8008      |
| Run7  | 44                          | 7897                     | 13427     |
| Run8  | 22                          | 2029                     | 5584      |
| Run9  | 703                         | 4756                     | 9670      |
| Run10 | 1232                        | 5117                     | 10614     |
| Run11 | 1509                        | 1298                     | 7404      |
| Run13 | 873                         | 1494                     | 6023      |
| Run14 | 5426                        | 1920                     | 9341      |
| Run15 | 3480                        | 2429                     | 8329      |
| Run16 | 5324                        | 2609                     | 9350      |
| Bun17 | 4312                        | 1168                     | 8728      |
| Run18 | 328                         | 344                      | 3939      |
| Run19 | 320                         | 807                      | 4530      |
| Run20 | 158                         | 3157                     | 3142      |
| Run21 | 168                         | 2594                     | 3403      |
| Run22 | 135                         | 760                      | 2674      |
| Run23 | 310                         | 338                      | 1776      |
| Run24 | 348                         | 596                      | 790       |
| Run28 | 177                         | 727                      | 1293      |
| Run29 | 154                         | 703                      | 1703      |
| Run30 | 162                         | 96                       | 3181      |
| Run32 | 137                         | 87                       | 2697      |
| Run33 | 118                         | 31                       | 463       |
| Run34 | 94                          | 23                       | 140       |



Figure: Total NVP Convergence Progress



## THANK YOU

YOUR INNOVATION YOUR COMMUNITY