

# Multi-Die Distributed Simulation -Next Generation Validation Framework SNUG INDIA 2024

Ravishankar Ramaswamy Principal Engineer Intel



- Problem Statement
- Requirement
- Solution VCS Distributed Simulation
- Results & Summary
- Future enhancements

2



# **Problem Statement**

### **Problem Statement**

# snug

### **Evolution of Designs**



- Rapid increase in design sizes and impact on PPA goals
- Reducing yield due to reticle limit of manufacturing equipment – Multi-die solutions, independent evolution of dies(process/nodes)

### Validation challenges

Traditional Monolithic approach:

- Increase in compute requirements(>128G machines)
- Significant TAT for Build+Simulation(~24hrs)
- TB size explosion with newer topologies

Independent evolution of dies(process/nodes) leads to

- Name collisions IP/TB
- Different versions VIP, TB packages etc.

Incremental updates to design

• Reintegration at SoC

### Evolve a nimble solution





### Requirements

### Disaggregation of dies

- Think design evolution Divide monolithic setup into multiple chiplets
- Make use of existing TB at Subsystem/IP level
- Occupy less memory compute machines
- "Connect" dies through a simple IPC glue





snu

# Disaggregated Simulation – Use Cases

Monolithic VAL – Disaggregated RTL





## Disaggregated Simulation – Use Cases

Die VAL – Reuse at SoC (Typical Usage)





# Challenges in Disaggregated Simulation

Typical challenges in custom/inhouse solution

### **RTL Synchronization TB Data Transfer** Connectivity should replicate true die- Flexible like Monolithic stimulus to-die signal connectivity Deterministic data transfer Support for multi clock synchronization Maintenance of Val code **TB Phase Synchronization Regression and Debug Productivity** Synchronization of UVM/User phases Low Impact to regression across dies methodology Less debug impact Deterministic phase synchronization • Faster turnaround – Bug fix val

Penalty for morphing a monolithic to disaggregated simulation < 5%







# VCS Distributed Simulation

snug

**Use Model** 

- Compilation
  - -distsim needs to be used during elaboration.



### Simulation

 Primary simv should be launched with switch (-distsim=launch\_server) which invokes separate process to control the communication





### VCS Distributed Simulation RTL Synchronization

Simple RTL synchronization with connectivity file

• Connectivity file:

sync\_interval: 100ps
s0.soc\_tb.cpu.sig1 = s1:soc\_tb.io.sig1
s1.soc\_tb.io.sig2 = s0:soc\_tb.cpu.sig2
s0.soc\_tb.cpu.sig3 = s1:soc\_tb.io.sig3

sync\_signal: s0:soc\_tb.cpu.clk
s0.soc\_tb.cpu.sig1 = s1:soc\_tb.io.sig1
s1.soc\_tb.io.sig2 = s0:soc\_tb.cpu.sig2
s0.soc\_tb.cpu.sig3 = s1:soc\_tb.io.sig3

- Synchronization can be clock based(sync\_signal:<clock signal>) or time based(sync\_interval)
- Communication between the simv's occurs during sample/drive phases
  - All the loads would get sampled together in Sample Phase and driven in Drive Phase
- Support of multiple clock sync signals in the connectivity file





# VCS Distributed Simulation TB Phase Sync

Synchronization for User defined and UVM phases

### • IPC for Test Bench Phase Sync: User defined phases

- Synchronization is done through code extensions by adding macro
- Macro `VCSDISTSIM\_PHASE\_SYNC(PHASE\_NAME) needs to be added at each sync point
- Waits until all client simv's reach same phase
- TB resumes when all client simv's are synced

### • IPC for Test Bench Phase Sync: UVM Runtime default phases

- Implicit UVM phase synchronization across simv's with predefined UVM component class **dist\_tbsync\_comp**
- TB phase order should be identical in client TBs
- There can be additional Client specific Phases :-
  - Sync not available for phases unavailable in other Clients
  - Sync for common phases can be achieved using runtime flag or connectivity file

# VCS Distributed Simulation TB Data Transfer

Val data transfer across clients

- Test Bench Data Transfer of Class Objects is through bit stream
- Send/receive TB data is through VCS distsim API's(`VCSDISTSIM\_TB\_SEND/ `VCSDISTSIM\_TB\_RECV)
- At sender, generate bit array from the transaction class object
- At receiver, bit array is used to fill the transaction class object





# VCS Distributed Simulation Save Replay

Debug Productivity

- Enables simulation of single Simv with stimulus captured from other clients during save run
- Helpful for debugging if issue is present in only one die
- Save mode is used to capture all RTL/TB receive calls with respect to sync point and logged into replay database
- During Replay Mode, one can force the RTL signals which are part of connectivity file







# **Results & Summary**

## VCS Distributed Simulation

**Results & Summary** 

- Ease of integration
  - Bringup of VCS distributed simulation in <2days</li>
  - Complete regression results < 1week</p>
- Performance improvement over existing solution
  - Gains for both Simulation time and memory
- Deployment
- Debug Productivity
  - 1.5X Runtime gain with replay mode.
- Scalability Easily scalable to next projects





# Future Enhancements

### VCS Distributed Simulation

### **Challenges/Next steps**

- Increase in sync time penalty with increase in number of simv's
- Support of interface modports connections is not available; add individual signals manually
- Only bit stream is supported for TB data transfer and needs force cap on entire design

### **Future Enhancements**

- Support of interface modport connection
- Multiple master clock synchronizations
- User assisted partitioning
- Unified Debug with Verdi
- Consolidated logs/coverage i.e merge across vdb's
- Cloud Support





# THANK YOU

Our Technology, Your Innovation™

## Sample code for TB Sync and TB data transfer



| <b>Driver:</b> send transaction from one-die to other-die and for receiving response.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | Sequencer: receive transaction from driver and send response to driver                                                                                                                                                                                                                                                                                                                                                   | Task based Phase Sync                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <pre>int dieid;<br/>bit rsp_bit_array_d0[];<br/>bit rsp_bit_array_d2[];<br/>mp_tb_seq_item local_rsp;<br/>bit rsp_bit_array[];<br/>\$value\$plusargs("DIE_ID=%d",dieid);<br/>if (dieid==0)<br/>begin<br/>fork<br/>begin<br/>forever begin<br/>`VCSDISTSIM_TB_RECV("RECV_RSP_D0",<br/>rsp_bit_array_d0);<br/>end<br/>end<br/>begin<br/>forever begin<br/>`VCSDISTSIM_TB_RECV("RECV_RSP_D1",<br/>rsp_bit_array_d1);<br/>end<br/>end<br/>begin<br/>forever begin<br/>`VCSDISTSIM_TB_RECV("RECV_RSP_D1",<br/>rsp_bit_array_d1);<br/>end<br/>end<br/>begin<br/>forever begin<br/>`VCSDISTSIM_TB_RECV("RECV_RSP",<br/>rsp_bit_array_d2);<br/>end<br/>end<br/>join_none //fork<br/>end //dieid=0</pre> | <pre>\$value\$plusargs("DIE_ID=%d",dieid);<br/>if (dieid==1)<br/>begin<br/>fork<br/>begin<br/>forever begin<br/>`VCSDISTSIM_TB_RECV("RECV_REQ1",<br/>req_bit_array_n);<br/>end<br/>end<br/>begin<br/>forever begin<br/>`VCSDISTSIM_TB_RECV("RECV_REQ2",<br/>req_bit_array_s);<br/>end<br/>end<br/>join_none //fork<br/>end<br/>if (dieid==1) begin<br/>`VCSDISTSIM_TB_SEND("SEND_RSP",<br/>req_bit_array);<br/>end</pre> | <pre>virtual task body();<br/>super.body();<br/><br/>`uvm_info(get_type_name(),"&gt;&gt; reset distsim_phase_sync<br/>sequence", UVM_NONE);<br/>\$display("SNPS VCSDISTSIM Phase Sync Before CP0");<br/>`VCSDISTSIM_PHASE_SYNC(CP0);<br/>\$display("SNPS VCSDISTSIM Phase Sync After CP0");<br/>if (get_current_test_phase() == "reset_phase")<br/>begin<br/>\$display("SNPS VCSDISTSIM Phase Sync Before CP1");<br/>`VCSDISTSIM_PHASE_SYNC(CP1);<br/>\$display("SNPS VCSDISTSIM Phase Sync After CP1");<br/><br/>endtask</pre> |