Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update sub-modules to latest version #39

Open
tancheng opened this issue Nov 3, 2024 · 21 comments · Fixed by #46
Open

Update sub-modules to latest version #39

tancheng opened this issue Nov 3, 2024 · 21 comments · Fixed by #46

Comments

@tancheng
Copy link
Owner

tancheng commented Nov 3, 2024

Hi @yuqisun, I post the TODOs here for better tracking the progress.

  1. Readme:
    Required (recommended) #CPUs, RAM?
    How many hours for the 2x2?
  2. Git stash Yuqi's modifications in CGRAFlow should still work?
  3. Git stash Yufei's modifications in VectorCGRA should still work?
  4. Update the layout.png location.
  5. Latest version of sub-modules.
    Pull latest other sub-modules.
    Pull latest CGRA-Mapper.
    Pull latest VectorCGRA and Mapper to checkout a 2x2 synthesis for layout.
    (venv) Need a specific PyMTL3 version: pip install -U git+https://github.com/tancheng/pymtl3.1@yo-struct-list-fix
  6. Make the asap7 be parameterizable.
  7. If works then update the CGRA-Flow repo and push another docker image (clean).
  8. Cleanup the docker_run.sh in CGRA-Flow.
  9. 1x1 is okay now if updated to latest version of VectorCGRA. Mapping is not okay and I assigned to @MeowMJ.
  10. Try timing: 1000->1GHz; 2000->500MHz; 4000->250MHz; 10000->100MHz; 20000->50MHz.
@yuqisun
Copy link
Collaborator

yuqisun commented Nov 16, 2024

Hi,

Update progress and some issues:
Good to pull other sub-modules except VectorCGRA.

After pull VectorCGRA:

  1. SVerilog --- works.

  2. Generate DFG
    File not found - /WORK_REPO/CGRA-Flow/CGRA-Mapper/build/src/libmapperPass.so
    Works after copy file manually, need add this into code repo right?

  3. Run tests
    After pull latest VectorCGRA(it's the last one pull to latest version), got below issue, can ignore to wait test code good?

collected 83 items / 4 errors
=========================== short test summary info ============================

ERROR ../../VectorCGRA/cgra/test/CGRA_custom_test.py

ERROR ../../VectorCGRA/fu/float/test/FpAddRTL_test.py

ERROR ../../VectorCGRA/fu/float/test/FpMulRTL_test.py

ERROR ../../VectorCGRA/tile/translate/TileRTL_test.py

!!!!!!!!!!!!!!!!!!! Interrupted: 4 errors during collection !!!!!!!!!!!!!!!!!!!!

============================== 4 errors in 1.94s ===============================
  1. Synthesize
    Also issue after pull latest VectorCGRA, looks issue when run make 3 in synthesize, any idea?:
Generating RTLIL representation for module `\ChannelRTL__f8f08fb4571b20cf'.
Warning: Replacing memory \queues__deq_en with list of registers. See design.name_mangled.v:572
Warning: Replacing memory \queues__enq_en with list of registers. See design.name_mangled.v:568
Warning: Replacing memory \queues__enq_msg with list of registers. See design.name_mangled.v:566
design.name_mangled.v:733: ERROR: Unsupported expression on dynamic range select on signal `\recv_data__rdy'!
make: *** [Makefile:264: 3-open-yosys-synthesis/.execstamp] Error 1
Exception in thread Thread-8:
Traceback (most recent call last):
  File "/usr/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.7/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "../mode_dark_light.py", line 1012, in runYosys
FileNotFoundError: [Errno 2] No such file or directory: '3-open-yosys-synthesis/stats.txt'

@tancheng
Copy link
Owner Author

Thanks Yuqi,

  1. it should be automatically generated after run make in the compiler's build folder. Where you copy it from? Maybe some path issue.
  2. It should just work as we tested them before with @yo96. Can you do some initial investigation (just run it in VectorCGRA repo without using GUI) and I will also take a look.
  3. Seems like the generated verilog is not synthesizable due to the updated ChannelRTL. Similar, let's try to reproduce the issue without GUI (model a 1x2 CGRA in the translateCGRARTL_test.py or some other *_test.py) and manually perform the yosys synthesis. It would be great if @yyan7223 can help. Let's discuss this during our 1:1.

@yuqisun
Copy link
Collaborator

yuqisun commented Nov 16, 2024

Sure, and I copy the .so file from previous image.

@tancheng
Copy link
Owner Author

Sure, and I copy the .so file from previous image.

That means we still use the old code. It needs to be rebuilt based on the latest/pulled repo.

@yuqisun
Copy link
Collaborator

yuqisun commented Nov 16, 2024

Sure, and I copy the .so file from previous image.

That means we still use the old code. It needs to be rebuilt based on the latest/pulled repo.

Got it, there is the .so file after rebuild repo.

@yuqisun
Copy link
Collaborator

yuqisun commented Nov 16, 2024

No worry for 'Run tests', fixed. Caused by I pull the latest repo of pymtl3-hardfloat, but folder name should be pymtl3_hardfloat. (an underline instead of a minus).

======================== 167 passed in 83.23s (0:01:23) ========================

Also, 'Synthesize' good now. Testing 'Layout'.

@tancheng
Copy link
Owner Author

Thanks a lot Yuqi,

  • How about we fork the pymtl3-hardfloat into coredac/pymtl3_hardfloat, and make CGRA-Flow/Vector-CGRA rely on the coredac/pymtl3_hardfloat, then no _ & - issue, and we are good even one day pymtl3-hardfloat not exists anymore..
  • How is the See design.name_mangled.v:566 issue fixed?

@yuqisun
Copy link
Collaborator

yuqisun commented Nov 17, 2024

  1. Get
  2. Yes fixed, but I'm not sure which step work now ...

Currently, pending on running layout only.

@tancheng
Copy link
Owner Author

Thanks, let's see.

@yyan7223
Copy link
Collaborator

Thanks Yuqi,

  1. it should be automatically generated after run make in the compiler's build folder. Where you copy it from? Maybe some path issue.
  2. It should just work as we tested them before with @yo96. Can you do some initial investigation (just run it in VectorCGRA repo without using GUI) and I will also take a look.
  3. Seems like the generated verilog is not synthesizable due to the updated ChannelRTL. Similar, let's try to reproduce the issue without GUI (model a 1x2 CGRA in the translateCGRARTL_test.py or some other *_test.py) and manually perform the yosys synthesis. It would be great if @yyan7223 can help. Let's discuss this during our 1:1.

Yeah I will try to find the problem this week.

@yuqisun
Copy link
Collaborator

yuqisun commented Nov 17, 2024

Hi Yufei @yyan7223 ,

Synthesize works now. But another question about Layout needs your help, I updated all submodules and keep OpenRoad script repo as f1bc7bc5891dc29ad94f248ec7ceaed892402958 commit, Sverilog Generate and Synthesize good, but it always warning in Floorplan check_setup step (more than 10 hours of 100k clk period config), have you ever seen this? Compared with previous log, no warning:

I'm pushing new image with less layers and size, also attached design file.

docker image: yuqisun/cgra-flow:20241117, decrese to 21 layers from 30.

design.v.txt
design.sv.txt

Log with all latest submodules:

==========================================================================
Floorplan check_setup
--------------------------------------------------------------------------
Warning: There are 20 unclocked register/latch pins.
Warning: There are 64 unconstrained endpoints.
Warning: There are 12435 combinational loops in the design.
number instances in verilog is 450768
[WARNING IFP-0028] Core area lower left (5.000, 5.000) snapped to (5.022, 5.130).
[INFO IFP-0001] Added 2034 rows of 10172 site asap7sc7p5t.
repair_timing -verbose -repair_tns 100
[WARNING RSZ-0021] no estimated parasitics. Using wire load models.
[INFO RSZ-0094] Found 4611 endpoints with setup violations.
[INFO RSZ-0099] Repairing 4611 out of 4611 (100.00%) violating endpoints...
Iteration | Removed | Resized | Inserted | Cloned Gates | Pin Swaps |   WNS   |   TNS   | Endpoint
          | Buffers |         | Buffers  |              |           |         |         |
-------------------------------------------------------------------------------------------------
        0 |       0 |       0 |        0 |            0 |         0 | -107878.164 | -433906336.000 | data_mem/reg_file/regs\[1\]\[11\]$_DFFE_PP_/D
       10 |       0 |       9 |        0 |            0 |         0 | -107633.469 | -432820832.000 | data_mem/reg_file/regs\[7\]\[9\]$_DFFE_PP_/D
       20 |       0 |      19 |        0 |            0 |         0 | -107383.727 | -431672832.000 | data_mem/reg_file/regs\[7\]\[9\]$_DFFE_PP_/D
       30 |       0 |      29 |        0 |            0 |         0 | -107154.477 | -430618720.000 | data_mem/reg_file/regs\[7\]\[9\]$_DFFE_PP_/D
       40 |       0 |      39 |        0 |            0 |         0 | -106942.875 | -429642560.000 | data_mem/reg_file/regs\[7\]\[9\]$_DFFE_PP_/D
       50 |       0 |      49 |        0 |            0 |         0 | -106755.836 | -428787072.000 | data_mem/reg_file/regs\[1\]\[16\]$_DFFE_PP_/D
       60 |       0 |      59 |        0 |            0 |         0 | -106588.234 | -428028032.000 | data_mem/reg_file/regs\[1\]\[0\]$_DFFE_PP_/D
       70 |       0 |      69 |        0 |            0 |         0 | -106402.328 | -427173728.000 | data_mem/reg_file/regs\[1\]\[0\]$_DFFE_PP_/D
       80 |       0 |      79 |        0 |            0 |         0 | -106246.984 | -426507712.000 | data_mem/reg_file/regs\[1\]\[14\]$_DFFE_PP_/D
       90 |       0 |      89 |        0 |            0 |         0 | -106067.898 | -425685664.000 | data_mem/reg_file/regs\[1\]\[14\]$_DFFE_PP_/D
      100 |       0 |      97 |        0 |            2 |         0 | -105905.102 | -424935488.000 | data_mem/reg_file/regs\[1\]\[14\]$_DFFE_PP_/D
      110 |       0 |     107 |        0 |            2 |         0 | -105771.234 | -424346784.000 | data_mem/reg_file/regs\[1\]\[11\]$_DFFE_PP_/D
      120 |       0 |     114 |        0 |            5 |         0 | -105613.352 | -423619104.000 | data_mem/reg_file/regs\[1\]\[11\]$_DFFE_PP_/D
      130 |       0 |     122 |        0 |            7 |         0 | -104614.484 | -418663680.000 | tile__1/channel__1.queues__0.dpath.queue.regs\[1\]\[24\]$_DFFE_PP_/D
      140 |       0 |     128 |        0 |           11 |         0 | -104567.531 | -418447168.000 | tile__1/channel__1.queues__0.dpath.queue.regs\[1\]\[24\]$_DFFE_PP_/D
      150 |       0 |     132 |        0 |           17 |         0 | -104427.070 | -417804896.000 | tile__1/channel__1.queues__0.dpath.queue.regs\[1\]\[24\]$_DFFE_PP_/D
      160 |       0 |     135 |        0 |           24 |         0 | -104298.836 | -417218080.000 | tile__1/channel__1.queues__0.dpath.queue.regs\[1\]\[24\]$_DFFE_PP_/D
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu1/_6783_/SN
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu1/_6783_/SN
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu1/_6783_/SN

Previous log:

==========================================================================
Floorplan check_setup
--------------------------------------------------------------------------
Warning: There are 5068 unclocked register/latch pins.
Warning: There are 5076 unconstrained endpoints.
Warning: There are 126 combinational loops in the design.
number instances in verilog is 426116
[WARNING IFP-0028] Core area lower left (5.000, 5.000) snapped to (5.022, 5.130).
[INFO IFP-0001] Added 2007 rows of 10041 site asap7sc7p5t.
repair_timing -verbose -repair_tns 100
[WARNING RSZ-0021] no estimated parasitics. Using wire load models.
[INFO RSZ-0098] No setup violations found
[INFO RSZ-0033] No hold violations found.
Default units for flow
 time 1ps
 capacitance 1fF
 resistance 1kohm
 voltage 1v
 current 1mA
 power 1pW
 distance 1um
Report metrics stage 2, floorplan final...

==========================================================================
floorplan final report_design_area
--------------------------------------------------------------------------
Design area 44111 u^2 15% utilization.
Elapsed time: 3:16.37[h:]min:sec. CPU time: user 195.35 sys 0.98 (99%). Peak memory: 1827604KB.

@tancheng
Copy link
Owner Author

The 4611 out of 4611 (100.00%) violating endpoints is scary. I think the negative TNS is fine while the negative WNS indicating timing violation (@dppatil3@asu.edu) that cannot be easily solved during floorplan stage.

  • @yuqisun Can you also try 1x1 CGRA?
  • @yyan7223 If floorplan cannot solve them, let's try to figure out the combinational loop at earlier stage and optimize the pymtl RTL..

Thanks!

@yuqisun
Copy link
Collaborator

yuqisun commented Nov 17, 2024

1x1 completes floorplan stage quickly(2-3 mins) (but 1x1 cannot map DFG), other stages are running, logs below:

(venv) root@9beb6213afe2 base# more 2_1_floorplan.log
OpenROAD 9b28074f5d04c68a9ddcc5c571583c1950e326a2
Features included (+) or not (-): +Charts +GPU +GUI +Python : None
This program is licensed under the BSD-3 license. See the LICENSE file for details.
Components of this program may be licensed under more restrictive licenses which must be honored.
[INFO ORD-0030] Using 6 thread(s).
[INFO ODB-0227] LEF file: /WORK_REPO/CGRA-Flow/tools/OpenROAD-flow-scripts/flow/platforms/asap7/lef/asap
7_tech_1x_201209.lef, created 24 layers, 9 vias
[INFO ODB-0227] LEF file: /WORK_REPO/CGRA-Flow/tools/OpenROAD-flow-scripts/flow/platforms/asap7/lef/asap
7sc7p5t_28_R_1x_220121a.lef, created 212 library cells

==========================================================================
Floorplan check_setup
--------------------------------------------------------------------------
Warning: There are 5 unclocked register/latch pins.
Warning: There are 16 unconstrained endpoints.
Warning: There are 1497 combinational loops in the design.
number instances in verilog is 115496
[WARNING IFP-0028] Core area lower left (5.000, 5.000) snapped to (5.022, 5.130).
[INFO IFP-0001] Added 1032 rows of 5163 site asap7sc7p5t.
repair_timing -verbose -repair_tns 100
[WARNING RSZ-0021] no estimated parasitics. Using wire load models.
[INFO RSZ-0098] No setup violations found
[INFO RSZ-0033] No hold violations found.
Default units for flow
 time 1ps
 capacitance 1fF
 resistance 1kohm
 voltage 1v
 current 1mA
 power 1pW
 distance 1um
Report metrics stage 2, floorplan final...

==========================================================================
floorplan final report_design_area
--------------------------------------------------------------------------
Design area 11664 u^2 15% utilization.
Elapsed time: 0:56.79[h:]min:sec. CPU time: user 56.33 sys 0.41 (99%). Peak memory: 796188KB.

@yuqisun
Copy link
Collaborator

yuqisun commented Nov 18, 2024

1x1 encountered the same error as before - [ERROR GRT-0119] Routing congestion too high., though I'm using same WSL memory: 26G + 10G swap. Did I miss something?

[ERROR GRT-0119] Routing congestion too high. Check the congestion heatmap in the GUI and load ./reports/asap7/CGRATemplateRTL/base/congestion.rpt in the DRC viewer.
Error: global_route.tcl, 111 GRT-0119
Command exited with non-zero status 1
Elapsed time: 4:14:54[h:]min:sec. CPU time: user 15293.05 sys 0.84 (99%). Peak memory: 2446392KB.
make[1]: *** [Makefile:827: do-5_1_grt] Error 1
make: *** [Makefile:825: results/asap7/CGRATemplateRTL/base/5_1_grt.odb] Error 2
OpenROAD 9b28074f5d04c68a9ddcc5c571583c1950e326a2
Features included (+) or not (-): +Charts +GPU +GUI +Python : None
This program is licensed under the BSD-3 license. See the LICENSE file for details.
Components of this program may be licensed under more restrictive licenses which must be honored.
[ERROR ORD-0007] /WORK_REPO/CGRA-Flow/tools/OpenROAD-flow-scripts/flow/results/asap7/CGRATemplateRTL/base/6_final.odb does not exist.
Error: cmd.tcl, 1 ORD-0007
openroad>
Exception in Tkinter callback
Traceback (most recent call last):
  File "/usr/lib/python3.7/tkinter/__init__.py", line 1708, in __call__
    return self.func(*args)
  File "/WORK_REPO/venv/lib/python3.7/site-packages/customtkinter/windows/widgets/ctk_button.py", line 554, in _clicked
    self._command()
  File "../mode_dark_light.py", line 2266, in clickRTL2Layout
  File "../mode_dark_light.py", line 2283, in display_layout_image
  File "/WORK_REPO/venv/lib/python3.7/site-packages/PIL/Image.py", line 3227, in open
    fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: '/WORK_REPO/CGRA-Flow/build/layout.png'

yuqisun added a commit to yuqisun/CGRA-Flow that referenced this issue Nov 18, 2024
yuqisun added a commit to yuqisun/CGRA-Flow that referenced this issue Nov 18, 2024
yuqisun added a commit to yuqisun/CGRA-Flow that referenced this issue Nov 18, 2024
yuqisun added a commit to yuqisun/CGRA-Flow that referenced this issue Nov 18, 2024
yuqisun added a commit to yuqisun/CGRA-Flow that referenced this issue Nov 18, 2024
yuqisun added a commit to yuqisun/CGRA-Flow that referenced this issue Nov 18, 2024
yuqisun added a commit to yuqisun/CGRA-Flow that referenced this issue Nov 18, 2024
@deepsz
Copy link

deepsz commented Nov 18, 2024

For 1x1 [ERROR GRT-0119] Routing congestion is too high, which indicates the core area is too small and the open road tried to route in the core area, In the config.mk changing core Utilization to 0.2 >0.1 ( 30>20>10 if it's defined in non-decimal ) and export PLACE_DENSITY = 0.6>0.4>0.2 might resolve the congestion problem.If the issue still persists then we need to define larger CORE_AREA in the config. mk. For timing Violations, In the Config. mk export CTS_BUF_CELL = BUFx8_ASAP7_75t_R to BUFx16_ASAP7_75t_R we can try changing the drive strength of Buffer to improve WNS.

@tancheng
Copy link
Owner Author

Thanks @deepsz for the valuable suggestions/analysis. @yuqisun plz try the suggestions mentioned above.

@yuqisun
Copy link
Collaborator

yuqisun commented Nov 19, 2024

Hi,

1x1 got same setup violations issue now after config.mk change, attached file below, could you help take a look:
config.mk.txt

==========================================================================
Floorplan check_setup
--------------------------------------------------------------------------
Warning: There are 5 unclocked register/latch pins.
Warning: There are 16 unconstrained endpoints.
Warning: There are 1497 combinational loops in the design.
number instances in verilog is 115496
[WARNING IFP-0028] Core area lower left (5.000, 5.000) snapped to (5.022, 5.130).
[INFO IFP-0001] Added 729 rows of 3651 site asap7sc7p5t.
repair_timing -verbose -setup_margin 0 -repair_tns 100
[WARNING RSZ-0021] no estimated parasitics. Using wire load models.
[INFO RSZ-0094] Found 1429 endpoints with setup violations.
[INFO RSZ-0099] Repairing 1429 out of 1429 (100.00%) violating endpoints...
Iteration | Removed | Resized | Inserted | Cloned Gates | Pin Swaps |   WNS   |   TNS   | Endpoint
          | Buffers |         | Buffers  |              |           |         |         |
-------------------------------------------------------------------------------------------------
        0 |       0 |       0 |        0 |            0 |         0 | -12582.932 | -17113398.000 | data_mem/reg_file/regs\[2\]\[14\]$_DFFE_PP_/D
       10 |       0 |       9 |        0 |            0 |         0 | -12369.227 | -16828164.000 | data_mem/reg_file/regs\[2\]\[17\]$_DFFE_PP_/D
       20 |       0 |      19 |        0 |            0 |         0 | -12197.269 | -16591699.000 | data_mem/reg_file/regs\[2\]\[15\]$_DFFE_PP_/D
       30 |       0 |      29 |        0 |            0 |         0 | -12016.222 | -16336782.000 | data_mem/reg_file/regs\[2\]\[15\]$_DFFE_PP_/D
       40 |       0 |      38 |        0 |            1 |         0 | -11874.815 | -16139147.000 | data_mem/reg_file/regs\[2\]\[10\]$_DFFE_PP_/D
       50 |       0 |      46 |        0 |            3 |         0 | -11972.719 | -16264887.000 | data_mem/reg_file/regs\[2\]\[8\]$_DFFE_PP_/D
       60 |       0 |      54 |        0 |            5 |         0 | -11927.944 | -16208871.000 | data_mem/reg_file/regs\[2\]\[18\]$_DFFE_PP_/D
       70 |       0 |      58 |        0 |           11 |         0 | -11843.477 | -16092403.000 | data_mem/reg_file/regs\[2\]\[1\]$_DFFE_PP_/D
       80 |       0 |      60 |        0 |           19 |         0 | -11744.750 | -15952175.000 | data_mem/reg_file/regs\[2\]\[1\]$_DFFE_PP_/D
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/ctrl_mem/_67_/Y
       90 |       0 |      65 |        0 |           22 |         2 | -15557.930 | -21335432.000 | data_mem/reg_file/regs\[2\]\[10\]$_DFFE_PP_/D
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu0/_13229_/Y
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu0/_21966_/SN
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu0/_21966_/SN
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu0/_21966_/SN
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu0/_21966_/SN
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu0/_22065_/SN
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu0/_22012_/SN
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu0/_21966_/SN
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu0/_22065_/SN
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu0/_22012_/SN
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu0/_21966_/SN
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu0/_22065_/SN
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu0/_22012_/SN
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu0/_21966_/SN
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu0/_22065_/SN
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu0/_22012_/SN
      100 |       0 |      69 |        0 |           24 |         6 | -16210.379 | -22240114.000 | data_mem/reg_file/regs\[2\]\[10\]$_DFFE_PP_/D
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu0/_21966_/SN

Thanks,

@MeowMJ
Copy link
Collaborator

MeowMJ commented Nov 20, 2024

The problem now is that 1x1 takes too long to map, e.g. 360 cycles. Do I understand correctly? I tried DFG of main function (only contains two nodes) which works when mapping on 1x1. The mapping log is attached below.

mainMap.log

@tancheng
Copy link
Owner Author

Hi @MeowMJ, you mean the fir kernel requires 360 cycles?

  • We need to make it compilable/mappable and show in the GUI.
  • Then let's try to optimize/fix the 360 cycles to a reasonable II.

@yyan7223
Copy link
Collaborator

Hi Yufei @yyan7223 ,

Synthesize works now. But another question about Layout needs your help, I updated all submodules and keep OpenRoad script repo as f1bc7bc5891dc29ad94f248ec7ceaed892402958 commit, Sverilog Generate and Synthesize good, but it always warning in Floorplan check_setup step (more than 10 hours of 100k clk period config), have you ever seen this? Compared with previous log, no warning:

I'm pushing new image with less layers and size, also attached design file.

docker image: yuqisun/cgra-flow:20241117, decrese to 21 layers from 30.

design.v.txt design.sv.txt

Log with all latest submodules:

==========================================================================
Floorplan check_setup
--------------------------------------------------------------------------
Warning: There are 20 unclocked register/latch pins.
Warning: There are 64 unconstrained endpoints.
Warning: There are 12435 combinational loops in the design.
number instances in verilog is 450768
[WARNING IFP-0028] Core area lower left (5.000, 5.000) snapped to (5.022, 5.130).
[INFO IFP-0001] Added 2034 rows of 10172 site asap7sc7p5t.
repair_timing -verbose -repair_tns 100
[WARNING RSZ-0021] no estimated parasitics. Using wire load models.
[INFO RSZ-0094] Found 4611 endpoints with setup violations.
[INFO RSZ-0099] Repairing 4611 out of 4611 (100.00%) violating endpoints...
Iteration | Removed | Resized | Inserted | Cloned Gates | Pin Swaps |   WNS   |   TNS   | Endpoint
          | Buffers |         | Buffers  |              |           |         |         |
-------------------------------------------------------------------------------------------------
        0 |       0 |       0 |        0 |            0 |         0 | -107878.164 | -433906336.000 | data_mem/reg_file/regs\[1\]\[11\]$_DFFE_PP_/D
       10 |       0 |       9 |        0 |            0 |         0 | -107633.469 | -432820832.000 | data_mem/reg_file/regs\[7\]\[9\]$_DFFE_PP_/D
       20 |       0 |      19 |        0 |            0 |         0 | -107383.727 | -431672832.000 | data_mem/reg_file/regs\[7\]\[9\]$_DFFE_PP_/D
       30 |       0 |      29 |        0 |            0 |         0 | -107154.477 | -430618720.000 | data_mem/reg_file/regs\[7\]\[9\]$_DFFE_PP_/D
       40 |       0 |      39 |        0 |            0 |         0 | -106942.875 | -429642560.000 | data_mem/reg_file/regs\[7\]\[9\]$_DFFE_PP_/D
       50 |       0 |      49 |        0 |            0 |         0 | -106755.836 | -428787072.000 | data_mem/reg_file/regs\[1\]\[16\]$_DFFE_PP_/D
       60 |       0 |      59 |        0 |            0 |         0 | -106588.234 | -428028032.000 | data_mem/reg_file/regs\[1\]\[0\]$_DFFE_PP_/D
       70 |       0 |      69 |        0 |            0 |         0 | -106402.328 | -427173728.000 | data_mem/reg_file/regs\[1\]\[0\]$_DFFE_PP_/D
       80 |       0 |      79 |        0 |            0 |         0 | -106246.984 | -426507712.000 | data_mem/reg_file/regs\[1\]\[14\]$_DFFE_PP_/D
       90 |       0 |      89 |        0 |            0 |         0 | -106067.898 | -425685664.000 | data_mem/reg_file/regs\[1\]\[14\]$_DFFE_PP_/D
      100 |       0 |      97 |        0 |            2 |         0 | -105905.102 | -424935488.000 | data_mem/reg_file/regs\[1\]\[14\]$_DFFE_PP_/D
      110 |       0 |     107 |        0 |            2 |         0 | -105771.234 | -424346784.000 | data_mem/reg_file/regs\[1\]\[11\]$_DFFE_PP_/D
      120 |       0 |     114 |        0 |            5 |         0 | -105613.352 | -423619104.000 | data_mem/reg_file/regs\[1\]\[11\]$_DFFE_PP_/D
      130 |       0 |     122 |        0 |            7 |         0 | -104614.484 | -418663680.000 | tile__1/channel__1.queues__0.dpath.queue.regs\[1\]\[24\]$_DFFE_PP_/D
      140 |       0 |     128 |        0 |           11 |         0 | -104567.531 | -418447168.000 | tile__1/channel__1.queues__0.dpath.queue.regs\[1\]\[24\]$_DFFE_PP_/D
      150 |       0 |     132 |        0 |           17 |         0 | -104427.070 | -417804896.000 | tile__1/channel__1.queues__0.dpath.queue.regs\[1\]\[24\]$_DFFE_PP_/D
      160 |       0 |     135 |        0 |           24 |         0 | -104298.836 | -417218080.000 | tile__1/channel__1.queues__0.dpath.queue.regs\[1\]\[24\]$_DFFE_PP_/D
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu1/_6783_/SN
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu1/_6783_/SN
[WARNING RSZ-0075] makeBufferedNet failed for driver tile__0/element/fu__6/Fu1/_6783_/SN

Previous log:

==========================================================================
Floorplan check_setup
--------------------------------------------------------------------------
Warning: There are 5068 unclocked register/latch pins.
Warning: There are 5076 unconstrained endpoints.
Warning: There are 126 combinational loops in the design.
number instances in verilog is 426116
[WARNING IFP-0028] Core area lower left (5.000, 5.000) snapped to (5.022, 5.130).
[INFO IFP-0001] Added 2007 rows of 10041 site asap7sc7p5t.
repair_timing -verbose -repair_tns 100
[WARNING RSZ-0021] no estimated parasitics. Using wire load models.
[INFO RSZ-0098] No setup violations found
[INFO RSZ-0033] No hold violations found.
Default units for flow
 time 1ps
 capacitance 1fF
 resistance 1kohm
 voltage 1v
 current 1mA
 power 1pW
 distance 1um
Report metrics stage 2, floorplan final...

==========================================================================
floorplan final report_design_area
--------------------------------------------------------------------------
Design area 44111 u^2 15% utilization.
Elapsed time: 3:16.37[h:]min:sec. CPU time: user 195.35 sys 0.98 (99%). Peak memory: 1827604KB.

The first log can be reproduced using VectorCGRA (commit 7714ef5 on Nov 3, 2024). The second log can be reproduced using VectorCGRA (commit b9859b7 on Jan 8, 2023).

I haven't fixed this issue but here are some records of debugging:

  • I firstly try to compare the differences between the SVerilog generated by them, but there are too many differences so I give up……
  • Then I compare the differences between the logs of yosys that synthesis the respective SVerilog. The key difference is that VectorCGRA b9859b7 reports there are 75 combinational logics in the design but VectorCGRA 7714ef5 doesn't report any combinational loops. However, in the Floorplan check_setup logs pasted above, VectorCGRA b9859b7 reports 126 combinational loops but VectorCGRA 7714ef5 reports 12435 combinational loops. I'm not sure where 12435 combinational loops come from because Yosys synthesis doesn't report any.
  • I finally switch the technology from asap7 to nangate45 and use VectorCGRA 7714ef5 to generate SVerilog. Although there are still 14627 combinational loops, no setup and hold violations found and the floorplan finishes normally. So I cannot make sure whether the issue is caused by too many combinational loops.

But anyway, a stable circuit should not contain any combinational loops. So next I may try to separately synthesis each module within the design, find which module contributes most of combinatioinal loops, and modify the pymtl3 code correspondingly. What do you think? @tancheng @yuqisun @deepsz

Regards,
Yufei

@tancheng
Copy link
Owner Author

Thanks @yyan7223 for the investigation. I am working on tancheng/VectorCGRA#13 to try to minimize the number of combinational loops across tiles and tancheng/VectorCGRA#26 to include data-dependent as another dimension in terms of reconfigurability.

  • However, the number of combination loops is supposed to be reported in Yosys, otherwise, I am not sure whether the updated design achieves improvements or not. @yo96 Yosys would report all the combination loops from the Verilog, right?
  • I created https://github.com/users/tancheng/projects/1 to track the progress for making the design more synthesis-friendly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

5 participants