[J-core] Adding J1 to the roadmap.
Geoff Salmon
gsalmon at se-instruments.com
Wed May 18 11:29:39 EDT 2016
On 16-05-18 08:07 AM, D. Jeff Dionne wrote:
> Which reminds me, we should try switching the decoder from random logic
> (ASIC style) to FPGA BRAM style and see what happens to the size again
> for the logic constrained platforms, unless Geoff already has done.
I just tried building mimas_v2 with the 2 different decode_table
implementations.
Here's the output in .mrp using the reverse_logic architecture of
decode_table:
Slice Logic Utilization:
Number of Slice Registers: 3,331 out of 11,440 29%
Number used as Flip Flops: 3,327
Number used as Latches: 3
Number used as Latch-thrus: 0
Number used as AND/OR logics: 1
Number of Slice LUTs: 5,307 out of 5,720 92%
Number used as logic: 5,190 out of 5,720 90%
Number using O6 output only: 4,519
Number using O5 output only: 162
Number using O5 and O6: 509
Number used as ROM: 0
Number used as Memory: 60 out of 1,440 4%
Number used as Dual Port RAM: 60
Number using O6 output only: 8
Number using O5 output only: 0
Number using O5 and O6: 52
Number used as Single Port RAM: 0
Number used as Shift Register: 0
Number used exclusively as route-thrus: 57
Number with same-slice register load: 52
Number with same-slice carry load: 5
Number with other load: 0
Slice Logic Distribution:
Number of occupied Slices: 1,427 out of 1,430 99%
Number of MUXCYs used: 496 out of 2,860 17%
Number of LUT Flip Flop pairs used: 5,536
Number with an unused Flip Flop: 2,343 out of 5,536 42%
Number with an unused LUT: 229 out of 5,536 4%
Number of fully used LUT-FF pairs: 2,964 out of 5,536 53%
Number of slice register sites lost
to control set restrictions: 0 out of 11,440 0%
And here's the output in .mrp using the rom architecture of decode_table:
Slice Logic Utilization:
Number of Slice Registers: 3,323 out of 11,440 29%
Number used as Flip Flops: 3,319
Number used as Latches: 3
Number used as Latch-thrus: 0
Number used as AND/OR logics: 1
Number of Slice LUTs: 4,471 out of 5,720 78%
Number used as logic: 4,387 out of 5,720 76%
Number using O6 output only: 3,684
Number using O5 output only: 162
Number using O5 and O6: 541
Number used as ROM: 0
Number used as Memory: 60 out of 1,440 4%
Number used as Dual Port RAM: 60
Number using O6 output only: 8
Number using O5 output only: 0
Number using O5 and O6: 52
Number used as Single Port RAM: 0
Number used as Shift Register: 0
Number used exclusively as route-thrus: 24
Number with same-slice register load: 19
Number with same-slice carry load: 5
Number with other load: 0
Slice Logic Distribution:
Number of occupied Slices: 1,400 out of 1,430 97%
Number of MUXCYs used: 496 out of 2,860 17%
Number of LUT Flip Flop pairs used: 4,968
Number with an unused Flip Flop: 1,785 out of 4,968 35%
Number with an unused LUT: 497 out of 4,968 10%
Number of fully used LUT-FF pairs: 2,686 out of 4,968 54%
Number of slice register sites lost
to control set restrictions: 0 out of 11,440 0%
Only a small reduction in occupied slices but a 15% reduction in Slice
LUTs used. The timing score goes from 20840 to 21516.
> On the other hand, I think the selective stripping or stripping down of
> pipeline appendages... barrel shifter, MAC unit, etc is a useful
> exercise in making the implementation as clean as possible. The
> compiler support could be tweaked, but some instructions removed might
> require support in libgcc.a
Removing groups of instructions and the associated hardware will be
interesting. Will need to revisit how cpu_gen works. Will J1 have SH-2's
rotate and shift instructions (ROTL/ROTR, ROTCL/ROTCR, SHAL/SHAR,
SHLL*/SHLR*)?
- Geoff
More information about the J-core
mailing list