[J-core] Porting J1 to LiteX

William D. Jones wjones at wdj-consulting.com
Wed Jun 16 03:36:32 UTC 2021


 >Yup.  GMail is becoming unusable for business also.  It's now blocking 
anything with 'investment' or other often 'misused' business and 
financial terms etc.  Useless.

Today I added SPF, DMARC, and DKIM. GMail (and Microsoft!) is still 
blocking my messages. I guess we'll have to wait and see...

 >Ok, I'll have a look, thanks :)

If my changes get accept back to the main repo, would you consider 
making jcore-j1-ghdlcapable of supporting CPU variants (no cache, 
icache, icache/dcache, multi-cycle multiplier)? VHDL doesn't seem to 
have an equivalent of Verilog `ifdef, so conditional compilation for 
variants seem more difficult.

 >This is the script that it's using 
https://github.com/j-core/j-core-ice40/blob/master/ram.sh

Excellent, I can also use this script/similar to add VHDL support to 
icebram a bit later, which will make iterating SoCs quicker.

 >Hmmm.  IMHO don't do that, just merge them in the wrapper. Far less logic.

Contrary to what I thought (as in, this _might've_ not been true re: 
LiteX in the past, and they added it recently), LiteX is capable of 
handling the merge either before or after the buses are merged: 
https://github.com/enjoy-digital/litex/search?p=2&q=periph_buses

The `interconnect_shared_cls` in the linked code will either create a 
round-robin arbiter if there are two (or more!) buses, or do a direct 
connection with no logic if there's only one master: 
https://github.com/enjoy-digital/litex/blob/master/litex/soc/integration/soc.py#L986-L993

The two bus case is where the buses weren't merged by the CPU; the one 
bus case, like I'll do here, is if the CPU merged the buses for you.

 >5 min is lightning fast good :)  I guess we're just used to huge 
multicore, DSP, GPS and signal processing accelerator PnR runs with 
J2smp.  Those can take up to 1 hour ;)

Fair, but this is still taking unusually long, compared to say, lm32. 
Will be fun to benchmark the CPU performances tho :P.

Sincerely,

On 6/15/2021 5:24 PM, D. Jeff Dionne wrote:
> On Jun 15, 2021, at 16:23, William D. Jones<wjones at wdj-consulting.com>  wrote:
>
>> Hi Jeff,
>>
>> Gmail has conveniently decided to block any attempts from me to send email from my mail server, even though it has worked fine for the past 5 years up to this point. I'm resending the email I sent to the mailing list directly to you using my ISP's relay. This seems to work, but I'm pretty angry on principle that I have to resort to this...
> Yup.  GMail is becoming unusable for business also.  It's now blocking anything with 'investment' or other often 'misused' business and financial terms etc.  Useless.
>
>>>   But does it fit reasonably?  I had found that J1 was impractical in HX, and so did not investigate further.
>> I guess it depends on what you mean by "reasonably"- LUT usage or RAM usage.
>>
>> The entire SoC uses ~5300 LUTs on HX8K vs ~4300 LUTs on UP5K. This leaves room for some peripherals like LEDs, GPIO, and UART, XIP from SPI flash, and cache.
>>
>> However, since 8K doesn't have SPRAM, I reduced the amount of "bulk" RAM to 1Kb. This is still enough to run the bootrom, and there's still a good 25% of the EBR (4kB) left unused. The smallest e.g. ARM microcontrollers had like 4kB flash and 1kB of RAM (LPC810), and I am partial to msp430 (some of them have 128 bytes of RAM). So I'm very tolerant of microcontrollers w/ limited memory :D!
> Sure, that is certainly a useful configuration for a lot of controller applications.
> ...
>
>> If you want to duplicate my results and see LUT/EBR uages, use my copy of j1 (https://github.com/cr1901/jcore-j1-ghdl/tree/hx8k) and run "make TARGET=ice40hx8k_b_evn". LiteX is using my own copy of j1 for now, just in case I need to make changes and experiment; this is temporary.
> Ok, I'll have a look, thanks :)
>
>> Ack. Where is the script/program you use to convert the testrom to a VHDL array? I think I'd rather reuse yours for now than write my own.
> This is the script that it's usinghttps://github.com/j-core/j-core-ice40/blob/master/ram.sh
>
>>> keep in mind that J1 is still a full Harvard machine, so you'll need to mux it down to a single master.
>> LiteX provides its own mux on the Wishbone bus, so I would adapt both the D and I buses to Wishbone before the mux.
> Hmmm.  IMHO don't do that, just merge them in the wrapper.  Far less logic.
>
>>> I don't think timing closure, on FPGA multipliers tend to be very fast.  There are a few critical paths, the one that erks me is the T bit feeding into the microcode sequencer.  But when I wrote the MAC unit, it was very clean, even if it's picked up a bit of cruft since.
>> Ack. One thing I'd like to add: nextpnr has trouble routing the up5k version of the SoC, even with the DSPs and SPRAM relieving about 1k LUTs for other use. nextpnr can take upwards of 5 minutes to route on up5k, and by changing the PCF, I could get nextpnr to take over 10 minutes to route before I cancelled it. I'll ask one of the nextpnr devs for some insight.
> 5 min is lightning fast good :)  I guess we're just used to huge multicore, DSP, GPS and signal processing accelerator PnR runs with J2smp.  Those can take up to 1 hour ;)
>
>>> IIRC, some are instruction chewers.  J1 is a highly encoded and more complex operation per instruction ISA, and pipelined machine with parallel ALU, MAC and shift units.  The throughput might be comparable, even at a slower clock :)
>> The time it takes to checksum the main payload in LiteX BIOS may be a good benchmark.
> Excellent.
>
> Cheers,
> J.
>
>> Sincerely,
>>
>> -- 
>> William D. Jones
>> wjones at wdj-consulting.com
>>
>> _______________________________________________
>> J-core mailing list
>> J-core at lists.j-core.org
>> https://lists.j-core.org/mailman/listinfo/j-core
>>

-- 
William D. Jones
wjones at wdj-consulting.com



More information about the J-core mailing list