From w4m4ck at gmail.com Sat Apr 10 10:56:25 2021 From: w4m4ck at gmail.com (a l3x) Date: Sat, 10 Apr 2021 12:56:25 +0200 Subject: [J-core] Turtle Board Documentation Message-ID: sorry for jumping in... rob wrote: > P.S. We sometimes talk about doing a 2.0 design someday with an ICE40 J1 instead > of Atmel, maybe Artix-7 and DDR3, possibly USB-C and an SDIO wifi chip... dunno > when Pi 3 form factor stops being useful and we'd need pi 4 to get cases? Except > what we REALLY need to make is a Lattice ECP5 board that can run J2 (since that > can use the open GHDL toolchain, although the result's probably like 12mhz > because ECP5 is _slow_). But again... that doesn't solve the problem that we're > not set up for retail sales. We're pretty good at designing new boards in house > and doing a run of a few dozen prototypes, we've already made multiple "hat" > boards that plug into the 1v1 Turtles for various development projects. And we > can set up high volume manufacturing to hand off to a B2B hardware partner that > can deploy a zillion boards into some supply chain. It's the "sell small > quantities retail" part in between that's... not what we do. And things like > kickstarter or "amazon fulfillment" turn out not to actually do the bits that > are missing. TL;DR i have been looking into giving (after 20yrs back @univ) fpga a second "chance" because of all the awesome open source achievements regarding yosys, nextpnr, ghdl, .... there are a couple ECP5 boards out there which could be used as "base" for a sh2/j2/j1 core: - https://github.com/gregdavill/OrangeCrab - https://github.com/butterstick-fpga/butterstick-hardware - https://www.crowdsupply.com/radiona/ulx3s - https://shop.lambdaconcept.com/home/46-1-ecpix-5.html#/1-ecpix_5_fpga-ecpix_5_45f i bought a lambdaconcept ecpix-5 with an ecp5 45F couple days ago. 99€. (also, almost none of the projects mentioned above have boards ready to buy ...) besides the j* core there is a 20yrs old sh2 verilog implementation available at - https://github.com/freecores/Aquarius and - https://www.patreon.com/srg320 <- this guy is working on a sh2 fpga implementation in order to be able to simulate a sega saturn system in its entirety using the MiSTer-FPGA platform: https://github.com/MiSTer-devel/Main_MiSTer/wiki (basically Cyclone V in form of de10 nano plus sandwich pcb(s)) requires intel quartus which was a show stopper for me. that guy doesn't care about j-core. he implements from scratch according to a twitter post. (understandable, because for emulation the result needs to be cycle accurate wrt orig sh2). actually i'm more kind of a low-level c++ with inline asm hacker than hardware engineer. however, the sh2 is an awesome very compact isa. much more accessible than riscv. i think because i'm used to x86 2-address code more then 3-address code of arm or risc*. so i tried to get a softcore working on the ecp5. on the road picking up a "design" i got kind of overwhelmed by the various different board variants software and $stuff. what would be an awesome "mini" project: porting the fpga fantasy console to sh2: https://github.com/dan-rodrigues/icestation-32 first to use an sh2 core instead of the picorv32 / vexriscv. then get it running on my ecpix (which is not compatible to the ulx3s which is supported out of the box). regarding your speed concerns on an ecp5: the microwatt powerpc including fpu synthesizes successfully for an 85F using like 80-90% resources at 40Mhz. (via ghdl/yosys/nextpnr). my 45F variant is too small for this. during the synthetization trial and errors i realized that the j1-core-ghdl uses twice as much resources on the fpga as the full icestation-32 soc. (well, j1 core took like 15-20% resources afair). i expected the opposite because of (almost) 16bit "only" opcodes. rounding up: i hope i can get the j1 or aquarius running on my ecp5 board. i don't care about linux i want bare metal for starters. unfortunately, my vhdl/verilog skillz are basically non-existent... it would be awesome if the j-core team could partner up having a low cost "easy" to use fully open source solution available. i understand that low volume productions are not your business, yet it seems that there is an enormous momentum building up. all the best, al3x From rob at landley.net Tue Apr 13 02:48:11 2021 From: rob at landley.net (Rob Landley) Date: Mon, 12 Apr 2021 21:48:11 -0500 Subject: [J-core] Turtle Board Documentation In-Reply-To: References: Message-ID: <9288ae72-1d98-31f0-40cd-cef2b40bf3b0@landley.net> On 4/10/21 5:56 AM, a l3x wrote: > TL;DR i have been looking into giving (after 20yrs back @univ) fpga a second > "chance" because of all the awesome open source achievements regarding > yosys, nextpnr, ghdl, .... Having a proper open source VHDL toolchain that can actually build real world things is exciting, yes. (Pity it can't seriously target xilinx yet...) > there are a couple ECP5 boards out there which could be used as "base" for > a sh2/j2/j1 core: > > - https://github.com/gregdavill/OrangeCrab > - https://github.com/butterstick-fpga/butterstick-hardware > - https://www.crowdsupply.com/radiona/ulx3s > - https://shop.lambdaconcept.com/home/46-1-ecpix-5.html#/1-ecpix_5_fpga-ecpix_5_45f Interesting. I'll wave that at people and see what they think. > i bought a lambdaconcept ecpix-5 with an ecp5 45F couple days ago. > 99€. (also, almost > none of the projects mentioned above have boards ready to buy ...) > > besides the j* core there is a 20yrs old sh2 verilog implementation available at > > - https://github.com/freecores/Aquarius Sounds familiar. I think they used it as a reference implementation to assemble a CPU test suite against back in 2012?. > and > > - https://www.patreon.com/srg320 <- this guy is working on a sh2 fpga > implementation > in order to be able to simulate a sega saturn system in its entirety > using the MiSTer-FPGA > platform: https://github.com/MiSTer-devel/Main_MiSTer/wiki > (basically Cyclone V in form of > de10 nano plus sandwich pcb(s)) requires intel quartus which was a > show stopper for me. > that guy doesn't care about j-core. he implements from scratch > according to a twitter post. > (understandable, because for emulation the result needs to be cycle > accurate wrt orig sh2). Indeed. We looked into doing a saturn implementation for the 20th(?) anniversary a few years back, but gave up both because "too busy with other things" and because we're _not_ cycle accurate. (There's places we take fewer clocks to do stuff that we'd have to slow _down_ in order to match sh2, or fetch/cache works differently...) And unfortunately staturn basically sticks two sh2 chips on the same bus and lets them fight it with no coordination. (There was a two player fighting game that did collision detection in the graphics buffer, for example.) Getting the same behavior is _entirely_ timing butterfly effects and cycle counting. We liked the instruction set and wanted enough compatibility to reuse existing toolchains and Linux support and such, but never intended to be cycle accurate. > actually i'm more kind of a low-level c++ with inline asm hacker than > hardware engineer. > however, the sh2 is an awesome very compact isa. much more accessible > than riscv. i > think because i'm used to x86 2-address code more then 3-address code > of arm or risc*. It _is_ possible to do 3 address instructions in superh (there's precedent for making r0 magic), but usually you only do that on multi-clock instructions because the plumbing is designed/balanced for 2 instructions. We've got a couple different register file implementations, but I believe the one we're using in xilinx is 2 read ports and one write port, so we can read 2 registers and write back 1 register result per clock. We can get a little ahead on the writes: there's a queue of 2 or 3 pending writes into the register file so "x=y++" sort of things can happen in one clock. (If you fill up that queue it'll stall a clock to flush it, but that almost never happens.) This came up for some DSP style instructions we've been thinking of adding to do fourier transforms and such faster. Hasn't gotten past the design stage because the project we need it for got reshuffled down the todo list, but we did confirm it's feasible... > so i tried to get a softcore working on the ecp5. on the road picking > up a "design" i got > kind of overwhelmed by the various different board variants software and $stuff. > > what would be an awesome "mini" project: porting the fpga fantasy > console to sh2: > https://github.com/dan-rodrigues/icestation-32 We've gotten jcore up and running on ice40, using 3 different boards. There's a separate ice40 repo on github with a much simpler build, support for the j1 CPU config ice40 used has been merged back upstream into our main j2 build but the SOC configuration system isn't flexible enough to strip the busses and such all the way down yet. (Hence the external build which was JUST cpu and some I/O pins...) > first to use an sh2 core instead of the picorv32 / vexriscv. then get > it running on my ecpix > (which is not compatible to the ulx3s which is supported out of the box). The problem with all of those is memory controller. We have an lpddr2 controller but haven't done a ddr3 or sdram controller yet. (Not a huge deal, just time consuming.) > regarding your speed concerns on an ecp5: the microwatt powerpc > including fpu synthesizes successfully > for an 85F using like 80-90% resources at 40Mhz. (via > ghdl/yosys/nextpnr). my 45F > variant is too small for this. > > during the synthetization trial and errors i realized that the > j1-core-ghdl uses twice as much > resources on the fpga as the full icestation-32 soc. (well, j1 core > took like 15-20% resources afair). > i expected the opposite because of (almost) 16bit "only" opcodes. We have plans to slim it down (to make a "j0"), we just haven't needed it for a customer project yet. It's on the nearish-term todo list though. > rounding up: i hope i can get the j1 or aquarius running on my ecp5 > board. i don't care about > linux i want bare metal for starters. unfortunately, my vhdl/verilog > skillz are basically non-existent... We do bare metal stuff too. Only option on ice40. There's a bare metal ELF toolchain, or the linux toolchain can be hit with --no-startfiles and --no-fdpic and such to create bare metal code (although libgcc.a becomes an issue if you need anything out of that). The "hello world" kernel posted a couple months back was an example of that. Um... Here, slightly cleaned up into actual HTML: https://j-core.org/docs/hello.html > it would be awesome if the j-core team could partner up having a low > cost "easy" to use fully > open source solution available. i understand that low volume > productions are not your business, > yet it seems that there is an enormous momentum building up. We'd like to do retail business, but might need to partner with somebody. Mostly we've just been distracted by customer projects... > all the best, al3x Rob