[J-core] Hi - more info on Super-H advantages

Fri Jan 12 15:37:47 EST 2018

On 01/11/2018 03:58 PM, BGB wrote:
> (outsider perspective / thoughts; my efforts are independent from those
> of the J-Core project, so sentiments may differ).

It's an open source project, and the original poster was asking
questions about superh which was developed ~20 years ago and has been
out in public for a while, we're by no means the only people who know
about it. :)

I don't remotely speak for the whole of the j-core project either, I'm
mostly working on Linux code that runs on it and a bit of documentation.
I've helped debug deep into the guts of the hardware, worked with the
devs to get SMP working, and been to japan to hang out with the hardware
guys a half-dozen times over the past 3 years. So I can pass on a lot of
stuff I learned from other people (some of whom do not speak english as
a first language). But they're the experts...

> On 1/11/2018 5:06 AM, Dru Nelson wrote:
>>
>> Hi,
>>
>> I saw a video of the announcement of the j-core project a while back.
>> The project impressed me, and I made a note to learn more.
>>
>> Without that presentation, I would not have known about the advantages
>> of the SH architecture.
>> For example, I would not have known about the code-density of the SH.
>> I would have just assumed it was like all the other RISCs. (32 bit
>> instructions)
> 
> luckily, code density is pretty good if compared with 32-bit RISCs.
> 
> but, it comes with a drawback in frequently needing to execute longer
> sequences of instructions for many operations, which partly works as
> counter-balance to the shorter instructions in some cases.

That was the downside of conventional RISC. SuperH used microcode so a
single instruction could take multiple clock cycles when it had
something complicated to do. That's built into the code density metrics
we use, mostly "Compile this, what's the resulting binary look like?"

> in my own testing, it is loosely comparable to x86 (but varies a fair
> bit by compiler and other factors though;

Um, so does x86?

> and comparison requires
> compensating for the relative sizes of the C libraries and similar).

Is there an architecture this isn't true for?

> Thumb2 is a fair bit more formidable in code density though, and

He asked about the development of superh. Thumb came after superh (and
licensed superh patents). Thumb2 was a second iteration developed years
after the original thumb, which has at least a decade to go on its patents.

(I'm still waiting for a cortex-m with mmu. Last I checked thumb2 still
hadn't shipped as an independent instruction set with mmu, only as
extension of arm chips with mmu.)

> generally gives a better performance relative to the total number of
> instructions executed (this is a weak area for the basic SH ISA vs
> Thumb2 or RISC-V's RVC coding or similar).

Things developed after superh do other things, sure. J-core added
cmpxchg so we could do proper modern SMP and futexes, for example.
Should I talk about j64 or our proposed vector extensions here? I didn't
think that was relevant to the original question...

(We have a "j0" project under consideration that involves yanking
unused/underused instructions to get a really _tiny_ chip. Our general
idea is "if j2 runs j0 code then it's the same family". But all
optimizing is optimizing _for_ something. For example, power consumption
to performance tradeoffs have been a big thing for a while, readahead
and speculative execution and register renaming and such do work that's
sometimes discarded, gaining performance at the expense of power
efficiency. The are other tradeoffs of die size, auditability, what's
involved in fabbing an asic (on which process with which manufacturing
partners), an entire landscape of IP claims... heck, the j-core build
has three versions of some files optimized for ghdl, fpga, or asic,
depending on what you configure it to build.)

> experimentally, it is possible to improve the ISA on both factors
> (making it both smaller and getting more work done in fewer
> instructions),

I.E. he's been posting on here about his own home-grown architecture
that's not related to j-core. It's a bit like posting about llvm on the
gcc list because they're both C compilers, or netbsd on the linux kernel
mailing list because they're both unixes. (Well, more like a developer
going "I wrote my own kernel from scratch, it also implements posix
system calls so you can compile some linux software for it, I'm on the
only contributor and I haven't got a mailing list for this project but
let me tell you what I did instead of procfs...")

Alas, there isn't a "superh" mailing list the same way there isn't a
"unix" mailing list where linux and darwin and openbsd and solaris are
all on an equal footing. (Well, not since the usenet days.) So he posts
here. It kind of annoys the developers at SEI (the company that
sponsored j-core development through its open source release), who have
have been shipping real products to customers for years. (Not in nearly
the _volumes_ they'd like, but still.)

I suppose if he could get Linux to run on his project he could start
talking about his thing on the linux-sh kernel mailing list instead? It
would be slightly less off-topic than posting about it here...

It's nice that other people are interested in the superh instruction set
again. Even Renesas has started investing in it again (as far as I can
tell, on the logic that if the j-core guys see value there, they must
have missed something, but everything I've heard there is thirdhand).

But j-core is a specific development project. (Which really really
REALLY needs to get the code up on github already, the last tarball's
over a year out of date. It's our fault we haven't been able to take
proper advantage of the community contributions...)

> decided mostly to leave out specifics for 32 vs 64-bit ISA variants
> (as-is there are several 64b variants, still TBD which will be
> "canonical"),

Jeff has had it written down in a folder in his office for something
like 2 years now?

Oh, you're still referring to your emulator.

> in my case, the project to do an FPGA implementation of these is still
> ongoing, but is going slowly; basically, the amount of work required to
> make something work plausibly, and synthesize with a plausible resource
> cost; is a fair bit harder than what one may experience writing code in
> C or similar (combined with my relative inexperience here, and being new
> to CPU design, ..., is making this project go a bit slowly).

And what would work in an fpga can be very different from what would
work in an asic, and that's before you get to "floorplan" work to
optimize for specific fabs or FPGA models...

Rob