[J-core] [RFC] SIMD extention for J-Core

Thu Oct 26 21:00:47 EDT 2017

On Oct 26, 2017 16:14, "Rob Landley" <rob at landley.net> wrote:

On 10/25/2017 06:59 PM, Ken Phillis Jr wrote:
> I figure I will include my Idea for an easy to use SIMD Extension for
> the J-Core. In general this extension will be used for Three Types of
> number formats...

One of the blue sky J64 proposals Jeff mentioned (let me see if I can
remember the details) was 2 control bits per register letting you
specify that register contains 8/16/32/64 bit SIMD, meaning a single 32
bit control register could control SIMD state for 16 general purpose
registers. Then you use the normal operations to deal with
signed/unsigned, multiply, and so on.

That does sound like a clever trick. But I am not sure of how you
differentiate between simd being on or off for that register (you could
have a 256 bits register with operation on 32 bits). Still the idea of a
register is neat and could make it easy during context switch.

The next question was how to deal with source/target size mismatches: in
theory a 64 bit source iterating over 8 bit targets could apply a single
64 bit source value to all 8 targets, a 32 bit source could go
1-2-3-4-1-2-3-4, etc. In practice what that makes the ALU look like is a
question that needs some code prototyping I think...

I am guessing the microcode will also look crazy complex. Is there a limit
on this side to ?

> New Registers: simd0 to simd15
> These Registers are 128-bits in size, and are used to perform a bulk
> of the SIMD Math.

Is there a way to do that and _not_ quadruple the size of the processor?

If you are not looking for massive performance improvement, but just more
compact core loop. You would have to pay for the increase in the register
bank, but you do not need to multiply the ALU by more than the max word
than you operate on (and have the microcode iterate over each word in a
register). That would keep things in check, no?

> SIMD Configuration Instructions:
>
> SIMD.IMODE - This configures the Integer Math mode of the Integer SIMD
> Operations. The accepted modes should include the following:
> * Integer Carry Mode - See ADDC and SUBC for example.
> * Value UnderFlow and OverFlow Mode - See ADDV and SUBV  for examples.
> * Integer Type: Signed and Unsigned values with sizes of 8-bit,
> 16-bit, 32-bit, and 64-bits.

We're really short on instruction space. We fit j64 in, but had to
repurpose several existing instructions in 64 bit mode to do it.

For the needed addition, as they will not be used very often, having a
prefix that enable a new instruction set temporary would limit the problem.
Even then, the instruction space is quite limited and adding an instruction
require careful thinking and testing different scenario.

> Data Loading/Conversion Instructions:
> Bulk Conversion From Integers to Floats, and Floats to integers is a
> must. That said, I'm not exactly sure how many Instructions are needed
> for this, but It would be reasonable to say that four to seven
> instructions may be required.

In Jeff's design you'd copy from register in one mode to register in
another mode, then do a rotate by enough bits you could do the next one,
and a rotate at the end if you needed to restore the register? (Hmmm,
that sounds like it would need an xor on the mode thing between those so
the rotate wasn't within the simd division? Possibly I misunderstood, it
was a while ago...)

One of the thing that killed the performance of early simd cpu was the lack
of efficient way of shuffling bits around. I think there is still
potentially patent on power pc shuffling instruction, but something that
allow rotation by word step would go a long way (again that can be in the
vector instruction space).

Cedric

Rob
_______________________________________________
J-core mailing list
J-core at lists.j-core.org
http://lists.j-core.org/mailman/listinfo/j-core
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.j-core.org/pipermail/j-core/attachments/20171026/3438293a/attachment-0001.html>