[J-core] [RFC] SIMD extention for J-Core
Ken Phillis Jr
kphillisjr at gmail.com
Thu Oct 26 16:12:42 EDT 2017
On Thu, Oct 26, 2017 at 8:59 AM, Christopher Friedt
<chrisfriedt at gmail.com> wrote:
> On Oct 26, 2017 9:04 AM, "emanuel stiebler" <emu at e-bbes.com> wrote:
> On 2017-10-25 17:59, Ken Phillis Jr wrote:
>> New CPUID Flags:
> Just a short one,
> for the floating points, I would prefer
> I would suggest shortening those further.
> * dot notation is a bit nicer
> * u := unsigned
> * s := signed
> * f32 := IEEE 754 32-bit float
> * f16 := half precision
> * similarly, f64, u64, ...
I was realistically thinking the items I mentioned would work like the
CPUID Instruction on x86 where the seven items are mapped to a single
Bit on a built-in read only table on chip that is accessed through the
Table id: TBD - This Table entry covers SIMD Information.
bit 0 to 7 - Integer Size support where bit 0 is for 8-bit integers,
bit 1 is 32 bit integers, etc.
bit 8 - Reserved.
bit 9 to 15 - Floating Point Size support - Bit 9 is 16-bit floats,
bit 10 is 32-bit floats, etc.
bit 16 to 31 - Reserved for future use.
> BGB - could you mention on the list how the FPU design differs between SH
> and x86/mmx ?
> Having ported FFTW over to ARM neon I'm extensively familiar with it and
> know it is strikingly similar to mmx. I know that (at least on ARM) the
> contention you've mentioned is quite significant. Pipeline stalls must be
> precisely inserted to ensure correct results for simd instructions are
> obtained at the correct times, etc. Vector loads and stores,
> cache-prefetching, and i/o alignment were critical.
> For A8 that meant fine-tuning the instructions generated by the compiler.
> There were also some memory barriers, iirc. Mostly hand-written assembly.
> Intrinsics were ~ meh.
> I did work on it before the A9 OOO pipeline was introduced, but also know
> that having an out-of-order unit helped simd on ARM.
> Again, I'm really curious how the FPU design differs, because if SH / J-Core
> can avoid that mess, it would be better off.
More information about the J-core