[J-core] [RFC] SIMD extention for J-Core
cr88192 at gmail.com
Tue Oct 31 14:05:13 EDT 2017
On 10/31/2017 1:34 AM, Ken Phillis Jr wrote:
> On Mon, Oct 30, 2017 at 10:11 PM, Rob Landley <rob at landley.net
> <mailto:rob at landley.net>> wrote:
> > On 10/29/2017 10:37 PM, Ken Phillis Jr wrote:
> >> Information on the SH-3 Is not exactly sparse,
> > We've noticed.
> >> This processor has 3 main versions:
> >> SH-3: The Feature added in this is MMU Instructions, and these are
> >> generally in the System Control Instructions
> > Yeah, that. Except Jeff decided not to use their MMU design because it
> > takes up an unreasonable amount of space in an FPGA (more than doubling
> > the size of the SOC). And we already backported sh3's barrel shift
> > instructions, most of the rest of the instructions they added were for
> > fiddling with the TLB or doing the DSP/FPU stuff.
> > So j3 is adding _an_ MMU, but not necessarily the sh3 mmu. I'll see if
> > we can get more detail posted about the j3 mmu design next month.
> >> SH-3E: the SH-3 Instructions with 32-bit Floating Point Instructions
> >> and registers added.
> > The problem is 32 bit floating point instructions aren't hugely useful,
> > everything interesting's a double. (Even printf("%f") is specced to take
> > a double argument, not a float.) So when we do an FPU, we're most likely
> > to jump straight to the 64 bit version (with maybe a compile time option
> > to strip it down to 32 bits, but we'll see).
> Not everything is interested in Doubles ( 64-bit floats). I know the
> C/C++ Library specification is geared to 64-bit floating point, but
> they do also include definitions for 32-bit floating point values as
> Anyways, it's important to offer 32-bit floating point for developers
> to use since 95% of games and most major 3d specs use it...
yes, basically true from what I have seen (namely, 32-bit single
precision being the more dominant type in-use in-practice).
it is possible to have an FPU that does both, eg, by up-converting to
double internally and then down-converting the result. this is a modest
cost if one skips over supporting denormals (it is mostly bit-twiddly,
which is fairly cheap). though, an alternative is having the main "unit"
have a single/double flag, and unpacking/repacking the output into the
desired format directly (sparing some details here).
usual advantages to single being that it supports doing a lot more
numbers without eating as much memory.
for similar reasons, half-float can often be pretty useful (even when
not supported by hardware), though a case could (possibly) be made for
an instruction to handle F16 <-> F32 conversion (rather than doing it
using function calls and explicit bit-twiddly; newer x86 and ARM have
special instructions for this).
> OpenCL v1.2 - double precision floating point is optional.
> 3D graphics api - generally 32 bit floats are used by programs even
> though the spec says 64 bit floats.
> OpenGL ES 3.2 - essentially single precision floating point only. The
> word double is only used once.
> OpenGL ES 2.0 - all 32-bit floating point
> OpenVG 1.1 - uses 32 bit floats
> Bullet physics - this defaults to single precision floats
> Box2D physics - I am fairly sure this also uses 32 bit floats.
supporting double probably wouldn't mean a lacking single (in any sane
OTOH: not supporting double could mean either a performance hit (due to
emulating it in cases where it is used), or potentially unacceptable
loss of precision or other issues (in cases where a double is actually
there is some uncertainty, for example, when mixing precision in
expressions. the normal C rules specify a slower but more precise route
(namely to always promote to double), whereas cheaper (and sometimes
done by compilers in-practice) is to quietly demote the double to float
in cases where the result will need to be float (such as assigning the
result to a float variable).
> >> SH3-DSP Core: SH-3 Instructions with Arithmetic DSP Instructions
> >> added. In general these are mostly for Integer and fixed point math.
> > I'm pretty sure we're not doing that. (I vaguely recall looking at that
> > trying to find j64 instruction space, but those had _already_ been
> > repurposed by later superh processors. I.E. even later superh didn't
> > respect that, and we needed to support the "not that" uses of that
> > instruction space in j64...)
> > I think. Ask me again next week, the notes and people who wrote them are
> > in tokyo. :)
> > (That said, we may wind up adding another simple DSP to the DMA engine.
> > Maybe something 8-bit and capable of driving ethernet checksumming, PTP,
> > handling the mmc bus state engine... But that's not part of historical
> > superh.)
> >> Also, To see a comparison of the SH1, SH2, SH3, and SH4 lines of
> >> chips, you can find the instruction set summary for these at:
> >> HTML: http://www.shared-ptr.com/sh_insns.html
> > Which is the second link at the top of the j-core.org
> <http://j-core.org> page, and looking
> > through a printout of that is how I was finding instructions to
> > potentially repurpose for j64 last year. (Which I then pointed the
> > actual engineers at so they could do the real research, I was just
> > finding candidates.)
> >> Github: https://github.com/shared-ptr/sh_insns
> >> Also, You can find the Programmers manual for the SH3 by searching the
> >> Renesas Website for the SH7705 chip, and looking for the following
> >> Document:
> >> SH-3/SH-3E/SH3-DSP Software Manual
> > See the older japanese gentleman standing next to me in the second
> > picture in https://lwn.net/Articles/647636/
> <https://lwn.net/Articles/647636/> wearing a red shirt? A
> > couple decades back, he was the SuperH platform architect. He had
> > _stories_ about SH3 development, and answered a lot of "why did they do
> > X" questions. (Not recently, he's moved on to other things. Retired to
> > California I think? I remember he still considered Microsoft Windows CE
> > compatibility to be important because it was a big customer back when
> > sh2 an sh3 were originally developed, so must still be relevant today
> > because reasons. He made darn sure j-core ran a lot of old Windows CE
> > binaries circa 2014 or so. *shrug* Mostly before my time. Yay
> > compatibility testing I suppose.)
> > Yes, back in the day, wince ran on sh:
> > https://msdn.microsoft.com/en-us/library/ms882059.aspx
> > Not currently a development focus of SEI, but if somebody else wanted to
> > do stuff, it's open...
interestingly, although VS2015 seems to lack an SH cross compiler (and
WinCE+SH is basically EOL'ed), its other tools are apparently still able
to work with WinCE SH objects/binaries.
was also able to build binutils for WinCE before, but GCC proper also
seems to have since dropped support (could probably be revived if
someone wanted to beat on it enough).
though, it is possible a lot of this still exists somewhere "in the
> > Rob
> > P.S. I'm not the expert on any of this, I'm just chatty. I try to keep
> > up with the mailing list and with what everybody else is doing. we're
> > trying to resurface from the last 18 months of crazy, and when we do
> > I'll see if I can... I dunno, get Jeff to drop into the #j-core irc
> > channel on freenode for half an hour each week or something. Keep in
> > mind he's usually in japan so his day is US night.
> J-core mailing list
> J-core at lists.j-core.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the J-core