[J-core] json instructions set

Tue Jun 27 00:35:17 EDT 2017

On 6/26/2017 7:12 PM, Cedric BAIL wrote:
> Hello,
>
> So I have been working on making
> http://www.shared-ptr.com/sh_insns.html more usable for developping
> new instructions as I think we are taking the road of adding more of
> them in the future and we need also an easier way to discuss future
> propoal I think.
>
> I have put my work on https://github.com/Bluebugs/sh-insns . It
> require to be installed either locally or on a webserver. It doesn't
> require any database or anything, it uses client side Javascript for
> everything (which make it a bit heavy on your computer, but it isn't
> too bad considering the benefit). This new webpage allow for sorting,
> filtering and regexp based search on the entire instructions set (and
> you can combine this together). The instructions themself are actually
> described in a separate JSON file (thus the title of the email) that
> would allow pull request discussion for expanding the J-Core
> instructions set.

for now I have mostly been using text-files (in MediaWiki format).

a problem that arises is that there is relatively little space left in 
the 16-bit range, so any non-trivial extensions will need to be careful 
to avoid conflicts (hardly any 2-register opcodes and a relatively small 
number of single-register forms remain unused).

it could make more sense to have a place to track who is using which 
opcode values.

I reclaimed some space used originally for SH-DSP for my BJX1 
extensions, figuring that BJX1 and SH-DSP are rather unlikely to need to 
exist on the same core.

as noted, these were used mostly to escape-code to 32-bit instruction 
forms (similar in premise to SH2A).
though, some of the remaining 16-bit 1-register I-forms were also used 
for this.

main ranges (prefix):
* 8Axx (10001010xxxxxxxx)  MOVI24
* 8Cxx (10001100xxxxxxxx)  LDSH8, may repurpose (as more 32-bit I-forms)
* 8Exx (10001110xxxxxxxx)  main set of 32-bit I-forms
** 8Exx-xxxx (10001110xxxxxxxx-xxxxxxxxxxxxxxxx)
* also:
** 82xx (10000010xxxxxxxx)  possibly more 32-bit I-forms

current extended ops in 16-bit space:
* 4--4
** 4n14 ROT32 Rn   //64-bit mode
** 4n34 SHLL4 Rn
** 4n44 EXTU.L Rn   //64-bit mode (zero-extend to 64-bits)
** 4n54 EXTS.L Rn   //64-bit mode (sign-extend to 64-bits)
** 4n65 MOV.W @R0, Rn   //64-bit mode (because most MOV.W ops become MOV.Q)
** 4n75 MOV.W Rn, @R0   //64-bit mode (1)
* 4--5
** 4n35 SHLR4 Rn
** 4n45 MOVHI R0, Rn   //? 64-bit mode
** 4n55 MOVHI Rn, R0   //? 64-bit mode
* 4n38 SHLL32 Rn   //64-bit mode
* 4n39 SHLR32 Rn   //64-bit mode

1: also allows a word memory-load as:
MOVI24 #disp, R0 //(or MOV #disp, R0)
ADD Rm, R0
MOV.W @R0, Rn
EXTS.L Rn //(sign extending to 64 bits)

not ideal, but probably acceptable...

I partly considered another extension (which I called "BJX32+") which 
would handle "64-bit mode" in a different way: basically, rather than 
extending directly 64-bits, it would instead use 48-bit addressing and 
an operating mode more analogous to 8086 real-mode (far pointers and 
segmented addressing), mostly as a possible simpler extension path for 
my existing emulator.

ended up mostly dropping the idea, but did sort of incorporate some 
ideas/concepts into my working design for BJX1 (mostly the notion that 
the 64-bit GPRs are actually register pairs). (this doesn't really 
effect the ISA all that much, but does have some effect on the 
implementation and instruction semantics, 1).

1: for example, the high-bits for GPRs don't just appear/disappear when 
switching modes, but exist as a set of additional albeit normally-hidden 
registers. the high-bits would be nominally undefined in 32-bit mode, 
but a more likely answer is that they will hold whatever was the 
previously held value.

( it is also likely that loads via MOV.L, MOV.W, or MOV.B would require 
an EXTU.L or EXTS.L if operating on the result as a 64-bit value; if I 
go the lazy route and simply ignore the high 32 bits for MOV.L and 
friends in 64-bit mode. some other specifics still TBD... )

I have my C compiler semi-quickly approaching being "usable", so I might 
soon-ish begin experimenting with some of this stuff.

priorities for the C compiler effort:
* get C compiler working to a more satisfactory level.
** I now have it able to build a "mostly working" version of Quake 1
** as I have been focusing mostly on debugging, generated code is still 
pretty bad
** OTOH: compile times are currently around 8-10x faster here than the 
GCC 4.2.1 build...
*** namely, rebuilding Quake1 in ~4s vs ~35s.
* get basic SH2 and SH4 modes working (in both BE and LE modes)
** testing thus far limited mostly to SH4 LE
** currently SH2 and SH4 are higher priorities than BJX1
** TODO: correct handling of volatile and __packed pointers and similar.
* produce binary output in a format other than static-linked ELF
** note that the compiler neither produces nor consumes traditional 
object files.
** static libraries are currently handled by compiling them to a 
stack-based IR.
** a hope is to later support a DLL/SO use-case.
** probable differences will depend on how the binaries are to be used.
*** bare metal: raw binary or flat ELF
*** Linux: PIC or FDPIC
*** testkern: probably either PE or a custom "WEX" format or similar
**** where WEX is sort of like a hybrid of PE and the Doom WAD format.
**** designed partly to address some annoyances with both PE and ELF, 
aims to be fairly simplistic.
**** not really intended for widespread use.

> As I know there will be discussion regarding the license, it is
> currently under GPLv3, but it can be relicensed to whatever any one
> really prefer. I don't think this work is impacted by the original
> license of the code that generate
> http://www.shared-ptr.com/sh_insns.html (GPLv3 doesn't apply on the
> output of the program and you can run that program locally). I have
> also not extracted the svg that the original site contain to not be
> affected by the GPLv3. So all in all, I think this is fine to be
> relicensed (I have checked in the node.js script I have used to
> extract the information and I made sure that all manual modification
> where visible in the git history).

my stuff is currently under the MIT license (excluding code I don't own, 
most of which is GPLv2).

my thinking is that people could use my stuff for whatever under pretty 
much any terms.
though, there is always the possibility of people being a-holes with 
patents or similar...

> My hope here would be that the J-Core project accept this as the first
> J-Core github project and that we can use this for future discussion
> on new instructions. Maybe it could also become a documentation
> repository for the J-Core in general (This file only cover
> instructions for the  moment and there is clearly more to a CPU than
> just its instructions set).

yeah, would like to know what others are doing.
pointless mutual incompatibility is preferably avoided.