[J-core] PC-relative loads and delay slots

Robert Ou rqou at robertou.com
Mon Jul 18 23:55:36 EDT 2016

On Mon, Jul 18, 2016 at 8:23 PM, Rich Felker <dalias at libc.org> wrote:
> On Mon, Jul 18, 2016 at 09:52:16AM -0400, Rich Felker wrote:
>> On Mon, Jul 18, 2016 at 02:37:55AM -0700, Robert Ou wrote:
>> > Hi,
>> >
>> > What is the correct behavior of PC-relative instructions such as
>> > "mov.l @(disp, PC), Rn" in a branch delay slot? Is this even allowed?
>> > From my testing, GAS seems to think it is "disp is multiplied by 4 and
>> > added to the address of the mov.l opcode + 2" but J-core seems to
>> > execute it as "disp is multiplied by 4 and added to the address of the
>> > branch target + 2". I discovered this while working on my MyHDL
>> > demonstration, and you can compare the difference in my demonstration
>> > by running the master branch and the branch_delay_test branch.
>> If true I think it's a bug. The original SH ISA documentation
>> specifies the behavior for "PC-relative" mov instructions as:
>>   "The PC points to the starting address of the second instruction
>>   after this MOV instruction"
>> (as opposed to the actual current value of the program counter). This
>> text is found on page 202 of document REJ09B0171-0500O.
>> I'm quite surprised we haven't run into this bug, since I would expect
>> gcc to generate code with immediate loads in branch delay slots (e.g.
>> when making function calls with constant arguments).
> Some further info:
> mova is documented to produce a result relative to the branch
> destination, but pc-relative mov.l seems to be documented to behave as
> I described above. However, this is only valid for sh1/2/3. On sh4,
> both mova and pc-relative mov.l (and mov.w) are illegal in delay slots
> and result in a trap (so the kernel can emulate them very slowly if
> you really want them). This is presumably why gcc never generates
> the pc-relative mov.l in delay slots and thus why the bug has never
> affected us.
> Rich

In the meantime, I have written a (very ugly) patch that makes both
mova and mov.l behave "as you would intuitively expect." It is
attached. It didn't break in some quick smoke testing (booting the
kernel, running the test program posted earlier), but I haven't tested
it extensively. It is also terrible code-quality-wise and violates a
bunch of abstraction barriers. Since it turns out that the "weird"
behavior is the correct behavior, this patch is probably only useful
for reference if someone wants to make a not-quite-sh2-compatible
-------------- next part --------------
A non-text attachment was scrubbed...
Name: possible-delay-slot-fix.patch
Type: text/x-patch
Size: 10514 bytes
Desc: not available
URL: <http://lists.j-core.org/pipermail/j-core/attachments/20160718/d5494fef/attachment.bin>

More information about the J-core mailing list