[J-core] Delay slots 2: Electric Boogaloo (Illegal instructions in delay slots)

Rich Felker dalias at libc.org
Thu Jul 21 15:21:13 EDT 2016


On Wed, Jul 20, 2016 at 03:10:45PM -0700, Robert Ou wrote:
> What value of PC should be saved on the stack when an illegal
> instruction is encountered in each of the following?
> 
> a) an unconditional branch
> b) a conditional branch that is taken
> c) a conditional branch that is not taken
> 
> The behavior of J-core WITHOUT my patch seems to be:
> a) the address of the branch target
> b) the address of the branch target
> c) the address of the illegal instruction (the address of the delay slot)
> 
> The behavior of J-core WITH my patch seems to be:
> a) the address of the illegal instruction + 4
> b) the address of the illegal instruction + 4
> c) the address of the illegal instruction (the address of the delay slot)
> 
> The expectation of the Linux kernel with FPU emulation enabled
> (do_illegal_slot_inst when CONFIG_SH_FPU_EMU is defined) seems to be:
> a) the address of the branch (the address of the delay slot - 2)
> b) the address of the branch (the address of the delay slot - 2)
> c) the address of the branch (the address of the delay slot - 2)
> 
> The expectation of the Linux kernel on fixing alignment errors
> (handle_unaligned_access, not actually used for J2) seems to be
> (according to the comment above the code for SH3):
> a) the address of the branch (the address of the delay slot - 2)
> b) the address of the branch (the address of the delay slot - 2)
> c) the address of the illegal instruction (the address of the delay slot)
> 
> Which set of behaviors (if any) is correct?

I'm not basing this on any SH documentation, but from the standpoint
of being useful to software, the trap handler wants/needs to know the
address of the branch instruction. This is because, if the trap is for
the sake of emulating an instruction that's only present on
later/different versions of the ISA or a feature the hardware doesn't
support at all, the handler needs to be able to first emulate the
delay-slot instruction, then emulate the branch to adjust the program
counter before resuming.

Note that in the special case where the branch is not taken, though,
there is no need for the emulator to be aware that a branch was even
present. Logically the effect is the same as if the untaken branch did
not execute its delay slot, but instead jumped to the instruction
address immediately after it (i.e. its delay slot). So where the
kernel comment reads:

/*
 * handle an instruction that does an unaligned memory access
 * - have to be careful of branch delay-slot instructions that fault
 *  SH3:
 *   - if the branch would be taken PC points to the branch
 *   - if the branch would not be taken, PC points to delay-slot
 *  SH4:
 *   - PC always points to delayed branch
 * - return 0 if handled, -EFAULT if failed (may not return if in
 * kernel)
 */

The only thing it has to "be careful" about is that it can't assume
that, when the saved PC points to a branch, the branch will be taken
(this assumption would be safe for SH3 but not for SH4). Thus it
actually has to compute whether the branch would be taken.

I'm not sure which behavior SH2 is supposed to have, but my guess
would be that it matched SH3.

In any case, both the current J2 behavior and the behavior with your
patch are clearly wrong. A kernel/trap-handler that's aware of the
unusual behavior after your patch could work around it, but there's
utterly no way to do anything useful with the current J2 behavior; the
address of the instructions the trap needs to process has been
completely lost.

Rich


More information about the J-core mailing list