[J-core] working on SH-2 emulator.
BGB
cr88192 at gmail.com
Wed Sep 7 17:02:41 EDT 2016
On 9/7/2016 1:20 PM, Rich Felker wrote:
> On Tue, Sep 06, 2016 at 02:04:48AM -0500, Rob Landley wrote:
>> On 09/06/2016 12:53 AM, D. Jeff Dionne wrote:
>>> On Sep 6, 2016, at 2:46 PM, BGB <cr88192 at gmail.com> wrote:
>>>
>>> Oh, excellent. This means you have passed all the CPU tests, and dropped into
>>> a GDB stub. Did you implement and test the J2 CAS instruction, or
>> patch it out?
>>
>> CAS an the two bit-shifts from sh3. (All mentioned on http://j-core.org
>> but I really should have better docs.)
>>
>> Alas, our arch/sh/configs/j2_defconfig does not appear to have
>> EARLY_PRINTK set, most likely because we haven't got an early printk
>> driver for our serial device. So you've got to get pretty far along in
>> the kernel boot before it dumps the printk buffer out to the serial device.
>>
>> (I remember Rich poking at that area before, but I don't remember how it
>> turned out. Possibly it's just not enabled in the config?)
> EARLY_PRINTK is deprecated and requires arch-specific hooks. The
> modern replacement is EARLYCON, and these Kconfig settings:
>
> CONFIG_CMDLINE="console=ttyUL0 earlycon"
> CONFIG_SERIAL_EARLYCON=y
> CONFIG_SERIAL_UARTLITE=y
> CONFIG_SERIAL_UARTLITE_CONSOLE=y
>
> plus an appropriate node in the device tree, like:
>
> chosen {
> stdout-path = "serial0";
> };
>
> where "serial0" is an alias assigning a name for the uartlite node,
> should make it work.
>
> It looks like I somehow omitted CONFIG_SERIAL_EARLYCON=y from
> j2_defconfig; I'll make a note to add it.
yeah.
I still sort-of want a working Linux, but ran into a problem where
during building it says:
sh2elf-ld: target elf32-shbig-linux not found
granted, I may have been trying to use the wrong compiler for this, but
sh2eb doesn't build...
grep only finds one occurrence of this (in a ".S" file), but commenting
it out doesn't work.
trying to set to little endian (just to see if it builds):
sh2elf-ld: target elf32-sh-linux not found
this was well after the point where the main aboriginal build process
blows up, and I was trying to get the kernel built more manually.
on-off battling with this for several days now has left me a bit
frustrated...
trying to get the Linux kernel built is proving somewhat more
frustrating than it was to pull an SH-2 emulator out of thin air. I
think maybe this says something...
in other news, I am considering adding a subset of the FPU and MMU
facilities to my emulator.
for reference, I would want to be able to look at the SH-3 MMU and see
how it compares with the SH-4 MMU, but a spec for SH-3 is proving elusive.
I am not sure what the standing is for the 32-bit SH-DSP/SH-2A
instructions (ex: are these ones "safe").
FPU would probably be a scalar only subset (no vector ops for now).
MMU would probably be a hack using an x86-like design. the TTB register
would be interpreted as holding a page-table, which would be interpreted
similarly to how x86 and ARM page-tables work.
some specifics would need to be TBD, as this wouldn't map up exactly
with how the SH4 does it.
in particular, the SH-4 TLB effectively uses 64-bit entries, whereas a
normal page-table would need 32-bit entries.
skimming through Linux source: looks like they do it with 32-bit entries
and a 2-level table.
PTEH: set with address from PTE and a faked ASID;
PTEL: just copies low-order bits from PTE.
if done this way, it is possible that this hacked MMU design could "just
work" if running Linux, or another OS which does basically the same thing.
the main alternative though, is, correctly implementing the SH-4 MMU, at
a likely performance cost.
some details would need to be decided WRT the handling of the
trace-cache and SMC detection bitmaps in relation to the MMU. done
naively, swapping page-tables or other things would also imply flushing
the trace-cache and zeroing the SMC bitmap.
possible tweak could be (quietly) keying TTB with a separate internal
context holding TLB and trace-cache state, so that changing address
spaces doesn't necessarily flush the caches (but, instead swaps to a
different set of caches).
however, this would be rendered ineffective (and detrimental to
performance) if the number of active address spaces (non-sleeping
processes) exceeds the number of caches (if done, would need to be
fairly small to avoid potentially excessive memory use).
I am thinking I may also add a MOV.L constant-load optimization, if the
source-address and instruction fall within the same page (the SMC
handling mechanism being page-granular, so writing into a page in which
code is executing triggers SMC handling, so effectively it comes "for
free", in contrast to the case where the constant and instruction would
fall into different pages).
or such...
More information about the J-core
mailing list