Pinocchio is a cancelled 1996 game for the 32X, a hardware add-on for the Sega Mega Drive / Genesis, adding two Hitachi SH-2 32-bit RISC processors to the mix.

A cheat menu had been found in a prototype’s ROM, but without a way to activate it. We will see that with the right tooling, we can figure it out just by following references.

System architecture

We need to consider that the Mega Drive’s Motorola 68000 CPU is also being used. The workload is distributed across CPUs depending on how games implement their pipelines.

Two memory maps are reserved for each CPU architecture, with the Mega Drive’s being partially overwritten to allow accessing and controlling the 32X. The best source to understand these is the official documentation, in particular “32X Hardware Manual” (MAR-32-R8-010995), Section 3.1 “Mapping”:

Tooling

  • Ghidra loader: A fork of Kononovich’s Mega Drive loader with additional 32X segments;
    • There was only a single segment for the Mega Drive’s addressed range 0x00800000..0x00a00000, while we need the SH-2 addressed ranges to follow SH-2 code:
      • ROM: 0x02000000..0x02400000;
      • DRAM / frame buffer: 0x04000000..0x04040000;
      • SDRAM / work RAM: 0x06000000..0x06040000;
    • With these additional segments, Ghidra’s auto analysis will be able to find and disassemble functions, due to references falling inside defined memory maps, which would not be the case if we simply imported the ROM at offset 0;
  • Emulator: I’ve picked ares, which not only has a configurable instruction trace log, but also a memory view, making it easier to see live changes without having to patch printfs in e.g. PicoDrive;
    • The log appears to only report part of the hit instructions, similar to what I previously encountered with PicoDrive, but I ended up not modifying this implementation, since it turned out to be sufficient to identify basic blocks of interest;

For disassembling, it seems the common approach is to have distinct databases, one for each CPU architecture. In our case, a SH-2 disassembly was enough.

Analysis

Right from the beginning, we get a promising result by searching for the string “Cheat menu”, which appears to be part of a table (named cheat_menu_labels) referenced by function 0x0208c598:

0208c598 d0 08         mov.l  @(PTR_DAT_0208c5bc,pc),r0 = 06000834
0208c59a 22 2a         xor    r2,r2
0208c59c 20 22         mov.l  r2,@r0=>DAT_06000834 = ??
0208c59e d3 06         mov.l  @(PTR_DAT_0208c5b8,pc),r3 = cheat_menu_labels
0208c5a0 2f a6         mov.l  r10,@-r15=>local_4
0208c5a2 4f 22         sts.l  pr,@-r15=>local_8
0208c5a4 da 03         mov.l  @(->FUN_0208c686,pc),r10 = 0208c686
0208c5a6 4a 0b         jsr    @r10=>FUN_0208c686
0208c5a8 00 09         _nop
0208c5aa 4f 26         lds.l  @r15=>local_8+,pr
0208c5ac 6a f6         mov.l  @r15=>local_4+,r10
0208c5ae 00 0b         rts
0208c5b0 00 09         _nop
0208c5b2 00            ??     00h
0208c5b3 00            ??     00h
                     PTR_FUN_0208c5b4
                     XREF[1]: cheat_menu:0208c5a4(R)
0208c5b4 02 08 c6 86   addr   FUN_0208c686
                     PTR_DAT_0208c5b8
                     XREF[1]: cheat_menu:0208c59e(R)
0208c5b8 02 08 c5 c0   addr   DAT_0208c5c0
                     PTR_DAT_0208c5bc
                     XREF[1]: cheat_menu:0208c598(R)
0208c5bc 06 00 08 34   addr   DAT_06000834
                     cheat_menu_labels
                     XREF[1]: 0208c5b8(*)
0208c5c0 ff            ??     FFh
0208c5c1 30            ??     30h    0
0208c5c2 20 20 20      ds     "      Pinocchio"
         20 20 20
         50 69 6e
0208c5d1 ff            ??     FFh
0208c5d2 07            ??     07h
0208c5d3 20 20 2d      ds     "  -  "
         20 20
0208c5d8 ff            ??     FFh
0208c5d9 28            ??     28h    (
0208c5da 43 68 65      ds     "Cheat menu\n\n\n\n"
         61 74 20
         6d 65 6e

Having referenced addresses and data after each function body is common in SH-2, similar to what you would find in ARM as well.

The function is called by 0x0208c4c6 (named cheat_menu_pre1), which in turn is called by 0x0208b564 (named cheat_menu_pre2), which has the following conditional logic:

if (*(short *)PTR_DAT_0208b6fc != 0) {
  *(undefined2 *)PTR_DAT_0208c140 = 0;
  // ...
  (*(code *)PTR_cheat_menu_pre1_0208c12c)();
  // ...
}

The decompilation doesn’t dereference pointers, but it’s clearer in the disassembly:

0208b6ee d1 03         mov.l  @(PTR_DAT_0208b6fc,pc),r1 = 060006f8
0208b6f0 60 11         mov.w  @r1=>DAT_060006f8,r0 = ??
0208b6f2 20 08         tst    r0,r0
0208b6f4 8b 18         bf     LAB_0208b728
0208b6f6 00 0b         rts
0208b6f8 00 09         _nop
; ...
0208b728 a4 40         bra    LAB_0208bfac
0208b72a 00 09         _nop
0208b72c a0 95         bra    LAB_0208b85a
0208b72e 00 09         _nop
; ...
0208bfac d3 64         mov.l  @(PTR_DAT_0208c140,pc),r3 = 060007a8
0208bfae 20 0a         xor    r0,r0
; ...
0208bfec da 4f         mov.l  @(->cheat_menu_pre1,pc),r10 = 0208c4c6
0208bfee 4a 0b         jsr    @r10=>cheat_menu_pre1

At some point, the SDRAM address 0x060006f8 is assigned and later on checked to conditionally call what we assume is the cheat menu handler.

But first, let’s see if any of this code is reached in normal gameplay. We can start by taking the typical instruction trace log with ares. We do actions such as opening the options screen at the main menu screen, change some options, then start the first level, pause the game, move around…

To start tracing at the beginning of execution, we check “Settings > Boot Options > Launch Debugger”. Then, we configure the tracer to include both SH-2 CPUs:

The output is then processed by the trace logger script from a previous writeup, which reads from file “f”:

./ares/desktop-ui/out/ares | awk '/SHM|SHS/{print $2}' > f

Resulting in ./trace.out, which indeed includes hit addresses right up to the condition we saw above! One of these addresses was the branch at 0x0208b72c, right next to the other branch we are interested in. Hmm, how about a simple patch to the target address of that hit branch:

 0208b728 a4 40        bra    LAB_0208bfac
 0208b72a 00 09        _nop
-0208b72c a0 95        bra    LAB_0208b85a
+0208b72c a4 3e        bra    LAB_0208bfac
 0208b72e 00 09        _nop

Now we run the game, up to the main menu screen:

Then we open that options screen, and just like that…

It works! But wait, isn’t there some intended way to activate it? Well, let’s undo that patch, then follow the only write cross-reference for address 0x060006f8, bringing us to the following function, which is sequentially checking 9 entries at 0x02082a38 against values at 0x06000772:

                     XREF[2]: FUN_02082820:020828b8(c),
                              02082974(*)
02082a10 d1 08         mov.l   @(PTR_DAT_02082a34,pc),r1 = 06000772
02082a12 d2 07         mov.l   @(PTR_WORD_02082a30,pc),r2 = 02082a38
02082a14 e4 09         mov     #0x9,r4
                     LAB_02082a16
                     XREF[1]: 02082a20(j)
02082a16 60 15         mov.w   @r1=>DAT_06000772+,r0 = ??
02082a18 65 25         mov.w   @r2=>button_codes+,r5 = 40h
02082a1a 30 50         cmp/eq  r5,r0
02082a1c 8b 04         bf      LAB_02082a28
02082a1e 44 10         dt      r4
02082a20 8b f9         bf      LAB_02082a16
02082a22 d1 02         mov.l   @(PTR_DAT_02082a2c,pc),r1 = 060006f8
02082a24 e0 ff         mov     #-0x1,r0
02082a26 21 01         mov.w   r0,@r1=>DAT_060006f8 = ??
                     LAB_02082a28
                     XREF[1]: 02082a1c(j)
02082a28 00 0b         rts
02082a2a 00 09         _nop
                     PTR_DAT_02082a2c
                     XREF[1]: cheat_check_input:02082a22(R)
02082a2c 06 00 06 f8   addr    DAT_060006f8
                     PTR_WORD_02082a30
                     XREF[1]: cheat_check_input:02082a12(R)
02082a30 02 08 2a 38   addr    button_codes
                     PTR_DAT_02082a34
                     XREF[1]: cheat_check_input:02082a10(R)
02082a34 06 00 07 72   addr    DAT_06000772
                     button_codes
                     XREF[2]: cheat_check_input:02082a18(R),
                              02082a30(*)
02082a38 00 20         dw      20h
02082a3a 00 40         dw      40h
02082a3c 00 10         dw      10h
02082a3e 00 20         dw      20h
02082a40 00 40         dw      40h
02082a42 00 10         dw      10h
02082a44 00 20         dw      20h
02082a46 00 40         dw      40h
02082a48 00 10         dw      10h
02082a4a 00 00         dw      0h

If all 9 values are matched, then 0x060006f8 = 0xffff, which happens to be different from zero, which is exactly what was being checked in cheat_menu_pre2:

0208b6ee d1 03         mov.l   @(PTR_DAT_0208b6fc,pc),r1 = 060006f8
0208b6f0 60 11         mov.w   @r1=>DAT_060006f8,r0 = ??
0208b6f2 20 08         tst     r0,r0
0208b6f4 8b 18         bf      LAB_0208b728 ; take branch to cheat menu handler

We can quickly confirm that these are button codes input during the pause screen:

We know that 0x06000000 is the base address for the 32X SDRAM, so let’s open the memory view for this region:

After pressing button “C”, we get 0x06000772 = 0x20:

After pressing button “A”, we get 0x06000772+2 = 0x40:

The full sequence is “C, A, B, C, A, B, C, A, B”. After the remaining entries are matched, we get 0x060006f8 = 0xffff:

Finally, we press any of the “A, B, C” buttons, and the game is reset, bringing us back to the main menu screen:

But when we press “Start”, we now get the cheat menu!

So there you have it, we didn’t even need to dive into any messy M68K/SH-2 synchronization…

Well, alright, let’s take a peek

You might be wondering about that choice of words over at 0x060006d8. To explain that, we’ll also need a M68K disassembly, so that we can check both sides of the CPU synchronization. We will also take ./trace.3cpus.logo.out, which has instructions executed by all 3 CPUs up until the SEGA logo is rendered on-screen. This log helps us see when some entry points are called, and also when some synchronization busy waits are breaked.

Just like in regular Mega Drive games, we start with the Reset interrupt vector, which in this case is located at 0x3f0. However, we’ll find some reads and writes being done over the “32X SYS REG” address range (0xa15100..0xa15180), in particular, the communication registers (0xa15120..0xa15130). For example, here we are reading the “M_OK” and “S_OK” bytes sent from SH-2 CPUs:

                     LAB_00000d54                                    XREF[1]:     00000d5c(j)
00000d54 0c ad 4d        cmpi.l     #0x4d5f4f4b,(offset DAT_00a15120,A5) ; "M_OK"
         5f 4f 4b
         00 20
00000d5c 66 00 ff f6     bne.w      LAB_00000d54
                     LAB_00000d60                                    XREF[1]:     00000d68(j)
00000d60 0c ad 53        cmpi.l     #0x535f4f4b,(offset DAT_00a15124,A5) ; "S_OK"
         5f 4f 4b
         00 24
00000d68 66 00 ff f6     bne.w      LAB_00000d60
00000d6c 23 fc 49        move.l     #0x494e4954,(LAB_00ff0000).l ; "INIT"
         4e 49 54
         00 ff 00 00

Those first 2 words are written by the SH-2 boot ROMs, as illustrated in the following diagrams from the “32X Hardware Manual”, Section 5.1 “Boot ROM”:

Afterwards, we see more busy waits:

00000d8e 20 3c 46        move.l     #0x4655434b,D0 ; "FUCK"
         55 43 4b
00000d94 2b 40 00 20     move.l     D0,(offset DAT_00a15120,A5)
00000d98 2b 40 00 24     move.l     D0,(offset DAT_00a15124,A5)
00000d9c 2b 7c 12        move.l     #0x12345678,(offset DAT_00a15128,A5)
         34 56 78
         00 28
                     LAB_00000da4
                     XREF[2]:     00000dac(j), 00000db6(j)
00000da4 0c ad 46        cmpi.l     #0x46495348,(offset DAT_00a15120,A5) ; "FISH"
         49 53 48
         00 20
00000dac 66 f6           bne.b      LAB_00000da4
00000dae 0c ad 46        cmpi.l     #0x46495348,(offset DAT_00a15124,A5) ; "FISH"
         49 53 48
         00 24
00000db6 66 ec           bne.b      LAB_00000da4

At this point, it would be convenient to also look at the SH-2 entrypoints. So… where are they defined? I just looked at the instruction trace, which for each SH-2 CPU started at boot ROM addresses, which then jumped to a… SDRAM address?

SHM  00000298  mov.l   @r8+,r1
...
SHM  06000454  mov.l   @(0x064,pc),r1

Likely some code is copied from the ROM, but how is this mapping done? We can load the SH-2 Master Boot ROM as an overlay in Ghidra, and we see it parsing address 0x06000454 from 0x220003e0:

000002a8 68 d2        mov.l   @r13=>PTR_220003e0,r8 = 06000454
000002aa d0 0a        mov.l   @(DAT_SHM_BOOT_ROM__000002d4,pc),r0 = 4D5F4F4Bh
000002ac c2 08        mov.l   r0,@(0x20,gbr=>DAT_20004020)
000002ae 48 2b        jmp     @r8=>DAT_06000454

This maps to file offset 0x3e0, and is preceeded by string “MARS CHECK MODE”, which we can find in the documentation as part of the “MARS User Header”, also described in Section 5.1 “Boot ROM”, which in our case corresponds to these entries:

000003c0 4d 41 52     ds      "MARS CHECK MODE " ; module name
         53 20 43
         48 45 43
000003d0 00 00 00 00  ddw     0h           ; version
000003d4 00 3f f9 14  addr    DAT_003ff914 ; source address
000003d8 00 00 00 00  addr    00000000     ; destination address
000003dc 00 00 06 ec  ddw     6ECh         ; size
000003e0 06 00 04 54  addr    DAT_06000454 ; SH-2 Master start address
000003e4 06 00 01 20  addr    DAT_06000120 ; SH-2 Slave start address
000003e8 06 00 03 34  addr    DAT_06000334 ; SH-2 Master vector base address
000003ec 06 00 00 00  addr    DAT_06000000 ; SH-2 Slave vector base address

But what bytes are copied to 0x06000454? The answer is right in front of us, but I only got it after taking some bytes at 0x06000454 from the emulator’s memory view and searching for them in the ROM image. These bytes can be found at file offset 0x3ffd68, so we get 0x3ffd68 - 0x454 = 0x3ff914, which is the “source address” we see above. We just need to add the SH-2 ROM base address to get the effective address 0x02000000 + 0x3ff914 = 0x023ff914.

Great, now we can follow the SH-2 Master entrypoint at 0x023ffd68, which eventually arrives at 0x06000690, where we see it reading from 0x20004020, the address of the communication registers on the SH-2 memory map, which corresponds to the Mega Drive’s 0xa15120. Indeed, it’s checking for the Mega Drive’s write at 0x00000d94 from the Reset interrupt vector:

060006a0 d1 0e        mov.l      @(060006dc,pc),r1 = "FUCK"
                  check_if_m68k_gave_a_fuck
                  XREF[1]:     060006a6(j)
060006a2 c6 08        mov.l      @(0x20,gbr),r0=>DAT_20004020
060006a4 30 10        cmp/eq     r1,r0
060006a6 8b fc        bf         check_if_m68k_gave_a_fuck ; branch back if not matched

Finally, we can see that the words at 0x060006d8 happen to be located sequentially in memory:

                     XREF[1]:     FUN_SH2_SDRAM__06000690:060006a8
060006d8 46 49 53 48     ds         "FISH"
                     XREF[1]:     FUN_SH2_SDRAM__06000690:060006a0
060006dc 46 55 43 4b     ds         "FUCK"

Mystery solved. Next time you need a pair of words, don’t settle for less with some boring “ping” and “pong”.