SimCity 2000 had a port made for the PS1 in 1996. Under file “DATA\THREED\3D_FACES.DAT” there were some pictures of children, which could be an undiscovered easter egg.

For a game that has been pretty much covered in terms of cheats, I wonder how stuff like this can go unnoticed. Would it have radically different activation steps? Would it just be unreferenced in code? Well, only one way to find out…

Following the references

With a quick grep on the US release, we find “3D_FACES.DAT” on another file, which is loaded as an overlay by the main executable “slus_001.13”:

void FUN_800a230c(void) {
  load_group("cdrom:\\DATA\\OVERLAYS\\GROUP4R.BIN;1",&DAT_8010fd68);
  return;
}

If we follow the call chain of FUN_800a230c, we arrive at FUN_8004b5fc (named load_group4_pre2), being passed as a function pointer, similar to several other callbacks:

void UndefinedFunction_800888e8(void) {
    // ...
    new_cb_obj(&DAT_800e95f8,0x19,0x87,0,0,0,8,0x13,0,0);
    new_cb_obj(&DAT_800e9630,0x9c,0x82,0x30e,0x30e,0,0,0xe,load_group4_pre2,0);
    new_cb_obj(&DAT_800e9668,0x71,0x61,0x1a6,0x9a,0,0,10,&LAB_800a1dc0,0);
    // ...
}

In Mednafen’s debugger, we can put a breakpoint at 0x8004b5fc, but it’s not hit after starting a new game and interacting with a few menus.

It doesn’t hurt to shotgun our way through the call chain: we step into whatever is the next instruction in game code (the main executable’s “.text” section covers 0x8001a71c..0x800e6633), then continue until we return from whatever function we are in, all the way up until we are close to the main game loop; then, on the next function call, we set the program counter to the address of our function of interest. Most of the time, the game obviously crashes, but on some cases, the overlay does get loaded, and the background changes:

After looking at a longplay video, we see that this background is rendered when we switch to the 3D driver mode, where we go around the city’s streets in a car.


We can add “GROUP4R.BIN” in Ghidra at 0x8010fd68, then find “3D_FACES.DAT” loaded after a set of .DAT files:

void FUN_80119958(void) {
  iVar4 = load_data("cdrom:/DATA/THREED/3D_CHP1.DAT;1");
  // ...
  iVar4 = load_data("cdrom:/DATA/THREED/3D_CHP2.DAT;1");
  // ...
  iVar4 = load_data("cdrom:/DATA/THREED/3D_FACES.DAT;1");
  // ...
}

I used psx-asset-viewer, an online tool that parses TIM format textures, which happen to be contained in these .DAT files. “3D_CHP*” contains textures for a chopper, which are used in one of the known cheats, where the aerial view is rendered with a chopper cockpit. So, is this cheat related with the children’s pictures? Well, if we check Mednafen’s VRAM viewer, all the textures are loaded, which is a good first sign:

But how do they end up being displayed? Let’s start by following references for the chopper textures, which will likely be under some conditional logic that checks if the corresponding cheat was activated.

Most of the textures in these .DAT files share a similar initialization, which includes a common call (named load_gfx_bind3):

  DAT_80120cac = 0xf9;
  DAT_80120cae = 0x8e;
  DAT_80120cb0 = 0x40;
  DAT_80120cb1 = 0x80;
  DAT_80120ca8 = 0x80;
  DAT_80120ca9 = 0x80;
  DAT_80120caa = 0x80;
  DAT_80120cb2 = 0x7bc0;
  DAT_80120cb4 = 0x40;
  DAT_80120cb6 = 0x40;
  DAT_80120ea4 = 0x40;
  DAT_80120ea6 = 0x80;
  DAT_80120ea8 = 0x40;
  DAT_80120eaa = 0x40;
  iVar4 = get_800ef708();
  if (iVar4 == 1) {
LAB_8011a6fc:
    bVar3 = true;
  }
  else {
    iVar4 = get_800ef708();
    bVar3 = false;
    if (iVar4 == 2) goto LAB_8011a6fc;
  }
  uVar5 = 0xab;
  if (bVar3) {
    uVar5 = 0x28b;
  }
  load_gfx_bind3(&DAT_80120de4,0,0,uVar5,&DAT_80120ea4);

If we check the references for 0x80120de4, there are 2 additional ones in FUN_8011b5ec (named render_chopper).

Some of the load_gfx_bind3() calls received arguments that didn’t have any additional references. Turns out that they were done indirectly, using 0x80120de4 as a base:

iVar2 = DAT_8011f71c; // texture index
DAT_8011f714 = DAT_8011f714 + DAT_8011f718;
iVar5 = DAT_8011f71c * 0x14;
*(undefined2 *)(&DAT_80120cae + iVar5) = (undefined2)DAT_8011f714;
puVar6 = (uint *)(&DAT_80120ca4 + iVar5);
*puVar6 = *puVar6 & 0xff000000 | DAT_8011f760 & 0xffffff;
puVar3 = (uint *)(&DAT_80120de4 + iVar2 * 0xc); // indirect reference
DAT_8011f760 = DAT_8011f760 & 0xff000000;
*puVar3 = *puVar3 & 0xff000000 | (uint)puVar6 & 0xffffff;
DAT_8011f760 = DAT_8011f760 & 0xff000000 | (uint)puVar3 & 0xffffff;

render_chopper() only gets called when 0x80120610 != 0 (named w_egg_chopper). However, the only direct references to this variable only set it to 0…

On the topic of “Continuously press R2, L2, R2, L2, R2, L2, etc”

Let’s put a memory write breakpoint on w_egg_chopper. While inputting the sequence for the chopper (which I had quite a bit of trouble activating… more on that later), we hit an indirect reference in FUN_80110bac (named check_input). Here’s where it gets interesting:

if (DAT_8011f6bc != 0) {
  puVar5 = &DAT_80120260;
  piVar9 = &DAT_801202b0;
  i = 0;
  do {
    if (*(int *)((int)&DAT_80120254 + i) != 0) {
      if (*piVar9 == 1) {
        uVar6 = 1;
      }
      else {
        uVar6 = (uint)(*(int *)((int)&DAT_8012025c + i) == *(int *)((int)&DAT_80120258 + i));
      }
      *(uint *)((int)&DAT_801202b0 + i) = uVar6;
      uVar6 = *(uint *)((int)&DAT_8012025c + i) & -(uint)((int)*(uint *)((int)&DAT_8012025c + i) < 0x14);
      *(uint *)((int)&DAT_8012025c + i) = uVar6;
      iVar7 = 0;
      if (*(uint *)(puVar5 + uVar6 * 4) == DAT_8011f6bc) {
        iVar7 = uVar6 + 1;
      }
      *(int *)((int)&DAT_8012025c + i) = iVar7;
      if (*piVar9 == 1) {
        uVar6 = 1;
      }
      else {
        uVar6 = (uint)(iVar7 == *(int *)((int)&DAT_80120258 + i));
      }
      *(uint *)((int)&DAT_801202b0 + i) = uVar6;
    }
    piVar9 = piVar9 + 0x18;
    i = i + 0x60;
    puVar5 = puVar5 + 0x60;
  } while ((int)piVar9 < 0x80120790);
}

While the logic involved isn’t immediately clear, we can already see some patterns:

  • These input checks are guarded by DAT_8011f6bc, which is set whenever we press a button;
  • If DAT_80120254 + i = 0, we skip the checks as well;
  • A few indirect accesses are done sequentially (DAT_80120254, DAT_80120258, DAT_8012025c), which together with the common i offset, suggest this is an array of structures with (at least) 3 fields;

If we go check the values around DAT_80120254, they are all zero’d. Turns out that they are initialized by FUN_8011884c (named init_cheats), so the overlay had to be replaced with a memory dump.

Afterwards, I wrote a Ghidra script to parse these structures, so that we can have a summarized view of these values:

import ghidra.app.script.GhidraScript;
import ghidra.app.util.bin.BinaryReader;
import ghidra.app.util.bin.ByteProvider;
import ghidra.app.util.bin.MemoryByteProvider;

public class SC2KCheats extends GhidraScript {
    public void run() throws Exception {
        println(String.format("base=0x%X", currentProgram.getImageBase().getUnsignedOffset()));
        ByteProvider provider = new MemoryByteProvider(currentProgram.getMemory(), currentProgram.getImageBase());
        BinaryReader reader = new BinaryReader(provider, true);

        long flags = 0x001202b0;
        long cheat_base = 0x00120254;
        long cheat_btns = 0x00120260;
        long i = 0;
        do {
            println(String.format("cheat_base+i=0x%08X, flags+i=0x%08X", cheat_base + i, flags + i));

            long has      = reader.readUnsignedInt(cheat_base + i);
            long expected = reader.readUnsignedInt(cheat_base + i + 0x4);
            long target   = reader.readUnsignedInt(cheat_base + i + 0x8);
            println(String.format("0x%08X 0x%08X 0x%08X", has, expected, target));

            flags = flags + 0x18;
            i = i + 0x60;
            cheat_btns = cheat_btns + 0x60;
        } while (flags < 0x00120790);
    }
}

Output:

base=0x80000000
cheat_base+i=0x00120254, flag+i=0x001202B0
0x00000001 0x00000007 0x00000000
cheat_base+i=0x001202B4, flag+i=0x00120310
0x00000001 0x00000007 0x00000000
cheat_base+i=0x00120314, flag+i=0x00120370
0x00000001 0x00000007 0x00000000
cheat_base+i=0x00120374, flag+i=0x001203D0
0x00000001 0x00000008 0x00000000
cheat_base+i=0x001203D4, flag+i=0x00120430
0x00000001 0x00000008 0x00000000
cheat_base+i=0x00120434, flag+i=0x00120490
0x00000001 0x00000008 0x00000000
cheat_base+i=0x00120494, flag+i=0x001204F0
0x00000001 0x00000008 0x00000000
cheat_base+i=0x001204F4, flag+i=0x00120550
0x00000001 0x00000008 0x00000000
cheat_base+i=0x00120554, flag+i=0x001205B0
0x00000001 0x00000008 0x00000001
cheat_base+i=0x001205B4, flag+i=0x00120610 # w_egg_chopper
0x00000001 0x00000008 0x00000000
cheat_base+i=0x00120614, flag+i=0x00120670 # w_egg2_pre1
0x00000001 0x00000008 0x00000000
cheat_base+i=0x00120674, flag+i=0x001206D0 # w_egg1
0x00000001 0x00000008 0x00000000
cheat_base+i=0x001206D4, flag+i=0x00120730
0x00000000 0x00000008 0x00000000
# ...

I’ve commented a few variables that seemed to be referenced along with the chopper cheat. A lot of them seem to be ready to parse by default (cheat_base + i = 1), but not much else we can say about it beyond some recurring values.

You might have noticed the cheat_btns above. In init_cheats(), it marks the start of what appeared to be button values:

DAT_80120260 = 0x2000;
DAT_80120264 = 0x8000;
DAT_80120268 = 0x2000;
DAT_8012026c = 0x8000;
DAT_80120270 = 0x2000;
DAT_80120274 = 1;
DAT_80120278 = 2;
DAT_801202b4 = 1;
DAT_80120310 = 0;
DAT_801202bc = 0;

Which gets more interesting further down:

w_egg_chopper = 0;
DAT_801205b8 = 8;
DAT_801205bc = 0;
DAT_801205c0 = 1;
DAT_801205c4 = 2;
DAT_801205c8 = 1;
DAT_801205cc = 2;
DAT_801205d0 = 1;
DAT_801205d4 = 2;
DAT_801205d8 = 1;
DAT_801205dc = 2;
DAT_80120614 = 1;

If we inspect the values of DAT_8011f6bc in Mednafen’s memory view, we can figure out that it’s a bitmask with the following button mapping (which is then parsed as little endian):

     △ = 0x1000
     ○ = 0x2000
     × = 0x4000
     □ = 0x8000
    L2 = 0x0100
    R2 = 0x0200
    L1 = 0x0400
    R1 = 0x0800
     ↑ = 0x0010
     → = 0x0020
     ↓ = 0x0040
     ← = 0x0080
Select = 0x0001
 Start = 0x0002

Near the variable for the chopper cheat (w_egg_chopper) is DAT_801205b8 (cheat_base + 4), DAT_801205bc (cheat_base + 8), followed by the input sequence… is it? Most sources list it as R2, L2, R2, L2, R2, L2, but here it’s L2, R2, L2, R2, L2, R2, L2, R2. If we look at DAT_801205bc in Mednafen’s memory view, we see that it gets incremented when we match each button in the sequence (with very short presses, so that each doesn’t count more than once, which resets the counter). Furthermore, the cockpit only shows up when 0x801205bc = 0x801205b8 = 8:

So yeah, this cheat is wrong on all those sources! Same goes for the cheat that hides the cockpit, which should be L1, R1, L1, R1, L1, R1, L1, R1.

Let’s focus on another input sequence, which is L1, R2, L1, R2, L1, R2, L1, R2:

_w_egg1 = 0;
_DAT_80120678 = 8;
_DAT_8012067c = 0;
_DAT_80120680 = 4;
_DAT_80120684 = 2;
_DAT_80120688 = 4;
_DAT_8012068c = 2;
_DAT_80120694 = 2;
_DAT_8012069c = 2;
_DAT_80120690 = 4;
_DAT_80120698 = 4;

If we go back to where render_chopper() is called, there’s this conditional logic:

  if (_w_egg_chopper == 0) {
    _DAT_80120730 = 0;
  }
  else {
    if (w_is_chopper != 0) {
      render_chopper();
    }
    if (w_egg2 == 7) {
      _DAT_801206d4 = 1;
      goto LAB_80119198;
    }
  }
  _DAT_801206d4 = 0;
LAB_80119198:
  if (_w_egg2_pre1 != 0) {
    if (_w_egg_chopper != 0) {
      w_egg2 = w_egg2 + 1;
    }
    _w_egg2_pre1 = 0;
    _w_egg_chopper = 0;
    _w_egg1 = 0;
  }

That w_egg2 == 7 grabbed my attention, since there’s another cheat that requires opening and hiding the cockpit 8 times. We do see it being incremented when w_egg2_pre1 != 0, which then sets w_egg_chopper = 0, so it seems w_egg2_pre1 encodes the cockpit being hidden.

Remember the textures we saw being indirectly set in render_chopper()? They are guarded by these variables:

if (_w_egg1 != 0 && w_egg2 == 3) {
    // ...
}

We got all the puzzle pieces! Those textures are shown when we:

  • Open and hide the chopper cockpit 4 times (for each time, 0x801205bc == 8 and 0x8012061c == 8), leaving it open on the last time (0x8011f720 == 3);
  • Afterwards, press L1, R2, L1, R2, L1, R2, L1, R2 (0x8012067c == 8);

Surprise, a slideshow of those kids on top of the minimap:

Now with glasses on

With the benefit of dynamic analysis, we can go back to the input checks and give some more clarifying names:

if (w_pad1_btn != 0) {
  cheat_btns = &DAT_80120260;
  flags = (uint *)&DAT_801202b0;
  i = 0;
  do {
    if (*(int *)((int)&cheat_80120254.enabled + i) != 0) {
      if (*flags == 1) {
        cheat_i = 1;
      }
      else {
        cheat_i = (uint)(*(int *)((int)&cheat_80120254.match_count + i) == *(int *)((int)&cheat_80120254.expected_count + i));
      }
      *(uint *)(&DAT_801202b0 + i) = cheat_i;
      cheat_i = *(uint *)((int)&cheat_80120254.match_count + i);
      cheat_i = cheat_i & -(uint)((int)cheat_i < 0x14);
      *(uint *)((int)&cheat_80120254.match_count + i) = cheat_i;
      match = 0;
      if (*(uint *)(cheat_btns + cheat_i * 4) == w_pad1_btn) {
        match = cheat_i + 1;
      }
      *(int *)((int)&cheat_80120254.match_count + i) = match;
      if (*flags == 1) {
        cheat_i = 1;
      }
      else {
        cheat_i = (uint)(match == *(int *)((int)&cheat_80120254.expected_count + i));
      }
      *(uint *)(&DAT_801202b0 + i) = cheat_i;
    }
    flags = flags + 0x18;
    i = i + 0x60;
    cheat_btns = cheat_btns + 0x60;
  } while ((int)flags < 0x80120790);
}

Overall:

  • We iterate through all input sequences (i = i + 0x60), checking one at a time;
  • cheat_i encodes the next button in the sequence to check; if it matches, match_count is incremented, otherwise it’s reset;
  • Each flag (DAT_801202b0 + i) is updated with the result of comparing expected_count == matches_count;

In the end, even this undiscovered easter egg had similar input sequences as the other known cheats. Combined with the fact that one of the cheats is listed wrong pretty much everywhere, it seems to me that those listings weren’t extracted from a disassembly are copied verbatim from Maxis themselves, so I guess the developers intentionally left the easter egg out.

Funnily enough, this is not the first time that I bump into this kind of an omission, and surely won’t be the last…

“But Wait, There’s More!”

Speaking of oversights, the developers also left one little bug in parsing directory entries, resulting in the last 2 scenarios being unused.

When the player selects “Load Scenario” in the main menu, “slus_001.13” will traverse the filesystem hierarchy under cdrom:/DATA/SCENARIO/. This is done on FUN_8003d2e4 (named load_city_dir), which calls FUN_8003d7e8 (named read_city_entries_pre1) whether we are loading a city or a scenario:

if (param_1 == '\0' && DAT_800e6940 == '\0') {
  read_city_entries_pre1(&w_city_names,&w_city_data,&w_city_display_names,"cdrom:/DATA/CITY/",0x23);
  DAT_800e6940 = '\x01';
}
if (param_1 != '\0' && DAT_800e6941 == '\0') {
  read_city_entries_pre1(&w_scn_names,&w_scn_data,&w_scn_display_names,"cdrom:/DATA/SCENARIO/",0x12);
  DAT_800e6941 = '\x01';
}

read_city_entries_pre1() takes these arguments:

  • List for filenames;
  • Pointer table for each file’s parsed data;
  • List for names displayed in the menu;
  • Path to traverse;
  • Number of file entries to parse;

At first glance, it looks right, we have 0x12 = 18 scenarios to load, so why does the menu only show 16?

If we dig deeper in callee FUN_800c84d8 (named read_city_entries), we see that it directly reads sectors of data off the disc. It’s worth pointing out that, in ISO9660:

Every directory will start with 2 special entries: an empty string, describing the “.” entry, and the string “\1” describing the “..” entry.

But the code seems to have this in mind. The following loop goes through each directory entry, starting at DAT_800ebfb8 (named w_load_city_data):

offset = 0;
buf_names = local_50;
while( true ) {
  bVar1 = false;
  if (offset < 0x800) {
    bVar1 = *file_i < file_cnt;
  }
  uVar6 = 0;
  if (!bVar1) break;
  next_delta = (&w_load_city_data)[offset];
  if (next_delta == 0) break;
  *(undefined4 *)(buf_names + 0x10) = *(undefined4 *)((int)&DAT_800ebfc2 + offset);
  uVar6 = *(undefined4 *)((int)&DAT_800ebfce + offset);
  *(undefined4 *)(buf_names + 0x14) = *(undefined4 *)((int)&DAT_800ebfca + offset);
  *(undefined4 *)(buf_names + 0x18) = uVar6;
  uVar7 = (uint)(byte)(&DAT_800ebfd8)[offset];
  if (0xf < uVar7) {
    uVar7 = 0xf;
  }
  read_file(buf_names,&DAT_800ebfd9 + offset,uVar7);
  if (*buf_names == '\0') {
    *buf_names = '.';
    buf_names[1] = '\0';
  }
  else if (*buf_names == '\x01') {
    *buf_names = '.';
    buf_names[1] = '.';
    buf_names[2] = '\0';
  }
  else {
    buf_names[uVar7] = '\0';
  }
  buf_names = buf_names + 0x1c;
  offset = (uint)next_delta + offset;
  *file_i = *file_i + 1;
}

The first byte of the entry is the length, stored in next_delta, which is used to advance to the next entry. offset points to the current entry. The array of structures pointed by buf_names (at 0x801076b0) will, among other metadata, contain the parsed name, which is “.” and “..” for the special entries.

That’s the bug: this loop should not be storing special entries, or the supplied entry count should have been 0x14. How come this doesn’t affect “Load City”? It does, but DATA/CITY/ only contains 18 entries, and the entry count passed as argument is 0x23, so the bug goes unnoticed.

By placing a breakpoint at 0x800c8a24, where read_file() is called, we spot the entries stored in w_load_city_data (the first entry’s empty string is highlighted):

After we hit the breakpoint 3 times (i.e. 3 entries parsed), 2 elements in buf_names are for the special entries (the first entry’s parsed name “.” is highlighted):

All of these are then processed by caller read_city_entries_pre1(), to build the menu with the display names:

read_city_entries(buf_names,local_30,path);
i = 0;
while (i < local_30[0]) {
  i = i + 1;
  if ((*(byte *)(buf_names + 0x1b) >> 1 & 1) != 1) {
    printf(auStack112,"%s%s",path,buf_names);
    iVar2 = FUN_801107b8(auStack112,&DAT_800122d4);
    if (iVar2 != 0) {
      uVar3 = *(undefined4 *)(iVar2 + 5);
      uVar5 = *(undefined4 *)(iVar2 + 9);
      uVar6 = *(undefined4 *)(iVar2 + 0xd);
      *display_names = *(undefined4 *)(iVar2 + 1);
      display_names[1] = uVar3;
      display_names[2] = uVar5;
      display_names[3] = uVar6;
      uVar3 = *(undefined4 *)(iVar2 + 0x15);
      uVar5 = *(undefined4 *)(iVar2 + 0x19);
      uVar1 = *(undefined *)(iVar2 + 0x1d);
      display_names[4] = *(undefined4 *)(iVar2 + 0x11);
      display_names[5] = uVar3;
      display_names[6] = uVar5;
      *(undefined *)(display_names + 7) = uVar1;
      uVar1 = *(undefined *)(iVar2 + 0x1f);
      *(undefined *)((int)display_names + 0x1d) = *(undefined *)(iVar2 + 0x1e);
      *(undefined *)((int)display_names + 0x1e) = uVar1;
      *(undefined *)((int)display_names + 0x1f) = 0;
      FUN_800bfe8c(iVar2);
      *data = display_names;
      data = data + 1;
      display_names = display_names + 8;
    }
  }
  buf_names = buf_names + 0x1c;
}

Ok, the fix seems straighforward, just patch the count to be 0x14, right? Of course, it’s never that simple… If we do that change, the game crashes while building the menu. The issue is illustrated below:

“SANFRAN.SSS” is the scenario filename that comes before the 2 unused scenarios. Their filenames actually were stored afterwards, but were overwritten by the pointer table w_scn_data, which starts at 0x801078a8!

It’s not a huge deal, we can just update references to w_scn_data to instead use some unallocated memory. A bit before w_scn_names is address 0x80107600, which seems to be filled with nulls, even if we load cities. Well, good enough for now, we update both the entry count and those references with the following Gameshark codes:

d00c1694 ffe8
8003d3f4 0014
8003e9d8 7600
8003d404 7600

Finally, we get to experience these scenarios over a quarter century later: