During my holidays last october I picked up my emulator project again (hardware and software) and have since worked on it on and off during my free time.
The emulator still runs on the STM32F429 discovery board with a 180MHz ARM Cortex-M4 with 192+64KB RAM and 2MB flash, a 320x240 LCD and 8MB external RAM.
I added a carrier board with a STM32F103 that can read gb cartridges and SNES and Sega gamepads and optimized the emulator to run at reasonable speeds.
The board also has a MicroSD slot, audio amplifier, speaker, headphone jack, voltage regulators and a battery holder.
And the STM32F1 has a clock crystal and a coin cell for the RTC which I want to use for MBC3 emulation.
I replaced the horribly slow pixel-by-pixel renderer and got a nice ~3x speed up. Moving as much as possible into the internal RAM gave another ~3x speed up.
Only the cartridge ROM and the framebuffers remain in the external memory, as they are simply too large.
Some improvements to the CPU template code, some tweaks to the blitter and caching sprite lookups probably increased the frame rate by another 40-50%, which is enough to run (almost?) all gb games above 60fps, some even close to 100fps.
Color games are a different issue. Even with the fast dmg/sgb renderer, they don't reach more than 40-50fps. I guess the renderer is now so fast, that the additional CPU cycles in double speed mode have a noticeable impact. And adding back the gbc specifics to the renderer will likely make it even slower :down:
The graphic controller hardware has some useful features, including two independent layers with blending. The background layer is set to 144x160 and displays the game, while the foreground layer has 320x240 pixels and displays the SGB border and everything else. This also allows the border to overlap the game without special handling or additional overhead. It also does vsync with double buffering.
The STM32F1 has enough free pins to connect to a gb cartridge and still have some pins left for the gamepad and the SPI bus from the STM32F4.
The STM32F4 has a 12bit DAC that I use for sound. It has two channels, but unfortunalety one of the outputs is occupied by the LCD interface and so I'm limited to mono. It is connected to a TDA 7052A amplifier which drives a small speaker, similar to the one in the gameboy, and a headphone jack. I still have to write my own gb sound emulation, but the hardware side is fully operational and playback of raw data from the SD card works.
The board is either powered over USB or from 2x1.5V AA batteries with a 3V and a 5V step-up regulator. I had a 3.3V regulator module from China with abysmal efficiency, but even my self-build module only manages 60%-80% efficiency depending on the battery voltage. Measuring the exact currents and power consumption is not easy, but it seems that for low battery voltages I'm close to the maximum output current, where the efficiency drops significantly. I lowered the output voltage from 3.3V to 3V, which helped with the overall consumption but not too much in regard of efficiency.