Difference between revisions of "Gemini Wing Repair Logs"
|Line 1:||Line 1:|
Revision as of 13:43, 21 September 2016
This page is intentionally not yet linked to the index of the wiki. This will change in a few hours. Meanwhile please leave it hidden. Thanks.
Symptom: Missing drum sounds
This repair effort started in a LOL, and I'm not even sure any more whether I should call it a “repair” or something else. I had found a Gemini Wing board at a nice price advertised as working and decided to buy it. When I fired it up and started playing the game I felt there was “something missing” in the audio, but couldn't really point out what it was. Puzzled, I fired up MAME for a comparison and realized that the digital drums were the missing part. A quick look at the audio section of the board, compared with the list of audio chips reported by MAME made me laugh: the entire digital audio section was not populated and I didn't realize so far!
Ok, bootleggers need to save money, but omitting an entire section of the circuit is pushing it a little bit too far. This dark omen could only mean one thing: the bootleggers had made a mistake on the PCB big enough to render the entire area useless. Unfortunately this later turned out to be correct. As I have a friend who owns a similar bootleg board where that area is populated, I asked him to shoot a photo from which I could see what parts were supposed to be there. But that wasn't enough of course: I needed to find out what was wrong with the design of my board before I could assemble the missing parts. Time for reverse-engineering! Luckily the total absence of components or sockets made it easy to follow the traces and figure out the schematics.
The way this section works is very straightforward. The heart of the system is the MSM5205 at W8. It generates a sample clock based on the 400kHz ceramic resonator whose frequency is then divided by 64. As a result, VCK is about 6kHz which in turn means the hardware can only play back audio up to around 3kHz worth of bandwidth; very low fidelity indeed but enough for the drums and for the thunder sound for level 4. The LS157 mux at X8 forwards sample data from the 27C256 EPROM at W7 to the OKI DAC 4 bits at a time. The flip-flop at U8 takes care of alternately selecting the high nibble or the low nibble. The addressing of the EPROM is split in two parts: the lower 8 bits are coming from the LS393 counter at U6, while the higher 7 bits come from the LS191 counters at X6 and W6. The two LS85 at X5 and W5 compare the higher 7 addressing bits to a constant value provided by the LS373 latch at U5. Once the two values match, the LS74 flip-flop at U8 flips from the play state to the stop state. The rest of the logic allows the audio CPU to set an 8 bit end address into the latch at U5, an 8 bit start address into the counters at X6 and W6, and a 4 bit volume level which controls a 4066 switching different resistors into the feedback path of an operational amplifier to control the volume of the sample playback. Samples therefore do always start and end on memory addresses multiple of 256. In order to prevent audio glitches, the CPU shall first program an end address and then a start address, because this last operation also triggers the start of the playback. Reasonable and cost-effective.
As you can see from the schematic above, the bootleggers made two big mistakes: first of all they shorted together the address lines A9 and A10 of the EPROM. Once I had completed the schematic of this area it was no problem to cut the trace in the right place and fix the issue with a wire. Second: they wired pin 3 of the DAC to ground; doing so makes the DAC ignore the 4th bit coming from the LS157 mux, which is very important as it is the sign bit. Just for the sake of curiosity I tried leaving pin 3 of the DAC as it was and the audio samples were garbled and very low in amplitude, confirming that the right setting is to pull this pin high. A little more cutting and wiring fixed this as well. Here are two photos showing the patchwork and the assembled sockets.
My board was now finally playing samples as Tecmo intended. ...or maybe not. While playing the game I experienced some very weird issue with samples: the samples themselves would very rarely play out garbled and “farty” with no apparent reason; by converse they were playing fine most of the time throughout the game. The issue was appearing almost exclusively in level 4, therefore I tried dumping and checking the romset: everything matched the gemini set documented by MAME except two bytes of the main CPU's program code. By searching on the internet, it turned out that someone else already reported the same differences in his own board.
This confirmed that my EPROMs were fine and I was just using an undumped version of this game. By the way, the difference between the two sets seems to be just the missing territorial warning screen. A careful investigation of the sample player was needed. With the scope I could see the DAC emitting saturated waveforms, probably coming from wrong input data. As this DAC is ADPCM, the input data gets summed to an internal 4 bit accumulator to determine the output value. Input data is therefore signed and represents the positive or negative offset from the current voltage level to the next one. If random data comes to the input, the accumulator will likely get out of control, overflow and roll back to 0000 or underflow and roll up to 1111, thereby producing very high amplitude square waves which sound like digital farts. The cause of this behaviour is yet another design mistake of the bootleggers, much harder to spot in comparison to the other two. You see, when the sample player has finished reproducing some sample, the play/stop flip-flop triggers to the stop state and makes the player ready to accept a new sample. But what happens if the CPU tries to start a new sample while the previous one hasn't finished playing yet? The flip-flop ignores the start command, because it's already playing, and as a result no reset pulse informs the rest of the circuitry about the sudden change of plans. Everything keeps running happily, except the start address and end addresses for the player have changed. To make things worse, the new start pulse from the CPU comes completely asynchronous to the clock of the DAC. This results in the output data bits of the EPROM changing value while the DAC is acquiring them. No wonder the DAC was misbehaving! In order to fix this, I needed to present a reset pulse to the circuitry every time a new play request came in; but I needed to preserve the important start/stop logic as well as comply with the timing requirements of the DAC which needs a reset pulse at least 4 VCK pulses long in order to successfully reset the internal circuitry. The solution I have come up with in order to solve this problem is the circuitry below, based on an LS161 and an LS00. The basic idea is that this small patch circuit replaces the LS74 at U8, therefore I built a patch board which plugs in place of the IC at U8, hosts the LS74 which I removed from the main board and integrates it with the LS161 counter and the LS00 gate.
As you can see, the LS74 is almost directly connected to its original socket. The main difference is that its signal is helped by a stretched reset signal coming from the LS161. The LS161 starts counting from 0 up to 4 at the rate of one LSB per VCK pulse whenever a new start pulse arrives. Its output is NANDed with the output of U8. This way U8 can enforce the long standing reset that is required by the stop state, but the LS161 dictates that any reset pulse must be at least 4 VCK pulses long to ensure that the DAC's internal register gets cleared before accepting new data. Furthermore, the LS161 always reacts to all start pulses and can therefore enforce a reset even if a sample is still playing. This certainly delays the start of the playback, but with a clock frequency of around 6kHz the whole delay will be less than 0.7 milliseconds and therefore not detectable by the human ear. Here is a picture of the patch board and the completed work.
One last very important step: Pin 1 of the IC at T7 should be cut from the trace it is connected to, and connected to the STARTADD# signal instead (pin 13 of the U8 socket). Without this important change, the now longer reset pulse generated by the patch board would keep the LS174 latch at T7 cleared while the CPU writes the new volume value to it for the upcoming sample; as a result all samples would be muted. You can see this last change wired on the above photo.
My fix was successful, and my bootleg Gemini Wing board is now always playing samples correctly, just as the original designers intended. And this game's awesome soundtrack is now fully enjoyable. I know what you are thinking: this was totally overkill for a bootleg and not worth investing time or money into. You are definitely right; yet I had a lot of fun doing this, and I am sure this article will prove useful to all owners of Gemini Wing bootleg boards with farting or absent sample playback. ;)
Needless to say, the endeavour wasn't over yet. While enjoying the awesome music of this shmup over a few credits, I let myself so carried over that I actually completed the game and got to the ending sequence. And that's where the actual surprise came in. Every single part and detail of this game was now beautifully perfect up until the final bit of the ending sequence where I got something I really wasn't expecting. See this video.
OH NO! How am I going to debug this now? MAME to the rescue! Except MAME was displaying something so incredibly different that made me fear it wasn't emulated correctly. You can see MAME's difference in this video, as well as many others; I couldn't find any video of the ending sequence shot on real hardware.
No other choice but to start studying the code of the game. MAME has been a precious help here, because its powerful debugger makes it the ideal tool to investigate these kinds of issues. By displaying the isolated contents of each and every single layer I was able to establish the first fact: this effect only occurs on the background layer. The foreground layer is completely transparent, and the text layer is used to frame the display to the same small rectangle used for the horizontal flight scene where the credits roll. No sprites are used, therefore what I was looking for had to do with the background layer. As MAME states, the background layer has no special features apart from scrolling; there is a 16bit register for horizontal scrolling and an 8bit register for vertical scrolling. But as you can see from my video a zooming effect is clearly visible.
Note that the game has vertical orientation, therefore the monitor is rotated to the right by 90°. This means that from the user perspective the raster beam runs vertically from top to bottom and the screen is scanned column by column from right to left. It's not so easy to wrap one's mind around this concept, so I will describe this pretending the monitor is in the wrong non-rotated position. The play field is therefore scrolling from left to right instead of from top to bottom.
The use of a 16 bit register for horizontal scrolling is more than justified: this is used to scroll the quite long play field from the start to the end of each level. However, there is no place in the game where the play field scrolls vertically. Tecmo released two other games which run on a similar hardware: Rygar in 1986 and Silkworm in 1988. Vertical scrolling is never used in Silkworm, and Rygar only has a few places where vertical scrolling is used (namely the areas where you climb a rope up or down, with frogs trying to hit you with their tongue). Gemini Wing is from 1987, which means the hardware is very likely to match Rygar to a great degree, and to maybe be a slightly improved one.
Vertical scrolling is controlled by the register at 0xF805, and by asking MAME to break execution when that register is written makes the game run just fine up until the ending sequence where the faulty effect starts; that's when the code first writes a value other than zero there. The code that takes care of implementing the effect starts at 0x1391 and is triggered through a flag via the interrupt signal that comes to the CPU at the end of each frame when the vertical blanking interval starts. The code does not use any other timing reference apart from this sinlge interrupt, and this makes it extremely time critical. What it internally does is to compute two rough half sinuses and poke the horizontal and vertical scrolling registers with the computed values every few raster lines. This is called “raster effect”; I remember doing a lot of this stuff when I was a kid coding intros on the C64... The target basically is to cheat the video hardware into behaving differently on a line by line basis in order to achieve visual effects that would otherwise be impossible for the platform. Gemini Wing uses the horizontal scrolling register to tear the picture on the screen. You can clearly see from the video that odd slices go up and even slices go down. The vertical scrolling register is instead used to impose a different vertical position to every line. By choosing the vertical position wisely, it's possible to display the same graphics in two adjacent lines, thereby stretching the image. It is also possible to skip graphic data from one line to next, thereby making the picture shrink with respect to its original size. The real hardware responds to register changes on the fly line by line, while MAME was only updating the screen once per frame, which kills the effect as you can see in the above video recording. That's the second proven fact: this raster effect is created solely by using the scrolling registers, and the interrupt pulse is of paramount importance to its timing. This explains what can be seen in my video, compared against MAME, but doesn't solve my issue: the effect on my board still looks very broken as I would expect some clever zooming-stretching-tearing effect.
By tracing the address decoding circuitry I managed to find the 74LS273 which responds to the address 0xF805, removed it from the board and put a socket in there. Then by tracing the data bits from the CPU to the register I mapped all eight bits to the input pins of the latch where they belonged. Then, with the latch removed, I stimulated each and every (now missing) output. By imposing a level 1 or 0 one pin at a time I could make the background shift by 1, 2, 4, 8, 16, 32, 64 and 128 pixels. Against all odds, the hardware is behaving exactly as MAME describes, scrolling the background vertically by the given amount of pixels, therefore my issue obviously was in the amount of time the CPU took from one register write to the next. Trying to figure the issue out, I also noticed that MAME was running the main processor of the game at 6MHz, while the CPU on my board runs at 8MHz. MAME also says that the video circuitry is clocked at 24.000MHz, while my board has a 24.180MHz oscillator. I patched my board to run the main CPU at 6MHz (24MHz divided by 4), which resulted in the audio CPU and the sound generator running at 3MHz instead of 4MHz because the board simply divides the main CPU clock by 2 and uses this to clock the audio area. I have shot another video with the board patched this way. The audio is way slower due to the clock frequency change.
I also tried 8MHz CPU clock but with 24.000MHz video clock, but that didn't improve the situation over my initial setup. Video here.
No cigar. I couldn't proceed any further without knowing the exact clock frequencies I needed to use, and the only way to know them is to look at an original board which unfortunately I don't own.
I would like to thank Leonard Oliveira and Angelo Salese, who helped out without hesitation. Leonard has asked his friends and found out a person who owns an original Gemini Wing board and would be willing to provide informations about the clock frequencies of the original board as well as a video showing how the ending sequence is meant to look like. Angelo needed no explanations; he figured everything out on his own starting from my videos, implemented raster effects for MAME's Tecmo driver, added my undumped romset to MAME and corrected the clock frequency and display timings for the game according to the ones my board is generating. Suprise surprise, with these changes implemented, MAME started behaving exactly the same as my board does. This confirmed that my hardware was actually fine, but is just as far as one can get without verified data. As soon as Leonard's friend will post back the video clock frequency and the main CPU clock frequency as used on his original board I will be able to patch mine with the right crystal and oscillator to obtain the same result as the original; provided my board doesn't hide other issues, that is. And of course these data will also end up in MAME for the good of the community.
NOTE: the changes have been applied to MAME's source code just a week ago. If you want to try them out immediately then you need to get the source code from GIT and recompile MAME yourself. Otherwise you need to wait until version 0.178 is out.