Archive for April, 2008
USB Action
BMOW is now USB-enabled for PC communication, for both reads and writes. Yahoo!

I wrote a simple program to log when the USB interface goes up and down, and to echo all bytes received from the PC to BMOW’s LCD. The program also adds one to each received byte before transmitting it back, so I could prove it was transmitting real data and I wasn’t seeing some local echo. The result was that whatever letter I typed on the PC, the next letter of the alphabet got sent back in reply. Here’s the Windows Hyperterminal session:

The whole exercise went very smoothly, and I didn’t have to do any debugging at all. Once I got the USBMOD4 in place, it all worked on the first attempt you see here.
The only wrinkle was with the USBMOD4 itself. I bought one months ago that was damaged in shipping: some pins were broken off. Futurlec sent me a replacement, but I didn’t open it until yesterday, only to discover it was the wrong part! After scouring the USBMOD4 data sheet, I worked out a modified solution that let me use the damaged module, which was only possible because all but one of the broken pins were ground pins. I ended up needing to desolder a SMD component on the USBMOD4 body, which was ugly, and I was also forced to use the integrated USB jack rather than the external USB jack I’d planned on. But I can live with the little flaws… and hey, now I’ve got USB!
1 commentValidation Test
Remember that validation test suite I wrote last December? Back then, I ran it using a Verilog hardware simulation of BMOW, and it exercised every variant of every machine instruction: adds, jumps, xors, loads, stores, branches taken, branches not taken, branches forward, branches backward, branches across a page boundary, and on and on in mind-numbing variation. Now I have that same test running and passing on the real BMOW hardware! OK, I had to cheat a little and disable the stack-related tests, since I still haven’t wired up the stack pointer register. But the other 95% of the tests run perfectly and the LCD says “pass”. Things are really looking good!
The stack pointer is the next obvious step. I’m a little hesitant to do the wiring, since the last two pieces of hardware that I added each made the machine stop working, and led to frustrating debug sessions. Hopefully this time will go better. Then the core “computing” part of the computer’s hardware will be done, and I can begin on I/O (keyboard, USB) and the real-time clock.
3 commentsRAM Test
Egads! The RAM works. This thing is starting to look like a real computer now. I wrote an improved version of my fibonacci program that operates on 16-bit values, using 4 bytes of RAM as temporary storage. I then created a super-program by combining the 16-bit RAM-based fibonacci with my earlier 8-bit fibonacci that uses only the A, X, and Y registers, as well as the “BMOW is alive!” program. At bootup, the super-program checks the value in the A register and uses it to choose one of the three demo programs to run. Each demo program writes a new value into A before halting. Since A isn’t cleared when the machine is reset, every time I hit the reset switch it runs the next program in the demo loop. Woohoo, user interactivity! Here’s a photo after running all three:

I’m using the 20×4 LCD now instead of the 16×2 one from the earlier photos. The fibonacci results are written in hexadecimal, since it was easiest for me to generate. The text is a little hard to interpret, because the LCD is mapped as two logical lines, where the first logical line covers the first and third physical lines of the LCD, and the second logical line covers the second and fourth physical lines. It reads “BMOW is alive! fib(13)=0xE9 fib(21)=0×2AC2 BMOW is alive!”
Some system facts:
- Current clock speed is 1MHz. It should be able to go faster ultimately.
- 512 KBytes of RAM, 16 KBytes of ROM.
- Power draw is 8 Watts, implying 1.6A at 5V.
- There are presently 704 connecting wires, so 1408 individual wire wraps.
It’s been a while since I posted any photos of the overall construction progress, so an update is overdue.

Here’s a close-up of the wire-wrap side of the board. Things are getting a little crazy. The photo doesn’t do justice to the fine detail of the dense wiring. Wires are stacked 10 deep in some channels! Big Mess o’ Wires indeed.

The component side is well-populated now. Eventually the rest of the bottom and right side of the board will be filled. The left side will be available for any possible future expansion, like audio or video.

Here you can see the three micro-ROMs at the top, and boot/program ROM below. That’s the newly-installed RAM immediately to the right of the boot ROM. All of the narrow chips with white labels are GALs. The rest of the unmarked chips are various 74LS series logic parts. Since there’s no fan or other moving parts to make an obvious noise, I added the yellow LED below the clock oscillator to remind me when BMOW is on.
1 commentMore Hardware Woes
After my earlier hardware glitches were seemingly resolved, I added the RAM to BMOW, which is the last of the basic hardware components needed. Unfortunately, this created a huge new set of hardware problems that I can only describe as “everything’s broken.” I haven’t even attempted to use the RAM yet, but its mere presence in the system (and the wires connected to it) seems to have caused everything to go haywire, and now the old “BMOW is alive!” program no longer works at all. I haven’t had time to really nail down what’s going wrong yet, but there are so many signals that suddenly seem to intermittently get the wrong values, I feel like my whole house of cards has just collapsed.
Update: A ROM was loose. D’oh! BMOW is alive, again.
Hardware Glitches
Just after I wrote last week’s entry, the hardware went from proudly proclaiming “BMOW is alive!” to merely stating “BMOW is aliv”. I was able to track this down to a problem with the program counter chip. Often, but not always, the low byte of the PC would roll over from FF to 01 instead of 00, and continue counting from there. The lowest bit of the address, A0, seemed to get stuck at 1 during a rollover. That wasn’t too painful to diagnose, but understanding *why* it was skipping 00 and how to fix it proved to be a much bigger problem.
The PC is implemented as a GAL, so the first thing I did was try replacing it with a new GAL, on the theory that the chip was bad. No help. Then I double-checked my GAL equations, and the raw fuse map produced by the GAL assembler, looking for errors. I found none. It seemed that the problem wasn’t simply bad hardware, nor a flaw in the logic, but some kind of electrical/noise/timing problem. Exactly the sort of problem I fear the most.
I checked every pin on the chip with my oscilloscope, looking for obvious spikes, noise, or power sags, but everything looked pretty good. There was some noise on some of the data load inputs, but nothing egregious, and those inputs aren’t used when incrementing the PC anyway. The scope showed that during a rollover, when A0 should have transitioned from a high voltage to a low one (1 to 0), it would start to dip low for about 5ns, then suddenly pop back up to a high voltage.
I tried slowing the system clock all the way from 1MHz down to 250kHz without success. I modified the GAL programming to make the PC increment every clock cycle, ignoring the count enable input. Of course this made the machine totally non-functional, but the rollover bug still occurred, as demonstrated with the scope. I tried removing all the other chips connected to A0, but the problem still persisted, and now the machine was even more non-functional.
Finally I tried rewriting the GAL equations to move A0 to another output pin, and the problem followed A0 to its new pin. This seemed a key bit of evidence, suggesting that the problem was not with the pin itself, nor the wires connected to the pin, but with the logical quantity A0. That led me to ask what was different about A0 versus A1-A7, which didn’t exhibit any problems. The answer is that when counting is enabled, A0 always changes state: a 0 becomes 1, or a 1 becomes 0. The other bits only change state depending on the values of the lower bits. In short, the PC was acting exactly as if it were being clocked twice in rapid succession.
I jumped back to the oscilloscope, looking carefully at the clock input to the PC at the moment of rollover. The clock looked fine, in fact it looked very clean. My scope only has 5ns timing resolution, though, so if there were a glitch of less than 5ns on the clock line, it might not show up. I wondered if there was a way to avoid a double-count in the case of a double-clock. I came up with the dangerous-sounding idea of including the clock itself in the product term used to compute the new A0 value at a clock edge. This certainly feels strange: by definition the A0 value will change exactly when the clock transitions from low to high, at which point the clock-as-data value will be undefined. The GAL program change also involved switching the A0 equation from negative logic to positive. Here it is (note the /clk0 term in the new equation):
| ; old equation |
| /q0 := /_reset + _reset*_cnt_in*_ld*/q0 + _reset*/_ld*/d0 + _reset*_ld*/_cnt_in*q0 |
| ; new equation |
| q0 := _reset*_cnt_in*_ld*q0 + _reset*/_ld*d0 + _reset*_ld*/_cnt_in*/q0*/clk0 |
This worked. So the problem has been successfully papered over, but not really solved. There are a couple of other experiments I’d like to try, in order to better understand what’s happening:
- Try a quarter-power GAL instead of my low-power ones. Perhaps the surge in power when all the address lines simultaneously switch from 1 to 0 is causing the clock glitch I can’t see. If so, a more power-efficient GAL might help. I’ve got one on order.
- Try a 15ns GAL instead of my 25ns ones. I don’t think the propagation delay has anything to do with the problem directly, but perhaps the different internal structure of the 15ns GAL would exhibit different symptoms. I’ve ordered one of these too.
- Experiment with various methods of terminating the clock line.
Termination seems to be something of an inexact process, from what I’ve read. I tried connecting a pin somewhere midway along the clock line to ground, through a 220 Ohm resistor, and it made the clocking problems worse. I’ve seen other designs that use an in-series resistor of around 40-80 Ohms, rather than a resistor tied to ground. I’ve been unable to find much good discussion of the need and method of termination for TTL circuits running in the 1-4 MHz range, and most of what I have read talks about terminating signals on the bus or backplane, which I don’t have. If anyone reading this knows more about this and could offer some advice, I’d love to hear it.
Update: Some more details on the PC clock may be useful for termination analysis. The low byte of the PC is computed by the GAL called PCLO in the schematics. It’s using the clock line Q0B, which is output from a 74LS244. Q0B is transmitted along a chain of wires about 21 inches in total length. The ‘244 that outputs the clock is at the beginning of the chain, and PCLO is about 9.5 inches down the chain from the ‘244, and about 11.5 inches from the end of the chain.
The clock signal propagates past PCLO, 11.5 inches to the end of the chain, reflects off the end, and propagates 11.5 inches back. So the reflected signal must travel 23 inches, or about 0.6 meters. Assuming 5 ns per meter signal propagation in copper wire, the reflected clock signal will arrive back at PCLO in 5 * 0.6 = 3 ns after the original signal. Maybe that causes the double-clocking?
4 commentsBMOW Is Alive!
Everything seems to be falling into place now. I think a picture says it best:

The LCD is up and running, displaying messages from the ROM program. I wasn’t lucky enough to have it work on the first try this time, though. I had to debug some accidentally swapped address lines and some uninitialized registers, but after a few hours of fiddling, I was rewarded with the greeting in the photo.
I’m starting to realize a few bad points about my physical setup. The biggest headache is getting chips in and out of the wire wrap board to reprogram them. They’re always hard to remove, and even with a chip puller, I’m afraid I’m going to damage or break a pin when the chip suddenly dislodges. I have my boot ROM in a ZIF socket, but all the microroms are in standard (non-ZIF) sockets, and the GALs aren’t socketed at all. What’s worse, the microrom sockets don’t seem to be making a consistent contact with the board. In one case, I was able to toggle the machine between working and not working just by pushing down on one of the microrom sockets a little. This is the sort of random, elusive electrical problem that worries me far more than any design problems.
Things are getting to the point where I really need a case or cabinet of some sort. I don’t necessarily want to put it in the PC case yet, since then I’ll have to be constantly removing it to work on the hardware. But the naked wirewrap board with power, reset, and LCD cables hanging off in random directions is getting pretty unwieldy. I’m thinking about constructing a temporary “development” case that’s more like a frame. It would provide something to easily grip everything by, and a place to anchor the cables and connectors, but still be totally open at the top and bottom. I’m hopeless with machine tools, but maybe I can cobble something half-respectable together with some scrap wood and screws.
2 commentsFirst Bootup!
It works!!! Eureka! And on the very first attempt, no less. I have achieved computation from a big mess o’ wires, and a couple of dozen basic logic chips. I can now say confidently that fibonacci(12) = 144. Check out the last line of the logic analyzer data listing:

Each line shows the state for a single clock cycle, with RESET (active low, so 1 means normal operation), OPCODE (in hex), and the X register (in decimal). Opcode FF means the machine has halted. You can see the last few terms of the fibonacci sequence on the preceding lines, although they’re not in order, and there’s a random value of 110 there too.
Did I mention that it works? Holy cow. The best part was that mere moments after the successful bootup, a friend called to ask me about something else, so I talked his ear off about the machine.
Since the hardware is still far from complete, running the fibonacci program required quite a bit of chicanery. At the moment there are only two 8-bit registers, and no RAM. The T register was intended to be used for temporary storage by the microcode, and wasn’t meant to be user-visible at all, so I had to add some additional instructions to expose it temporarily. I also added an instruction to add the X and T registers. Then I had to write microcode for a conditional absolute jump instruction, since the hardware needed for a relative branch isn’t finished yet. I had to modify the absolute jump instruction to work only within the first 256 bytes of memory, to avoid disturbing the T register. And finally, since I didn’t have any place to store a running count of how many fibonacci sequence terms had been generated, I resorted to cheating: the program terminates when the sign bit of the X register (bit 7) is 1. So it’s not really computing fibonacci(12), but rather the first fibonacci number >= 128, which happens to be fibonacci(12).
Here is BMOW’s first program:
| * = $0 |
| nop ; let’s hope we can execute a no-op, at least |
| ; load X and T with the first two terms of the fibonacci sequence |
| ldx #1 |
| sxt ; swap X and T, uses XOR swap since there’s no other temporary register! |
| ldx #0 |
| loop: |
| clc ; clear the carry flag |
| axt ; add x + t |
| jmi done ; if the result is “minus” (sign bit is 1), exit the loop |
| sxt ; swap X and T |
| jpl loop ; jump back to the start of the loop |
| done: halt |
Honestly, I’m fairly amazed that it worked on the first try. Yes, I’d been testing the subsystems as much as I could as I built them, but this was the first real integration test. What’s more, it was the first test of any kind that tried to modify the program counter, or use the ALU, condition codes, data registers, databus, or memory bus to data bus interface. I fully expected to spend a long time working out all sorts of problems before getting to the first successful program run. Heck, I must have run into 10 different logic and microcode bugs while testing the fibonacci program on the simulator, and the potential for errors in the impenetrable mass of wires the composes the BMOW hardware is far greater.
Here’s a look at the testing setup for my moment of glory:

So now I’ve got a very rudimentary computer, with two 8-bit registers and no RAM, running at a blazing 470 kHz. What’s next? I’ll probably write a few more test programs to exercise the hardware in its current state, to make sure everything’s really working as it ought to. Next, I think I’ll try to tackle integrating the LCD module. Checking the progress of the computer with all those logic analyzer probes is not much fun, so it would be great to display “fib(12) = 144″ instead. Once I’m able to check the machine’s health without connecting up the logic analyzer every time, I’ll probably move on to the remaining data registers, RAM, stack pointer, and other hardware devices. There’s still a tremendous amount left to do, but as of today, I can finally say I’ve built a working homebrew CPU.
2 commentsAlmost There
I’m getting very close to the first real BMOW bootup. I’ve got two data registers wired up, along with enough of the data bus to use them. Now I just need to double-check the current wiring, add the rest of the registers, and see what kind of test program I devise.
My instruction set is copied from the 6502, with some minor additions and changes. In the course of thinking about a good example program for first bootup, a couple of oddities occurred to me for the first time. The biggest one is that there’s no way to add values from two registers. Instead, the ADC instruction always adds the value in the accumulator register to a value in RAM. That’s a problem for me, since I haven’t yet wired up the RAM, but it also seems like a deeper problem. Why wouldn’t you want something like an ADX instruction, to add A + X? Wouldn’t it be faster than a memory access, if the values you want to add are already in registers?
The second oddity is the very non-symmetric nature of the 6502 instruction set. It’s something I was certainly aware of before, but never really thought much about. With these instructions, each register has different capabilities. The results of an add are always stored in A, never X or Y. Indexed memory accesses use X or Y, not A. Only the accumulator can be bit shifted. It all seems arbitrary and awkward, although I’m sure there were good reasons for those limitations in the original 6502 hardware.
All this has got me thinking that maybe I ought to pattern the BMOW instruction set from the 68000, or MIPS, or something else that’s a little more rational. There’s nothing specifically tying BMOW to the 6502, and with a reprogramming of the microcode, I can implement any instruction set I want, as long as it can be realized on the BMOW hardware. For the time being I plan to stick with the current instruction set, though, since I’ve already written most of the microcode, and got a working assembler too. Writing a new assembler is something I particularly don’t relish.
Musings on instruction sets aside, the BMOW construction and wiring is proceeding smoothly, and I probably have more than 50% of it finished now. A lot of things like devices and the stack pointer aren’t strictly necessary in order to run simple programs, so I hope to have some good news about my first bootup with a real program very soon.
No commentsStraight Line Code
Believe it or not, I think this pig may actually fly! Things are starting to get interesting. After several more days of wiring, I’ve reached the point where I can execute straight line code (no branches or jumps), with no RAM, and no registers. If you think about it for a moment, you’ll realize that given those restrictions, you can’t really do anything with the computer at all. There’s only one piece of state (the current program address), and there’s no way to change that state other than by sequentially executing instructions. It may even be a stretch to say that it’s “executing” instructions, when they don’t change any state. All the instructions might as well be NOPs.
Despite the outwardly boring appearance, I’m actually very happy with this result. The ability to execute straight line code means:
- The program counter works, because it steps through the program instructions correctly.
- The address bus is wired correctly.
- The address decoder works, because it enables the boot ROM for addresses that are mapped to it.
- The boot ROM is set up correctly, because program instructions are being read from it.
- The external data bus (memory bus) is wired correctly, because instructions are transferred on it to the opcode register.
- The microcode and control system work (although I knew this already from earlier tests).
I didn’t exactly stress test it, but everything seemed very solid and reliable during my experiments, with no weird glitchy behavior at all. The couple of signals I examined with the oscilloscope looked pretty clean.
From here, it’s only a couple more steps before I have something interesting working. I think it’s time to start thinking about my minimal definition of a computer, so I can pinpoint the date of the first successful boot. If I had conditional branching, and two registers, that would be enough to write a simple program to compute a factorial or some such. The result would need to be read out with the logic analyzer, since there’s still no human-readable output, but it would be good enough for me to declare BMOW officially up and running. If all goes well, I should be less than a week away from that goal!

Mmmm, 24 bits of delicious address bus.

I also managed to cram in a ZIF socket for the boot/program ROM, and a shiny red reset button.
No comments