Designing an ISA
19 Oct 23
I stumbled across Devine Lu Ator's uxn project. uxn is an instruction-set architecture (ISA) - the lowest form of a computer that can be programmed. Devine's philosophy behind the project is that the ISA should be simple enough that an emulator can be written in a weekend. This means the specification can be included with and software written for it, and that software is safe from its target architecture being lost. Designing an ISA purely for a software implementation is an interesting take. This removes some limitations that hardware comes with and lets you make some interesting choices. I decided to design an ISA without concern for a hardware implementation for fun.
----------------
Outline
----------------
I've taken a few university classes on assembly and ISAs, so I'ms ure several of my design decisions come from what we did in those. I started out with a memory size of 0xFFFF, an arithmetic logic unit (ALU) with 2 input registers and 1 output register, a return stack of 256 entries, a user-data stack with 256 entries, and 8 IO ports that are 2 bytes wide. I also decided to have 8 interrupt vectors: some for timers, keyboards, etc. As I wrote out the instructions and some sample assembly programs, I modified these specs.
I started the detailed design with the instructions since these are the part of the architecture that people interact with the most.
Opcode | Instruction | byte1 | byte2 | byte3 | Description | |||
LOAD | load | op1 | op2 | copy op1 into op2 | ||||
SAVE | save | op1 | op2 | copy op1 into op2 | ||||
ADD | add | ACC = R1 + R2 | ||||||
SUB | subtract | ACC = R1 - R2 | ||||||
MUL | multiply | ACC = R1 - R2 | ||||||
DIV | divide | ACC - R1 / R2 | ||||||
LSHIFT | left-shift | op | shift [op] 1 bit left | |||||
RSHIFT | right-shift | op | shift [op] 1 bit right | |||||
AND | AND | ACC = R1 AND R2 | ||||||
OR | OR | ACC = R1 OR R2 | ||||||
XOR | XOR | ACC = R1 XOR R2 | ||||||
NAND | NAND | ACC = R1 NAND R2 | ||||||
PUSH | push | op | increment SP, store op in [SP] | |||||
POP | pop | op | copy [SP] into op, decrement SP | |||||
SADD | stack add | consume the top 2 items, push their sum | ||||||
SSUB | stack subtract | consume the top 2 items, push their difference | ||||||
SMUL | stack multiply | consume the top 2 items, push their product | ||||||
SDIV | stack divide | consume the top 2 items, push thir ratio | ||||||
SLSHIFT | stack left-shit | bit shift [SP] 1 bit left | ||||||
SRSHIFT | stack right-shift | bit shift [SP] 1 bit right | ||||||
SAND | stack AND | consume/AND the top 2 items, push result | ||||||
SOR | stack OR | consume/OR the top 2 items, push result | ||||||
SXOR | stack XOR | consume/XOR the top 2 items, push result | ||||||
SNAND | stack NAND | consume/NAND the top 2 items, push result | ||||||
READ | read | IO | op | copy IO into op | ||||
WRITE | write | op | IO | copy op into IO | ||||
IENABLE | interrupt-enable | op | copy op into IE | |||||
ISETUP | interrupt-setup | op1 | op2 | copy op2 into IV entries specified by op1 | ||||
T1DIV | timer1-divider | op | copy op into TD1 | |||||
T2DIV | timer2-divider | op | copy op into TD2 | |||||
T3DIV | timer3-divider | op | copy op into TD3 | |||||
HALT | system halt | stop clock | ||||||
JMP | jump | op | copy op into PC | |||||
JMPEQ | jump-if-equal | op | copy op into PC if ACC zero flag set | |||||
JMPNE | jump-if-not-equal | op | copy op into PC if ACC zero flag not set | |||||
JMPPOS | jump-if-positive | op | copy op into PC if ACC negative flag not set | |||||
JMPNEG | jump-if-negative | op | copy op into PC if ACC negative flag set | |||||
CALL | call | op | increment RP, copy (PC+2) into [RP], copy op into PC | |||||
RETURN | return | copy [RP] into PC, decrement RP | ||||||
INC | increment | op | op = op + 1 | |||||
DEC | decrement | op | op = op + 1 | |||||
JSEQ | jump if stack = 0 | op | copy op into PC if stack zero flag set | |||||
JSNE | jump if stack != 0 | op | copy op into PC if stack zero flag not set | |||||
JSPOS | jump if stack > 0 | op | copy op into PC if stack negative flag not set | |||||
JSNEG | jump if stack < 0 | op | copy op into PC if stack negative flag set |
----------------
Memory
----------------
Initially, I wanted each memory address to contain 4 bytes, a full instruction. The problem was that when using memory to store variables, I had to specify if I wanted the lower 2 bytes or the upper 2 bytes. If the variable was a pointer, I had to specify lower/upper again. Potential solutions include:
changing the memory layout make variables only refer to the lower bytes don't allow pointers directly from memory - load them into register then point
I made the decision to have each memory entry/address be 2 bytes wide. But with each instruction and operands being 4 bytes wide, each instruction will take up 2 memory addresses. To make this happen, the PC is incremented by 2 each time. This makes programming a little harder since we're counting by 2's, but it allows for much more elegant use of pointer variables.
0x0000 | 0x0001 |
0x0002 | 0x0003 |
0x0004 | 0x0005 |
... |
You can still use odd-numbered memory for storage, but instructions will take up 2 slots. This solution allows for full self-modification. The program can write to any space in memory and replace parts of itself. I don't know what this would be useful for, but it's a thing.
----------------
Registers & Stack
----------------
In my classes and in Ben Eater's "building a breadboard computer" series, the architecture used a couple registers that get loaded with values, operations are done, and the results are stored in memory. For this machine, I decided to make a couple memory spaces take the role of registers. The register that holds the results is also a memory space. This all means the results memory slot can be directly loaded with a value to immediately check for JMPEQ, JMPPOS, etc.
... | |
0xFDFC | 0xFDFD |
0xFDFE | 0xFDFF |
0xFDFC is input1, 0xFDFD is input 2, and 0xFDFE is the 4-byte output. Having a large output will help prevent overflows
In addition to the register-oriented operations I'm used to, I decided to devote some instructions to stack operations, including all the math and bitwise operators that the "registers" get. This gives flexibility to the programmer at the cost of some implementation complexity. I may also add some instructions for uniquely stack-focused operations, like swapping the top to stack items, swapping the 1st and 3rd items, etc.
There are some traditional registers for system values like the Program Counter, stack pointers, etc:
Register | Bytes | Description |
PC | 2 | program counter |
RP | 2 | return stack pointer |
SP | 2 | data stack pointer |
IE | 1 | interrupt enable |
IV | 16 | interrupt vectors - 2 bytes to hold address interrupts should trigger |
AF | 1 | accumulator status flags (zero, negative, overflow, etc.) |
SF | 1 | stack status flags |
----------------
Addressing Modes
----------------
2 bits of the opcode are reserved for specifying the addressing mode. I defined 3: literal, direct, and indirect. literal is where the operand is included in the instruction, direct is where a memory location is specified, and indirect is where a memory location is specified and that location contains another memory address. Indirect addressing is like using pointers in C. These would be useful for looping over memory, e.g. when a section of memory contains a string of characters, and the contents of our pointer/index is incremented each loop.
Symbol | Example | Description |
none | 0xFF | literal |
& | &0FFF | direct (hex only) |
> | >0FFF | indirect/pointer (hex only) |
-----------------------------
Assembler Directives & Syntax
-----------------------------
Along with the instructions, there need to be a few additional features to make the language complete. The language needs to support assembler directives that the CPU won't execute, but will do things like pre-populate memory with certain values, replace variable names with their actual memory address, etc. These are outlined here:
Symbol | Example | Description |
@ | @label | marks a place in the code that can be jumped to |
# | #name 0x0FFF | any occurences of "name" in the code will be replaced with the address |
$ | $address value | starting at the address, memory is populated with "value". Useful for defining strings |
// | // this is a comment | ignored by assembler |
----------------
Example Programs
----------------
Hello World:
#string F000 // declare variable $string "Hello World!" // populate memory with string #index FD00 // declare variable push string // push address onto stack pop index // pop address into pointer variable @loop // start of loop write >index 0x00 // write contents of [index] to cout inc index // increment index load >index 0xFE // load contents of [index] to accumulator jmpne /loop // if accumulator != 0, jump to loop start
----------------
Comments
----------------