Assembler
We'll the high-level language Python to describe the assembler, and Hack assembly language for the example programs. The workflow for now will then be to write a code in Hack, convert it into binary code using our Assembler written in Python, and execute that binary code with our computer.
Writing an Assembler in Python
The assembler I lay out here does two passes. First, we'll record label declarations, then we'll generate binary code, handling both A-instructions and C-instructions. I provide the code for the assembler in the code repository on GitHub. But I encourage you to look at these functions and try to fill it in yourself. It's a simple idea but can be a bit tricky to not make a mistake. In more detail:
- Initializes symbol tables and instruction code tables.
- Removes comments and whitespace from the input.
- Performs a first pass to record label declarations.
- Performs a second pass to generate binary code:
- Handles A-instructions (starting with @)
- Handles C-instructions (dest=comp;jump)
- Writes the resulting binary code to a
.hack
file.
import re
import sys
class Assembler:
def __init__(self):
self.symbol_table = {
'SP': 0, 'LCL': 1, 'ARG': 2, 'THIS': 3, 'THAT': 4,
'R0': 0, 'R1': 1, 'R2': 2, 'R3': 3, 'R4': 4, 'R5': 5, 'R6': 6, 'R7': 7,
'R8': 8, 'R9': 9, 'R10': 10, 'R11': 11, 'R12': 12, 'R13': 13, 'R14': 14, 'R15': 15,
'SCREEN': 16384, 'KBD': 24576
}
self.next_variable_address = 16
self.comp_table = {
'0': '0101010', '1': '0111111', '-1': '0111010', 'D': '0001100',
'A': '0110000', '!D': '0001101', '!A': '0110001', '-D': '0001111',
'-A': '0110011', 'D+1': '0011111', 'A+1': '0110111', 'D-1': '0001110',
'A-1': '0110010', 'D+A': '0000010', 'D-A': '0010011', 'A-D': '0000111',
'D&A': '0000000', 'D|A': '0010101',
'M': '1110000', '!M': '1110001', '-M': '1110011', 'M+1': '1110111',
'M-1': '1110010', 'D+M': '1000010', 'D-M': '1010011', 'M-D': '1000111',
'D&M': '1000000', 'D|M': '1010101'
}
self.dest_table = {
'': '000', 'M': '001', 'D': '010', 'MD': '011',
'A': '100', 'AM': '101', 'AD': '110', 'AMD': '111'
}
self.jump_table = {
'': '000', 'JGT': '001', 'JEQ': '010', 'JGE': '011',
'JLT': '100', 'JNE': '101', 'JLE': '110', 'JMP': '111'
}
def remove_comments_and_whitespace(self, line):
pass
def first_pass(self, assembly_code):
pass
def get_address(self, symbol):
pass
def assemble_a_instruction(self, instruction):
pass
def assemble_c_instruction(self, instruction):
pass
def second_pass(self, assembly_code):
pass
def assemble(self, filename):
pass
if __name__ == '__main__':
if len(sys.argv) != 2:
print("Usage: python assembler.py <input_file.asm>")
sys.exit(1)
input_file = sys.argv[1]
assembler = Assembler()
assembler.assemble(input_file)
If you use the provided assembler.py
script, you can use it from the command line, providing the input .asm
file as an argument:
> python assembler.py input_file.asm
The assembler will generate an output file with the same name but with a .hack
extension, containing the binary machine code. So let's go back to writing some assembly programs and see what we can make our computer do!
Add
This is one of our two basic arithmetic operations our ALU can do.
// Add.asm
// Computes R0 = 2 + 3
@2
D=A
@3
D=D+A
@0
M=D
Feeding this add.asm
file through our assembler, you'll see Hack binary code like add.hack
this
0000000000000010
1110110000010000
0000000000000011
1110000010010000
0000000000000000
1110001100001000
created.
In the Hack computer architecture, every line of generated binary code represents one operation or instruction. This is known as a "one instruction per line" or "single instruction per line" architecture.
Note how the assembly code and the generate binary code thus have roughly the same number of lines.
... Testing ...
Max
Compared to the add.asm
program, max.asm
finding the maximum number of two numbers is a bit longer:
// Max.asm
// Computes R2 = max(R0, R1)
@R0
D=M // D = first number
@R1
D=D-M // D = first number - second number
@OUTPUT_FIRST
D;JGT // if D>0 (first is greater) goto output_first
@R1
D=M // D = second number
@OUTPUT_D
0;JMP // goto output_d
(OUTPUT_FIRST)
@R0
D=M // D = first number
(OUTPUT_D)
@R2
M=D // M[2] = D (greatest number)
(INFINITE_LOOP)
@INFINITE_LOOP
0;JMP // infinite loop
... Testing ...
Rect
Now, the next big goal is to be able to play a game of Pong on our computer. For that, we need to draw images to the screen! The code below shows how to draw a simple rectangle.
// Rect.asm
// Draws a rectangle at the top-left corner of the screen.
// The rectangle is 16 pixels wide and R0 pixels high.
@0
D=M
@INFINITE_LOOP
D;JLE
@counter
M=D
@SCREEN
D=A
@address
M=D
(LOOP)
@address
A=M
M=-1
@address
D=M
@32
D=D+A
@address
M=D
@counter
MD=M-1
@LOOP
D;JGT
(INFINITE_LOOP)
@INFINITE_LOOP
0;JMP
... Testing ...