Klive Z80 Assembly Language Structure
Each line of the source code is a declaration unit and is parsed in its context. Such a source code line can be one of these constructs:
- A Z80 instruction, which can be directly compiled to binary code (such as
ld bc,#12AC
) - A directive that is used by the compiler's preprocessor (e.g.
#include
,#if
, etc.) - A pragma that emits binary output or instructs the compiler about code emission (
.org
,.defb
, etc.) - A compiler statement (or shortly, a statement) that implements control flow operations for the compiler (e.g.,
.loop
,.repeat
...until
,.if
...elif
...else
...endif
) - A comment that helps the understanding of the code.
Syntax Basics
The assembler language uses a unique way of case sensitivity. You can write reserved words (such as assembly instructions, pragmas, or directives) with lowercase or uppercase letters, but you cannot mix these cases. For example, these instructions use the proper syntax:
LD c,A
JP #12ac
ldir
djnz MyLabel
However, in these samples, character cases are mixed, and the compiler will refuse them:
Ld c,A
Jp #12ac
ldIR
djNZ MyLabel
In symbolic names (labels, identifiers, etc.), you can mix lowercase and uppercase letters. Nonetheless, the compiler applies case-insensitive comparison when matching symbolic names. So, these statement pairs are equivalent to each other:
jp MainEx
jp MAINEX
djnz mylabel
djnz MyLabel
ld hl,ErrNo
ld hl,errNo
Comments
The language supports two types of comments: end-of-line and block comments.
En-of-line comments start with a semicolon (;
) or double forward slash (//
). The compiler takes the rest of the line into account as the body of the comment. This sample illustrates this concept:
; This line is a comment-only line
Wait: ld b,8 ; Set the counter
Wait1: djnz Wait1 // wait while the counter reaches zero
Block comments can be put anywhere within an instruction line between /*
and */
tokens until they do not break other tokens. Nonetheless, block comments cannot span multiple lines; they must start and end within the same source code line. All of the block comments in this code snippet are correct:
SetAttr:
ld b,32
fill:
/* block */
/* b2 */ ld (hl),a
inc /* b3 */ hl
djnz /* b4 */ fill /* b5 */
ret
However, this will result in a syntax error:
/*
This block comment spans multiple lines,
and thus, it is invalid
*/
SetAttr:
ld b,32
Note: If you need multi-line comments, you can add single-line comments after each other. The Z80 assembly does not have separate multi-line comment syntax.
Literals
The language syntax provides these types of literals:
- Boolean values. The following tokens represent Booleans:
.false
,false
,.true
, andtrue
. - Decimal numbers. You can use up to 5 digits (0..9) to declare a decimal number. For example:
16
,32768
,2354
. - Floating point numbers. You can use the same notation for floating point numbers as in C/C++/Java/C#. Here are a few samples:
.25
123.456
12.45E34
12.45e-12
3e+4
- Hexadecimal numbers. You can use up to 4 hexadecimal digits (0..9, a..f or A..F) to declare a hexadecimal literal. The compiler looks for a
#
,0x
, or$
prefix or one of theh
orH
suffixes to recognize them as hexadecimal. If you use theh
orH
suffixes, the hexadecimal number should start with a decimal digit0
...9
; otherwise, the assembler interprets it as an identifier (label). Here are a few samples:
#12AC
0x12ac
$12Ac
12ACh
12acH
0AC34H
- Binary numbers. Literals starting with one of the
%
, or0b
prefixes (or with theb
orB
suffix) are considered binary literals. You can follow the prefix with up to 160
or1
digits. To make them more readable, you can separate adjacent digits with the underscore (_
) or single quote ('
) character. These are all valid binary literals:
%01011111
0b01011111
0b_0101_1111
0101_1111b
0b'0101'1111
- Octal numbers. You can use up to 6 digits (0..7) with an
o
,O
(letter O),q
, orQ
suffix to declare an octal number. Examples:16o
,327q
,2354Q
.
Note: You can use negative numbers with the minus sign in front of them. The sign is not part of the numeric literal; it is an operator.
- Characters. You can put a character between single quotes (for example:
'Q'
). - Strings. You can put a series of characters between double quotes (for example:
"Sinclair"
).
Note: You can use escape sequences to define non-visible or control characters, as you will learn soon.
- The
$
,*
or.
tokens. These literals are equivalent; all represent the current assembly address.
Identifiers
You can use identifiers to refer to labels and other constants. Identifiers must start with a letter (a...z or A...Z) or with one of these characters: `
(backtick), _
(underscore), @
, !
, ?
, or #
. The subsequent ones can be digits and any start characters except backtick. Here are a few examples:
MyCycle
ERR_NO
Cycle_4_Wait
`MyTemp
@ModLocal
IsLastLine?
Note: Some strings can be identifiers or hexadecimal literals with the
H
orh
suffix, likeAC0Fh
, orFADH
. The assembler considers such strings as identifiers. To sign a hexadecimal literal, use a0
prefix:0FADH
is a hexadecimal literal, whileFADH
is an identifier.
Note: Theoretically, you can use arbitrary long identifiers. I suggest you make them no longer than 32 characters so readers can read your code easily.
Scoped Identifiers
As you will later learn, the Klive Assembler supports modules like namespaces in other languages (Java, C#, C++, etc.) to encapsulate labels and symbols. To access symbols within modules, you can use scoped identifiers with this syntax:
::
? identifier (.
identifier)*
The optional ::
token means the name should start in the outermost (global) scope. The module and identifier segments are separated with a dot. Examples:
::FirstLevelModule.Routine1
NestedModule.ClearScreen
FirstLevelModule.NestedModule.ClearScreen
Characters and Strings
You have already learned that you can utilize character and string literals (wrapped into single or double quotes, respectively), such as in these samples:
"This is a string. The next sample is a single character:"
'c'
ZX Spectrum has a character set with special control characters such as AT, INK, PAPER, etc. The Assembler allows you to define these with special escape sequences:
Escape | Code | Character |
---|---|---|
\i | 0x10 | INK |
\p | 0x11 | PAPER |
\f | 0x12 | FLASH |
\b | 0x13 | BRIGHT |
\I | 0x14 | INVERSE |
\o | 0x15 | OVER |
\a | 0x16 | AT |
\t | 0x17 | TAB |
\P | 0x60 | pound sign |
\C | 0x7F | copyright sign |
\\ | 0x5C | backslash |
\' | 0x27 | single quote |
\" | 0x22 | double quote |
\0 | 0x00 | binary zero |
Note: Some of these sequences have different values than their corresponding pairs in other languages, such as C, C++, C#, or Java.
To declare a character by its binary code, you can use the \xH
or
\xHH
sequences (H
is a hexadecimal digit). For example, these
escape sequence pairs are equivalent:
"\i"
"\x10"
"\C by me"
"\x7f \x62y me"
Labels and Symbols
In Klive Z80 Assembly, you can define labels and symbols. Both constructs are syntactically the same, but there is some difference in their semantics. While we define labels to mark addresses (code points) in the program so that we can jump to those addresses and read or write their contents, symbols are not as specific; they just store values we intend to use.
From now on, I will mention "label" for both constructs and do otherwise only when the context requires it.
When you write a Klive Assembly instruction, you can start the line with a label:
MyStart: ld hl,0
Here, in this sample, MyStart
is a label. The assembler allows you to omit the colon after the label name, so this line is valid:
MyStart ld hl,0
Some developers like to put a label in a separate line from the instruction to which it belongs. You can use the same hanging label style within Klive. In this case, the label should go before its instruction. Take a look at this code snippet:
MyStart:
ld hl,0
MyNext
; Use B as a counter
ld b,32
This code is entirely correct. Note the ld b,32
instruction belongs to the MyNext
label. As you see from the sample, the colon character is optional for hanging labels, too. You can have multiple line breaks between a label and its instruction, and the space can include comments.
Label and Symbol Declarations
As you will learn later, you can define symbols with the .EQU
or .VAR
pragmas. While .EQU
allows you to assign a constant value to a symbol, it cannot change its value after the declaration. .VAR
lets you re-assign the initial value.
Klive supports the idea of lexical scopes. When you create the program, it starts with a global (outermost) lexical scope. Particular language elements, such a statements create their nested lexical scope. Labels and symbols are always created within the current lexical scope. Nonetheless, when resolving them, the assembler starts with the innermost scope and goes through all outer scopes until it finds the label declaration.
This mechanism means that you can declare labels within a nested scope so that those hide labels and symbols in outer scopes.
Klive also supports modules, which allow you to use namespace-like constructs.
Temporary Labels
The assembler considers labels that start with a backtick (`) character as temporary labels. Their scope is the area between the last persistent label preceding the temporary one and the first persistent label following the temporary one.
This code snippet demonstrates this concept:
SetPixels: ; Persistent label
ld hl, #4000
ld a,#AA
ld b,#20
`loop: ; Temporary label (scope #1)
ld (hl),a
inc hl
djnz `loop
SetAttr: ; Persistent label, scope #1 disposed here
ld hl,#5800
ld a,#32
ld b,#20
`loop: ; Temporary label (scope #2)
ld (hl),a
inc hl
djnz `loop
ret
; scope #2 still lives here
; ...
Another: ; Persistent label, scope #2 disposed here
ld a,b
As you see, the two occurrences of `loop
belong to two separate temporary scopes. The first scope is the one between SetPixels
and SetAttr
, the second one between SetAttr
and Another
.