Klive Z80 Assembler
Language Structure

Klive Z80 Assembly Language Structure

Each line of the source code is a declaration unit and is parsed in its context. Such a source code line can be one of these constructs:

  • A Z80 instruction, which can be directly compiled to binary code (such as ld bc,#12AC)
  • A directive that is used by the compiler's preprocessor (e.g. #include, #if, etc.)
  • A pragma that emits binary output or instructs the compiler about code emission (.org, .defb, etc.)
  • A compiler statement (or shortly, a statement) that implements control flow operations for the compiler (e.g., .loop, .repeat...until, .if...elif...else...endif)
  • A comment that helps the understanding of the code.

Syntax Basics

The assembler language uses a unique way of case sensitivity. You can write reserved words (such as assembly instructions, pragmas, or directives) with lowercase or uppercase letters, but you cannot mix these cases. For example, these instructions use the proper syntax:

LD c,A
JP #12ac
ldir
djnz MyLabel

However, in these samples, character cases are mixed, and the compiler will refuse them:

Ld c,A
Jp #12ac
ldIR
djNZ MyLabel

In symbolic names (labels, identifiers, etc.), you can mix lowercase and uppercase letters. Nonetheless, the compiler applies case-insensitive comparison when matching symbolic names. So, these statement pairs are equivalent to each other:

jp MainEx
jp MAINEX

djnz mylabel
djnz MyLabel

ld hl,ErrNo
ld hl,errNo

Comments

The language supports two types of comments: end-of-line and block comments.

En-of-line comments start with a semicolon (;) or double forward slash (//). The compiler takes the rest of the line into account as the body of the comment. This sample illustrates this concept:

; This line is a comment-only line
Wait:   ld b,8     ; Set the counter
Wait1:  djnz Wait1 // wait while the counter reaches zero

Block comments can be put anywhere within an instruction line between /* and */ tokens until they do not break other tokens. Nonetheless, block comments cannot span multiple lines; they must start and end within the same source code line. All of the block comments in this code snippet are correct:


SetAttr:
	ld b,32
fill:
  /* block */
  /* b2 */ ld (hl),a
  inc /* b3 */ hl
  djnz /* b4 */ fill /* b5 */
  ret

However, this will result in a syntax error:

/* 
  This block comment spans multiple lines,
  and thus, it is invalid
*/
SetAttr:
	ld b,32

Note: If you need multi-line comments, you can add single-line comments after each other. The Z80 assembly does not have separate multi-line comment syntax.

Literals

The language syntax provides these types of literals:

  • Boolean values. The following tokens represent Booleans: .false, false, .true, and true.
  • Decimal numbers. You can use up to 5 digits (0..9) to declare a decimal number. For example: 16, 32768, 2354.
  • Floating point numbers. You can use the same notation for floating point numbers as in C/C++/Java/C#. Here are a few samples:
.25
123.456
12.45E34
12.45e-12
3e+4
  • Hexadecimal numbers. You can use up to 4 hexadecimal digits (0..9, a..f or A..F) to declare a hexadecimal literal. The compiler looks for a #, 0x, or $ prefix or one of the h or H suffixes to recognize them as hexadecimal. If you use the h or H suffixes, the hexadecimal number should start with a decimal digit 0...9; otherwise, the assembler interprets it as an identifier (label). Here are a few samples:
#12AC
0x12ac
$12Ac
12ACh
12acH
0AC34H
  • Binary numbers. Literals starting with one of the %, or 0b prefixes (or with the b or B suffix) are considered binary literals. You can follow the prefix with up to 16 0 or 1 digits. To make them more readable, you can separate adjacent digits with the underscore (_) or single quote (') character. These are all valid binary literals:
%01011111
0b01011111
0b_0101_1111
0101_1111b
0b'0101'1111
  • Octal numbers. You can use up to 6 digits (0..7) with an o, O (letter O), q, or Q suffix to declare an octal number. Examples: 16o, 327q, 2354Q.

Note: You can use negative numbers with the minus sign in front of them. The sign is not part of the numeric literal; it is an operator.

  • Characters. You can put a character between single quotes (for example: 'Q').
  • Strings. You can put a series of characters between double quotes (for example: "Sinclair").

Note: You can use escape sequences to define non-visible or control characters, as you will learn soon.

  • The $, * or . tokens. These literals are equivalent; all represent the current assembly address.

Identifiers

You can use identifiers to refer to labels and other constants. Identifiers must start with a letter (a...z or A...Z) or with one of these characters: ` (backtick), _ (underscore), @, !, ?, or #. The subsequent ones can be digits and any start characters except backtick. Here are a few examples:

MyCycle
ERR_NO
Cycle_4_Wait
`MyTemp
@ModLocal
IsLastLine?

Note: Some strings can be identifiers or hexadecimal literals with the H or h suffix, like AC0Fh, or FADH. The assembler considers such strings as identifiers. To sign a hexadecimal literal, use a 0 prefix: 0FADH is a hexadecimal literal, while FADH is an identifier.

Note: Theoretically, you can use arbitrary long identifiers. I suggest you make them no longer than 32 characters so readers can read your code easily.

Scoped Identifiers

As you will later learn, the Klive Assembler supports modules like namespaces in other languages (Java, C#, C++, etc.) to encapsulate labels and symbols. To access symbols within modules, you can use scoped identifiers with this syntax:

::? identifier (. identifier)*

The optional :: token means the name should start in the outermost (global) scope. The module and identifier segments are separated with a dot. Examples:

::FirstLevelModule.Routine1
NestedModule.ClearScreen
FirstLevelModule.NestedModule.ClearScreen

Characters and Strings

You have already learned that you can utilize character and string literals (wrapped into single or double quotes, respectively), such as in these samples:

"This is a string. The next sample is a single character:"
'c'

ZX Spectrum has a character set with special control characters such as AT, INK, PAPER, etc. The Assembler allows you to define these with special escape sequences:

EscapeCodeCharacter
\i0x10INK
\p0x11PAPER
\f0x12FLASH
\b0x13BRIGHT
\I0x14INVERSE
\o0x15OVER
\a0x16AT
\t0x17TAB
\P0x60pound sign
\C0x7Fcopyright sign
\\0x5Cbackslash
\'0x27single quote
\"0x22double quote
\00x00binary zero

Note: Some of these sequences have different values than their corresponding pairs in other languages, such as C, C++, C#, or Java.

To declare a character by its binary code, you can use the \xH or
\xHH sequences (H is a hexadecimal digit). For example, these escape sequence pairs are equivalent:

"\i"
"\x10"

"\C by me"
"\x7f \x62y me"

Labels and Symbols

In Klive Z80 Assembly, you can define labels and symbols. Both constructs are syntactically the same, but there is some difference in their semantics. While we define labels to mark addresses (code points) in the program so that we can jump to those addresses and read or write their contents, symbols are not as specific; they just store values we intend to use.

From now on, I will mention "label" for both constructs and do otherwise only when the context requires it.

When you write a Klive Assembly instruction, you can start the line with a label:

MyStart: ld hl,0

Here, in this sample, MyStart is a label. The assembler allows you to omit the colon after the label name, so this line is valid:

MyStart ld hl,0

Some developers like to put a label in a separate line from the instruction to which it belongs. You can use the same hanging label style within Klive. In this case, the label should go before its instruction. Take a look at this code snippet:

MyStart:
  ld hl,0
MyNext
  ; Use B as a counter
  ld b,32

This code is entirely correct. Note the ld b,32 instruction belongs to the MyNext label. As you see from the sample, the colon character is optional for hanging labels, too. You can have multiple line breaks between a label and its instruction, and the space can include comments.

Label and Symbol Declarations

As you will learn later, you can define symbols with the .EQU or .VAR pragmas. While .EQU allows you to assign a constant value to a symbol, it cannot change its value after the declaration. .VAR lets you re-assign the initial value.

Klive supports the idea of lexical scopes. When you create the program, it starts with a global (outermost) lexical scope. Particular language elements, such a statements create their nested lexical scope. Labels and symbols are always created within the current lexical scope. Nonetheless, when resolving them, the assembler starts with the innermost scope and goes through all outer scopes until it finds the label declaration.

This mechanism means that you can declare labels within a nested scope so that those hide labels and symbols in outer scopes.

Klive also supports modules, which allow you to use namespace-like constructs.

Temporary Labels

The assembler considers labels that start with a backtick (`) character as temporary labels. Their scope is the area between the last persistent label preceding the temporary one and the first persistent label following the temporary one.

This code snippet demonstrates this concept:

SetPixels:        ; Persistent label
  ld hl, #4000
  ld a,#AA
  ld b,#20
`loop:            ; Temporary label (scope #1)
  ld (hl),a
  inc hl
  djnz `loop
SetAttr:          ; Persistent label, scope #1 disposed here
  ld hl,#5800
  ld a,#32
  ld b,#20
`loop:            ; Temporary label (scope #2)
  ld (hl),a
  inc hl
  djnz `loop
  ret

; scope #2 still lives here
; ...
Another: ; Persistent label, scope #2 disposed here
  ld a,b

As you see, the two occurrences of `loop belong to two separate temporary scopes. The first scope is the one between SetPixels and SetAttr, the second one between SetAttr and Another.