Skip to content
Darren Kulp edited this page Apr 26, 2012 · 41 revisions

The tenyr assembly language is first of all case-insensitive. Further, it is algebraic, rather than mnemonic like most common assembly languages : it doesn't use keywords like mov or xor to signal operations to the assembler. Examples will clarify :

_start:
    b <- 10         // set up loop constraint
    // comments can appear anywhere on a line
    d <- -5         // load multiplier
top:
    b <- b - 1      // decrement loop variable
    c <- b > a      // compare b to 0
    c <- c & d
    p <- p + c + 1  // jump back
done:
    illegal

The operation "load the immediate value 10 into register B" is effected with b <- 10.

Registers

There are 16 general-purpose registers, all 32 bits wide, named A through P. Register A, when read from, is always 0 ; it can be written to, but with no effect. Register P is the program counter / instruction pointer ; it can be written to directly, but otherwise will increment by 1 with every executed instruction. Its current value is the address of the executing instruction.

Syntax

All operations in tenyr have basically two major forms :

  1. Z arrow X op Y + I
  2. Z arrow X op I + Y

where arrow is <- or -> ; op is one of the supported operations ; Z, X, and Y are named registers ; and I is a 12-bit sign-extended immediate value. Regardless of the op, the operation involving X op _ always occurs logically before the addition of I or Y. Dereferencing (using the value as a pointer into memory, and retrieving or storing through that pointer) is also possible ; however, only one side of the arrow can be dereferenced per instruction. It is possible to leave out any one or two of the elements on the right side of the arrow ; this is effected by replacing missing operands with A (which is always zero) and missing operations with bitwise-or. For example, using real register names this time :

 b  <-  c + d + 1   # load b with the value c+d+1
 b  <- [c + d + 1]  # load b with the value at address c+d+1
[b] <-  c + d + 1   # store to address b the value c+d+1
 b  <-  c * 3 + d   # load b with the value c*3+d
 b  <- [c * 3 + d]  # load b with the value at address c*3+d
[b] <-  c * 3 + d   # store to address b the value c*3+d

and when the right-hand side is dereferenced, the arrows can be reversed (however, the immediate on the right-hand side is only 12 bits wide, because the right-hand side is technically always in the X op Y + I form) :

 b  -> [c + d + 1]  # store b to address c+d+1
 b  -> [0x333]      # store b to address 0x333

Operations

Currently the following operations are supported. These operations have not yet been finalised, since some of them are not as friendly to synthesis. Operations that are guaranteed to remain appear in bold font. Keep in mind that operations take registers for both operands ; the only operations permitted with immediate values (or constant expressions composed of immediate values and references to labels) are addition and subtraction, due to the fixed instruction format.

  • X | Y : bitwise or (unsigned)
  • X & Y : bitwise and (unsigned)
  • X + Y : add (signed)
  • X * Y : multiply (signed)
  • X << Y : shift left (unsigned)
  • X <= Y : compare less than or equal (signed)
  • X == Y : compare equal (signed)
  • X ~| Y : bitwise nor (unsigned)
  • X ~& Y : bitwise nand (unsigned)
  • X ^ Y : bitwise xor (unsigned)
  • X +- Y : add two's complement (signed)
  • X ^~ Y : xor ones' complement (unsigned)
  • X >> Y : shift right logical (unsigned)
  • X > Y : compare greater than (signed)
  • X <> Y : compare not equal (signed)

When an immediate value appears as the second operand in an operation, it will be sign-extended before the operation if used in a signed operation, and zero-extended otherwise. If it appears alone on the right side of the arrow, it can be preceded by a dollar sign $ to indicate that it should not be sign-extended ; otherwise it will be sign-extended.

B <-  0xfff # B now contains 0xffffffff
C <- $0xfff # C now contains 0x00000fff

Labels

Labels are strings from 2 - 31 characters long that are used to mark and refer to addresses in a code symbolically. They must match the regular expression /[A-Z_][A-Z0-9_]{1,30}/. A label is created by suffixing it with a colon, and is referred to by prefixing it with an at-sign :

    b <- @foo
foo:
    .word 0xbeef

at which point register B now contains the address represented by foo, which points to a 32-bit word containing the value 0x0000beef. .word is a directive.

The . character can be used as a sort of special label reference, to the current instruction or directive. For example, if bar in the following example represents address 6, then the value at bar will be 11.

bar: .word . + 5

Directives

The directives currently supported are

  • .word — This directive creates 32-bit words containing the values of the expressions following it (multiple expressions must be separated by commas).
  • .ascii — This directive packs the double-quoted string following it at 8 bits per character into 32-bit words, little end first.

.word

For example :

.word 0
.word 2 * 3, 3 * 4, 0x2
.word (@bar + 7) * @foo - .

.ascii

For example :

.ascii "hello, world"
.ascii "this" " " "is a series of " "concatenations"

Comments

Two kinds of comments are supported : C99-style comments starting with // and shell-style comments starting with #. Both types of comments extend only to the end of line, and as there is no line continuation character, every commented line must have its own // or # character. As long as the comment does not turn one token or lexeme into two, however, comments can come anywhere in the program text.

Clone this wiki locally