Minimalistic 64 bit virtual machine inspired by Forth and Factor. The BlueVM aims to:
- Have a reasonably small, simplistic and hackable codebase
- Support execution of any previously compiled machine or bytecode
- Provide a minimal set of opcodes that can be extended by the host
- Serve as the basis for minimalistic handcrafted applications
- Allow the host full control
- Bring some fun back into the world
By convention BlueVM bytecode files have a bs0
(BlueVM Stage 0) extension.
To build the BlueVM, tools and examples run ./build.sh
The BlueVM has the following layout in rwx memory:
- Opcode Map (4096 bytes)
- BlueVM Opcode Map: 0x00 - 0x7F (2048 bytes)
- Extended Opcode Map: 0x80 - 0xFF (2048 bytes)
- Input buffer (2048 bytes)
- Return stack (1024 bytes)
- Data stack (1024 bytes)
- Runtime Buffer (4096 bytes)
- BlueVM addresses (64 bytes)
- Code Buffer (4032 bytes)
BlueVM will set entries in the BlueVM opcode map and read 2048 bytes from stdin into the input buffer. This initial read will serve as the bootstrap for the host and is interpreted until the host stops execution.
BlueVM follows a simple byte oriented execution strategy. Using the values from the BlueVM data portion of memory it begins the outer interpreter:
- Read byte from and increment instruction pointer
- Locate the opcode entry in the opcode map
- If code address is 0 push the opcode on the data stack and call invalid opcode handler
- Push the code address and flags on the data stack and call the opcode handler
- Goto 1
Because of this "late binding" approach, the host can change the values that the BlueVM uses to execute. The outer
interpreter requires that the host termintes execution properly. One way to do this is with the exit
opcode.
Each entry in the opcode map is 16 bytes.
- Flags (1 byte)
- Size (1 byte)
- Code (14 bytes)
- Either 8 bytes for addr and 6 empty
- Or 14 bytes available for byte or machine code
Once an opcode entry is located in the opcode map the opcode handler is called. The BlueVM has two opcode handlers, interpret (default) and compile. The interpret handler calls the code address specified in the opcode entry. The compile handler finds the opcode length in the opcode entry and copies that many bytes, including the opcode, into the code buffer and advances the code buffer's here.
Opcodes start at 00 and subject to change.
Opcode | Name | Stack Effect | Description |
---|---|---|---|
XX | exit | ( b -- ) | Exit with status from top of stack |
XX | true | ( -- t ) | Push true value |
XX | false | ( -- f ) | Push false value |
XX | if-else | ( t/f ta fa -- ? ) | Call ta if t/f is true else call fa |
XX | mccall | ( a -- ? ) | Call machine code at address |
XX | call | ( a -- ? ) | Call bytecode located at address |
XX | >r | ( a -- ) | Move top of data stack to return stack |
XX | r> | ( -- a ) | Move top of return stacl to data stack |
XX | ret | ( -- ) | Pops value from return stack and sets the instruction pointer |
XX | [ | ( -- ) | Begin compiling bytecode |
XX | ] | ( -- a ) | Append ret and end compilation, push addr where compilation started |
XX | ip | ( -- a ) | Push location of the instruction pointer |
XX | ip! | ( a -- ) | Sets the location of the instruction pointer |
XX | entry | ( b -- a ) | Push addr of the entry for opcode |
XX | start | ( -- a ) | Push the code buffer location |
XX | here | ( -- a ) | Push location of code buffer's here |
XX | here! | ( a -- ) | Sets the location of code buffer's here |
XX | b@ | ( a -- b ) | Push byte value found at addr |
XX | @ | ( a -- ) | Push qword value found at addr |
XX | b!+ | ( a b -- 'a ) | Write byte value to, increment and push addr |
XX | w!+ | ( a w -- 'a ) | Write word value to, increment and push addr |
XX | d!+ | ( a d -- 'a ) | Write dword value to, increment and push addr |
XX | !+ | ( a q -- 'a ) | Write qword value to, increment and push addr |
XX | b, | ( b -- ) | Write byte value to, and increment, here |
XX | w, | ( w -- ) | Write word value to, and increment, here |
XX | d, | ( d -- ) | Write dword value to, and increment, here |
XX | , | ( q -- ) | Write qword value to, and increment, here |
XX | litb | ( -- b ) | Push next byte from, and increment, instruction pointer |
XX | litw | ( -- w ) | Push next word from, and increment, instruction pointer |
XX | litd | ( -- d ) | Push next dword from, and increment, instruction pointer |
XX | lit | ( -- q ) | Push next qword from, and increment, instruction pointer |
XX | depth | ( -- n ) | Push depth of the data stack |
XX | drop | ( x -- ) | Drops top of the data stack |
XX | dup | ( a -- a a ) | Duplicate top of stack |
XX | swap | ( a b -- b a ) | Swap top two values on the data stack |
XX | not | ( x -- 'x ) | Bitwise not top of the data stack |
XX | = | ( a b -- t/f ) | Check top two items for equality and push result |
XX | + | ( a b -- n ) | Push a + b |
XX | - | ( a b -- n ) | Push a - b |
XX | shl | ( x n -- 'x ) | Push x shl n |
XX | shr | ( x n -- 'x ) | Push x shr n |
Along with the code for BlueVM this repository also contains some tools and examples that can be used as reference:
Name | Descripton | Location |
---|---|---|
blang | Quick and dirty frontend for a textual representation of the BlueVM bytecode | lang/blang |
blue | Frontend for a language with a Forth-like syntax where you are the assembler and linker | lang/blue |
- Write a bs0->blang (gnalb) decompiler by overwriting opcode map/handler
- Rename entry
- Add ! flavors
- Use in Blue opcodes
- See about re-arranging >r order in op_compile_begin to simplify it and op_compile_end
- Add more ops to make defining a custom op less verbose/brittle
- opN[ ]op
- opNI[ ]op
- opB[ ]op
- opBI[ ]op
- Dockerize and get a CI job that runs ./build.sh
- Add bytecode op if/if-not
- Add bytecode opcodes for litb, etc
- Print error messages to disambiguate exit status
- Move blang into lang like blue