The matrix extension adds 9 unprivileged CSRs and 16 matrix registers to the base scalar RISC-V ISA.
The read-only XLEN-wide matrix type CSR, mtype
, provides the default type used to interpret the contents of the matrix register file, and can only be updated by msettype{i|hi}
and field-set instructions. The matrix type determines the organization of elements in each matrix register.
Note
|
Allowing updates only via type-set or field-set instructions simplifies the maintenance of mtype register state.
|
The mtype
register has an mill
field, an msew
field, an mba
field and several type fields. Bits mtype[XLEN-2:16]
should be written with zero, and non-zero values of this field are reserved.
The msew
field is used to specify the element width of source operands. It is used to calculate the maximum values of matrix size.
For each type field, a value 0 means the corresponding type is disabled. Write non-zero value to enable matrix multiplication operation of the specified type. 0 will be returned and mill
will be set if the type is not supported.
For mint4
field, write 1 to enable 4-bit integer where a 8-bit integer will be treated as a pair of 4-bit integers. 1’b0 will be returned if 4-bit integer is not supported.
For mint8
field, write 1 to enable 8-bit integer.
For mint16
field, write 1 to enable 16-bit integer.
For mint64
field, write 1 to enable 64-bit integer.
For mfp8
field, write 2’b01 to enable E4M3, 2’b10 to enable E5M2, and 2’b11 to enable E3M4. mfp8[1:0]
always returns 2’b00 if FP8 is not supported.
For mfp16
field, write 2’b01 to enable IEEE-754 half-precision float point (E5M10), and write 2’b10 to enable BFloat16 (E8M7). 2’b11 is reserved.
For mfp32
field, write 2’b01 to enalbe IEEE-754 single-precison float point (E8M23), and write 2’b10 to enable TensorFloat32 (E8M10). 2’b11 is reserved.
For mfp64
field, write 1 to enable 64-bit double-precision float point. To support FP64 format, the implementation should support "D" extension at the same time. 0 will be returned if FP64 is not supported.
The mba
field indicates that the out-of-bound elements is undisturbed or agnostic. When mba
is marked undisturbed (mba=0
), the out-of-bound elements in a matrix register retain the value it previously held. Otherwise, the out-of-bound elements can be overwritten with any values.
The XLEN-bit-wide read-only mtilem/mtilek/mtilen
CSRs can only be updated by the msettile{m|k|n}{i}
instructions. The registers holds 3 unsigned integers specifying the tile shapes for tiled matrix.
The mstart
read-write CSR specifies the index of the first element to be executed by load/store and element-wise arithmetic instructions. The CSR can be written by hardware on a trap, and its value represents the element on which the trap was taken. The value is the sequential number in row order.
Any legal matrix instruction can reset the mstart
to zero at the end of excution.
The mcsr
register has 2 fields, and other bits with non-zero value are reserved.
mcsr
register layout
Bits | Name | Description |
---|---|---|
XLEN-1:1 |
0 |
Reserved if non-zero. |
0 |
msat |
Integer/FP8 arithmetic instruction accrued saturation flag. |
If msat
bit is set, the results of integer and FP8 will be saturated to the border values of corresponding type.