Skip to content

Sanitiser

Matt Windsor edited this page Jan 4, 2019 · 2 revisions

The sanitiser is the part of act that reduces assembly listings from compilers to a form suitable for conversion to litmus tests. Since it performs direct manipulations on the assembly, it is arguably the most complex and potentially error-prone part of act.

Specific sanitiser passes

Deref chain elimination

The Sanitiser_deref module, and its corresponding simplify-deref-chains pass flag, simplifies chains of move instructions that start with loading a variable's symbol into a location and end with dereferencing that symbol in some manner. These chains tend to get generated by non-optimising compilers when they translate atomic_int usages.

Warning: While this pass is implemented in a language-abstract manner, it specifically operates on an x86 idiom, and may need to be generalised in future when we target other architectures.

Read chains

A read chain starts by moving a variable symbol, and ends by using the dereference of that symbol as the source of a move instruction. Between the two is a sequence of zero or more moves that pass the symbol around. For example, in (x86 intel):

mov EAX    , x       ; move value of symbol x to register
mov [EBP-4], EAX     ; spill symbol onto stack
mov EDX    , [EBP-4] ; place symbol back into register
mov EDX    , [EDX]   ; dereference it

The simplify-deref-chains pass simplifies this chain into a single move from the memory pointed to by the symbol to the final destination at the end of the chain:

mov EDX, [x]

Write chains

A write chain starts like a read chain, but includes a movement of an immediate value into another location, and ends with the writing of that value into the dereferenced variable. Between the two ends of the chain is an interleaving of movements of the variable symbol (the target) and the value to write (the value).

For example, again in (x86 intel):

mov EAX    , x       ; move value of symbol x to register
mov [EBP-4], 1       ; load a value into memory
mov [EBP-8], EAX     ; spill symbol onto stack
mov EDX    , [EBP-4] ; place value into register
mov EAX    , [EBP-8] ; place symbol back into register
mov [EAX]  , EDX     ; store into variable

The simplify-deref-chains pass simplifies this chain into a single move of the immediate value into the final destination at the end of the chain:

mov [x], 1

Sanitiser internals

The Ctx monad

The sanitiser holds, at any given time, a large amount of state about the program (or programs) on which it is operating. As with most things in act, we handle this in a functional manner, by enclosing this state in a state monad.

Clone this wiki locally