-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat[dace][next]: Added CopyChainRemover
#1901
feat[dace][next]: Added CopyChainRemover
#1901
Conversation
… but this time in multiple states. Note that this transformation does not look at the subsets when it removes it. This is in accordance with ADR-18.
…) -> (T) -> (G)` which is handled by the previous self copy elimination transformation. However, it is something that we should do. The main problem is that deleting the (unnecessary) writes to `G` is a lot harder.
…fication pipeline.
Merge branch 'distributed-self-copy-remover' into redundant-array-remover-chain.
Co-authored-by: edopao <edoardo16@gmail.com>
… that we need to handle concat_where expressions.
I do not fully understand it, it is a DaCe issue. It is, however, interesting that it worked for me yesterday but not today, and for Edoardo it worked yesterday.
The funny thing is that `can_be_applied()` can handle it, but the `apply()` method can not. The case is not handled at all and even the retrival of teh edge can not be handled.
It now also handles the case that `a1` has other outpust. There are some tests missing, which I will now add.
There is one case in teh cycle detection that is not handled correctly, but it is a rather edgy case.
Because then it would be cyclic anyway.
The issue was that we also must propagate the strides into scopes, i.e. the case when a NestedSDFG is located inside a nested SDFG.
I do not want it, but I will allow it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job! I imagined it would have been much simpler..
src/gt4py/next/program_processors/runners/dace/transformations/simplify.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace/transformations/utils.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace/transformations/redundant_array_removers.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace/transformations/redundant_array_removers.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace/transformations/redundant_array_removers.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace/transformations/redundant_array_removers.py
Outdated
Show resolved
Hide resolved
|
||
# If the subset we care about, which is always on the `a1` side, was not | ||
# specified we assume that the whole `a1` has been written. | ||
# TODO(edopao): Fix lowering that this does not happens, it happens for example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it happen in docstring tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not really looked into it, I just know that in that file and that test there is a Memlet without other_subset
.
It is not a high priority, but validation complains, and this case only has three access nodes, so it should be simple to fix, once we have time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've found the place, that's because I call sdfg.make_array_memlet(field.dc_node.data)
which only sets subset
, not other_subset
.
src/gt4py/next/program_processors/runners/dace/transformations/redundant_array_removers.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace/transformations/redundant_array_removers.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace/transformations/redundant_array_removers.py
Outdated
Show resolved
Hide resolved
Co-authored-by: edopao <edoardo16@gmail.com>
|
||
# If the subset we care about, which is always on the `a1` side, was not | ||
# specified we assume that the whole `a1` has been written. | ||
# TODO(edopao): Fix lowering that this does not happens, it happens for example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've found the place, that's because I call sdfg.make_array_memlet(field.dc_node.data)
which only sets subset
, not other_subset
.
This PR adds the
CopyChainRemover
transformation.The introduction of
concat_where
introduced a new pattern, which is essentially a chain of copies.As an example imagine the case that a domain is split into 3 subdomain.
The result on the first subdomain is stored in
T1
, the one of the second inT2
and the one for the third domain inT3
.T1
andT2
are then copied intoT4
, finallyT4
together withT3
are then copied intoT5
.This transformation will remove
T1
,T2
,T3
andT4
thus the results will be written intoT5
directly.There are some limitation, if we have the pattern
(A1) -> (A2)
, then we eliminateA1
only if:A1
is fully read; this is to avoid some nasty adjustments of map bounds.A1
andA2
.The transformation was added twice to the simplify pass, which allows us to mitigate DaCe issue#1959.