Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade test_action_dim. Refer to #2767 #2802

Merged
merged 12 commits into from
May 16, 2022

Conversation

gianlucadecola
Copy link
Contributor

For one of the proposals of #2767
This PR aim at obtaining more comprehensive tests to ensure the correct action is performed in the environment.
While adding tests I found that correct action check is performed in different ways across environments; sometimes a check is done before step computations, other times, step is left to fail due to subsequent invalid operations caused by a wrong action.

I've tried to add more consistency across the action validation.

For Discrete environments I'm raising an exception, however, continuous action spaces are typically allowed to take out of bound action and perform action clipping inside (also wrong numpy shape input are accepted). Since I didn't want to break backward compatibility I've decided to just raise a warning in these situations.

Copy link
Contributor

@pseudo-rnd-thoughts pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my idea behind this PR was more on the tests side to check that all environments would work as intended but the idea of validation environments inputs is interesting.
I guess this is a more generalised version of environments checking their input instead of each environment checking it individually.

If this is the approach we want to take, then could we generalise the function to work with every known Space? The contains function is believe is implemented by every Space so it shouldn't be too bad.

@gianlucadecola
Copy link
Contributor Author

Maybe we should defer this to another PR since it is also a design change and leave here only the tests;
the drawback will be that for discrete actions, instead of checking for a specific Exception, if an out-of-bound action is performed I need to check for a generic Exception since it is not consistent across all envs.

For Box actions spaces I think I could instead check that the output of an "out-of-bound" is clipped to the bound of the action_space.
What do you think?

I guess this is a more generalised version of environments checking their input instead of each environment checking it individually.

Yes, while looking at the code when writing tests I thought that more consistency across error handling for out-of-action-space action would be nice

@pseudo-rnd-thoughts
Copy link
Contributor

@RedTachyon What are your thoughts on this PR?
I think that it is reasonable to raise an error if an action outside of the action space is used with env.step for all space type. However, with the other breaking changes that we are making, I don't want to make too many that are unnecessary. This should help spot errors that users don't check for so Im unsure on the balance what to do.

@RedTachyon
Copy link
Contributor

This seems to be like 3 PRs combined into one, no? I see (1) automatic action validation for built-in envs (which is something to consider, but far from an obvious feature for me personally), (2) modifications to car racing (which at a glance seem reasonable), and (3) actual tests?

I'd very strongly prefer those to be split up, with a space to discuss each of them separately. The order of priority imo would be (3) [tests for current code] -> (2) [refactoring without changing functionality] -> (1) [new functionality/deprecating certain functionalities]

As a quick counterpoint about (1), it often might be good enough to just clip the actions to the correct range, which I think already happens in some environments. I don't know if that's a definitive argument against this change, but it's an argument, and something like that needs a stand-alone discussion.

@gianlucadecola
Copy link
Contributor Author

This seems to be like 3 PRs combined into one, no? I see (1) automatic action validation for built-in envs (which is something to consider, but far from an obvious feature for me personally), (2) modifications to car racing (which at a glance seem reasonable), and (3) actual tests?

Definitely right, gone a little too far with refactoring

I'd very strongly prefer those to be split up, with a space to discuss each of them separately. The order of priority imo would be (3) [tests for current code] -> (2) [refactoring without changing functionality] -> (1) [new functionality/deprecating certain functionalities]

As a quick counterpoint about (1), it often might be good enough to just clip the actions to the correct range, which I think already happens in some environments. I don't know if that's a definitive argument against this change, but it's an argument, and something like that needs a stand-alone discussion.

I'm going to revert back those changes and leave this PR only for actual tests 👍

@gianlucadecola
Copy link
Contributor Author

It should be ok now :)

@jkterry1 jkterry1 merged commit 6112b0d into openai:master May 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants