-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug Report] VectorEnv action space seed problem #2680
Comments
An additional concern is that this managed to get through CI tests |
Off the top of my head I don't really know how to fix this, but I think it's pretty low severity, right? Since it's a relatively niche feature, and there's a pretty simple way of doing this "correctly". It might be enough to just mention it in the documentation that batched spaces don't preserve their RNG. Also, what exactly is the intended behavior here? After seeding a space and using it in a vectorized env, do we expect all environments's spaces to have the same RNG? |
Yeah, this is a weird one. Something I can think of is the seed the batched space with the seed from the sub environment space if the seed is set. If there are multiple different seeds from the sub environment spaces probably just take the first one’s seed…? |
The second option (seeding the As @RedTachyon said, there is no easy way to fix this unfortunately. The only way I could see seeding being enforced is if we had a way to know the seed of an instantiated space, which is impossible with the current API; the only way to have access to seeding information is when we call As a side note (this was extensively discussed in #2422), if reproducibility is critical, I would suggest defining (and seeding) a random policy instead, using PyTorch/Jax/Numpy/TF. You'd be losing the ease to sample from any kind of space (barely, it should be easy to write code that works for any space), but you would gain a lot of control over what you want to be fully reproducible. |
Should we include a disclaimer like "This method is meant to serve as a quick and dirty way of getting a valid action, and should not be used in situations where reproducibility is desirable" around |
I like that disclaimer |
Looking at creating a solution for this, the problem is in gym.vector.utils.spaces as when a batch space is created then the new space, a seed is not provided based on the original space. Therefore, I have modified the space.init seed parameter to allow a raw seeding.RandomNumberGenerator object I will create a PR with this solution to allow review and comments Edit: PR is #2727
There doesn't appear to be any tests for the batch space seeds, Im adding that as part of the PR |
… 2680 (#2727) * Add a case for the Box shape where the low and high values are both scalars * Add seeding.RandomNumberGenerator parameter to Dict seed. Modify __repr__ for the dictionary space string looks similar to an actual dictionary * Add seeding.RandomNumberGenerator parameter to Multi Binary seed * Add seeding.RandomNumberGenerator parameter to Multi Binary seed. Modify nvec typing to include np.ndarray * Space seed typing can be a seeding.RandomNumberGenerator. If a seeding.RNG is provided then it is assigned to _np_random and .seed is not run * Fixed the tuple seeding type as List[int] is not a valid Space seed type * Added typing to batch_space. The batch_space seed is equal to the space's seeding * Fixed the seeding type * Add test for batch space seeds are identical to the original space's seeding * Add equivalence function for RandomNumberGenerator comparing the bit_generator.state * The batch_space functions uses a copy of the seed for the original space * Set the action space seed for sync_vector_env seed testing * Add test for the seeding of the sync vector environment * Update the test_batch_space_seed to check the resulting sampling are equivalent for testing * Revert representation back to the original version * Remove additional Box shape initialisation * Remove additional typing of MultiDiscrete * Fixed bug of Space batch space where the original space's np_random is not a complete copy of the original space * Add CustomSpace to the batched space seed test * Modify the CustomSpace sample to produce a random number not a static value * Fix CustomSpace to reflect the sample function * Copy the space.np_random for the batched_space seed to ensure that the original space doesn't sampling doesn't effect the batched_space * Parameterized the batch_space_seed, added testing for rng_different_at_each_index and test_deterministic * Black and isort pre-commit changes * Pre-commit fix * MacOS, test_read_from_shared_memory throws an error that the inner _process_write function was unpicklable. Making the function a top-level function solves this error * Fixed typing of seed where a space's seed function differs from Space.seed's typing * Added check that the sample lengths are equal and explicitly provided the number of batched spaces n=1 * Removed relative imports for absolute imports * Use deepcopy instead of copy * Replaces `from numpy.testing._private.utils import assert_array_equal` with `from numpy.testing import assert_array_equal` * Using the seeding `__eq__` function, replace `np_random.bit_generator.state` with `np_random` * Added docstrings and comments to the tests to explain their purpose * Remove __eq__ from RandomNumberGenerator and add to tests/vector/utils * Add sync vector determinism test for issue #2680 * Fixed bug for https://github.com/openai/gym/pull/2727/files/462101d3846bc35ff3fad9f65979c693472a93a8#r850740825 * Made the new seeds a list of integers
Describe the bug
envs.action_space.sample()
is not reproducible after seeds.Code example
I think the problem is #2280 batches a new unseeded action space... So if I seed this action space again it will work:
But this may seem unintuitive to the users. I am not sure how to resolve this, @tristandeleu What do you think?
Checklist
The text was updated successfully, but these errors were encountered: