[Bug Report] VectorEnv action space seed problem #2680

vwxyzjn · 2022-03-09T16:04:18Z

Describe the bug
envs.action_space.sample() is not reproducible after seeds.

Code example

import gym
import numpy as np
def make_env(env_id, seed):
    def thunk():
        env = gym.make(env_id)
        env.seed(seed)
        env.action_space.seed(seed)
        env.observation_space.seed(seed)
        return env

    return thunk
seed = 1
envs = gym.vector.SyncVectorEnv([make_env("CartPole-v1", seed)])
sampled_actions = []
for _ in range(20):
    sampled_actions += [envs.action_space.sample()]
print(np.array(sampled_actions).flatten())

(gym-vSAllIMi-py3.9) ➜  gym git:(master) ✗ python w.py
[1 0 0 1 0 1 1 0 1 1 1 1 1 0 0 1 0 1 0 0]
(gym-vSAllIMi-py3.9) ➜  gym git:(master) ✗ python w.py
[0 1 1 1 0 0 1 0 1 0 0 0 1 1 0 0 1 1 1 1]
(gym-vSAllIMi-py3.9) ➜  gym git:(master) ✗ python w.py
[0 1 1 0 1 1 1 1 1 0 1 0 0 1 1 1 1 0 1 1]
(gym-vSAllIMi-py3.9) ➜  gym git:(master) ✗ python w.py
[0 0 0 1 1 1 1 1 0 1 1 0 1 1 0 1 0 0 0 0]
(gym-vSAllIMi-py3.9) ➜  gym git:(master) ✗ python w.py
[0 1 0 1 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 0]

I think the problem is #2280 batches a new unseeded action space... So if I seed this action space again it will work:

import gym
import numpy as np
def make_env(env_id, seed):
    def thunk():
        env = gym.make(env_id)
        env.seed(seed)
        env.action_space.seed(seed)
        env.observation_space.seed(seed)
        return env

    return thunk
seed = 1
envs = gym.vector.SyncVectorEnv([make_env("CartPole-v1", seed)])
envs.action_space.seed(seed)
envs.observation_space.seed(seed)
sampled_actions = []
for _ in range(20):
    sampled_actions += [envs.action_space.sample()]
print(np.array(sampled_actions).flatten())

(gym-vSAllIMi-py3.9) ➜  gym git:(master) ✗ python w.py
[1 1 0 1 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0]
(gym-vSAllIMi-py3.9) ➜  gym git:(master) ✗ python w.py
[1 1 0 1 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0]
(gym-vSAllIMi-py3.9) ➜  gym git:(master) ✗ python w.py
[1 1 0 1 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0]
(gym-vSAllIMi-py3.9) ➜  gym git:(master) ✗ python w.py
[1 1 0 1 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0]
(gym-vSAllIMi-py3.9) ➜  gym git:(master) ✗ python w.py
[1 1 0 1 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0]
(gym-vSAllIMi-py3.9) ➜  gym git:(master) ✗

But this may seem unintuitive to the users. I am not sure how to resolve this, @tristandeleu What do you think?

Checklist

I have checked that there is no similar issue in the repo (required)

The text was updated successfully, but these errors were encountered:

jkterry1 · 2022-03-11T03:14:09Z

An additional concern is that this managed to get through CI tests

RedTachyon · 2022-03-13T22:19:56Z

Off the top of my head I don't really know how to fix this, but I think it's pretty low severity, right? Since it's a relatively niche feature, and there's a pretty simple way of doing this "correctly". It might be enough to just mention it in the documentation that batched spaces don't preserve their RNG.

Also, what exactly is the intended behavior here? After seeding a space and using it in a vectorized env, do we expect all environments's spaces to have the same RNG?

vwxyzjn · 2022-03-13T22:21:37Z

Yeah, this is a weird one. Something I can think of is the seed the batched space with the seed from the sub environment space if the seed is set. If there are multiple different seeds from the sub environment spaces probably just take the first one’s seed…?

tristandeleu · 2022-03-24T23:32:06Z

The second option (seeding the observation_space and action_space of VectorEnv, instead of the individual environments) should be the preferred one, since a VectorEnv really is just like any environment. This could be made clearer in the documentation.

As @RedTachyon said, there is no easy way to fix this unfortunately. The only way I could see seeding being enforced is if we had a way to know the seed of an instantiated space, which is impossible with the current API; the only way to have access to seeding information is when we call space.seed().

As a side note (this was extensively discussed in #2422), if reproducibility is critical, I would suggest defining (and seeding) a random policy instead, using PyTorch/Jax/Numpy/TF. You'd be losing the ease to sample from any kind of space (barely, it should be easy to write code that works for any space), but you would gain a lot of control over what you want to be fully reproducible.

RedTachyon · 2022-03-24T23:43:35Z

Should we include a disclaimer like "This method is meant to serve as a quick and dirty way of getting a valid action, and should not be used in situations where reproducibility is desirable" around space.sample()?

jkterry1 · 2022-03-31T20:59:54Z

I like that disclaimer

pseudo-rnd-thoughts · 2022-04-02T16:19:58Z

Looking at creating a solution for this, the problem is in gym.vector.utils.spaces as when a batch space is created then the new space, a seed is not provided based on the original space.
However, the problem is this seed value needs to be an integer value that can't be reversed

Therefore, I have modified the space.init seed parameter to allow a raw seeding.RandomNumberGenerator object
Using this modification, the first env provided to a sync vector env is used for the action space seeding

I will create a PR with this solution to allow review and comments

Edit: PR is #2727

An additional concern is that this managed to get through CI tests

There doesn't appear to be any tests for the batch space seeds, Im adding that as part of the PR

… 2680 (#2727) * Add a case for the Box shape where the low and high values are both scalars * Add seeding.RandomNumberGenerator parameter to Dict seed. Modify __repr__ for the dictionary space string looks similar to an actual dictionary * Add seeding.RandomNumberGenerator parameter to Multi Binary seed * Add seeding.RandomNumberGenerator parameter to Multi Binary seed. Modify nvec typing to include np.ndarray * Space seed typing can be a seeding.RandomNumberGenerator. If a seeding.RNG is provided then it is assigned to _np_random and .seed is not run * Fixed the tuple seeding type as List[int] is not a valid Space seed type * Added typing to batch_space. The batch_space seed is equal to the space's seeding * Fixed the seeding type * Add test for batch space seeds are identical to the original space's seeding * Add equivalence function for RandomNumberGenerator comparing the bit_generator.state * The batch_space functions uses a copy of the seed for the original space * Set the action space seed for sync_vector_env seed testing * Add test for the seeding of the sync vector environment * Update the test_batch_space_seed to check the resulting sampling are equivalent for testing * Revert representation back to the original version * Remove additional Box shape initialisation * Remove additional typing of MultiDiscrete * Fixed bug of Space batch space where the original space's np_random is not a complete copy of the original space * Add CustomSpace to the batched space seed test * Modify the CustomSpace sample to produce a random number not a static value * Fix CustomSpace to reflect the sample function * Copy the space.np_random for the batched_space seed to ensure that the original space doesn't sampling doesn't effect the batched_space * Parameterized the batch_space_seed, added testing for rng_different_at_each_index and test_deterministic * Black and isort pre-commit changes * Pre-commit fix * MacOS, test_read_from_shared_memory throws an error that the inner _process_write function was unpicklable. Making the function a top-level function solves this error * Fixed typing of seed where a space's seed function differs from Space.seed's typing * Added check that the sample lengths are equal and explicitly provided the number of batched spaces n=1 * Removed relative imports for absolute imports * Use deepcopy instead of copy * Replaces `from numpy.testing._private.utils import assert_array_equal` with `from numpy.testing import assert_array_equal` * Using the seeding `__eq__` function, replace `np_random.bit_generator.state` with `np_random` * Added docstrings and comments to the tests to explain their purpose * Remove __eq__ from RandomNumberGenerator and add to tests/vector/utils * Add sync vector determinism test for issue #2680 * Fixed bug for https://github.com/openai/gym/pull/2727/files/462101d3846bc35ff3fad9f65979c693472a93a8#r850740825 * Made the new seeds a list of integers

pseudo-rnd-thoughts · 2022-04-28T14:37:58Z

@vwxyzjn Thanks has been solved #2727, could you close please

vwxyzjn mentioned this issue Mar 9, 2022

dqn.py does not respect seed vwxyzjn/cleanrl#133

Closed

jkterry1 added the PR Needed label Mar 13, 2022

pseudo-rnd-thoughts added a commit to pseudo-rnd-thoughts/gym that referenced this issue Apr 19, 2022

Add sync vector determinism test for issue openai#2680

b7fc60f

jkterry1 closed this as completed Apr 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug Report] VectorEnv action space seed problem #2680

[Bug Report] VectorEnv action space seed problem #2680

vwxyzjn commented Mar 9, 2022 •

edited

Loading

jkterry1 commented Mar 11, 2022

RedTachyon commented Mar 13, 2022

vwxyzjn commented Mar 13, 2022

tristandeleu commented Mar 24, 2022

RedTachyon commented Mar 24, 2022

jkterry1 commented Mar 31, 2022

pseudo-rnd-thoughts commented Apr 2, 2022 •

edited

Loading

pseudo-rnd-thoughts commented Apr 28, 2022

[Bug Report] VectorEnv action space seed problem #2680

[Bug Report] VectorEnv action space seed problem #2680

Comments

vwxyzjn commented Mar 9, 2022 • edited Loading

Checklist

jkterry1 commented Mar 11, 2022

RedTachyon commented Mar 13, 2022

vwxyzjn commented Mar 13, 2022

tristandeleu commented Mar 24, 2022

RedTachyon commented Mar 24, 2022

jkterry1 commented Mar 31, 2022

pseudo-rnd-thoughts commented Apr 2, 2022 • edited Loading

pseudo-rnd-thoughts commented Apr 28, 2022

vwxyzjn commented Mar 9, 2022 •

edited

Loading

pseudo-rnd-thoughts commented Apr 2, 2022 •

edited

Loading