Gym Wrappers

Additional Gymnasium Wrappers to enhance Gymnasium environments.

TimeFeatureWrapper

class sb3_contrib.common.wrappers.TimeFeatureWrapper(env, max_steps=1000, test_mode=False)[source]

Add remaining, normalized time to observation space for fixed length episodes. See https://arxiv.org/abs/1712.00378 and https://github.com/aravindr93/mjrl/issues/13.

Note

Only gym.spaces.Box and gym.spaces.Dict (gym.GoalEnv) 1D observation spaces are supported for now.

Parameters:

env – Gym env to wrap.
max_steps – Max number of steps of an episode if it is not wrapped in a TimeLimit object.
test_mode – In test mode, the time feature is constant, equal to zero. This allow to check that the agent did not overfit this feature, learning a deterministic pre-defined sequence of actions.

reset(**kwargs)[source]

Uses the reset() of the env that can be overwritten to change the returned data.

Return type:: Tuple[ndarray | Dict[str, ndarray], Dict[str, Any]]

step(action)[source]

Uses the step() of the env that can be overwritten to change the returned data.

Parameters:: action (ActType) –
Return type:: Tuple[ndarray | Dict[str, ndarray], SupportsFloat, bool, bool, Dict[str, Any]]