garage.experiment
¶
Experiment functions.
-
class
MetaEvaluator
(*, test_task_sampler, n_exploration_eps=10, n_test_tasks=None, n_test_episodes=1, prefix='MetaTest', test_task_names=None, worker_class=DefaultWorker, worker_args=None)¶ Evaluates Meta-RL algorithms on test environments.
- Parameters
test_task_sampler (TaskSampler) – Sampler for test tasks. To demonstrate the effectiveness of a meta-learning method, these should be different from the training tasks.
n_test_tasks (int or None) – Number of test tasks to sample each time evaluation is performed. Note that tasks are sampled “without replacement”. If None, is set to test_task_sampler.n_tasks.
n_exploration_eps (int) – Number of episodes to gather from the exploration policy before requesting the meta algorithm to produce an adapted policy.
n_test_episodes (int) – Number of episodes to use for each adapted policy. The adapted policy should forget previous episodes when .reset() is called.
prefix (str) – Prefix to use when logging. Defaults to MetaTest. For example, this results in logging the key ‘MetaTest/SuccessRate’. If not set to MetaTest, it should probably be set to MetaTrain.
test_task_names (list[str]) – List of task names to test. Should be in an order consistent with the task_id env_info, if that is present.
worker_class (type) – Type of worker the Sampler should use.
worker_args (dict or None) – Additional arguments that should be passed to the worker.
-
evaluate
(self, algo, test_episodes_per_task=None)¶ Evaluate the Meta-RL algorithm on the test tasks.
- Parameters
algo (MetaRLAlgorithm) – The algorithm to evaluate.
test_episodes_per_task (int or None) – Number of episodes per task.
-
SnapshotConfig
¶
-
class
Snapshotter
(snapshot_dir=os.path.join(os.getcwd(), 'data/local/experiment'), snapshot_mode='last', snapshot_gap=1)¶ Snapshotter snapshots training data.
When training, it saves data to binary files. When resuming, it loads from saved data.
- Parameters
snapshot_dir (str) – Path to save the log and iteration snapshot.
snapshot_mode (str) – Mode to save the snapshot. Can be either “all” (all iterations will be saved), “last” (only the last iteration will be saved), “gap” (every snapshot_gap iterations are saved), “gap_and_last” (save the last iteration as ‘params.pkl’ and save every snapshot_gap iteration separately), “gap_overwrite” (same as gap but overwrites the last saved snapshot), or “none” (do not save snapshots).
snapshot_gap (int) – Gap between snapshot iterations. Wait this number of iterations before taking another snapshot.
-
property
snapshot_dir
(self)¶ Return the directory of snapshot.
- Returns
The directory of snapshot
- Return type
-
property
snapshot_mode
(self)¶ Return the type of snapshot.
- Returns
- The type of snapshot. Can be “all”, “last”, “gap”,
”gap_overwrite”, “gap_and_last”, or “none”.
- Return type
-
property
snapshot_gap
(self)¶ Return the gap number of snapshot.
- Returns
The gap number of snapshot.
- Return type
-
save_itr_params
(self, itr, params)¶ Save the parameters if at the right iteration.
- Parameters
itr (int) – Number of iterations. Used as the index of snapshot.
params (obj) – Content of snapshot to be saved.
- Raises
ValueError – If snapshot_mode is not one of “all”, “last”, “gap”, “gap_overwrite”, “gap_and_last”, or “none”.
-
load
(self, load_dir, itr='last')¶ Load one snapshot of parameters from disk.
- Parameters
- Returns
Loaded snapshot.
- Return type
- Raises
ValueError – If itr is neither an integer nor one of (“last”, “first”).
FileNotFoundError – If the snapshot file is not found in load_dir.
NotAFileError – If the snapshot exists but is not a file.
-
class
ConstructEnvsSampler
(env_constructors)¶ Bases:
garage.experiment.task_sampler.TaskSampler
TaskSampler where each task has its own constructor.
Generally, this is used when the different tasks are completely different environments.
- Parameters
env_constructors (list[Callable[Environment]]) – Callables that produce environments (for example, environment types).
-
property
n_tasks
(self)¶ int: the number of tasks.
-
sample
(self, n_tasks, with_replacement=False)¶ Sample a list of environment updates.
- Parameters
- Returns
- Batch of sampled environment updates, which, when
invoked on environments, will configure them with new tasks. See
EnvUpdate
for more information.
- Return type
-
class
EnvPoolSampler
(envs)¶ Bases:
garage.experiment.task_sampler.TaskSampler
TaskSampler that samples from a finite pool of environments.
This can be used with any environments, but is generally best when using in-process samplers with environments that are expensive to construct.
- Parameters
envs (list[Environment]) – List of environments to use as a pool.
-
property
n_tasks
(self)¶ int: the number of tasks.
-
sample
(self, n_tasks, with_replacement=False)¶ Sample a list of environment updates.
- Parameters
- Raises
ValueError – If the number of requested tasks is larger than the pool, or with_replacement is set.
- Returns
- Batch of sampled environment updates, which, when
invoked on environments, will configure them with new tasks. See
EnvUpdate
for more information.
- Return type
-
class
MetaWorldTaskSampler
(benchmark, kind, wrapper=None, add_env_onehot=False)¶ Bases:
garage.experiment.task_sampler.TaskSampler
TaskSampler that distributes a Meta-World benchmark across workers.
- Parameters
benchmark (metaworld.Benchmark) – Benchmark to sample tasks from.
kind (str) – Must be either ‘test’ or ‘train’. Determines whether to sample training or test tasks from the Benchmark.
wrapper (Callable[garage.Env, garage.Env] or None) – Wrapper to apply to env instances.
add_env_onehot (bool) – If true, a one-hot representing the current environment name will be added to the environments. Should only be used with multi-task benchmarks.
- Raises
ValueError – If kind is not ‘train’ or ‘test’. Also raisd if add_env_onehot is used on a metaworld meta learning (not multi-task) benchmark.
-
property
n_tasks
(self)¶ int: the number of tasks.
-
sample
(self, n_tasks, with_replacement=False)¶ Sample a list of environment updates.
Note that this will always return environments in the same order, to make parallel sampling across workers efficient. If randomizing the environment order is required, shuffle the result of this method.
- Parameters
n_tasks (int) – Number of updates to sample. Must be a multiple of the number of env classes in the benchmark (e.g. 1 for MT/ML1, 10 for MT10, 50 for MT50). Tasks for each environment will be grouped to be adjacent to each other.
with_replacement (bool) – Whether tasks can repeat when sampled. Since this cannot be easily implemented for an object pool, setting this to True results in ValueError.
- Raises
ValueError – If the number of requested tasks is not equal to the number of classes or the number of total tasks.
- Returns
- Batch of sampled environment updates, which, when
invoked on environments, will configure them with new tasks. See
EnvUpdate
for more information.
- Return type
-
class
SetTaskSampler
(env_constructor, *, env=None, wrapper=None)¶ Bases:
garage.experiment.task_sampler.TaskSampler
TaskSampler where the environment can sample “task objects”.
This is used for environments that implement sample_tasks and set_task. For example,
HalfCheetahVelEnv
, as implemented in Garage.- Parameters
-
property
n_tasks
(self)¶ int or None: The number of tasks if known and finite.
-
sample
(self, n_tasks, with_replacement=False)¶ Sample a list of environment updates.
- Parameters
- Returns
- Batch of sampled environment updates, which, when
invoked on environments, will configure them with new tasks. See
EnvUpdate
for more information.
- Return type
-
class
TaskSampler
¶ Bases:
abc.ABC
Class for sampling batches of tasks, represented as `~EnvUpdate`s.
-
abstract
sample
(self, n_tasks, with_replacement=False)¶ Sample a list of environment updates.
- Parameters
- Returns
- Batch of sampled environment updates, which, when
invoked on environments, will configure them with new tasks. See
EnvUpdate
for more information.
- Return type
-
property
n_tasks
(self)¶ int or None: The number of tasks if known and finite.
-
abstract