Skip to content

Latest commit

 

History

History
107 lines (65 loc) · 5.11 KB

readme.md

File metadata and controls

107 lines (65 loc) · 5.11 KB

Model Card

Model Details

Model Description

The pretrained world model is released here as the results of our paper "Action Inference by Maximising Evidence: Zero-Shot Imitation from Observation with World Models" at NeurIPS 2023.

We release the models for the community to replicate the results in our paper and encourage people to study the transferablity of the pretrained world models. The model can be accessed at here.

  • Developed by: Xingyuan Zhang during his Ph.D. at Machine Learning Research Lab at Volkswagen AG.
  • Model type: Latent variable world model, RSSM.
  • License: © 2023. This work is licensed under a CC BY 4.0 license.

Model Sources

Uses

Direct Use

The model can:

  1. estimate the states of the environment from the observations and actions, i.e. $s_{1:t} \sim q(s_{1:t} | o_{1:t}, a_{0:t-1})$.
  2. make prediction from certain state to the future states and observations given an action sequence, i.e. $s_{1:t} \sim p(s_{1:t} | s_0, a_{0:t-1})$ and $o_t \sim p(o_t | s_t)$.
  3. evaluate the lower bound of the likelihood of the observations sequence given a certain action sequence, i.e. $\log p(o_{1:t} | a_{0:t-1})$.

Downstream Use

There are applications with each of the ability the model provides:

  • with 1, the model can be considered to give a pretrained representation for the task you want to apply on.
  • with 2, the model can be treated as a virtual environment to replace the real environment for interaction, which is typically useful for model-based reinforcement learning.
  • with 3, the model can be used to infer the actions given the observations with the AIME algorithm as proposed in our paper.

Out-of-Scope Use

  • Although the model can predict the future observations by rollout on the latent states, it should not be considered for high-quality video generation.
  • The model shouldn't work well on embodiments other than what it is trained for without finetuning.

How to Get Started with the Model

Use the code below to get started with the model:

from aime.utils import load_pretrained_model

model_root = ... 

model = load_pretrained_model(model_root)

Training Details

Training Data

The datasets we use to train the model are also released here. For more details about the datasets, please checkout the data card.

Training Procedure

The model is pretrained on each datasets by running the train_model_only.py script. The walker models are trained with the RSSM architecture while the cheetah models are trained with the RSSMO architecture, you can find their implementation details at code. For example, to get the walker-mix-visual model, you can run python scripts/train_model_only.py env=walker environment_setup=visual embodiment_dataset_name=walker-mix world_model=rssm

Training Hyperparameters

Please checkout the general config at here and model configs for RSSM and RSSMO.

Citation

If you find the models useful, please cite our paper.

@inproceedings{
zhang2023aime,
title={Action Inference by Maximising Evidence: Zero-Shot Imitation from Observation with World Models},
author={Xingyuan Zhang and Philip Becker-Ehmck and Patrick van der Smagt and Maximilian Karl},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=WjlCQxpuxU}
}

Model Card Authors and Contact

Xingyuan Zhang with wizardicarus@gmail.com.