Skip to content

Commit

Permalink
First commit
Browse files Browse the repository at this point in the history
  • Loading branch information
Jing Yuan Thong committed Jun 19, 2024
0 parents commit fae6a5b
Show file tree
Hide file tree
Showing 16 changed files with 1,684 additions and 0 deletions.
162 changes: 162 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2024 Panasonic Connect

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
27 changes: 27 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# RAP: Retrieval-Augmented Planning with Contextual Memory for Multimodal LLM Agents
This repository is the implementation of [RAP](https://arxiv.org/abs/2402.03610).

![RAP figure](./figure/figure.png)

# Get Started
Please refer to the following README's for each benchmark.
* [ALFWorld](./alfworld/README.md)
* [WebShop](./webshop/README.md)
* Franka Kitchen
* MetaWorld

# Citation
If you find RAP helpful in your research, please consider citing.
```bibtex
@misc{kagaya2024rap,
title={RAP: Retrieval-Augmented Planning with Contextual Memory for Multimodal LLM Agents},
author={Tomoyuki Kagaya and Thong Jing Yuan and Yuxuan Lou and Jayashree Karlekar and Sugiri Pranata and Akira Kinose and Koki Oguri and Felix Wick and Yang You},
year={2024},
eprint={2402.03610},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```

# License
MIT license
25 changes: 25 additions & 0 deletions alfworld/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Get Started
1. Move alfworld directory and install requirements
```bash
cd ./alfworld
pip install -r requirements.txt
```

2. According to the [instruction](https://github.com/alfworld/alfworld?tab=readme-ov-file#quickstart), download the data and set environment variables

3. Prepare OpenAI API key and put the key in ```OpenAI_api_key.txt```

4. Run RAP on ALFWorld
```bash
python main.py
```

Following are the hyper-parametes for RAP.

* num_trials: Number of recursive trials. Default is 3.
* num_steps: Maximum steps in one task. Default is 50.
* model: Model to be used in evaluation. Default is "gpt-3.5-turbo-instruct".
* output: Folder path to output logs and memory.
* emb_model: Embedding model to be used in evaluation. Default is "sentence-transformers/all-MiniLM-L6-v2".

Also, Python 3.11 is recommended for evaluation.
145 changes: 145 additions & 0 deletions alfworld/configs/base_config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
dataset:
data_path: '$ALFWORLD_DATA/json_2.1.1/train'
eval_id_data_path: '$ALFWORLD_DATA/json_2.1.1/valid_seen' # null/None to disable
eval_ood_data_path: '$ALFWORLD_DATA/json_2.1.1/valid_unseen' # null/None to disable
num_train_games: -1 # max training games (<=0 indicates full dataset)
num_eval_games: -1 # max evaluation games (<=0 indicates full dataset)

logic:
domain: '$ALFWORLD_DATA/logic/alfred.pddl' # PDDL domain file that defines the world dynamics
grammar: '$ALFWORLD_DATA/logic/alfred.twl2' # Grammar file that defines the text feedbacks

env:
type: 'AlfredTWEnv' # 'AlfredTWEnv' or 'AlfredThorEnv' or 'AlfredHybrid'
regen_game_files: False # check if game is solvable by expert and save to game.tw-pddl file
domain_randomization: False # shuffle Textworld print order and object id nums
task_types: [1, 2, 3, 4, 5, 6] # task-type ids: 1 - Pick & Place, 2 - Examine in Light, 3 - Clean & Place, 4 - Heat & Place, 5 - Cool & Place, 6 - Pick Two & Place
expert_timeout_steps: 150 # max steps before timeout for expert to solve the task
expert_type: "handcoded" # 'handcoded' or 'downward'. Note: the downward planner is very slow for real-time use
goal_desc_human_anns_prob: 0.0 # prob of using human-annotated goal language instead of templated goals (1.0 indicates all human annotations from ALFRED)

hybrid:
start_eps: 100000 # starting episode of hybrid training, tw-only training upto this point
thor_prob: 0.5 # prob of AlfredThorEnv during hybrid training
eval_mode: "tw" # 'tw' or 'thor' - env used for evaluation during hybrid training

thor:
screen_width: 300 # width of THOR window
screen_height: 300 # height of THOR window
smooth_nav: False # smooth rotations, looks, and translations during navigation (very slow)
save_frames_to_disk: False # save frame PNGs to disk (useful for making videos)
save_frames_path: './videos/' # path to save frame PNGs

controller:
type: 'oracle' # 'oracle' or 'oracle_astar' or 'mrcnn' or 'mrcnn_astar' (aka BUTLER)
debug: False
load_receps: True # load receptacle locations from precomputed dict (if available)

mask_rcnn:
pretrained_model_path: '$ALFWORLD_DATA/detectors/mrcnn.pth'

general:
random_seed: 42
use_cuda: True # disable this when running on machine without cuda
visdom: False # plot training/eval curves, run with visdom server
task: 'alfred'
training_method: 'dagger' # 'dqn' or 'dagger'
save_path: './training/' # path to save pytorch models
observation_pool_capacity: 3 # k-size queue, 0 indicates no observation
hide_init_receptacles: False # remove initial observation containing navigable receptacles

training:
batch_size: 10
max_episode: 50000
smoothing_eps: 0.1
optimizer:
learning_rate: 0.001
clip_grad_norm: 5

evaluate:
run_eval: True
batch_size: 10
env:
type: "AlfredTWEnv"

checkpoint:
report_frequency: 1000 # report every N episode
experiment_tag: 'test' # name of experiment
load_pretrained: False # during test, enable this so that the agent load your pretrained model
load_from_tag: 'not loading anything' # name of pre-trained model to load in save_path

model:
encoder_layers: 1
decoder_layers: 1
encoder_conv_num: 5
block_hidden_dim: 64
n_heads: 1
dropout: 0.1
block_dropout: 0.1
recurrent: True

rl:
action_space: "admissible" # 'admissible' (candidates from text engine) or 'generation' (seq2seq-style generation) or 'beam_search_choice' or 'exhaustive' (not working)
max_target_length: 20 # max token length for seq2seq generation
beam_width: 10 # 1 means greedy
generate_top_k: 3

training:
max_nb_steps_per_episode: 50 # terminate after this many steps
learn_start_from_this_episode: 0 # delay updates until this epsiode
target_net_update_frequency: 500 # sync target net with online net per this many epochs

replay:
accumulate_reward_from_final: True
count_reward_lambda: 0.0 # 0 to disable
novel_object_reward_lambda: 0.0 # 0 to disable
discount_gamma_game_reward: 0.9
discount_gamma_count_reward: 0.5
discount_gamma_novel_object_reward: 0.5
replay_memory_capacity: 500000 # adjust this depending on your RAM size
replay_memory_priority_fraction: 0.5
update_per_k_game_steps: 5
replay_batch_size: 64
multi_step: 3
replay_sample_history_length: 4
replay_sample_update_from: 2

epsilon_greedy:
noisy_net: False # if this is true, then epsilon greedy is disabled
epsilon_anneal_episodes: 1000 # -1 if not annealing
epsilon_anneal_from: 0.3
epsilon_anneal_to: 0.1

dagger:
action_space: "generation" # 'admissible' (candidates from text engine) or 'generation' (seq2seq-style generation) or 'exhaustive' (not working)
max_target_length: 20 # max token length for seq2seq generation
beam_width: 10 # 1 means greedy
generate_top_k: 5
unstick_by_beam_search: False # use beam-search for failed actions, set True during evaluation

training:
max_nb_steps_per_episode: 50 # terminate after this many steps

fraction_assist:
fraction_assist_anneal_episodes: 50000
fraction_assist_anneal_from: 1.0
fraction_assist_anneal_to: 0.01

fraction_random:
fraction_random_anneal_episodes: 0
fraction_random_anneal_from: 0.0
fraction_random_anneal_to: 0.0

replay:
replay_memory_capacity: 500000
update_per_k_game_steps: 5
replay_batch_size: 64
replay_sample_history_length: 4
replay_sample_update_from: 2

vision_dagger:
model_type: "resnet" # 'resnet' (whole image features) or 'maskrcnn_whole' (whole image MaskRCNN feats) or 'maskrcnn' (top k MaskRCNN detection feats) or 'no_vision' (zero vision input)
resnet_fc_dim: 64
maskrcnn_top_k_boxes: 10 # top k box features
use_exploration_frame_feats: False # append feats from initial exploration (memory intensive!)
sequence_aggregation_method: "average" # 'sum' or 'average' or 'rnn'
Loading

0 comments on commit fae6a5b

Please sign in to comment.