| map_nav_src | ||
| pretrain_src | ||
| .gitignore | ||
| README.md | ||
| requirements.txt | ||
Think Global, Act Local: Dual-scale GraphTransformer for Vision-and-Language Navigation
This repository is the official implementation of Think Global, Act Local: Dual-scale GraphTransformer for Vision-and-Language Navigatio.
Requirements
- Install Matterport3D simulators: follow instructions here. We use the latest version instead of v0.1.
export PYTHONPATH=Matterport3DSimulator/build:$PYTHONPATH
- Install requirements:
conda create --name vlnduet python=3.8.5
conda activate vlnduet
pip install -r requirements.txt
-
Download data from Dropbox, including processed annotations, features and pretrained models. Put the data in `datasets' directory.
-
Download pretrained lxmert
mkdir -p datasets/pretrained
wget https://nlp.cs.unc.edu/data/model_LXRT.pth -P datasets/pretrained
Pretraining
Combine behavior cloning and auxiliary proxy tasks in pretraining:
cd pretrain_src
bash run_reverie.sh # (run_soon.sh, run_r2r.sh)
Fine-tuning & Evaluation
Combine behavior cloning and auxiliary proxy tasks in pretraining:
cd map_nav_src
bash scripts/run_reverie.sh # (run_soon.sh, run_r2r.sh)
Examples
Video examples can be found here.