Go to file
2021-11-24 13:12:08 +01:00
map_nav_src init 2021-11-24 13:29:08 +01:00
pretrain_src init 2021-11-24 13:29:08 +01:00
.gitignore init 2021-11-24 13:29:08 +01:00
README.md Update README.md 2021-11-24 13:12:08 +01:00
requirements.txt init 2021-11-24 13:29:08 +01:00

Think Global, Act Local: Dual-scale GraphTransformer for Vision-and-Language Navigation

This repository is the official implementation of Think Global, Act Local: Dual-scale GraphTransformer for Vision-and-Language Navigatio.

Requirements

  1. Install Matterport3D simulators: follow instructions here. We use the latest version instead of v0.1.
export PYTHONPATH=Matterport3DSimulator/build:$PYTHONPATH
  1. Install requirements:
conda create --name vlnduet python=3.8.5
conda activate vlnduet
pip install -r requirements.txt
  1. Download data from Dropbox, including processed annotations, features and pretrained models. Put the data in `datasets' directory.

  2. Download pretrained lxmert

mkdir -p datasets/pretrained 
wget https://nlp.cs.unc.edu/data/model_LXRT.pth -P datasets/pretrained

Pretraining

Combine behavior cloning and auxiliary proxy tasks in pretraining:

cd pretrain_src
bash run_reverie.sh # (run_soon.sh, run_r2r.sh)

Fine-tuning & Evaluation

Combine behavior cloning and auxiliary proxy tasks in pretraining:

cd map_nav_src
bash scripts/run_reverie.sh # (run_soon.sh, run_r2r.sh)

Examples

Video examples can be found here.