update readme

This commit is contained in:
Yicong Hong 2021-01-14 16:35:46 +11:00
parent d95d8943f0
commit e46f8dcb61

View File

@ -1,4 +1,4 @@
# Recurrent-VLN-BERT # Recurrent VLN-BERT
Code of the Recurrent-VLN-BERT paper: Code of the Recurrent-VLN-BERT paper:
**A Recurrent Vision-and-Language BERT for Navigation**<br> **A Recurrent Vision-and-Language BERT for Navigation**<br>
@ -53,45 +53,28 @@ Our code is based on the code structure of the [EnvDrop](https://github.com/airs
To replicate the performance reported in our paper, load the trained network weights and run validation: To replicate the performance reported in our paper, load the trained network weights and run validation:
```bash ```bash
bash run/agent.bash bash run/test_agent.bash
``` ```
You can simply switch between the OSCAR-based and the PREVALENT-based VLN models by changing the arguments `vlnmodel` (oscar or prevalent) and `load` (trained model paths).
### Training ### Training
#### Navigator #### Navigator
To train the network from scratch, first train a Navigator on the R2R training split: To train the network from scratch, simply run:
Modify `run/agent.bash`, remove the argument for `--load` and set `--train listener`. Then,
```bash ```bash
bash run/agent.bash bash run/train_agent.bash
``` ```
The trained Navigator will be saved under `snap/`. The trained Navigator will be saved under `snap/`.
#### Speaker
You also need to train a [Speaker](https://github.com/airsplay/R2R-EnvDrop) for augmented training:
```bash
bash run/speak.bash
```
The trained Speaker will be saved under `snap/`.
#### Augmented Navigator
Finally, keep training the Navigator with the mixture of original data and [augmented data](http://www.cs.unc.edu/~airsplay/aug_paths.json):
```bash
bash run/bt_envdrop.bash
```
We apply a one-step learning rate decay to 1e-5 when training saturates.
## Citation ## Citation
If you use or discuss our Entity Relationship Graph, please cite our paper: If you use or discuss our Recurrent VLN-BERT, please cite our paper:
``` ```
@article{hong2020language, @article{hong2020recurrent,
title={Language and Visual Entity Relationship Graph for Agent Navigation}, title={A Recurrent Vision-and-Language BERT for Navigation},
author={Hong, Yicong and Rodriguez, Cristian and Qi, Yuankai and Wu, Qi and Gould, Stephen}, author={Hong, Yicong and Wu, Qi and Qi, Yuankai and Rodriguez-Opazo, Cristian and Gould, Stephen},
journal={Advances in Neural Information Processing Systems}, journal={arXiv preprint arXiv:2011.13922},
volume={33},
year={2020} year={2020}
} }
``` ```