VALAN, short for Vision and Language Agent Navigation, is a scalable reinforcement learning framework for developing embodied agents, particularly those found in Vision-Language Navigation (VLN) problems. VLN tasks require agents to interpret natural language instructions to navigate in photo-realistic environments in order to achieve prescribed navigation goals. VALAN is designed to be the common learning infrastructure for all VLN problems. VALAN is based on IMPALA (https://arxiv.org/abs/1802.01561) and uses its novel off-policy correction method called V-trace for Reinforcement Learning methods.
TODO
TODO
This is not an official Google product.