Abstract: We consider the problem of Vision-and-Language Navigation (VLN). The majority of current methods for VLN are trained end-to-end using either unstructured memory such as LSTM, or using ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results