Abstract: Since the rise of vision-language navigation (VLN), great progress has been made in instruction following - building a follower to navigate environments under the guidance of instructions.