Vision and Language Navigation

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 201–223 of 223 papers

Title	Date	Tasks	Status
Narrowing the Gap between Vision and Action in Navigation	Aug 19, 2024	DecoderSpatial Reasoning	CodeCode Available
GeoVLN: Learning Geometry-Enhanced Visual Representation with Slot Attention for Vision-and-Language Navigation	May 26, 2023	Vision and Language Navigation	CodeCode Available
Augmented Commonsense Knowledge for Remote Object Grounding	Jun 3, 2024	Decision MakingObject	CodeCode Available
MLANet: Multi-Level Attention Network with Sub-instruction for Continuous Vision-and-Language Navigation	Mar 2, 2023	NavigateVision and Language Navigation	CodeCode Available
A Priority Map for Vision-and-Language Navigation with Trajectory Plans and Feature-Location Cues	Jul 24, 2022	cross-modal alignmentTrajectory Planning	CodeCode Available
ULN: Towards Underspecified Vision-and-Language Navigation	Oct 18, 2022	Vision and Language Navigation	CodeCode Available
LOViS: Learning Orientation and Visual Signals for Vision and Language Navigation	Sep 26, 2022	Spatial ReasoningVision and Language Navigation	CodeCode Available
Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation	Mar 21, 2018	Deep Reinforcement Learningmodel	CodeCode Available
Local Slot Attention for Vision-and-Language Navigation	Jun 17, 2022	NavigateVision and Language Navigation	CodeCode Available
VLN-PETL: Parameter-Efficient Transfer Learning for Vision-and-Language Navigation	Aug 20, 2023	Transfer LearningVision and Language Navigation	CodeCode Available
FOAM: A Follower-aware Speaker Model For Vision-and-Language Navigation	Jun 9, 2022	Vision and Language Navigation	CodeCode Available
Explicit Object Relation Alignment for Vision and Language Navigation	May 1, 2022	ObjectRelation	CodeCode Available
Spatially-Aware Speaker for Vision-and-Language Navigation Instruction Generation	Sep 9, 2024	Vision and Language Navigation	CodeCode Available
Speaker-Follower Models for Vision-and-Language Navigation	Jun 7, 2018	Data AugmentationVision and Language Navigation	CodeCode Available
Chasing Ghosts: Instruction Following as Bayesian State Tracking	Jul 3, 2019	Instruction FollowingVision and Language Navigation	CodeCode Available
Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters	Jul 5, 2019	Vision and Language Navigation	CodeCode Available
A Navigation Framework Utilizing Vision-Language Models	Jun 11, 2025	NavigatePrompt Engineering	CodeCode Available
Kefa: A Knowledge Enhanced and Fine-grained Aligned Speaker for Navigation Instruction Generation	Jul 25, 2023	Vision and Language Navigation	CodeCode Available
Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation	Mar 6, 2019	Vision and Language NavigationVision-Language Navigation	CodeCode Available
Diagnosing Vision-and-Language Navigation: What Really Matters	Mar 30, 2021	DiagnosticObject	CodeCode Available
Behavioral Analysis of Vision-and-Language Navigation Agents	Jul 20, 2023	Vision and Language Navigation	CodeCode Available
DELAN: Dual-Level Alignment for Vision-and-Language Navigation by Cross-Modal Contrastive Learning	Apr 2, 2024	Contrastive LearningDecision Making	CodeCode Available
Into the Unknown: Generating Geospatial Descriptions for New Environments	Jun 28, 2024	Language ModellingLarge Language Model	CodeCode Available

Show:10 25 50

← PrevPage 5 of 5Next →

All datasets VLN Challenge Touchdown Dataset RxR map2seq Room2Room robo-vln

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	human	success	0.86	—	Unverified
2	Lily	success	0.79	—	Unverified
3	Airbert	success	0.78	—	Unverified
4	explore@40 beam-search	success	0.74	—	Unverified
5	Global Normalization	success	0.74	—	Unverified
6	VLN-Bert	success	0.73	—	Unverified
7	BEVBert	success	0.73	—	Unverified
8	GMap	success	0.73	—	Unverified
9	Gloabl Normalization pre-explore	success	0.73	—	Unverified
10	FOAM-Beam Search	success	0.72	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	FLAME	Task Completion (TC)	40.2	—	Unverified
2	ORAR + junction type + heading delta	Task Completion (TC)	29.1	—	Unverified
3	ORAR	Task Completion (TC)	24.2	—	Unverified
4	ARC + L2STOP	Task Completion (TC)	16.68	—	Unverified
5	VLN Transformer +M-50 +style	Task Completion (TC)	16.2	—	Unverified
6	VLN Transformer	Task Completion (TC)	14.9	—	Unverified
7	ARC	Task Completion (TC)	14.13	—	Unverified
8	Retouch-RConcat	Task Completion (TC)	12.8	—	Unverified
9	Gated Attention (GA)	Task Completion (TC)	11.9	—	Unverified
10	RConcat	Task Completion (TC)	11.8	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MARVAL	ndtw	66.76	—	Unverified
2	EnvEdit-PT	ndtw	64.61	—	Unverified
3	HAMT	ndtw	59.94	—	Unverified
4	CLEAR-CLIP	ndtw	53.69	—	Unverified
5	Monolingual Baseline	ndtw	41.05	—	Unverified
6	Multilingual Baseline	ndtw	36.81	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	FLAME	Task Completion (TC)	52.44	—	Unverified
2	ORAR + junction type + heading delta	Task Completion (TC)	46.7	—	Unverified
3	ORAR	Task Completion (TC)	45.1	—	Unverified
4	Gated Attention	Task Completion (TC)	17	—	Unverified
5	Rconcat	Task Completion (TC)	14.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	R2R+EnvDrop	spl	0.61	—	Unverified
2	RCM + SIL	spl	0.59	—	Unverified
3	Tactical Rewind - short	spl	0.41	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Hierarchical Cross-Modal Agent	SPL (Sucess Weighted by Path Length)	0.4	—	Unverified