SOTAVerified

Autonomous Driving

Autonomous driving is the task of driving a vehicle without human conduction.

Many of the state-of-the-art results can be found at more general task pages such as 3D Object Detection and Semantic Segmentation.

(Image credit: Exploring the Limitations of Behavior Cloning for Autonomous Driving)

Papers

Showing 451500 of 6092 papers

TitleStatusHype
ChatStitch: Visualizing Through Structures via Surround-View Unsupervised Deep Image Stitching with Collaborative LLM-Agents0
GASP: Unifying Geometric and Semantic Self-Supervised Pre-training for Autonomous Driving0
DRoPE: Directional Rotary Position Embedding for Efficient Agent Interaction Modeling0
MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models0
SemanticFlow: A Self-Supervised Framework for Joint Scene Flow Prediction and Instance Segmentation in Dynamic Environments0
CP-NCBF: A Conformal Prediction-based Approach to Synthesize Verified Neural Control Barrier Functions0
SuperPC: A Single Diffusion Model for Point Cloud Completion, Upsampling, Denoising, and Colorization0
Driving behavior recognition via self-discovery learning0
ChatBEV: A Visual Language Model that Understands BEV Maps0
PSA-SSL: Pose and Size-aware Self-Supervised Learning on LiDAR Point CloudsCode0
SimWorld: A Unified Benchmark for Simulator-Conditioned Scene Generation via World ModelCode1
RAD: Retrieval-Augmented Decision-Making of Meta-Actions with Vision-Language Models in Autonomous Driving0
Robust3D-CIL: Robust Class-Incremental Learning for 3D Perception0
Bridging Past and Future: End-to-End Autonomous Driving with Historical Prediction and PlanningCode2
Advances in 4D Generation: A SurveyCode2
MamBEV: Enabling State Space Models to Learn Birds-Eye-View RepresentationsCode1
Tracking Meets Large Multimodal Models for Driving Scenario UnderstandingCode1
TriLiteNet: Lightweight Model for Multi-Task Visual PerceptionCode1
SparseAlign: A Fully Sparse Framework for Cooperative Object Detection0
Learning-based 3D Reconstruction in Autonomous Driving: A Comprehensive Survey0
SAM2 for Image and Video Segmentation: A Comprehensive Survey0
OptiPMB: Enhancing 3D Multi-Object Tracking with Optimized Poisson Multi-Bernoulli Filtering0
AugMapNet: Improving Spatial Latent Structure via BEV Grid Augmentation for Enhanced Vectorized Online HD Map Construction0
GenStereo: Towards Open-World Generation of Stereo Images and Unsupervised MatchingCode2
A Comprehensive Survey on Multi-Agent Cooperative Decision-Making: Scenarios, Approaches, Challenges and Perspectives0
L2COcc: Lightweight Camera-Centric Semantic Scene Completion via Distillation of LiDAR Model0
Logic-RAG: Augmenting Large Multimodal Models with Visual-Spatial Knowledge for Road Scene UnderstandingCode1
Point Cloud Based Scene Segmentation: A Survey0
Multimodal Chain-of-Thought Reasoning: A Comprehensive SurveyCode4
Hydra-NeXt: Robust Closed-Loop Driving with Open-Loop TrainingCode1
Bench2FreeAD: A Benchmark for Vision-based End-to-end Navigation in Unstructured Robotic EnvironmentsCode1
DiffAD: A Unified Diffusion Modeling Approach for Autonomous Driving0
3D Gaussian Splatting against Moving Objects for High-Fidelity Street Scene ReconstructionCode1
Industrial-Grade Sensor Simulation via Gaussian Splatting: A Modular Framework for Scalable Editing and Full-Stack Validation0
DriveGEN: Generalized and Robust 3D Detection in Driving via Controllable Text-to-Image Diffusion GenerationCode1
Active Learning from Scene Embeddings for End-to-End Autonomous Driving0
DynRsl-VLM: Enhancing Autonomous Driving Perception with Dynamic Resolution Vision-Language Models0
BEVDiffLoc: End-to-End LiDAR Global Localization in BEV View based on Diffusion ModelCode1
Centaur: Robust End-to-End Autonomous Driving with Test-Time Training0
A Framework for a Capability-driven Evaluation of Scenario Understanding for Multimodal Large Language Models in Autonomous Driving0
Learning-Based MPC for Fuel Efficient Control of Autonomous Vehicles with Discrete Gear SelectionCode0
Trajectory Mamba: Efficient Attention-Mamba Forecasting Model Based on Selective SSMCode1
TAIJI: Textual Anchoring for Immunizing Jailbreak Images in Vision Language Models0
OCCUQ: Exploring Efficient Uncertainty Quantification for 3D Occupancy PredictionCode1
Mamba-VA: A Mamba-based Approach for Continuous Emotion Recognition in Valence-Arousal SpaceCode0
Unlock the Power of Unlabeled Data in Language Driving Model0
MuDG: Taming Multi-modal Diffusion with Gaussian Splatting for Urban Scene Reconstruction0
TGP: Two-modal occupancy prediction with 3D Gaussian and sparse points for 3D Environment Awareness0
Unlocking Generalization Power in LiDAR Point Cloud RegistrationCode2
TARS: Traffic-Aware Radar Scene Flow Estimation0
Show:102550
← PrevPage 10 of 122Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ReasonNetDriving Score79.95Unverified
2InterFuserDriving Score76.18Unverified
3TCPDriving Score75.14Unverified
4TF++ WPDriving Score66.32Unverified
5Learning From All Vehicles (LAV)Driving Score61.85Unverified
6TransFuserDriving Score61.18Unverified
7TransFuser (Reproduced)Driving Score55.04Unverified
8TCP (Reproduced)Driving Score47.91Unverified
9Latent TransFuserDriving Score45.2Unverified
10GRIADDriving Score36.79Unverified
#ModelMetricClaimedVerifiedStatus
1Geometric FusionRC69.17Unverified
2TransFuserRC56.36Unverified
#ModelMetricClaimedVerifiedStatus
1Geometric FusionRC86.91Unverified
2TransFuserRC78.41Unverified