| STICKERCONV: Generating Multimodal Empathetic Responses from Scratch | Jan 20, 2024 | 2kEmpathetic Response Generation | CodeCode Available | 2 | 5 |
| Revealing data leakage in protein interaction benchmarks | Apr 16, 2024 | Benchmarking | CodeCode Available | 2 | 5 |
| Improving CLIP Training with Language Rewrites | May 31, 2023 | In-Context LearningSentence | CodeCode Available | 2 | 5 |
| Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model | Dec 30, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| What Kind of Visual Tokens Do We Need? Training-free Visual Token Pruning for Multi-modal Large Language Models from the Perspective of Graph | Jan 4, 2025 | TextVQA | CodeCode Available | 2 | 5 |
| Dynamic 3D Point Cloud Sequences as 2D Videos | Mar 2, 2024 | Action RecognitionSelf-Supervised Learning | CodeCode Available | 2 | 5 |
| PILOT: A Pre-Trained Model-Based Continual Learning Toolbox | Sep 13, 2023 | class-incremental learningClass Incremental Learning | CodeCode Available | 2 | 5 |
| ARTrackV2: Prompting Autoregressive Tracker Where to Look and How to Describe | Dec 28, 2023 | ObjectObject Tracking | CodeCode Available | 2 | 5 |
| PL-EVIO: Robust Monocular Event-based Visual Inertial Odometry with Point and Line Features | Sep 25, 2022 | Management | CodeCode Available | 2 | 5 |
| EvoCodeBench: An Evolving Code Generation Benchmark Aligned with Real-World Code Repositories | Mar 31, 2024 | Code Generation | CodeCode Available | 2 | 5 |
| DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World | Jun 30, 2025 | Caption GenerationObject | CodeCode Available | 2 | 5 |
| RangeUDF: Semantic Surface Reconstruction from 3D Point Clouds | Apr 19, 2022 | Semantic SegmentationSurface Reconstruction | CodeCode Available | 2 | 5 |
| Finetuning Large Language Models for Vulnerability Detection | Jan 30, 2024 | Transfer LearningVulnerability Detection | CodeCode Available | 2 | 5 |
| Beyond Any-Shot Adaptation: Predicting Optimization Outcome for Robustness Gains without Extra Pay | Jan 19, 2025 | | CodeCode Available | 2 | 5 |
| PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models | Dec 21, 2023 | Image Animation | CodeCode Available | 2 | 5 |
| Retrieval with Learned Similarities | Jul 22, 2024 | Question AnsweringRecommendation Systems | CodeCode Available | 2 | 5 |
| Dynamic Graph Representation with Knowledge-aware Attention for Histopathology Whole Slide Image Analysis | Mar 12, 2024 | Graph Representation LearningRepresentation Learning | CodeCode Available | 2 | 5 |
| LATR: 3D Lane Detection from Monocular Images with Transformer | Aug 8, 2023 | 3D Lane DetectionAutonomous Driving | CodeCode Available | 2 | 5 |
| SkyGPT: Probabilistic Short-term Solar Forecasting Using Synthetic Sky Videos from Physics-constrained VideoGPT | Jun 20, 2023 | PredictionVideo Prediction | CodeCode Available | 2 | 5 |
| ISR-DPO: Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective DPO | Jun 17, 2024 | Language ModellingQuestion Answering | CodeCode Available | 2 | 5 |
| Face2Diffusion for Fast and Editable Face Personalization | Mar 8, 2024 | Diffusion PersonalizationDiversity | CodeCode Available | 2 | 5 |
| Hibou: A Family of Foundational Vision Transformers for Pathology | Jun 7, 2024 | Diagnosticwhole slide images | CodeCode Available | 2 | 5 |
| Diffusion models as plug-and-play priors | Jun 17, 2022 | Combinatorial OptimizationDenoising | CodeCode Available | 2 | 5 |
| Does Refusal Training in LLMs Generalize to the Past Tense? | Jul 16, 2024 | | CodeCode Available | 2 | 5 |
| LatteReview: A Multi-Agent Framework for Systematic Review Automation Using Large Language Models | Jan 5, 2025 | Decision MakingRAG | CodeCode Available | 2 | 5 |
| Introducing Visual Perception Token into Multimodal Large Language Model | Feb 24, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis | Dec 9, 2022 | AttributeImage Generation | CodeCode Available | 2 | 5 |
| SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving | Jul 1, 2024 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 2 | 5 |
| RAG-QA Arena: Evaluating Domain Robustness for Long-form Retrieval Augmented Question Answering | Jul 19, 2024 | Domain GeneralizationForm | CodeCode Available | 2 | 5 |
| FocalFormer3D : Focusing on Hard Instance for 3D Object Detection | Aug 8, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 | 5 |
| DualAnoDiff: Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation | Aug 24, 2024 | Anomaly ClassificationAnomaly Detection | CodeCode Available | 2 | 5 |
| Stella Nera: Achieving 161 TOp/s/W with Multiplier-free DNN Acceleration based on Approximate Matrix Multiplication | Nov 16, 2023 | Quantization | CodeCode Available | 2 | 5 |
| GaussianAD: Gaussian-Centric End-to-End Autonomous Driving | Dec 13, 2024 | Autonomous DrivingDecision Making | CodeCode Available | 2 | 5 |
| AvatarGen: a 3D Generative Model for Animatable Human Avatars | Aug 1, 2022 | 3D Human Reconstruction | CodeCode Available | 2 | 5 |
| LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition | Jul 25, 2023 | In-Context Learning | CodeCode Available | 2 | 5 |
| Machine Learning Coarse-Grained Potentials of Protein Thermodynamics | Dec 14, 2022 | | CodeCode Available | 2 | 5 |
| MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced Reranking and Noise-injected Training | Jul 31, 2024 | RAGReranking | CodeCode Available | 2 | 5 |
| Extremely Simple Multimodal Outlier Synthesis for Out-of-Distribution Detection and Segmentation | May 22, 2025 | Autonomous DrivingOut-of-Distribution Detection | CodeCode Available | 2 | 5 |
| JAX-FLUIDS: A fully-differentiable high-order computational fluid dynamics solver for compressible two-phase flows | Mar 25, 2022 | | CodeCode Available | 2 | 5 |
| QuadricFormer: Scene as Superquadrics for 3D Semantic Occupancy Prediction | Jun 12, 2025 | 3D Semantic Occupancy PredictionAutonomous Driving | CodeCode Available | 2 | 5 |
| DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution | Nov 4, 2024 | GPURobot Manipulation | CodeCode Available | 2 | 5 |
| SPD Learning for Covariance-Based Neuroimaging Analysis: Perspectives, Methods, and Challenges | Apr 26, 2025 | | CodeCode Available | 2 | 5 |
| X-Ray: A Sequential 3D Representation For Generation | Apr 22, 2024 | 3D GenerationObject | CodeCode Available | 2 | 5 |
| BridgeData V2: A Dataset for Robot Learning at Scale | Aug 24, 2023 | Imitation LearningMulti-Task Learning | CodeCode Available | 2 | 5 |
| Controlled Text Generation via Language Model Arithmetic | Nov 24, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| Diff-BGM: A Diffusion Model for Video Background Music Generation | May 20, 2024 | DiversityMusic Generation | CodeCode Available | 2 | 5 |
| Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent | Jul 31, 2024 | Translationvalid | CodeCode Available | 2 | 5 |
| DVMSR: Distillated Vision Mamba for Efficient Super-Resolution | May 5, 2024 | Image Super-ResolutionLong-range modeling | CodeCode Available | 2 | 5 |
| Open-Set Domain Adaptation for Semantic Segmentation | May 30, 2024 | Domain AdaptationSemantic Segmentation | CodeCode Available | 2 | 5 |
| EMIFF: Enhanced Multi-scale Image Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection | Feb 23, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 | 5 |