| GStex: Per-Primitive Texturing of 2D Gaussian Splatting for Decoupled Appearance and Geometry Modeling | Sep 19, 2024 | Novel View Synthesis | CodeCode Available | 2 |
| VHM: Versatile and Honest Vision Language Model for Remote Sensing Image Analysis | Mar 29, 2024 | HallucinationImage Captioning | CodeCode Available | 2 |
| Constrained Decision Transformer for Offline Safe Reinforcement Learning | Feb 14, 2023 | reinforcement-learningReinforcement Learning | CodeCode Available | 2 |
| TSELM: Target Speaker Extraction using Discrete Tokens and Language Models | Sep 12, 2024 | Audio GenerationTarget Speaker Extraction | CodeCode Available | 2 |
| End-to-End Navigation with Vision Language Models: Transforming Spatial Reasoning into Question-Answering | Nov 8, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| All-in-One Image Restoration for Unknown Corruption | Jan 1, 2022 | 5-Degradation Blind All-in-One Image RestorationAll | CodeCode Available | 2 |
| Long-Form Speech Generation with Spoken Language Models | Dec 24, 2024 | FormLanguage Modeling | CodeCode Available | 2 |
| Calibration and Option Pricing with Stochastic Volatility and Double Exponential Jumps | Feb 19, 2025 | ArticlesEconometrics | CodeCode Available | 2 |
| DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention | May 28, 2024 | GPUMamba | CodeCode Available | 2 |
| Direct Multi-Turn Preference Optimization for Language Agents | Jun 21, 2024 | Reinforcement Learning (RL) | CodeCode Available | 2 |
| Multi-Task Learning as Multi-Objective Optimization | Oct 10, 2018 | Depth EstimationGeneral Classification | CodeCode Available | 2 |
| MonoForce: Learnable Image-conditioned Physics Engine | Feb 14, 2025 | Model Predictive ControlTrajectory Prediction | CodeCode Available | 2 |
| Efficient LLM Scheduling by Learning to Rank | Aug 28, 2024 | BlockingChatbot | CodeCode Available | 2 |
| MedCalc-Bench: Evaluating Large Language Models for Medical Calculations | Jun 17, 2024 | DescriptiveMedical Diagnosis | CodeCode Available | 2 |
| PIXIU: A Comprehensive Benchmark, Instruction Dataset and Large Language Model for Finance | Sep 26, 2023 | | CodeCode Available | 2 |
| Rethinking Few-shot 3D Point Cloud Semantic Segmentation | Mar 1, 2024 | Few-shot 3D Point Cloud Semantic SegmentationSegmentation | CodeCode Available | 2 |
| Event-guided Multi-patch Network with Self-supervision for Non-uniform Motion Deblurring | Feb 14, 2023 | Deblurring | CodeCode Available | 2 |
| An Embarrassingly Simple Approach for LLM with Strong ASR Capacity | Feb 13, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| MedFMC: A Real-world Dataset and Benchmark For Foundation Model Adaptation in Medical Image Classification | Jun 16, 2023 | Diabetic Retinopathy Gradingimage-classification | CodeCode Available | 2 |
| Open-sourced Data Ecosystem in Autonomous Driving: the Present and Future | Dec 6, 2023 | Autonomous Driving | CodeCode Available | 2 |
| AutoPSV: Automated Process-Supervised Verifier | May 27, 2024 | | CodeCode Available | 2 |
| Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning | Jun 17, 2024 | Data AugmentationMathematical Reasoning | CodeCode Available | 2 |
| Inter-X: Towards Versatile Human-Human Interaction Analysis | Dec 26, 2023 | Motion Synthesis | CodeCode Available | 2 |
| GraphAD: Interaction Scene Graph for End-to-end Autonomous Driving | Mar 28, 2024 | Autonomous Driving | CodeCode Available | 2 |
| UVDoc: Neural Grid-based Document Unwarping | Feb 6, 2023 | distortion correctionMS-SSIM | CodeCode Available | 2 |
| SceneDreamer360: Text-Driven 3D-Consistent Scene Generation with Panoramic Gaussian Splatting | Aug 25, 2024 | 3DGSImage Generation | CodeCode Available | 2 |
| Low-Light Image Enhancement with Wavelet-based Diffusion Models | Jun 1, 2023 | DenoisingFace Detection | CodeCode Available | 2 |
| Towards Long-Term Time-Series Forecasting: Feature, Pattern, and Distribution | Jan 5, 2023 | DecoderTime Series | CodeCode Available | 2 |
| Improving Feature-based Visual Localization by Geometry-Aided Matching | Nov 16, 2022 | 3D Feature MatchingPose Estimation | CodeCode Available | 2 |
| DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data | May 16, 2024 | Data AugmentationDiversity | CodeCode Available | 2 |
| Det-SAM2:Technical Report on the Self-Prompting Segmentation Framework Based on Segment Anything Model 2 | Nov 28, 2024 | Video SegmentationVideo Semantic Segmentation | CodeCode Available | 2 |
| Mustango: Toward Controllable Text-to-Music Generation | Nov 14, 2023 | Data AugmentationDenoising | CodeCode Available | 2 |
| Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning | Jun 11, 2024 | Contrastive Learning | CodeCode Available | 2 |
| Learning to Fly in Seconds | Nov 22, 2023 | GPUReinforcement Learning (RL) | CodeCode Available | 2 |
| SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction | Nov 21, 2023 | Autonomous DrivingDepth Estimation | CodeCode Available | 2 |
| JORLDY: a fully customizable open source framework for reinforcement learning | Apr 11, 2022 | MuJoCoOpenAI Gym | CodeCode Available | 2 |
| Large Language Models meet Collaborative Filtering: An Efficient All-round LLM-based Recommender System | Apr 17, 2024 | AllCollaborative Filtering | CodeCode Available | 2 |
| D2GV: Deformable 2D Gaussian Splatting for Video Representation in 400FPS | Mar 7, 2025 | DenoisingQuantization | CodeCode Available | 2 |
| RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding | Apr 3, 2023 | Contrastive LearningInstance Segmentation | CodeCode Available | 2 |
| ScanTalk: 3D Talking Heads from Unregistered Scans | Mar 16, 2024 | | CodeCode Available | 2 |
| ModuLoRA: Finetuning 2-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers | Sep 28, 2023 | GPUInstruction Following | CodeCode Available | 2 |
| FusionVision: A comprehensive approach of 3D object reconstruction and segmentation from RGB-D cameras using YOLO and fast segment anything | Feb 29, 2024 | 3D Object ReconstructionInstance Segmentation | CodeCode Available | 2 |
| SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow | Jul 17, 2024 | | CodeCode Available | 2 |
| HHAvatar: Gaussian Head Avatar with Dynamic Hairs | Dec 5, 2023 | 2k | CodeCode Available | 2 |
| ControlVAR: Exploring Controllable Visual Autoregressive Modeling | Jun 14, 2024 | Image Generation | CodeCode Available | 2 |
| L4DR: LiDAR-4DRadar Fusion for Weather-Robust 3D Object Detection | Aug 7, 2024 | 3D Object DetectionAutonomous Navigation | CodeCode Available | 2 |
| Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic | Feb 19, 2024 | Instruction FollowingMath | CodeCode Available | 2 |
| Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent | Feb 21, 2024 | Incremental Learning | CodeCode Available | 2 |
| IDGenRec: LLM-RecSys Alignment with Textual ID Learning | Mar 27, 2024 | Sequential RecommendationText Generation | CodeCode Available | 2 |
| Self-regulating Prompts: Foundational Model Adaptation without Forgetting | Jul 13, 2023 | Diversitymodel | CodeCode Available | 2 |