| R1-Track: Direct Application of MLLMs to Visual Object Tracking via Reinforcement Learning | Jun 27, 2025 | Object TrackingTemplate Matching | CodeCode Available | 2 | 5 |
| Cross Language Image Matching for Weakly Supervised Semantic Segmentation | Mar 5, 2022 | ObjectSemantic Segmentation | CodeCode Available | 2 | 5 |
| Discovering Latent Knowledge in Language Models Without Supervision | Dec 7, 2022 | Imitation LearningLanguage Modelling | CodeCode Available | 2 | 5 |
| QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search | Feb 4, 2025 | | CodeCode Available | 2 | 5 |
| DNABERT-2: Efficient Foundation Model and Benchmark For Multi-Species Genome | Jun 26, 2023 | Computational EfficiencyCore Promoter Detection | CodeCode Available | 2 | 5 |
| ProteinInvBench: Benchmarking Protein Inverse Folding on Diverse Tasks, Models, and Metrics | Sep 26, 2023 | | CodeCode Available | 2 | 5 |
| Multi-Stage Manipulation with Demonstration-Augmented Reward, Policy, and World Model Learning | Mar 3, 2025 | Reinforcement Learning (RL) | CodeCode Available | 2 | 5 |
| Equivariant Graph Neural Operator for Modeling 3D Dynamics | Jan 19, 2024 | Operator learning | CodeCode Available | 2 | 5 |
| Positional Encoder Graph Quantile Neural Networks for Geographic Data | Sep 27, 2024 | Density EstimationUncertainty Quantification | CodeCode Available | 2 | 5 |
| Using the IBM Analog In-Memory Hardware Acceleration Kit for Neural Network Training and Inference | Jul 18, 2023 | | CodeCode Available | 2 | 5 |
| FloorSet -- a VLSI Floorplanning Dataset with Design Constraints of Real-World SoCs | May 9, 2024 | Combinatorial Optimization | CodeCode Available | 2 | 5 |
| PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change | Jun 21, 2022 | Common Sense ReasoningDiversity | CodeCode Available | 2 | 5 |
| SuperPoint-SLAM3: Augmenting ORB-SLAM3 with Deep Features, Adaptive NMS, and Learning-Based Loop Closure | Jun 16, 2025 | Simultaneous Localization and Mapping | CodeCode Available | 2 | 5 |
| Cross-modal Orthogonal High-rank Augmentation for RGB-Event Transformer-trackers | Jul 9, 2023 | Object Tracking | CodeCode Available | 2 | 5 |
| Can LLMs Separate Instructions From Data? And What Do We Even Mean By That? | Mar 11, 2024 | Prompt Engineering | CodeCode Available | 2 | 5 |
| Idiosyncrasies in Large Language Models | Feb 17, 2025 | | CodeCode Available | 2 | 5 |
| Longitudinal Segmentation of MS Lesions via Temporal Difference Weighting | Sep 20, 2024 | Inductive BiasLesion Detection | CodeCode Available | 2 | 5 |
| Learning Robust Stereo Matching in the Wild with Selective Mixture-of-Experts | Jul 7, 2025 | Inductive BiasMixture-of-Experts | CodeCode Available | 2 | 5 |
| ICASSP 2022 Acoustic Echo Cancellation Challenge | Feb 27, 2022 | Acoustic echo cancellationSpeech Enhancement | CodeCode Available | 2 | 5 |
| EASI-Tex: Edge-Aware Mesh Texturing from Single Image | May 27, 2024 | | CodeCode Available | 2 | 5 |
| Gaussian Shading: Provable Performance-Lossless Image Watermarking for Diffusion Models | Apr 7, 2024 | Denoising | CodeCode Available | 2 | 5 |
| Accurate Leukocyte Detection Based on Deformable-DETR and Multi-Level Feature Fusion for Aiding Diagnosis of Blood Diseases | Jan 1, 2024 | | CodeCode Available | 2 | 5 |
| HCF-Net: Hierarchical Context Fusion Network for Infrared Small Object Detection | Mar 16, 2024 | channel selectionobject-detection | CodeCode Available | 2 | 5 |
| Attention-based CNN-LSTM and XGBoost hybrid model for stock prediction | Apr 6, 2022 | PredictionStock Prediction | CodeCode Available | 2 | 5 |
| IndicVoices-R: Unlocking a Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS | Sep 9, 2024 | DenoisingSpeech Enhancement | CodeCode Available | 2 | 5 |
| Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning? | May 27, 2025 | Multimodal Reasoning | CodeCode Available | 2 | 5 |
| SFPNet: Sparse Focal Point Network for Semantic Segmentation on General LiDAR Point Clouds | Jul 16, 2024 | LIDAR Semantic SegmentationSemantic Segmentation | CodeCode Available | 2 | 5 |
| Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models | Mar 14, 2025 | Image Super-ResolutionSuper-Resolution | CodeCode Available | 2 | 5 |
| GinAR: An End-To-End Multivariate Time Series Forecasting Model Suitable for Variable Missing | May 18, 2024 | Multivariate Time Series ForecastingTime Series | CodeCode Available | 2 | 5 |
| FlowSE: Efficient and High-Quality Speech Enhancement via Flow Matching | May 26, 2025 | QuantizationSpeech Enhancement | CodeCode Available | 2 | 5 |
| EVOR: Evolving Retrieval for Code Generation | Feb 19, 2024 | Code GenerationRAG | CodeCode Available | 2 | 5 |
| CenterFormer: Center-based Transformer for 3D Object Detection | Sep 12, 2022 | 3D Object DetectionObject | CodeCode Available | 2 | 5 |
| Natural Language Fine-Tuning | Dec 29, 2024 | GSM8KLarge Language Model | CodeCode Available | 2 | 5 |
| Compression-Aware One-Step Diffusion Model for JPEG Artifact Removal | Feb 14, 2025 | DenoisingImage Restoration | CodeCode Available | 2 | 5 |
| OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems | Feb 21, 2024 | Logical Fallacies | CodeCode Available | 2 | 5 |
| Implicit Neural Representation in Medical Imaging: A Comparative Survey | Jul 30, 2023 | Domain AdaptationImage Reconstruction | CodeCode Available | 2 | 5 |
| LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and Models | Jun 7, 2024 | | CodeCode Available | 2 | 5 |
| DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion | May 25, 2023 | DenoisingStyle Transfer | CodeCode Available | 2 | 5 |
| Think-on-Graph: Deep and Responsible Reasoning of Large Language Model on Knowledge Graph | Jul 15, 2023 | HallucinationKnowledge Graphs | CodeCode Available | 2 | 5 |
| Quantifying the Plausibility of Context Reliance in Neural Machine Translation | Oct 2, 2023 | Machine TranslationTranslation | CodeCode Available | 2 | 5 |
| Semi-Supervised Vision-Centric 3D Occupancy World Model for Autonomous Driving | Feb 11, 2025 | AttributeAutonomous Driving | CodeCode Available | 2 | 5 |
| DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation | Mar 24, 2025 | 3D Semantic SegmentationLIDAR Semantic Segmentation | CodeCode Available | 2 | 5 |
| Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning | May 31, 2023 | Decision MakingGeneral Knowledge | CodeCode Available | 2 | 5 |
| LLM-PySC2: Starcraft II learning environment for Large Language Models | Nov 8, 2024 | Decision MakingLanguage Modelling | CodeCode Available | 2 | 5 |
| Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment | Dec 26, 2024 | | CodeCode Available | 2 | 5 |
| Emotion-driven Piano Music Generation via Two-stage Disentanglement and Functional Representation | Jul 30, 2024 | DisentanglementMusic Generation | CodeCode Available | 2 | 5 |
| DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision | Dec 26, 2023 | Deep LearningNeRF | CodeCode Available | 2 | 5 |
| Gaussian Shell Maps for Efficient 3D Human Generation | Nov 29, 2023 | | CodeCode Available | 2 | 5 |
| Holodeck: Language Guided Generation of 3D Embodied AI Environments | Dec 14, 2023 | Common Sense ReasoningLanguage Modelling | CodeCode Available | 2 | 5 |
| CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts | May 9, 2024 | Image CaptioningInstruction Following | CodeCode Available | 2 | 5 |