| Recurrent Diffusion for Large-Scale Parameter Generation | Jan 20, 2025 | GPU | CodeCode Available | 2 |
| A generalizable 3D framework and model for self-supervised learning in medical imaging | Jan 20, 2025 | Medical Image SegmentationSelf-Supervised Learning | CodeCode Available | 2 |
| Avoiding Shortcuts: Enhancing Channel-Robust Specific Emitter Identification via Single-Source Domain Generalization | Jan 20, 2025 | Contrastive LearningDomain Generalization | CodeCode Available | 2 |
| Investigating the Scalability of Approximate Sparse Retrieval Algorithms to Massive Datasets | Jan 20, 2025 | Retrieval | CodeCode Available | 2 |
| A Survey on Diffusion Models for Anomaly Detection | Jan 20, 2025 | Anomaly DetectionComputational Efficiency | CodeCode Available | 2 |
| Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training | Jan 20, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Reasoning Language Models: A Blueprint | Jan 20, 2025 | Reinforcement Learning (RL)Retrieval-augmented Generation | CodeCode Available | 2 |
| Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling | Jan 20, 2025 | Imitation LearningLanguage Modeling | CodeCode Available | 2 |
| Beyond Any-Shot Adaptation: Predicting Optimization Outcome for Robustness Gains without Extra Pay | Jan 19, 2025 | | CodeCode Available | 2 |
| Diffusion Models in Recommendation Systems: A Survey | Jan 17, 2025 | Collaborative FilteringRecommendation Systems | CodeCode Available | 2 |
| LWGANet: A Lightweight Group Attention Backbone for Remote Sensing Visual Tasks | Jan 17, 2025 | Change DetectionImage Classification | CodeCode Available | 2 |
| Agent4Edu: Generating Learner Response Data by Generative Agents for Intelligent Education Systems | Jan 17, 2025 | Response Generation | CodeCode Available | 2 |
| Discrete Prior-based Temporal-coherent Content Prediction for Blind Face Video Restoration | Jan 17, 2025 | Video Restoration | CodeCode Available | 2 |
| FiLo++: Zero-/Few-Shot Anomaly Detection by Fused Fine-Grained Descriptions and Deformable Localization | Jan 17, 2025 | Anomaly DetectionImage-text matching | CodeCode Available | 2 |
| ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario | Jan 17, 2025 | | CodeCode Available | 2 |
| Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key | Jan 16, 2025 | 16kHallucination | CodeCode Available | 2 |
| Prompt-CAM: A Simpler Interpretable Transformer for Fine-Grained Analysis | Jan 16, 2025 | Explainable Artificial Intelligence (XAI)Explainable Models | CodeCode Available | 2 |
| Practical Continual Forgetting for Pre-trained Vision Models | Jan 16, 2025 | Continual ForgettingFace Recognition | CodeCode Available | 2 |
| Lossless Compression of Vector IDs for Approximate Nearest Neighbor Search | Jan 16, 2025 | Quantization | CodeCode Available | 2 |
| A Simple Aerial Detection Baseline of Multimodal Language Models | Jan 16, 2025 | object-detectionObject Detection | CodeCode Available | 2 |
| Scaling up self-supervised learning for improved surgical foundation models | Jan 16, 2025 | Self-Supervised LearningSemantic Segmentation | CodeCode Available | 2 |
| CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation | Jan 16, 2025 | 3D Generation4k | CodeCode Available | 2 |
| AnyStory: Towards Unified Single and Multiple Subject Personalization in Text-to-Image Generation | Jan 16, 2025 | Image GenerationText to Image Generation | CodeCode Available | 2 |
| Monte Carlo Tree Search for Comprehensive Exploration in LLM-Based Automatic Heuristic Design | Jan 15, 2025 | Combinatorial OptimizationLanguage Modeling | CodeCode Available | 2 |
| Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation | Jan 15, 2025 | Image SegmentationReferring Expression Segmentation | CodeCode Available | 2 |
| What Limits LLM-based Human Simulation: LLMs or Our Design? | Jan 15, 2025 | | CodeCode Available | 2 |
| The Devil is in Temporal Token: High Quality Video Reasoning Segmentation | Jan 15, 2025 | Reasoning SegmentationReferring Expression Segmentation | CodeCode Available | 2 |
| GenAI Content Detection Task 3: Cross-Domain Machine-Generated Text Detection Challenge | Jan 15, 2025 | Text Detection | CodeCode Available | 2 |
| Vision Foundation Models for Computed Tomography | Jan 15, 2025 | Computed Tomography (CT)Contrastive Learning | CodeCode Available | 2 |
| CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities | Jan 15, 2025 | Scene Generation | CodeCode Available | 2 |
| Generative Planning with 3D-vision Language Pre-training for End-to-End Autonomous Driving | Jan 15, 2025 | Autonomous DrivingTrajectory Planning | CodeCode Available | 2 |
| Parameter-Inverted Image Pyramid Networks for Visual Perception and Multimodal Understanding | Jan 14, 2025 | image-classificationImage Classification | CodeCode Available | 2 |
| LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding | Jan 14, 2025 | Feature CompressionLanguage Modeling | CodeCode Available | 2 |
| PokerBench: Training Large Language Models to become Professional Poker Players | Jan 14, 2025 | | CodeCode Available | 2 |
| LeapVAD: A Leap in Autonomous Driving via Cognitive Perception and Dual-Process Thinking | Jan 14, 2025 | Autonomous DrivingDecision Making | CodeCode Available | 2 |
| OptiChat: Bridging Optimization Models and Practitioners with Large Language Models | Jan 14, 2025 | Code Generationcounterfactual | CodeCode Available | 2 |
| Flow: Modularized Agentic Workflow Automation | Jan 14, 2025 | | CodeCode Available | 2 |
| RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation | Jan 14, 2025 | Computational EfficiencyImage Segmentation | CodeCode Available | 2 |
| Enhancing Retrieval-Augmented Generation: A Study of Best Practices | Jan 13, 2025 | In-Context LearningRAG | CodeCode Available | 2 |
| BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature | Jan 13, 2025 | ArticlesImage-text Retrieval | CodeCode Available | 2 |
| Imagine while Reasoning in Space: Multimodal Visualization-of-Thought | Jan 13, 2025 | Spatial Reasoning | CodeCode Available | 2 |
| SynthSoM: A synthetic intelligent multi-modal sensing-communication dataset for Synesthesia of Machines (SoM) | Jan 13, 2025 | | CodeCode Available | 2 |
| Leveraging ASIC AI Chips for Homomorphic Encryption | Jan 13, 2025 | | CodeCode Available | 2 |
| AlphaNet: Scaling Up Local-frame-based Atomistic Interatomic Potential | Jan 13, 2025 | Computational Efficiency | CodeCode Available | 2 |
| A User's Guide to KSig: GPU-Accelerated Computation of the Signature Kernel | Jan 13, 2025 | GPU | CodeCode Available | 2 |
| Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-Resolution | Jan 12, 2025 | Computational EfficiencyGPU | CodeCode Available | 2 |
| Deep Learning and Foundation Models for Weather Prediction: A Survey | Jan 12, 2025 | Deep LearningPrediction | CodeCode Available | 2 |
| F3D-Gaus: Feed-forward 3D-aware Generation on ImageNet with Cycle-Consistent Gaussian Splatting | Jan 12, 2025 | | CodeCode Available | 2 |
| RSRefSeg: Referring Remote Sensing Image Segmentation with Foundation Models | Jan 12, 2025 | Image SegmentationSegmentation | CodeCode Available | 2 |
| ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning | Jan 11, 2025 | Drug Discovery | CodeCode Available | 2 |