| Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? | Oct 2, 2024 | Code CompletionCode Generation | CodeCode Available | 2 |
| PointAD: Comprehending 3D Anomalies from Points and Pixels for Zero-shot 3D Anomaly Detection | Oct 1, 2024 | 3D Anomaly DetectionAnomaly Detection | CodeCode Available | 2 |
| End-to-end Piano Performance-MIDI to Score Conversion with Transformers | Sep 30, 2024 | | CodeCode Available | 2 |
| Mamba in Vision: A Comprehensive Survey of Techniques and Applications | Oct 4, 2024 | MambaState Space Models | CodeCode Available | 2 |
| Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates | Oct 9, 2024 | | CodeCode Available | 2 |
| Prompting DirectSAM for Semantic Contour Extraction in Remote Sensing Images | Oct 8, 2024 | | CodeCode Available | 2 |
| Reversible Decoupling Network for Single Image Reflection Removal | Oct 10, 2024 | Reflection Removal | CodeCode Available | 2 |
| Merging in a Bottle: Differentiable Adaptive Merging (DAM) and the Path from Averaging to Automation | Oct 10, 2024 | | CodeCode Available | 2 |
| TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control | Oct 14, 2024 | DisentanglementImage Generation | CodeCode Available | 2 |
| LLM-Based Multi-Agent Systems are Scalable Graph Generative Models | Oct 13, 2024 | BenchmarkingGraph Generation | CodeCode Available | 2 |
| GS^3: Efficient Relighting with Triple Gaussian Splatting | Oct 15, 2024 | GPU | CodeCode Available | 2 |
| WeatherDG: LLM-assisted Diffusion Model for Procedural Weather Generation in Domain-Generalized Semantic Segmentation | Oct 15, 2024 | Autonomous DrivingLanguage Modeling | CodeCode Available | 2 |
| Batch and match: black-box variational inference with a score-based divergence | Feb 22, 2024 | Variational Inference | CodeCode Available | 2 |
| Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines | Oct 28, 2024 | RetrievalRetrieval-augmented Generation | CodeCode Available | 2 |
| Retrieval-Enhanced Mutation Mastery: Augmenting Zero-Shot Prediction of Protein Language Model | Oct 28, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Accelerating Direct Preference Optimization with Prefix Sharing | Oct 27, 2024 | Computational Efficiency | CodeCode Available | 2 |
| Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance | Oct 29, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| MassSpecGym: A benchmark for the discovery and identification of molecules | Oct 30, 2024 | De novo molecule generation from MS/MS spectrumDe novo molecule generation from MS/MS spectrum (bonus chemical formulae) | CodeCode Available | 2 |
| PC-Gym: Benchmark Environments For Process Control Problems | Oct 29, 2024 | BenchmarkingChemical Process | CodeCode Available | 2 |
| CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs | Jun 26, 2024 | Chart Understanding | CodeCode Available | 2 |
| Learning General-Purpose Biomedical Volume Representations using Randomized Synthesis | Nov 4, 2024 | Contrastive LearningDiversity | CodeCode Available | 2 |
| A Modular and Robust Physics-Based Approach for Lensless Image Reconstruction | Mar 1, 2024 | Image Reconstruction | CodeCode Available | 2 |
| PoseX: AI Defeats Physics Approaches on Protein-Ligand Cross Docking | May 3, 2025 | Blind DockingMolecular Docking | CodeCode Available | 2 |
| GTA: Global Tracklet Association for Multi-Object Tracking in Sports | Nov 12, 2024 | Multi-Object TrackingMultiple Object Tracking | CodeCode Available | 2 |
| Golden Noise for Diffusion Models: A Learning Framework | Nov 14, 2024 | Prompt Learning | CodeCode Available | 2 |
| SymphonyQG: Towards Symphonious Integration of Quantization and Graph for Approximate Nearest Neighbor Search | Nov 19, 2024 | QuantizationRe-Ranking | CodeCode Available | 2 |
| Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing | Nov 25, 2024 | Privacy Preserving | CodeCode Available | 2 |
| GaussianSpeech: Audio-Driven Gaussian Avatars | Nov 27, 2024 | 3DGS | CodeCode Available | 2 |
| PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation | Nov 30, 2024 | Text-to-Video GenerationVideo Generation | CodeCode Available | 2 |
| RIDCP: Revitalizing Real Image Dehazing via High-Quality Codebook Priors | Apr 8, 2023 | Image DehazingVocal Bursts Intensity Prediction | CodeCode Available | 2 |
| Volumetrically Consistent 3D Gaussian Rasterization | Dec 4, 2024 | 3DGSSSIM | CodeCode Available | 2 |
| Splatter-360: Generalizable 360^ Gaussian Splatting for Wide-baseline Panoramic Images | Dec 9, 2024 | 3DGSNeRF | CodeCode Available | 2 |
| FlashRNN: Optimizing Traditional RNNs on Modern Hardware | Dec 10, 2024 | GPULogical Reasoning | CodeCode Available | 2 |
| Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning | Dec 12, 2024 | Decision Making | CodeCode Available | 2 |
| Gramian Multimodal Representation Learning and Alignment | Dec 16, 2024 | Contrastive LearningRepresentation Learning | CodeCode Available | 2 |
| Zigzag Diffusion Sampling: Diffusion Models Can Self-Improve via Self-Reflection | Dec 14, 2024 | Denoising | CodeCode Available | 2 |
| EvalGIM: A Library for Evaluating Generative Image Models | Dec 13, 2024 | BenchmarkingDiversity | CodeCode Available | 2 |
| Tracr: Compiled Transformers as a Laboratory for Interpretability | Jan 12, 2023 | Decoder | CodeCode Available | 2 |
| Mesoscopic Insights: Orchestrating Multi-scale & Hybrid Architecture for Image Manipulation Localization | Dec 18, 2024 | Image Manipulation | CodeCode Available | 2 |
| Joint Perception and Prediction for Autonomous Driving: A Survey | Dec 18, 2024 | Autonomous Drivingmotion prediction | CodeCode Available | 2 |
| FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching | Dec 19, 2024 | Image GenerationPrediction | CodeCode Available | 2 |
| SoftPatch+: Fully Unsupervised Anomaly Classification and Segmentation | Dec 30, 2024 | Anomaly ClassificationAnomaly Detection | CodeCode Available | 2 |
| Superposition in Transformers: A Novel Way of Building Mixture of Experts | Dec 31, 2024 | Mixture-of-Experts | CodeCode Available | 2 |
| TCPFormer: Learning Temporal Correlation with Implicit Pose Proxy for 3D Human Pose Estimation | Jan 3, 2025 | 3D Human Pose EstimationMonocular 3D Human Pose Estimation | CodeCode Available | 2 |
| M-SENA: An Integrated Platform for Multimodal Sentiment Analysis | Mar 23, 2022 | ManagementMultimodal Sentiment Analysis | CodeCode Available | 2 |
| RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning | May 21, 2025 | MathMathematical Reasoning | CodeCode Available | 2 |
| UAV-VLA: Vision-Language-Action System for Large Scale Aerial Mission Generation | Jan 9, 2025 | Decision MakingLanguage Modeling | CodeCode Available | 2 |
| TinyLLaVA-Video: A Simple Framework of Small-scale Large Multimodal Models for Video Understanding | Jan 26, 2025 | Video Understanding | CodeCode Available | 2 |
| Leveraging ASIC AI Chips for Homomorphic Encryption | Jan 13, 2025 | | CodeCode Available | 2 |
| A Simple Aerial Detection Baseline of Multimodal Language Models | Jan 16, 2025 | object-detectionObject Detection | CodeCode Available | 2 |