| DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction | May 27, 2025 | Image Generation | CodeCode Available | 2 |
| Play to Generalize: Learning to Reason Through Game Play | Jun 9, 2025 | Domain GeneralizationMath | CodeCode Available | 2 |
| ChineseHarm-Bench: A Chinese Harmful Content Detection Benchmark | Jun 12, 2025 | | CodeCode Available | 2 |
| Curve-Aware Gaussian Splatting for 3D Parametric Curve Reconstruction | Jun 26, 2025 | Point cloud reconstruction | CodeCode Available | 2 |
| Pre-Trained LLM is a Semantic-Aware and Generalizable Segmentation Booster | Jun 22, 2025 | DecoderImage Segmentation | CodeCode Available | 2 |
| LLM2Rec: Large Language Models Are Powerful Embedding Models for Sequential Recommendation | Jun 16, 2025 | Collaborative FilteringSequential Recommendation | CodeCode Available | 2 |
| TESS 2: A Large-Scale Generalist Diffusion Language Model | Feb 19, 2025 | Instruction FollowingLanguage Modeling | CodeCode Available | 2 |
| Learning a Decision Tree Algorithm with Transformers | Feb 6, 2024 | Meta-Learning | CodeCode Available | 2 |
| On the Arbitrary-Oriented Object Detection: Classification based Approaches Revisited | Mar 12, 2020 | ClassificationGeneral Classification | CodeCode Available | 2 |
| Speaker-change Aware CRF for Dialogue Act Classification | Apr 6, 2020 | ClassificationDialogue Act Classification | CodeCode Available | 2 |
| MMFashion: An Open-Source Toolbox for Visual Fashion Analysis | May 18, 2020 | AttributeRetrieval | CodeCode Available | 2 |
| Point2Mesh: A Self-Prior for Deformable Meshes | May 22, 2020 | | CodeCode Available | 2 |
| DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | Jun 11, 2021 | Card GamesDeep Reinforcement Learning | CodeCode Available | 2 |
| Citrus: Leveraging Expert Cognitive Pathways in a Medical Language Model for Advanced Medical Decision Support | Feb 25, 2025 | Decision MakingDiagnostic | CodeCode Available | 2 |
| GreaseLM: Graph REASoning Enhanced Language Models for Question Answering | Jan 21, 2022 | Knowledge GraphsMedical Question Answering | CodeCode Available | 2 |
| EvoJAX: Hardware-Accelerated Neuroevolution | Feb 10, 2022 | CPU | CodeCode Available | 2 |
| LCCDE: A Decision-Based Ensemble Framework for Intrusion Detection in The Internet of Vehicles | Aug 5, 2022 | Autonomous VehiclesIntrusion Detection | CodeCode Available | 2 |
| Shape, Pose, and Appearance from a Single Image via Bootstrapped Radiance Field Inversion | Nov 21, 2022 | 3D ReconstructionNeRF | CodeCode Available | 2 |
| Desbordante: from benchmarking suite to high-performance science-intensive data profiler (preprint) | Jan 14, 2023 | Benchmarking | CodeCode Available | 2 |
| DETR Does Not Need Multi-Scale or Locality Design | Jan 1, 2023 | DecoderObject Detection | CodeCode Available | 2 |
| Reconstructing Animatable Categories from Videos | May 10, 2023 | 3D Shape Reconstruction from VideosDynamic Reconstruction | CodeCode Available | 2 |
| Learning Lip-Based Audio-Visual Speaker Embeddings with AV-HuBERT | May 15, 2022 | Representation LearningSpeaker Verification | CodeCode Available | 2 |
| SE(3) diffusion model with application to protein backbone generation | Feb 5, 2023 | Protein Structure Prediction | CodeCode Available | 2 |
| GLAP: General contrastive audio-text pretraining across domains and languages | Jun 12, 2025 | AudioCapsKeyword Spotting | CodeCode Available | 2 |
| TimeZero: Temporal Video Grounding with Reasoning-Guided LVLM | Mar 17, 2025 | Video Grounding | CodeCode Available | 2 |
| Rethinking Benchmark and Contamination for Language Models with Rephrased Samples | Nov 8, 2023 | HumanEvalMMLU | CodeCode Available | 2 |
| PG-Video-LLaVA: Pixel Grounding Large Video-Language Models | Nov 22, 2023 | BenchmarkingPhrase Grounding | CodeCode Available | 2 |
| QuIP: 2-Bit Quantization of Large Language Models With Guarantees | Jul 25, 2023 | Quantization | CodeCode Available | 2 |
| Machine Mindset: An MBTI Exploration of Large Language Models | Dec 20, 2023 | Large Language ModelPersonality Alignment | CodeCode Available | 2 |
| Triplet Interaction Improves Graph Transformers: Accurate Molecular Graph Learning with Triplet Graph Transformers | Feb 7, 2024 | Drug DiscoveryGraph Learning | CodeCode Available | 2 |
| Subobject-level Image Tokenization | Feb 22, 2024 | AttributeLanguage Modeling | CodeCode Available | 2 |
| The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio | May 8, 2024 | Audio Deepfake DetectionAudio Generation | CodeCode Available | 2 |
| Large Language Models Must Be Taught to Know What They Don't Know | Jun 12, 2024 | | CodeCode Available | 2 |
| Text2Robot: Evolutionary Robot Design from Text Descriptions | Jun 28, 2024 | NavigateText to 3D | CodeCode Available | 2 |
| Towards Reasoning in Large Language Models: A Survey | Dec 20, 2022 | Decision MakingSurvey | CodeCode Available | 2 |
| Shadow Generation for Composite Image Using Diffusion model | Mar 22, 2024 | Image-to-Image Translation | CodeCode Available | 2 |
| Universal Narrative Model: an Author-centric Storytelling Framework for Generative AI | Mar 5, 2025 | | CodeCode Available | 2 |
| REBEL: Reinforcement Learning via Regressing Relative Rewards | Apr 25, 2024 | continuous-controlContinuous Control | CodeCode Available | 2 |
| Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with Modality-Aware Sparsity | Jan 27, 2025 | Computational EfficiencyMamba | CodeCode Available | 2 |
| QuEST: Stable Training of LLMs with 1-Bit Weights and Activations | Feb 7, 2025 | GPUQuantization | CodeCode Available | 2 |
| Without Paired Labeled Data: An End-to-End Self-Supervised Paradigm for UAV-View Geo-Localization | Feb 17, 2025 | Computational EfficiencyContrastive Learning | CodeCode Available | 2 |
| MC-LLaVA: Multi-Concept Personalized Vision-Language Model | Mar 24, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| TC-RAG:Turing-Complete RAG's Case study on Medical LLM Systems | Aug 17, 2024 | RAGRetrieval | CodeCode Available | 2 |
| Hacking CTFs with Plain Agents | Dec 3, 2024 | | CodeCode Available | 2 |
| Are Vision xLSTM Embedded UNet More Reliable in Medical 3D Image Segmentation? | Jun 24, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| DreamCatalyst: Fast and High-Quality 3D Editing via Controlling Editability and Identity Preservation | Jul 16, 2024 | 3DGS3D scene Editing | CodeCode Available | 2 |
| VGGHeads: 3D Multi Head Alignment with a Large-Scale Synthetic Dataset | Jul 25, 2024 | Head DetectionKeypoint Estimation | CodeCode Available | 2 |
| VerilogEval: Evaluating Large Language Models for Verilog Code Generation | Sep 14, 2023 | BenchmarkingCode Generation | CodeCode Available | 2 |
| Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch | Nov 6, 2023 | DecoderGSM8K | CodeCode Available | 2 |
| Finding Transformer Circuits with Edge Pruning | Jun 24, 2024 | In-Context LearningLanguage Modelling | CodeCode Available | 2 |