| Grad: Guided Relation Diffusion Generation for Graph Augmentation in Graph Fraud Detection | Apr 22, 2025 | Contrastive LearningFraud Detection | CodeCode Available | 3 |
| ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization | Feb 4, 2025 | Quantization | CodeCode Available | 3 |
| Large Language Model based Long-tail Query Rewriting in Taobao Search | Nov 7, 2023 | Contrastive LearningLanguage Modeling | CodeCode Available | 3 |
| SealQA: Raising the Bar for Reasoning in Search-Augmented Language Models | Jun 1, 2025 | | CodeCode Available | 3 |
| MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages with Negligible Cost | Dec 2, 2024 | Image Generation | CodeCode Available | 3 |
| View Selection for 3D Captioning via Diffusion Ranking | Apr 11, 2024 | 3D Object CaptioningHallucination | CodeCode Available | 3 |
| Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates | Sep 27, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 3 |
| ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation | Feb 25, 2025 | Image Generation | CodeCode Available | 3 |
| Lossless and Near-Lossless Compression for Foundation Models | Apr 5, 2024 | | CodeCode Available | 3 |
| StarWhisper Telescope: Agent-Based Observation Assistant System to Approach AI Astrophysicist | Dec 9, 2024 | | CodeCode Available | 3 |
| CTNet: A Convolutional Transformer Network for EEG-Based Motor Imagery Classification | Aug 30, 2024 | Brain Computer InterfaceEEG | CodeCode Available | 3 |
| Affordable AI Assistants with Knowledge Graph of Thoughts | Apr 3, 2025 | Knowledge GraphsLLM real-life tasks | CodeCode Available | 3 |
| The Elephant in the Room: Towards A Reliable Time-Series Anomaly Detection Benchmark | Sep 26, 2024 | Anomaly DetectionBenchmarking | CodeCode Available | 3 |
| DDT: Decoupled Diffusion Transformer | Apr 8, 2025 | DenoisingImage Generation | CodeCode Available | 3 |
| PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from Videos | Mar 23, 2025 | 4D reconstructionDeformable Object Manipulation | CodeCode Available | 3 |
| Deep Learning and LLM-based Methods Applied to Stellar Lightcurve Classification | Apr 16, 2024 | Feature EngineeringLanguage Modeling | CodeCode Available | 3 |
| Detecting hallucinations in large language models using semantic entropy | Jun 19, 2024 | Large Language ModelQuestion Answering | CodeCode Available | 3 |
| The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio | Oct 16, 2024 | Hallucination | CodeCode Available | 3 |
| BoostTrack: boosting the similarity measure and detection confidence for improved multiple object tracking | Apr 12, 2024 | Motion CompensationMulti-Object Tracking | CodeCode Available | 3 |
| SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling | Dec 23, 2023 | Instruction FollowingLanguage Modeling | CodeCode Available | 3 |
| PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360^ | Mar 23, 2023 | Image GenerationImage Segmentation | CodeCode Available | 3 |
| VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning | Apr 9, 2025 | MVBenchObject Tracking | CodeCode Available | 3 |
| Improving Dictionary Learning with Gated Sparse Autoencoders | Apr 24, 2024 | Dictionary Learning | CodeCode Available | 3 |
| Open3D: A Modern Library for 3D Data Processing | Jan 30, 2018 | Point Cloud Registration | CodeCode Available | 3 |
| ATPrompt: Textual Prompt Learning with Embedded Attributes | Dec 12, 2024 | AttributeLarge Language Model | CodeCode Available | 3 |
| N-BEATS: Neural basis expansion analysis for interpretable time series forecasting | May 24, 2019 | Time SeriesTime Series Analysis | CodeCode Available | 3 |
| Mip-Splatting: Alias-free 3D Gaussian Splatting | Nov 27, 2023 | Novel View Synthesis | CodeCode Available | 3 |
| Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations | Dec 19, 2024 | Contrastive LearningImage Reconstruction | CodeCode Available | 3 |
| A Vision-Language Foundation Model to Enhance Efficiency of Chest X-ray Interpretation | Jan 22, 2024 | BenchmarkingDiagnostic | CodeCode Available | 3 |
| Scaling Rectified Flow Transformers for High-Resolution Image Synthesis | Mar 5, 2024 | Image Generation | CodeCode Available | 3 |
| MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval | Dec 19, 2024 | Image RetrievalRetrieval | CodeCode Available | 3 |
| BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond | Dec 3, 2020 | Super-ResolutionVideo deraining | CodeCode Available | 3 |
| Towards CausalGPT: A Multi-Agent Approach for Faithful Knowledge Reasoning via Promoting Causal Consistency in LLMs | Aug 23, 2023 | counterfactualQuestion Answering | CodeCode Available | 3 |
| WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion | Dec 12, 2023 | 3D Human Pose Estimation | CodeCode Available | 3 |
| Block-NeRF: Scalable Large Scene Neural View Synthesis | Feb 10, 2022 | NeRF | CodeCode Available | 3 |
| SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound | Apr 30, 2024 | DecoderLanguage Modelling | CodeCode Available | 3 |
| Vision Transformers for Dense Prediction | Mar 24, 2021 | DecoderDepth Estimation | CodeCode Available | 3 |
| RepViT: Revisiting Mobile CNN From ViT Perspective | Jul 18, 2023 | | CodeCode Available | 3 |
| MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model | Nov 27, 2023 | Image Animation | CodeCode Available | 3 |
| CRAG -- Comprehensive RAG Benchmark | Jun 7, 2024 | HallucinationLanguage Modelling | CodeCode Available | 3 |
| Major TOM: Expandable Datasets for Earth Observation | Feb 19, 2024 | Earth Observation | CodeCode Available | 3 |
| Uni-QSAR: an Auto-ML Tool for Molecular Property Prediction | Apr 24, 2023 | Drug DiscoveryModel Selection | CodeCode Available | 3 |
| Optimal Variable Speed Limit Control Strategy on Freeway Segments under Fog Conditions | Jul 30, 2021 | | CodeCode Available | 3 |
| Towards General-purpose Infrastructure for Protecting Scientific Data Under Study | Oct 4, 2021 | | CodeCode Available | 3 |
| L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning | Mar 6, 2025 | | CodeCode Available | 3 |
| Genie: Generative Interactive Environments | Feb 23, 2024 | | CodeCode Available | 3 |
| Exploring Regional Clues in CLIP for Zero-Shot Semantic Segmentation | Jan 1, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 3 |
| Efficiently Serving LLM Reasoning Programs with Certaindex | Dec 30, 2024 | Code GenerationMathematical Problem-Solving | CodeCode Available | 3 |
| SPO: Sequential Monte Carlo Policy Optimisation | Feb 12, 2024 | Decision MakingModel-based Reinforcement Learning | CodeCode Available | 3 |
| AgentStudio: A Toolkit for Building General Virtual Agents | Mar 26, 2024 | Visual Grounding | CodeCode Available | 3 |