| Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models | May 25, 2023 | Conditional Text-to-Image SynthesisImage Generation | CodeCode Available | 2 |
| ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization | Feb 6, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| SalM2: An Extremely Lightweight Saliency Mamba Model for Real-Time Cognitive Awareness of Driver Attention | Feb 22, 2025 | Mamba | CodeCode Available | 2 |
| TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators | Feb 20, 2025 | BenchmarkingCode Generation | CodeCode Available | 2 |
| A Survey of Safety on Large Vision-Language Models: Attacks, Defenses and Evaluations | Feb 14, 2025 | Survey | CodeCode Available | 2 |
| Sanity Checking Causal Representation Learning on a Simple Real-World System | Feb 27, 2025 | Representation Learning | CodeCode Available | 2 |
| Enhanced Contrastive Learning with Multi-view Longitudinal Data for Chest X-ray Report Generation | Feb 27, 2025 | Contrastive LearningDiagnostic | CodeCode Available | 2 |
| A Training-free LLM-based Approach to General Chinese Character Error Correction | Feb 21, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| SemiSAM+: Rethinking Semi-Supervised Medical Image Segmentation in the Era of Foundation Models | Feb 28, 2025 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| Neural Posterior Estimation for Cataloging Astronomical Images with Spatially Varying Backgrounds and Point Spread Functions | Feb 28, 2025 | Variational Inference | CodeCode Available | 2 |
| AnalogGenie: A Generative Engine for Automatic Discovery of Analog Circuit Topologies | Feb 28, 2025 | | CodeCode Available | 2 |
| Patch-wise Structural Loss for Time Series Forecasting | Mar 2, 2025 | Time SeriesTime Series Forecasting | CodeCode Available | 2 |
| Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation | Mar 5, 2025 | ObjectReferring Video Object Segmentation | CodeCode Available | 2 |
| MM-OR: A Large Multimodal Operating Room Dataset for Semantic Understanding of High-Intensity Surgical Environments | Mar 4, 2025 | 2D Panoptic SegmentationGraph Generation | CodeCode Available | 2 |
| PromptPex: Automatic Test Generation for Language Model Prompts | Mar 7, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| MPA: MultiPath++ Based Architecture for Motion Prediction | Jun 20, 2022 | Autonomous Drivingmotion prediction | CodeCode Available | 2 |
| Real-time Spatial-temporal Traversability Assessment via Feature-based Sparse Gaussian Process | Mar 6, 2025 | Autonomous NavigationComputational Efficiency | CodeCode Available | 2 |
| DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding | Mar 13, 2025 | 4kAutonomous Driving | CodeCode Available | 2 |
| Bayesian Prompt Flow Learning for Zero-Shot Anomaly Detection | Mar 13, 2025 | Anomaly Detectionzero-shot anomaly detection | CodeCode Available | 2 |
| Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting | Mar 18, 2025 | Instance SegmentationObject | CodeCode Available | 2 |
| Rapid patient-specific neural networks for intraoperative X-ray to volume registration | Mar 20, 2025 | | CodeCode Available | 2 |
| Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model | Mar 20, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Tokenize Image as a Set | Mar 20, 2025 | Image Generation | CodeCode Available | 2 |
| Hahaha | Jul 10, 2020 | | CodeCode Available | 2 |
| WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching | Mar 20, 2025 | Speech Synthesis | CodeCode Available | 2 |
| MaSS13K: A Matting-level Semantic Segmentation Benchmark | Mar 24, 2025 | 4kImage Matting | CodeCode Available | 2 |
| STEVE: A Step Verification Pipeline for Computer-use Agent Training | Mar 16, 2025 | | CodeCode Available | 2 |
| COB-GS: Clear Object Boundaries in 3DGS Segmentation Based on Boundary-Adaptive Gaussian Splitting | Mar 25, 2025 | 3DGSObject | CodeCode Available | 2 |
| SkySenseGPT: A Fine-Grained Instruction Tuning Dataset and Model for Remote Sensing Vision-Language Understanding | Jun 14, 2024 | Graph GenerationRelation | CodeCode Available | 2 |
| OntologyRAG: Better and Faster Biomedical Code Mapping with Retrieval-Augmented Generation (RAG) Leveraging Ontology Knowledge Graphs and Large Language Models | Feb 26, 2025 | In-Context LearningKnowledge Graphs | CodeCode Available | 2 |
| SALT: A Flexible Semi-Automatic Labeling Tool for General LiDAR Point Clouds with Cross-Scene Adaptability and 4D Consistency | Mar 31, 2025 | Zero-Shot Learning | CodeCode Available | 2 |
| TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes | Mar 30, 2025 | 2kImage Generation | CodeCode Available | 2 |
| Force-Free Molecular Dynamics Through Autoregressive Equivariant Networks | Mar 31, 2025 | Numerical Integration | CodeCode Available | 2 |
| Graph ODEs and Beyond: A Comprehensive Survey on Integrating Differential Equations with Graph Neural Networks | Mar 29, 2025 | SurveyTraffic Prediction | CodeCode Available | 2 |
| GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning | Apr 3, 2025 | Reinforcement Learning (RL) | CodeCode Available | 2 |
| CrackSQL: A Hybrid SQL Dialect Translation System Powered by Large Language Models | Apr 1, 2025 | Large Language ModelTranslation | CodeCode Available | 2 |
| SpaceR: Reinforcing MLLMs in Video Spatial Reasoning | Apr 2, 2025 | MMESpatial Reasoning | CodeCode Available | 2 |
| RWKVTTS: Yet another TTS based on RWKV-7 | Apr 4, 2025 | Computational Efficiencytext-to-speech | CodeCode Available | 2 |
| Sleep-time Compute: Beyond Inference Scaling at Test-time | Apr 17, 2025 | | CodeCode Available | 2 |
| Seurat: From Moving Points to Depth | Apr 20, 2025 | Depth EstimationPoint Tracking | CodeCode Available | 2 |
| RWKV-X: A Linear Complexity Hybrid Language Model | Apr 30, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Representation Learning for Tabular Data: A Comprehensive Survey | Apr 17, 2025 | Representation LearningSurvey | CodeCode Available | 2 |
| Test-Time Domain Generalization via Universe Learning: A Multi-Graph Matching Approach for Medical Image Segmentation | Mar 17, 2025 | Domain AdaptationDomain Generalization | CodeCode Available | 2 |
| DTGBrepGen: A Novel B-rep Generative Model through Decoupling Topology and Geometry | Mar 17, 2025 | valid | CodeCode Available | 2 |
| Learning to Detect Multi-class Anomalies with Just One Normal Image Prompt | May 14, 2025 | Anomaly DetectionAnomaly Segmentation | CodeCode Available | 2 |
| Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models | May 15, 2025 | Mathreinforcement-learning | CodeCode Available | 2 |
| Relational Graph Transformer | May 16, 2025 | Graph Neural Network | CodeCode Available | 2 |
| AdaptThink: Reasoning Models Can Learn When to Think | May 19, 2025 | Math | CodeCode Available | 2 |
| AD-AGENT: A Multi-agent Framework for End-to-end Anomaly Detection | May 19, 2025 | Anomaly DetectionCode Generation | CodeCode Available | 2 |
| FlightGPT: Towards Generalizable and Interpretable UAV Vision-and-Language Navigation with Vision-Language Models | May 19, 2025 | Disaster ResponseVision and Language Navigation | CodeCode Available | 2 |