| Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning | Oct 18, 2024 | | CodeCode Available | 2 |
| DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech | Jul 3, 2022 | text-to-speechText to Speech | CodeCode Available | 2 |
| GPTScore: Evaluate as You Desire | Feb 8, 2023 | Text Generation | CodeCode Available | 2 |
| Towards Learning Universal Hyperparameter Optimizers with Transformers | May 26, 2022 | Hyperparameter OptimizationMeta-Learning | CodeCode Available | 2 |
| Towards Building the Federated GPT: Federated Instruction Tuning | May 9, 2023 | Federated Learning | CodeCode Available | 2 |
| GaussRender: Learning 3D Occupancy with Gaussian Rendering | Feb 7, 2025 | 3D geometryAutonomous Vehicles | CodeCode Available | 2 |
| Prompting for Numerical Sequences: A Case Study on Market Comment Generation | Apr 3, 2024 | Comment GenerationData-to-Text Generation | CodeCode Available | 2 |
| Learning Efficient Convolutional Networks through Network Slimming | Aug 22, 2017 | image-classificationImage Classification | CodeCode Available | 2 |
| SEAL: Steerable Reasoning Calibration of Large Language Models for Free | Apr 7, 2025 | GSM8K | CodeCode Available | 2 |
| I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders | Mar 24, 2025 | All | CodeCode Available | 2 |
| Multi-Modal Fusion Transformer for End-to-End Autonomous Driving | Apr 19, 2021 | Autonomous Driving | CodeCode Available | 2 |
| Harnessing Administrative Data Inventories to Create a Reliable Transnational Reference Database for Crop Type Monitoring | Oct 10, 2023 | Earth Observation | CodeCode Available | 2 |
| Online Decision Transformer | Feb 11, 2022 | D4RLEfficient Exploration | CodeCode Available | 2 |
| Synthesizing Anyone, Anywhere, in Any Pose | Apr 6, 2023 | | CodeCode Available | 2 |
| Learning Transferable Visual Models From Natural Language Supervision | Feb 26, 2021 | Action RecognitionBenchmarking | CodeCode Available | 2 |
| When StyleGAN Meets Stable Diffusion: a W+ Adapter for Personalized Image Generation | Jan 1, 2024 | AttributeDisentanglement | CodeCode Available | 2 |
| Adapting Frechet Audio Distance for Generative Music Evaluation | Nov 2, 2023 | FAD | CodeCode Available | 2 |
| Linearizing Large Language Models | May 10, 2024 | In-Context LearningMamba | CodeCode Available | 2 |
| UniGen: A Unified Framework for Textual Dataset Generation Using Large Language Models | Jun 27, 2024 | AttributeBenchmarking | CodeCode Available | 2 |
| Demystifying AI Platform Design for Distributed Inference of Next-Generation LLM models | Jun 3, 2024 | ChunkingMamba | CodeCode Available | 2 |
| SpA-Former: Transformer image shadow detection and removal via spatial attention | Jun 22, 2022 | Shadow DetectionShadow Detection And Removal | CodeCode Available | 2 |
| Layer-Condensed KV Cache for Efficient Inference of Large Language Models | May 17, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| DYffusion: A Dynamics-informed Diffusion Model for Spatiotemporal Forecasting | Jun 3, 2023 | Computational EfficiencyInductive Bias | CodeCode Available | 2 |
| Kernel Neural Optimal Transport | May 30, 2022 | Image-to-Image TranslationTranslation | CodeCode Available | 2 |
| Extract, Define, Canonicalize: An LLM-based Framework for Knowledge Graph Construction | Apr 5, 2024 | graph constructionOpen Information Extraction | CodeCode Available | 2 |
| WizMap: Scalable Interactive Visualization for Exploring Large Machine Learning Embeddings | Jun 15, 2023 | Navigate | CodeCode Available | 2 |
| Actuarial Applications of Natural Language Processing Using Transformers: Case Studies for Using Text Features in an Actuarial Context | Jun 4, 2022 | Transfer Learning | CodeCode Available | 2 |
| Graph-based Neural Weather Prediction for Limited Area Modeling | Sep 29, 2023 | Weather Forecasting | CodeCode Available | 2 |
| MOROCCO: Model Resource Comparison Framework | Apr 29, 2021 | model | CodeCode Available | 2 |
| LesionLocator: Zero-Shot Universal Tumor Segmentation and Tracking in 3D Whole-Body Imaging | Jan 1, 2025 | Lesion SegmentationSegmentation | CodeCode Available | 2 |
| Dilated Neighborhood Attention Transformer | Sep 29, 2022 | Image ClassificationInstance Segmentation | CodeCode Available | 2 |
| Vakyansh: ASR Toolkit for Low Resource Indic languages | Mar 30, 2022 | Punctuation Restorationspeech-recognition | CodeCode Available | 2 |
| Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention | Jan 1, 2025 | HallucinationResponse Generation | CodeCode Available | 2 |
| Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings | Mar 25, 2025 | 4kAction Recognition | CodeCode Available | 2 |
| Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration | May 26, 2025 | Domain GeneralizationHallucination | CodeCode Available | 2 |
| KaLM-Embedding-V2: Superior Training Techniques and Data Inspire A Versatile Embedding Model | Jun 26, 2025 | Representation LearningRetrieval | CodeCode Available | 2 |
| Unifying Unsupervised Graph-Level Anomaly Detection and Out-of-Distribution Detection: A Benchmark | Jun 21, 2024 | Anomaly DetectionOut-of-Distribution Detection | CodeCode Available | 2 |
| CodeS: Towards Building Open-source Language Models for Text-to-SQL | Feb 26, 2024 | Data AugmentationDiagnostic | CodeCode Available | 2 |
| TransNormerLLM: A Faster and Better Large Language Model with Improved TransNormer | Jul 27, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Learning Embeddings with Centroid Triplet Loss for Object Identification in Robotic Grasping | Apr 9, 2024 | Image RetrievalObject | CodeCode Available | 2 |
| DiffDock-PP: Rigid Protein-Protein Docking with Diffusion Models | Apr 8, 2023 | Drug DiscoveryProtein Design | CodeCode Available | 2 |
| Number it: Temporal Grounding Videos like Flipping Manga | Nov 15, 2024 | Highlight DetectionMoment Retrieval | CodeCode Available | 2 |
| Two Heads are Better Than One: Test-time Scaling of Multi-agent Collaborative Reasoning | Apr 14, 2025 | Mathematical Reasoningmbpp | CodeCode Available | 2 |
| Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement | Oct 15, 2024 | DisentanglementInductive Bias | CodeCode Available | 2 |
| Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation | Dec 1, 2022 | 3D GenerationText to 3D | CodeCode Available | 2 |
| RSRefSeg: Referring Remote Sensing Image Segmentation with Foundation Models | Jan 12, 2025 | Image SegmentationSegmentation | CodeCode Available | 2 |
| A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation | Oct 2, 2024 | Image GenerationQuantization | CodeCode Available | 2 |
| Photoreal Scene Reconstruction from an Egocentric Device | Jun 4, 2025 | | CodeCode Available | 2 |
| How Much are Large Language Models Contaminated? A Comprehensive Survey and the LLMSanitize Library | Mar 31, 2024 | Question Answering | CodeCode Available | 2 |
| TabLLM: Few-shot Classification of Tabular Data with Large Language Models | Oct 19, 2022 | ClassificationDeep Learning | CodeCode Available | 2 |