| ADELIE: Aligning Large Language Models on Information Extraction | May 8, 2024 | | CodeCode Available | 2 |
| Web-Shepherd: Advancing PRMs for Reinforcing Web Agents | May 21, 2025 | Large Language ModelMultimodal Large Language Model | CodeCode Available | 2 |
| DeepPrivacy2: Towards Realistic Full-Body Anonymization | Nov 17, 2022 | DiversityFace Anonymization | CodeCode Available | 2 |
| Pre-training Enhanced Spatial-temporal Graph Neural Network for Multivariate Time Series Forecasting | Jun 18, 2022 | Graph Neural NetworkMultivariate Time Series Forecasting | CodeCode Available | 2 |
| EasyText: Controllable Diffusion Transformer for Multilingual Text Rendering | May 30, 2025 | Denoising | CodeCode Available | 2 |
| Graph Condensation: A Survey | Jan 22, 2024 | FairnessGraph Generation | CodeCode Available | 2 |
| Large Language Model Enhanced Recommender Systems: A Survey | Dec 18, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| SF-Loc: A Visual Mapping and Geo-Localization System based on Sparse Visual Structure Frames | Dec 2, 2024 | geo-localizationPose Estimation | CodeCode Available | 2 |
| FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation | May 30, 2025 | Hallucination | CodeCode Available | 2 |
| LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers | Mar 18, 2025 | Automated Feature EngineeringFeature Engineering | CodeCode Available | 2 |
| DriveTransformer: Unified Transformer for Scalable End-to-End Autonomous Driving | Mar 7, 2025 | Autonomous DrivingBench2Drive | CodeCode Available | 2 |
| LLark: A Multimodal Instruction-Following Language Model for Music | Oct 11, 2023 | Instruction FollowingLanguage Modeling | CodeCode Available | 2 |
| The Sensory Neuron as a Transformer: Permutation-Invariant Neural Networks for Reinforcement Learning | Sep 7, 2021 | reinforcement-learningReinforcement Learning (RL) | CodeCode Available | 2 |
| Heterogeneity-Informed Meta-Parameter Learning for Spatiotemporal Time Series Forecasting | May 17, 2024 | Time SeriesTime Series Forecasting | CodeCode Available | 2 |
| OMLT: Optimization & Machine Learning Toolkit | Feb 4, 2022 | Bayesian OptimisationBIG-bench Machine Learning | CodeCode Available | 2 |
| HiGPT: Heterogeneous Graph Language Model | Feb 25, 2024 | Graph LearningLanguage Modeling | CodeCode Available | 2 |
| Objects With Lighting: A Real-World Dataset for Evaluating Reconstruction and Rendering for Object Relighting | Jan 17, 2024 | Inverse RenderingNovel View Synthesis | CodeCode Available | 2 |
| Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action | Dec 28, 2023 | DecoderImage Generation | CodeCode Available | 2 |
| Rethinking Diverse Human Preference Learning through Principal Component Analysis | Feb 18, 2025 | | CodeCode Available | 2 |
| Generative Inbetweening through Frame-wise Conditions-Driven Video Generation | Dec 16, 2024 | Video Generation | CodeCode Available | 2 |
| See More Details: Efficient Image Super-Resolution by Experts Mining | Feb 5, 2024 | Image Super-ResolutionSuper-Resolution | CodeCode Available | 2 |
| TAVA: Template-free Animatable Volumetric Actors | Jun 17, 2022 | | CodeCode Available | 2 |
| Code2Logic: Game-Code-Driven Data Synthesis for Enhancing VLMs General Reasoning | May 20, 2025 | Domain GeneralizationMultimodal Reasoning | CodeCode Available | 2 |
| GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM | Mar 8, 2024 | Quantization | CodeCode Available | 2 |
| Cross-View Referring Multi-Object Tracking | Dec 23, 2024 | Cross-view Referring Multi-Object TrackingMulti-Object Tracking | CodeCode Available | 2 |
| Gaussian Mixture Flow Matching Models | Apr 7, 2025 | DenoisingImage Generation | CodeCode Available | 2 |
| Quantus: An Explainable AI Toolkit for Responsible Evaluation of Neural Network Explanations and Beyond | Feb 14, 2022 | Explainable Artificial Intelligence (XAI) | CodeCode Available | 2 |
| Searching Latent Program Spaces | Nov 13, 2024 | ARCProgram induction | CodeCode Available | 2 |
| MSVM-UNet: Multi-Scale Vision Mamba UNet for Medical Image Segmentation | Aug 25, 2024 | Image SegmentationMamba | CodeCode Available | 2 |
| Equalized Focal Loss for Dense Long-Tailed Object Detection | Jan 7, 2022 | Long-tailed Object DetectionObject | CodeCode Available | 2 |
| DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds | Jun 9, 2023 | 3D Multi-Object Tracking3D Object Detection | CodeCode Available | 2 |
| A Touch, Vision, and Language Dataset for Multimodal Alignment | Feb 20, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Latent Ewald summation for machine learning of long-range interactions | Aug 27, 2024 | | CodeCode Available | 2 |
| Exploring Contrastive Learning for Multimodal Detection of Misogynistic Memes | Jul 1, 2022 | Contrastive Learning | CodeCode Available | 2 |
| VLT: Vision-Language Transformer and Query Generation for Referring Segmentation | Oct 28, 2022 | Referring Expression SegmentationReferring Video Object Segmentation | CodeCode Available | 2 |
| DeepDPM: Deep Clustering With an Unknown Number of Clusters | Mar 27, 2022 | ClusteringDeep Clustering | CodeCode Available | 2 |
| LayoutPrompter: Awaken the Design Ability of Large Language Models | Nov 11, 2023 | In-Context LearningLayout Generation | CodeCode Available | 2 |
| Behind Maya: Building a Multilingual Vision Language Model | May 13, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Alignment faking in large language models | Dec 18, 2024 | Large Language Model | CodeCode Available | 2 |
| Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization | Mar 21, 2024 | geo-localizationRe-Ranking | CodeCode Available | 2 |
| CelebV-HQ: A Large-Scale Video Facial Attributes Dataset | Jul 25, 2022 | AttributeDiversity | CodeCode Available | 2 |
| Jodi: Unification of Visual Generation and Understanding via Joint Modeling | May 25, 2025 | | CodeCode Available | 2 |
| Dress Code: High-Resolution Multi-Category Virtual Try-On | Apr 18, 2022 | Virtual Try-onVocal Bursts Intensity Prediction | CodeCode Available | 2 |
| Embodied-R: Collaborative Framework for Activating Embodied Spatial Reasoning in Foundation Models via Reinforcement Learning | Apr 17, 2025 | Multimodal ReasoningReinforcement Learning (RL) | CodeCode Available | 2 |
| AMAGO-2: Breaking the Multi-Task Barrier in Meta-Reinforcement Learning with Transformers | Nov 17, 2024 | In-Context LearningMeta-Learning | CodeCode Available | 2 |
| 3D Student Splatting and Scooping | Mar 13, 2025 | 3DGSNeural Rendering | CodeCode Available | 2 |
| TJ4DRadSet: A 4D Radar Dataset for Autonomous Driving | Apr 28, 2022 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| Delta Decompression for MoE-based LLMs Compression | Feb 24, 2025 | DiversityMixture-of-Experts | CodeCode Available | 2 |
| CMMLU: Measuring massive multitask language understanding in Chinese | Jun 15, 2023 | Large Language Model | CodeCode Available | 2 |
| SemViQA: A Semantic Question Answering System for Vietnamese Information Fact-Checking | Mar 2, 2025 | Fact CheckingFact Verification | CodeCode Available | 2 |