| VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model | Jan 21, 2025 | Image GenerationInstruction Following | CodeCode Available | 3 |
| Parameter-Efficient Fine-Tuning in Spectral Domain for Point Cloud Learning | Oct 10, 2024 | 3D Parameter-Efficient Fine-Tuning for Classification3D Point Cloud Classification | CodeCode Available | 3 |
| GaussianEditor: Swift and Controllable 3D Editing with Gaussian Splatting | Nov 24, 2023 | NeRF | CodeCode Available | 3 |
| GraphStorm: all-in-one graph machine learning framework for industry applications | Jun 10, 2024 | Allgraph construction | CodeCode Available | 3 |
| TokenPacker: Efficient Visual Projector for Multimodal LLM | Jul 2, 2024 | Language ModellingLarge Language Model | CodeCode Available | 3 |
| WeatherMesh-3: Fast and accurate operational global weather forecasting | Mar 28, 2025 | Computational EfficiencyGPU | CodeCode Available | 3 |
| NdLinear Is All You Need for Representation Learning | Mar 21, 2025 | AllRepresentation Learning | CodeCode Available | 3 |
| Bake off redux: a review and experimental evaluation of recent time series classification algorithms | Apr 25, 2023 | Dynamic Time WarpingTime Series | CodeCode Available | 3 |
| TrafficLLM: Enhancing Large Language Models for Network Traffic Analysis with Generic Traffic Representation | Apr 5, 2025 | | CodeCode Available | 3 |
| CameraHMR: Aligning People with Perspective | Nov 12, 2024 | 3D human pose and shape estimation | CodeCode Available | 3 |
| DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge | Jul 6, 2025 | Image GenerationMultimodal Reasoning | CodeCode Available | 3 |
| DEFOM-Stereo: Depth Foundation Model Based Stereo Matching | Jan 16, 2025 | Depth EstimationDisparity Estimation | CodeCode Available | 3 |
| Rainbow: Combining Improvements in Deep Reinforcement Learning | Oct 6, 2017 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 3 |
| Mambular: A Sequential Model for Tabular Deep Learning | Aug 12, 2024 | Deep LearningMamba | CodeCode Available | 3 |
| Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference Optimization | Jun 6, 2024 | DenoisingImage Generation | CodeCode Available | 3 |
| WHAC: World-grounded Humans and Cameras | Mar 19, 2024 | Camera Pose EstimationPose Estimation | CodeCode Available | 3 |
| GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations | Feb 19, 2024 | Card GamesLogical Reasoning | CodeCode Available | 3 |
| Generative AI Act II: Test Time Scaling Drives Cognition Engineering | Apr 18, 2025 | Prompt Engineering | CodeCode Available | 3 |
| ArxivDIGESTables: Synthesizing Scientific Literature into Tables using Language Models | Oct 25, 2024 | | CodeCode Available | 3 |
| Cognify: Supercharging Gen-AI Workflows With Hierarchical Autotuning | Feb 12, 2025 | RAGText to SQL | CodeCode Available | 3 |
| Unitxt: Flexible, Shareable and Reusable Data Preparation and Evaluation for Generative AI | Jan 25, 2024 | | CodeCode Available | 3 |
| Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents | Oct 7, 2024 | Natural Language Visual GroundingNavigate | CodeCode Available | 3 |
| AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language Models | May 22, 2025 | BenchmarkingFairness | CodeCode Available | 3 |
| From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models | Mar 18, 2024 | Chart UnderstandingData Visualization | CodeCode Available | 3 |
| DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models | Jun 17, 2024 | Document ClassificationVisual Grounding | CodeCode Available | 3 |
| Chain of Draft: Thinking Faster by Writing Less | Feb 25, 2025 | | CodeCode Available | 3 |
| Data Augmentation for Sequential Recommendation: A Survey | Sep 20, 2024 | Data AugmentationRecommendation Systems | CodeCode Available | 3 |
| Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale | Sep 25, 2024 | Large Language Model | CodeCode Available | 3 |
| MLVU: Benchmarking Multi-task Long Video Understanding | Jun 6, 2024 | BenchmarkingVideo Understanding | CodeCode Available | 3 |
| UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition | Nov 27, 2023 | Image ClassificationObject Detection | CodeCode Available | 3 |
| ECON: Explicit Clothed humans Optimized via Normal integration | Dec 14, 2022 | 3D Human ReconstructionSurface Reconstruction | CodeCode Available | 3 |
| Partially Rewriting a Transformer in Natural Language | Jan 31, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| A Clean Slate for Offline Reinforcement Learning | Apr 15, 2025 | Offline RLreinforcement-learning | CodeCode Available | 3 |
| MarioGPT: Open-Ended Text2Level Generation through Large Language Models | Feb 12, 2023 | | CodeCode Available | 3 |
| PINGS: Gaussian Splatting Meets Distance Fields within a Point-Based Implicit Neural Map | Feb 9, 2025 | | CodeCode Available | 3 |
| VisualRWKV: Exploring Recurrent Neural Networks for Visual Language Models | Jun 19, 2024 | GPULanguage Modeling | CodeCode Available | 3 |
| OS-ATLAS: A Foundation Action Model for Generalist GUI Agents | Oct 30, 2024 | Natural Language Visual Grounding | CodeCode Available | 3 |
| HadaCore: Tensor Core Accelerated Hadamard Transform Kernel | Dec 12, 2024 | GPUMMLU | CodeCode Available | 3 |
| Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling | Aug 16, 2024 | Retrieval | CodeCode Available | 3 |
| Description Boosting for Zero-Shot Entity and Relation Classification | Jun 4, 2024 | RelationRelation Classification | CodeCode Available | 3 |
| LibCity: A Unified Library Towards Efficient and Comprehensive Urban Spatial-Temporal Prediction | Apr 27, 2023 | Prediction | CodeCode Available | 3 |
| Bird-Eye Transformers for Text Generation Models | Oct 8, 2022 | AttributeInductive Bias | CodeCode Available | 3 |
| Lightplane: Highly-Scalable Components for Neural 3D Fields | Apr 30, 2024 | 3D Reconstruction | CodeCode Available | 3 |
| Apollo: Band-sequence Modeling for High-Quality Audio Restoration | Sep 13, 2024 | Computational EfficiencySpeech Enhancement | CodeCode Available | 3 |
| ExTrans: Multilingual Deep Reasoning Translation via Exemplar-Enhanced Reinforcement Learning | May 19, 2025 | Machine Translationreinforcement-learning | CodeCode Available | 3 |
| Image Quality Assessment for Magnetic Resonance Imaging | Mar 15, 2022 | DenoisingImage Enhancement | CodeCode Available | 3 |
| RoadBEV: Road Surface Reconstruction in Bird's Eye View | Apr 9, 2024 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 3 |
| MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse | Mar 24, 2025 | Layout GenerationReinforcement Learning (RL) | CodeCode Available | 3 |
| Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks | Nov 22, 2022 | Math | CodeCode Available | 3 |
| XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model | Jul 14, 2022 | 2D Human Pose Estimation2D Object Detection | CodeCode Available | 3 |