| Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models | Feb 10, 2024 | CPUGPU | CodeCode Available | 3 |
| FaceXFormer: A Unified Transformer for Facial Analysis | Mar 19, 2024 | Age and Gender EstimationAge Estimation | CodeCode Available | 3 |
| Benchmarking Large Language Models on CFLUE -- A Chinese Financial Language Understanding Evaluation Dataset | May 17, 2024 | 16kBenchmarking | CodeCode Available | 3 |
| Vaporetto: Efficient Japanese Tokenization Based on Improved Pointwise Linear Classification | Jun 24, 2024 | | CodeCode Available | 3 |
| HARDVS: Revisiting Human Activity Recognition with Dynamic Vision Sensors | Nov 17, 2022 | Activity PredictionActivity Recognition | CodeCode Available | 3 |
| A Note on the Prediction-Powered Bootstrap | May 28, 2024 | Prediction | CodeCode Available | 3 |
| S-Graphs 2.0 -- A Hierarchical-Semantic Optimization and Loop Closure for SLAM | Feb 25, 2025 | global-optimizationManagement | CodeCode Available | 3 |
| AudioBench: A Universal Benchmark for Audio Large Language Models | Jun 23, 2024 | Audio Scene UnderstandingInstruction Following | CodeCode Available | 3 |
| Alias-Free Generative Adversarial Networks | Jun 23, 2021 | Image Generation | CodeCode Available | 3 |
| HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting | May 24, 2024 | NeRFNovel View Synthesis | CodeCode Available | 3 |
| Embodied CoT Distillation From LLM To Off-the-shelf Agents | Dec 16, 2024 | Decision MakingIn-Context Learning | CodeCode Available | 3 |
| MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields | Jun 15, 2022 | Computational chemistry | CodeCode Available | 3 |
| GLiREL -- Generalist Model for Zero-Shot Relation Extraction | Jan 6, 2025 | modelnamed-entity-recognition | CodeCode Available | 3 |
| ZIM: Zero-Shot Image Matting for Anything | Nov 1, 2024 | Image InpaintingImage Matting | CodeCode Available | 3 |
| ivis Dimensionality Reduction Framework for Biomacromolecular Simulations | Apr 22, 2020 | Dimensionality Reduction | CodeCode Available | 3 |
| Vulnerability Detection with Code Language Models: How Far Are We? | Mar 27, 2024 | Vulnerability Detection | CodeCode Available | 3 |
| Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language Models | May 26, 2023 | GSM8KMultimodal Reasoning | CodeCode Available | 3 |
| Vision-LSTM: xLSTM as Generic Vision Backbone | Jun 6, 2024 | | CodeCode Available | 3 |
| A Survey on Evaluation of Large Language Models | Jul 6, 2023 | EthicsSurvey | CodeCode Available | 3 |
| Movie Gen: A Cast of Media Foundation Models | Oct 17, 2024 | Audio GenerationVideo Editing | CodeCode Available | 3 |
| Recent Advances on Machine Learning for Computational Fluid Dynamics: A Survey | Aug 22, 2024 | scientific discoverySymbolic Regression | CodeCode Available | 3 |
| Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community | Aug 17, 2024 | Novel ConceptsObject | CodeCode Available | 3 |
| Point Transformer V3: Simpler, Faster, Stronger | Dec 15, 2023 | 3D Semantic SegmentationLIDAR Semantic Segmentation | CodeCode Available | 3 |
| OmniSQL: Synthesizing High-quality Text-to-SQL Data at Scale | Mar 4, 2025 | Text to SQLText-To-SQL | CodeCode Available | 3 |
| OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models | Jul 5, 2023 | | CodeCode Available | 3 |
| Scaling Diffusion Models to Real-World 3D LiDAR Scene Completion | Mar 20, 2024 | Autonomous VehiclesDenoising | CodeCode Available | 3 |
| Delay-penalized CTC implemented based on Finite State Transducer | May 19, 2023 | Attribute | CodeCode Available | 3 |
| BlackMamba: Mixture of Experts for State-Space Models | Feb 1, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| SM3Det: A Unified Model for Multi-Modal Remote Sensing Object Detection | Dec 30, 2024 | object-detectionObject Detection | CodeCode Available | 3 |
| Reinforcement Learning for Reasoning in Large Language Models with One Training Example | Apr 29, 2025 | Domain GeneralizationMath | CodeCode Available | 3 |
| OneChart: Purify the Chart Structural Extraction via One Auxiliary Token | Apr 15, 2024 | Decoder | CodeCode Available | 3 |
| AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos | Mar 30, 2025 | | CodeCode Available | 3 |
| What We Talk About When We Talk About LMs: Implicit Paradigm Shifts and the Ship of Language Models | Jul 2, 2024 | | CodeCode Available | 3 |
| StyleShot: A Snapshot on Any Style | Jul 1, 2024 | Image GenerationStyle Transfer | CodeCode Available | 3 |
| Theia: Distilling Diverse Vision Foundation Models for Robot Learning | Jul 29, 2024 | | CodeCode Available | 3 |
| BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec | Sep 9, 2024 | Quantization | CodeCode Available | 3 |
| Generating Synergistic Formulaic Alpha Collections via Reinforcement Learning | May 25, 2023 | reinforcement-learningReinforcement Learning | CodeCode Available | 3 |
| Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks | Mar 30, 2023 | Human ParsingPedestrian Attribute Recognition | CodeCode Available | 3 |
| DeepFake-O-Meter v2.0: An Open Platform for DeepFake Detection | Apr 19, 2024 | BenchmarkingDeepFake Detection | CodeCode Available | 3 |
| VAD: Vectorized Scene Representation for Efficient Autonomous Driving | Mar 21, 2023 | Autonomous DrivingBench2Drive | CodeCode Available | 3 |
| Scaling Diffusion Transformers to 16 Billion Parameters | Jul 16, 2024 | AttributeConditional Image Generation | CodeCode Available | 3 |
| Ola: Pushing the Frontiers of Omni-Modal Language Model | Feb 6, 2025 | cross-modal alignmentLanguage Modeling | CodeCode Available | 3 |
| SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery | Dec 15, 2023 | Contrastive LearningEarth Observation | CodeCode Available | 3 |
| LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning | Jun 5, 2023 | Benchmarking | CodeCode Available | 3 |
| Matcha-TTS: A fast TTS architecture with conditional flow matching | Sep 6, 2023 | Acoustic ModellingDecoder | CodeCode Available | 3 |
| Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models | Jul 9, 2024 | Vision and Language Navigation | CodeCode Available | 3 |
| Decoding-based Regression | Jan 31, 2025 | Density Estimationregression | CodeCode Available | 3 |
| OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis | Dec 27, 2024 | DiversitySynthetic Data Generation | CodeCode Available | 3 |
| Demystifying Long Chain-of-Thought Reasoning in LLMs | Feb 5, 2025 | Reinforcement Learning (RL) | CodeCode Available | 3 |
| MAXIM: Multi-Axis MLP for Image Processing | Jan 9, 2022 | DeblurringDenoising | CodeCode Available | 3 |