| Macaw-LLM: Multi-Modal Language Modeling with Image, Audio, Video, and Text Integration | Jun 15, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| AgentTuning: Enabling Generalized Agent Abilities for LLMs | Oct 19, 2023 | Memorization | CodeCode Available | 3 | 5 |
| Hawk: Learning to Understand Open-World Video Anomalies | May 27, 2024 | Anomaly DetectionQuestion Answering | CodeCode Available | 3 | 5 |
| PhoWhisper: Automatic Speech Recognition for Vietnamese | Mar 27, 2024 | Automatic Speech Recognitionspeech-recognition | CodeCode Available | 3 | 5 |
| Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs | Jun 26, 2024 | Arithmetic ReasoningGSM8K | CodeCode Available | 3 | 5 |
| How to build the best medical image segmentation algorithm using foundation models: a comprehensive empirical study with Segment Anything Model | Apr 15, 2024 | DecoderImage Segmentation | CodeCode Available | 3 | 5 |
| Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning | Feb 26, 2024 | GPUMinecraft | CodeCode Available | 3 | 5 |
| Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection | Jun 8, 2020 | Dense Object DetectionGeneral Classification | CodeCode Available | 3 | 5 |
| Exploring the Performance Improvement of Tensor Processing Engines through Transformation in the Bit-weight Dimension of MACs | Mar 8, 2025 | | CodeCode Available | 3 | 5 |
| DRCT: Saving Image Super-resolution away from Information Bottleneck | Mar 31, 2024 | Image Super-ResolutionSuper-Resolution | CodeCode Available | 3 | 5 |
| TopoX: A Suite of Python Packages for Machine Learning on Topological Domains | Feb 4, 2024 | | CodeCode Available | 3 | 5 |
| OSUM: Advancing Open Speech Understanding Models with Limited Resources in Academia | Jan 23, 2025 | Emotion RecognitionEvent Detection | CodeCode Available | 3 | 5 |
| Emu3: Next-Token Prediction is All You Need | Sep 27, 2024 | All | CodeCode Available | 3 | 5 |
| Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving | Apr 3, 2025 | Reinforcement Learning (RL) | CodeCode Available | 3 | 5 |
| MagicPose: Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion | Nov 18, 2023 | Video Generation | CodeCode Available | 3 | 5 |
| MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries | Jan 27, 2024 | BenchmarkingRAG | CodeCode Available | 3 | 5 |
| NerfAcc: A General NeRF Acceleration Toolbox | Oct 10, 2022 | NeRF | CodeCode Available | 3 | 5 |
| Llemma: An Open Language Model For Mathematics | Oct 16, 2023 | Arithmetic ReasoningAutomated Theorem Proving | CodeCode Available | 3 | 5 |
| Datasets: A Community Library for Natural Language Processing | Sep 7, 2021 | Image ClassificationObject Recognition | CodeCode Available | 3 | 5 |
| Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction | Feb 15, 2023 | 3D Semantic Scene CompletionAutonomous Driving | CodeCode Available | 3 | 5 |
| ResNeSt: Split-Attention Networks | Apr 19, 2020 | image-classificationImage Classification | CodeCode Available | 3 | 5 |
| MedSegDiff-V2: Diffusion based Medical Image Segmentation with Transformer | Jan 19, 2023 | Image GenerationImage Segmentation | CodeCode Available | 3 | 5 |
| IEPile: Unearthing Large-Scale Schema-Based Information Extraction Corpus | Feb 22, 2024 | Zero-shot Generalization | CodeCode Available | 3 | 5 |
| StableToolBench-MirrorAPI: Modeling Tool Environments as Mirrors of 7,000+ Real-World APIs | Mar 26, 2025 | Benchmarking | CodeCode Available | 3 | 5 |
| Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory | Apr 10, 2025 | MathMMLU | CodeCode Available | 3 | 5 |
| Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs | Jan 11, 2024 | Representation LearningSelf-Supervised Learning | CodeCode Available | 3 | 5 |
| Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling | Jan 9, 2023 | 2D Object DetectionContrastive Learning | CodeCode Available | 3 | 5 |
| Inferring Articulated Rigid Body Dynamics from RGBD Video | Mar 20, 2022 | Contact mechanicsInverse Rendering | CodeCode Available | 3 | 5 |
| SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension | Apr 25, 2024 | BenchmarkingMultiple-choice | CodeCode Available | 3 | 5 |
| Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters | Mar 18, 2024 | Continual LearningIncremental Learning | CodeCode Available | 3 | 5 |
| Neural Network Verification with Branch-and-Bound for General Nonlinearities | May 31, 2024 | | CodeCode Available | 3 | 5 |
| AToMiC: An Image/Text Retrieval Test Collection to Support Multimedia Content Creation | Apr 4, 2023 | Cross-Modal RetrievalImage-text Retrieval | CodeCode Available | 3 | 5 |
| DrivAerNet: A Parametric Car Dataset for Data-Driven Aerodynamic Design and Prediction | Mar 12, 2024 | | CodeCode Available | 3 | 5 |
| Exploring Intrinsic Normal Prototypes within a Single Image for Universal Anomaly Detection | Mar 4, 2025 | Anomaly DetectionMulti-class Anomaly Detection | CodeCode Available | 3 | 5 |
| Diffusion Model-Based Video Editing: A Survey | Jun 26, 2024 | modelSurvey | CodeCode Available | 3 | 5 |
| Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer | Mar 7, 2022 | | CodeCode Available | 3 | 5 |
| BoT-SORT: Robust Associations Multi-Pedestrian Tracking | Jun 29, 2022 | Multi-Object TrackingObject | CodeCode Available | 3 | 5 |
| TopoBench: A Framework for Benchmarking Topological Deep Learning | Jun 9, 2024 | BenchmarkingDeep Learning | CodeCode Available | 3 | 5 |
| InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation | Sep 12, 2023 | GPUImage Generation | CodeCode Available | 3 | 5 |
| Impact of architecture on robustness and interpretability of multispectral deep neural networks | Sep 21, 2023 | Deep Learning | CodeCode Available | 3 | 5 |
| Are Language Models Actually Useful for Time Series Forecasting? | Jun 22, 2024 | Time SeriesTime Series Forecasting | CodeCode Available | 3 | 5 |
| PDEBENCH: An Extensive Benchmark for Scientific Machine Learning | Oct 13, 2022 | | CodeCode Available | 3 | 5 |
| Activating More Pixels in Image Super-Resolution Transformer | May 9, 2022 | Image Super-ResolutionSuper-Resolution | CodeCode Available | 3 | 5 |
| The First Competition on Resource-Limited Infrared Small Target Detection Challenge: Methods and Results | Aug 18, 2024 | | CodeCode Available | 3 | 5 |
| ELIZA Reanimated: The world's first chatbot restored on the world's first time sharing system | Jan 12, 2025 | Chatbot | CodeCode Available | 3 | 5 |
| The Manga Whisperer: Automatically Generating Transcriptions for Comics | Jan 18, 2024 | | CodeCode Available | 3 | 5 |
| Lighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View Synthesis | Jun 10, 2024 | 2k3DGS | CodeCode Available | 3 | 5 |
| Speedy-Splat: Fast 3D Gaussian Splatting with Sparse Pixels and Sparse Primitives | Nov 30, 2024 | 3D Scene ReconstructionNeRF | CodeCode Available | 3 | 5 |
| Dispelling the Mirage of Progress in Offline MARL through Standardised Baselines and Evaluation | Jun 13, 2024 | Multi-agent Reinforcement Learning | CodeCode Available | 3 | 5 |
| Deep Neural Networks for Rank-Consistent Ordinal Regression Based On Conditional Probabilities | Nov 17, 2021 | regression | CodeCode Available | 3 | 5 |