| Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts | May 18, 2024 | Mixture-of-ExpertsVisual Question Answering | CodeCode Available | 5 |
| RLHF Workflow: From Reward Modeling to Online RLHF | May 13, 2024 | ChatbotHumanEval | CodeCode Available | 5 |
| Single-seed generation of Brownian paths and integrals for adaptive and high order SDE solvers | May 10, 2024 | | CodeCode Available | 5 |
| Evaluating Real-World Robot Manipulation Policies in Simulation | May 9, 2024 | Robotic GraspingRobot Manipulation | CodeCode Available | 5 |
| Granite Code Models: A Family of Open Foundation Models for Code Intelligence | May 7, 2024 | Code GenerationDecoder | CodeCode Available | 5 |
| AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding | May 6, 2024 | Metric LearningSelf-Supervised Learning | CodeCode Available | 5 |
| When LLMs Meet Cybersecurity: A Systematic Literature Review | May 6, 2024 | Systematic Literature Review | CodeCode Available | 5 |
| Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation | May 2, 2024 | MuJoCoReinforcement Learning (RL) | CodeCode Available | 5 |
| Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models | May 2, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 5 |
| XFeat: Accelerated Features for Lightweight Image Matching | Apr 30, 2024 | CPUKeypoint detection and image matching | CodeCode Available | 5 |
| Make Your LLM Fully Utilize the Context | Apr 25, 2024 | 4kInformation Retrieval | CodeCode Available | 5 |
| ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving | Apr 25, 2024 | Diversity | CodeCode Available | 5 |
| NTIRE 2024 Challenge on Low Light Image Enhancement: Methods and Results | Apr 22, 2024 | 4kImage Enhancement | CodeCode Available | 5 |
| MARIO Eval: Evaluate Your Math LLM with your Math LLM--A mathematical dataset evaluation toolkit | Apr 22, 2024 | Math | CodeCode Available | 5 |
| Do "English" Named Entity Recognizers Work Well on Global Englishes? | Apr 20, 2024 | named-entity-recognitionNamed Entity Recognition | CodeCode Available | 5 |
| Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMs | Apr 19, 2024 | Event ExtractionIn-Context Learning | CodeCode Available | 5 |
| Lean Copilot: Large Language Models as Copilots for Theorem Proving in Lean | Apr 18, 2024 | Automated Theorem ProvingHallucination | CodeCode Available | 5 |
| Gaussian Opacity Fields: Efficient Adaptive Surface Reconstruction in Unbounded Scenes | Apr 16, 2024 | 3DGSNovel View Synthesis | CodeCode Available | 5 |
| Magic Clothing: Controllable Garment-Driven Image Synthesis | Apr 15, 2024 | Image Generation | CodeCode Available | 5 |
| SQUAT: Stateful Quantization-Aware Training in Recurrent Spiking Neural Networks | Apr 15, 2024 | Quantization | CodeCode Available | 5 |
| Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization | Apr 15, 2024 | Audio Generation | CodeCode Available | 5 |
| MING-MOE: Enhancing Medical Multi-Task Learning in Large Language Models with Sparse Mixture of Low-Rank Adapter Experts | Apr 13, 2024 | DiversityLanguage Modeling | CodeCode Available | 5 |
| The Path To Autonomous Cyber Defense | Apr 12, 2024 | | CodeCode Available | 5 |
| LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models | Apr 10, 2024 | Decision Making | CodeCode Available | 5 |
| LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders | Apr 9, 2024 | Contrastive LearningDecoder | CodeCode Available | 5 |
| SpeechAlign: Aligning Speech Generation to Human Preferences | Apr 8, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 5 |
| Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer | Apr 8, 2024 | MuJoCoPhysical Simulations | CodeCode Available | 5 |
| MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators | Apr 7, 2024 | Text-to-Video GenerationVideo Generation | CodeCode Available | 5 |
| Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators | Apr 6, 2024 | Chatbotcounterfactual | CodeCode Available | 5 |
| SpatialTracker: Tracking Any 2D Pixels in 3D Space | Apr 5, 2024 | | CodeCode Available | 5 |
| ReFT: Representation Finetuning for Language Models | Apr 4, 2024 | Arithmetic Reasoning | CodeCode Available | 5 |
| Masked Completion via Structured Diffusion with White-Box Transformers | Apr 3, 2024 | Representation Learning | CodeCode Available | 5 |
| Long-context LLMs Struggle with Long In-context Learning | Apr 2, 2024 | 2kIn-Context Learning | CodeCode Available | 5 |
| CityGaussian: Real-time High-quality Large-Scale Scene Rendering with Gaussians | Apr 1, 2024 | 3DGS3D Scene Reconstruction | CodeCode Available | 5 |
| Measuring Taiwanese Mandarin Language Understanding | Mar 29, 2024 | | CodeCode Available | 5 |
| TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting Methods | Mar 29, 2024 | BenchmarkingMultivariate Time Series Forecasting | CodeCode Available | 5 |
| InstantSplat: Sparse-view SfM-free Gaussian Splatting in Seconds | Mar 29, 2024 | 3D ReconstructionNovel View Synthesis | CodeCode Available | 5 |
| GauStudio: A Modular Framework for 3D Gaussian Splatting and Beyond | Mar 28, 2024 | 3DGSNovel View Synthesis | CodeCode Available | 5 |
| UniDepth: Universal Monocular Metric Depth Estimation | Mar 27, 2024 | Depth EstimationMonocular Depth Estimation | CodeCode Available | 5 |
| ChatDBG: Augmenting Debugging with Large Language Models | Mar 25, 2024 | C++ codeNavigate | CodeCode Available | 5 |
| StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text | Mar 21, 2024 | Text-to-Video GenerationVideo Generation | CodeCode Available | 5 |
| MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images | Mar 21, 2024 | 3D ReconstructionGeneralizable Novel View Synthesis | CodeCode Available | 5 |
| Mora: Enabling Generalist Video Generation via A Multi-Agent Framework | Mar 20, 2024 | Image to Video GenerationText-to-Video Generation | CodeCode Available | 5 |
| Evolutionary Optimization of Model Merging Recipes | Mar 19, 2024 | Evolutionary AlgorithmsMath | CodeCode Available | 5 |
| FeatUp: A Model-Agnostic Framework for Features at Any Resolution | Mar 15, 2024 | Depth EstimationDepth Prediction | CodeCode Available | 5 |
| Automatic Interactive Evaluation for Large Language Models with State Aware Patient Simulator | Mar 13, 2024 | | CodeCode Available | 5 |
| Fundamental Components of Deep Learning: A category-theoretic approach | Mar 13, 2024 | Deep LearningDescriptive | CodeCode Available | 5 |
| WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? | Mar 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 5 |
| Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation | Mar 12, 2024 | Image GenerationLanguage Modelling | CodeCode Available | 5 |
| pyvene: A Library for Understanding and Improving PyTorch Models via Interventions | Mar 12, 2024 | Model Editing | CodeCode Available | 5 |