| SegVol: Universal and Interactive Volumetric Medical Image Segmentation | Nov 22, 2023 | Computed Tomography (CT)Image Segmentation | CodeCode Available | 2 |
| FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design | Nov 23, 2023 | Decision MakingLanguage Modelling | CodeCode Available | 2 |
| OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving | Nov 27, 2023 | Autonomous Driving | CodeCode Available | 2 |
| Adapter is All You Need for Tuning Visual Tasks | Nov 25, 2023 | Allimage-classification | CodeCode Available | 2 |
| Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular, Stereo, and RGB-D Cameras | Nov 28, 2023 | Neural Rendering | CodeCode Available | 2 |
| H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models | Sep 21, 2023 | | CodeCode Available | 2 |
| Achieving Cross Modal Generalization with Multimodal Unified Representation | Sep 21, 2023 | | CodeCode Available | 2 |
| M^4: A Unified XAI Benchmark for Faithfulness Evaluation of Feature Attribution Methods across Metrics, Modalities and Models | Sep 26, 2023 | | CodeCode Available | 2 |
| Language Models can Solve Computer Tasks | Mar 30, 2023 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving | Nov 29, 2023 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 2 |
| Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation | Jun 29, 2023 | 3D Shape GenerationDecoder | CodeCode Available | 2 |
| Spike-driven Transformer | Jul 4, 2023 | | CodeCode Available | 2 |
| StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter | Dec 1, 2023 | DisentanglementText-to-Video Generation | CodeCode Available | 2 |
| TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding | Dec 4, 2023 | Dense CaptioningHighlight Detection | CodeCode Available | 2 |
| Aligning and Prompting Everything All at Once for Universal Visual Perception | Dec 4, 2023 | AllObject | CodeCode Available | 2 |
| Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation | Dec 5, 2023 | Logical Reasoning | CodeCode Available | 2 |
| GauHuman: Articulated Gaussian Splatting from Monocular Human Videos | Dec 5, 2023 | Generalizable Novel View SynthesisNeRF | CodeCode Available | 2 |
| DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving | Jun 18, 2024 | Arithmetic ReasoningMath | CodeCode Available | 2 |
| Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion models | Jun 7, 2023 | DiversityImage Generation | CodeCode Available | 2 |
| Pgx: Hardware-Accelerated Parallel Game Simulators for Reinforcement Learning | Mar 29, 2023 | GPUreinforcement-learning | CodeCode Available | 2 |
| AnimateZero: Video Diffusion Models are Zero-Shot Image Animators | Dec 6, 2023 | Image AnimationVideo Generation | CodeCode Available | 2 |
| Mind2Web: Towards a Generalist Agent for the Web | Jun 9, 2023 | | CodeCode Available | 2 |
| ClimateLearn: Benchmarking Machine Learning for Weather and Climate Modeling | Jul 4, 2023 | BenchmarkingWeather Forecasting | CodeCode Available | 2 |
| When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment | Jul 7, 2023 | Reinforcement Learning (RL) | CodeCode Available | 2 |
| UDiffText: A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diffusion Models | Dec 8, 2023 | Image GenerationScene Text Editing | CodeCode Available | 2 |
| UniPC: A Unified Predictor-Corrector Framework for Fast Sampling of Diffusion Models | Feb 9, 2023 | DenoisingImage Generation | CodeCode Available | 2 |
| Multi-Branch Auxiliary Fusion YOLO with Re-parameterization Heterogeneous Convolutional for accurate object detection | Jul 5, 2024 | Novel Object Detectionobject-detection | CodeCode Available | 2 |
| LMDrive: Closed-Loop End-to-End Driving with Large Language Models | Dec 12, 2023 | Autonomous DrivingInstruction Following | CodeCode Available | 2 |
| CLIP in Medical Imaging: A Survey | Dec 12, 2023 | Medical Image AnalysisSurvey | CodeCode Available | 2 |
| Steering Llama 2 via Contrastive Activation Addition | Dec 9, 2023 | Multiple-choice | CodeCode Available | 2 |
| PUG: Photorealistic and Semantically Controllable Synthetic Data for Representation Learning | Aug 8, 2023 | Representation Learning | CodeCode Available | 2 |
| StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis | Dec 17, 2023 | QuantizationSinging Voice Synthesis | CodeCode Available | 2 |
| XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAX | Dec 19, 2023 | DiversityGPU | CodeCode Available | 2 |
| VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models | Jul 12, 2023 | FormLanguage Modelling | CodeCode Available | 2 |
| Understanding the Potential of FPGA-Based Spatial Acceleration for Large Language Model Inference | Dec 23, 2023 | GPUHigh-Level Synthesis | CodeCode Available | 2 |
| CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling | Sep 28, 2024 | image-classificationImage Classification | CodeCode Available | 2 |
| OpenRL: A Unified Reinforcement Learning Framework | Dec 20, 2023 | reinforcement-learningReinforcement Learning | CodeCode Available | 2 |
| ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios | Jan 1, 2024 | | CodeCode Available | 2 |
| Improving the Stability and Efficiency of Diffusion Models for Content Consistent Super-Resolution | Dec 30, 2023 | DecoderImage Generation | CodeCode Available | 2 |
| Graph Neural Networks for Tabular Data Learning: A Survey with Taxonomy and Directions | Jan 4, 2024 | Representation LearningSurvey | CodeCode Available | 2 |
| Grimoire is All You Need for Enhancing Large Language Models | Jan 7, 2024 | AllIn-Context Learning | CodeCode Available | 2 |
| ANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of Video | Jan 10, 2024 | Video Summarization | CodeCode Available | 2 |
| PhilEO Bench: Evaluating Geo-Spatial Foundation Models | Jan 9, 2024 | Density EstimationEarth Observation | CodeCode Available | 2 |
| RadarCam-Depth: Radar-Camera Fusion for Depth Estimation with Learned Metric Scale | Jan 9, 2024 | Depth EstimationDepth Prediction | CodeCode Available | 2 |
| Deep Covariance Alignment for Domain Adaptive Remote Sensing Image Segmentation | Jan 9, 2024 | Image SegmentationSegmentation | CodeCode Available | 2 |
| EmoLLMs: A Series of Emotional Large Language Models and Annotation Tools for Comprehensive Affective Analysis | Jan 16, 2024 | Instruction Followingregression | CodeCode Available | 2 |
| Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive | Jan 16, 2024 | Domain GeneralizationImage Generation | CodeCode Available | 2 |
| DiffMoog: a Differentiable Modular Synthesizer for Sound Matching | Jan 23, 2024 | Audio Synthesis | CodeCode Available | 2 |
| A Survey on Learning from Graphs with Heterophily: Recent Advances and Future Directions | Jan 18, 2024 | Graph LearningSurvey | CodeCode Available | 2 |
| Denoising Diffusion Probabilistic Models | Jun 19, 2020 | DenoisingDensity Estimation | CodeCode Available | 2 |