| TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models | Apr 25, 2024 | DenoisingImage to Video Generation | CodeCode Available | 2 |
| Multimodal Information Interaction for Medical Image Segmentation | Apr 25, 2024 | Heart SegmentationImage Segmentation | CodeCode Available | 2 |
| The PRISM Alignment Dataset: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models | Apr 24, 2024 | DiversityNavigate | CodeCode Available | 2 |
| A Dynamic Kernel Prior Model for Unsupervised Blind Image Super-Resolution | Apr 24, 2024 | Blind Super-ResolutionImage Restoration | CodeCode Available | 2 |
| Telco-RAG: Navigating the Challenges of Retrieval-Augmented Language Models for Telecommunications | Apr 24, 2024 | RAGRetrieval | CodeCode Available | 2 |
| Let's Think Dot by Dot: Hidden Computation in Transformer Language Models | Apr 24, 2024 | | CodeCode Available | 2 |
| Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges | Apr 24, 2024 | Drug DesignInductive Bias | CodeCode Available | 2 |
| MaGGIe: Masked Guided Gradual Human Instance Matting | Apr 24, 2024 | Image MattingVideo Matting | CodeCode Available | 2 |
| zkLLM: Zero Knowledge Proofs for Large Language Models | Apr 24, 2024 | | CodeCode Available | 2 |
| Gradformer: Graph Transformer with Exponential Decay | Apr 24, 2024 | Graph ClassificationGraph Neural Network | CodeCode Available | 2 |
| From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large Language Models | Apr 24, 2024 | Instruction Following | CodeCode Available | 2 |
| Facilitating Advanced Sentinel-2 Analysis Through a Simplified Computation of Nadir BRDF Adjusted Reflectance | Apr 24, 2024 | | CodeCode Available | 2 |
| Multi-Session SLAM with Differentiable Wide-Baseline Pose Optimization | Apr 23, 2024 | global-optimizationOptical Flow Estimation | CodeCode Available | 2 |
| GSCo: Towards Generalizable AI in Medicine via Generalist-Specialist Collaboration | Apr 23, 2024 | Collaborative InferenceIn-Context Learning | CodeCode Available | 2 |
| SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation | Apr 23, 2024 | 3D Human Pose EstimationPose Estimation | CodeCode Available | 2 |
| Mamba3D: Enhancing Local Features for 3D Point Cloud Analysis via State Space Model | Apr 23, 2024 | 3D Point Cloud ClassificationMamba | CodeCode Available | 2 |
| From Parts to Whole: A Unified Reference Framework for Controllable Human Image Generation | Apr 23, 2024 | Image Generation | CodeCode Available | 2 |
| Generate-on-Graph: Treat LLM as both Agent and KG in Incomplete Knowledge Graph Question Answering | Apr 23, 2024 | Graph Question AnsweringHallucination | CodeCode Available | 2 |
| An empirical study of LLaMA3 quantization: from LLMs to MLLMs | Apr 22, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| X-Ray: A Sequential 3D Representation For Generation | Apr 22, 2024 | 3D GenerationObject | CodeCode Available | 2 |
| CLIP-GS: CLIP-Informed Gaussian Splatting for Real-time and View-consistent 3D Semantic Understanding | Apr 22, 2024 | Attribute | CodeCode Available | 2 |
| SpaceByte: Towards Deleting Tokenization from Large Language Modeling | Apr 22, 2024 | DecoderLanguage Modeling | CodeCode Available | 2 |
| SwinFuSR: an image fusion-inspired model for RGB-guided thermal image super-resolution | Apr 22, 2024 | Image Super-ResolutionSSIM | CodeCode Available | 2 |
| UVEB: A Large-scale Benchmark and Baseline Towards Real-World Underwater Video Enhancement | Apr 22, 2024 | 4kImage Enhancement | CodeCode Available | 2 |
| Graphic Design with Large Multimodal Model | Apr 22, 2024 | Layout Generationmodel | CodeCode Available | 2 |