| A Case Study in CUDA Kernel Fusion: Implementing FlashAttention-2 on NVIDIA Hopper Architecture using the CUTLASS Library | Dec 19, 2023 | GPU | CodeCode Available | 2 |
| XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAX | Dec 19, 2023 | DiversityGPU | CodeCode Available | 2 |
| Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach | Dec 19, 2023 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| Hybrid Internal Model: Learning Agile Legged Locomotion with Simulated Robot Response | Dec 18, 2023 | Contrastive Learning | CodeCode Available | 2 |
| MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL | Dec 18, 2023 | SQL ParsingText to SQL | CodeCode Available | 2 |
| SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing | Dec 18, 2023 | DecoderImage Generation | CodeCode Available | 2 |
| StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis | Dec 17, 2023 | QuantizationSinging Voice Synthesis | CodeCode Available | 2 |
| VidToMe: Video Token Merging for Zero-Shot Video Editing | Dec 17, 2023 | Video EditingVideo Generation | CodeCode Available | 2 |
| A Survey of Reasoning with Foundation Models | Dec 17, 2023 | Medical DiagnosisSurvey | CodeCode Available | 2 |
| Rethinking the Up-Sampling Operations in CNN-based Generative Network for Generalizable Deepfake Detection | Dec 16, 2023 | DeepFake DetectionFace Swapping | CodeCode Available | 2 |
| LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style Plugin | Dec 15, 2023 | Language ModellingMixture-of-Experts | CodeCode Available | 2 |
| PathoDuet: Foundation Models for Pathological Slide Analysis of H&E and IHC Stains | Dec 15, 2023 | Self-Supervised Learning | CodeCode Available | 2 |
| Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference | Dec 15, 2023 | DecoderDenoising | CodeCode Available | 2 |
| OccNeRF: Advancing 3D Occupancy Prediction in LiDAR-Free Environments | Dec 14, 2023 | Autonomous DrivingDepth Estimation | CodeCode Available | 2 |
| 3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting | Dec 14, 2023 | 3DGSImage Generation | CodeCode Available | 2 |
| Polyper: Boundary Sensitive Polyp Segmentation | Dec 14, 2023 | Segmentation | CodeCode Available | 2 |
| DiffusionLight: Light Probes for Free by Painting a Chrome Ball | Dec 14, 2023 | DiversityLighting Estimation | CodeCode Available | 2 |
| Holodeck: Language Guided Generation of 3D Embodied AI Environments | Dec 14, 2023 | Common Sense ReasoningLanguage Modelling | CodeCode Available | 2 |
| ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks | Dec 14, 2023 | Abstractive Text SummarizationCode Generation | CodeCode Available | 2 |
| MCANet: Medical Image Segmentation with Multi-Scale Cross-Axis Attention | Dec 14, 2023 | Image SegmentationLesion Segmentation | CodeCode Available | 2 |
| Tokenize Anything via Prompting | Dec 14, 2023 | DecoderVisual Prompting | CodeCode Available | 2 |
| Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers | Dec 14, 2023 | 3D ReconstructionDecoder | CodeCode Available | 2 |
| UCMCTrack: Multi-Object Tracking with Uniform Camera Motion Compensation | Dec 14, 2023 | Motion CompensationMulti-Object Tracking | CodeCode Available | 2 |
| Interactive Humanoid: Online Full-Body Motion Reaction Synthesis with Social Affordance Canonicalization and Forecasting | Dec 14, 2023 | | CodeCode Available | 2 |
| Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language Models | Dec 14, 2023 | DescriptiveImage Quality Assessment | CodeCode Available | 2 |
| Agent Attention: On the Integration of Softmax and Linear Attention | Dec 14, 2023 | Computational Efficiencyimage-classification | CodeCode Available | 2 |
| Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers | Dec 13, 2023 | 3D Question Answering (3D-QA)Attribute | CodeCode Available | 2 |
| DrivingGaussian: Composite Gaussian Splatting for Surrounding Dynamic Autonomous Driving Scenes | Dec 13, 2023 | Autonomous Driving | CodeCode Available | 2 |
| FaceTalk: Audio-Driven Motion Diffusion for Neural Parametric Head Models | Dec 13, 2023 | 3D Face AnimationAudio Synthesis | CodeCode Available | 2 |
| Boosting Latent Diffusion with Flow Matching | Dec 12, 2023 | DecoderDiversity | CodeCode Available | 2 |
| DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing | Dec 12, 2023 | Image GenerationImage Morphing | CodeCode Available | 2 |
| FreeInit: Bridging Initialization Gap in Video Diffusion Models | Dec 12, 2023 | DenoisingText-to-Video Generation | CodeCode Available | 2 |
| GenDet: Towards Good Generalizations for AI-Generated Image Detection | Dec 12, 2023 | Anomaly Detection | CodeCode Available | 2 |
| Reducing Energy Bloat in Large Model Training | Dec 12, 2023 | model | CodeCode Available | 2 |
| BIRB: A Generalization Benchmark for Information Retrieval in Bioacoustics | Dec 12, 2023 | Information RetrievalRepresentation Learning | CodeCode Available | 2 |
| LMDrive: Closed-Loop End-to-End Driving with Large Language Models | Dec 12, 2023 | Autonomous DrivingInstruction Following | CodeCode Available | 2 |
| COLMAP-Free 3D Gaussian Splatting | Dec 12, 2023 | 3DGSCamera Pose Estimation | CodeCode Available | 2 |
| CLIP in Medical Imaging: A Survey | Dec 12, 2023 | Medical Image AnalysisSurvey | CodeCode Available | 2 |
| ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image | Dec 12, 2023 | Image SegmentationInteractive Segmentation | CodeCode Available | 2 |
| Style Injection in Diffusion: A Training-free Approach for Adapting Large-scale Diffusion Models for Style Transfer | Dec 11, 2023 | Style Transfer | CodeCode Available | 2 |
| On Meta-Prompting | Dec 11, 2023 | In-Context Learning | CodeCode Available | 2 |
| SmartEdit: Exploring Complex Instruction-based Image Editing with Multimodal Large Language Models | Dec 11, 2023 | | CodeCode Available | 2 |
| DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection | Dec 11, 2023 | Anomaly DetectionDenoising | CodeCode Available | 2 |
| ControlNet-XS: Rethinking the Control of Text-to-Image Diffusion Models as Feedback-Control Systems | Dec 11, 2023 | Image GenerationSemantic Segmentation | CodeCode Available | 2 |
| Honeybee: Locality-enhanced Projector for Multimodal LLM | Dec 11, 2023 | MMEScience Question Answering | CodeCode Available | 2 |
| EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models | Dec 11, 2023 | BenchmarkingEmotional Intelligence | CodeCode Available | 2 |
| Intraoperative 2D/3D Image Registration via Differentiable X-ray Rendering | Dec 11, 2023 | Image Registration | CodeCode Available | 2 |
| SIFU: Side-view Conditioned Implicit Function for Real-world Usable Clothed Human Reconstruction | Dec 10, 2023 | Lifelike 3D Human Generation | CodeCode Available | 2 |
| Learning for CasADi: Data-driven Models in Numerical Optimization | Dec 10, 2023 | | CodeCode Available | 2 |
| AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model | Dec 10, 2023 | Image Generation | CodeCode Available | 2 |