| ARoFace: Alignment Robustness to Improve Low-Quality Face Recognition | Jul 20, 2024 | Data AugmentationFace Alignment | CodeCode Available | 2 |
| mdCATH: A Large-Scale MD Dataset for Data-Driven Computational Biophysics | Jul 20, 2024 | | CodeCode Available | 2 |
| Intelligent Artistic Typography: A Comprehensive Review of Artistic Text Design and Generation | Jul 20, 2024 | Text Generation | CodeCode Available | 2 |
| PlacidDreamer: Advancing Harmony in Text-to-3D Generation | Jul 19, 2024 | 3D GenerationText to 3D | CodeCode Available | 2 |
| Composer's Assistant 2: Interactive Multi-Track MIDI Infilling with Fine-Grained User Control | Jul 19, 2024 | | CodeCode Available | 2 |
| ESP-MedSAM: Efficient Self-Prompting SAM for Universal Domain-Generalized Medical Image Segmentation | Jul 19, 2024 | DecoderImage Segmentation | CodeCode Available | 2 |
| T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation | Jul 19, 2024 | AttributeLanguage Modeling | CodeCode Available | 2 |
| PointRegGPT: Boosting 3D Point Cloud Registration using Generative Point-Cloud Pairs for Training | Jul 19, 2024 | Point Cloud Registration | CodeCode Available | 2 |
| Mono-ViFI: A Unified Learning Framework for Self-supervised Single- and Multi-frame Monocular Depth Estimation | Jul 19, 2024 | Data AugmentationDepth Estimation | CodeCode Available | 2 |
| Longhorn: State Space Models are Amortized Online Learners | Jul 19, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery | Jul 19, 2024 | | CodeCode Available | 2 |
| RealViformer: Investigating Attention for Real-World Video Super-Resolution | Jul 19, 2024 | Image Super-ResolutionSuper-Resolution | CodeCode Available | 2 |
| Internal Consistency and Self-Feedback in Large Language Models: A Survey | Jul 19, 2024 | | CodeCode Available | 2 |
| Class-Incremental Learning with CLIP: Adaptive Representation Adjustment and Parameter Fusion | Jul 19, 2024 | class-incremental learningClass Incremental Learning | CodeCode Available | 2 |
| RAG-QA Arena: Evaluating Domain Robustness for Long-form Retrieval Augmented Question Answering | Jul 19, 2024 | Domain GeneralizationForm | CodeCode Available | 2 |
| 6DoF Head Pose Estimation through Explicit Bidirectional Interaction with Face Geometry | Jul 19, 2024 | Head Pose EstimationPose Estimation | CodeCode Available | 2 |
| FCN: Fusing Exponential and Linear Cross Network for Click-Through Rate Prediction | Jul 18, 2024 | Click-Through Rate Prediction | CodeCode Available | 2 |
| PetFace: A Large-Scale Dataset and Benchmark for Animal Identification | Jul 18, 2024 | Face IdentificationFace Verification | CodeCode Available | 2 |
| Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies | Jul 18, 2024 | ARC | CodeCode Available | 2 |
| Understanding Reinforcement Learning-Based Fine-Tuning of Diffusion Models: A Tutorial and Review | Jul 18, 2024 | Reinforcement Learning (RL) | CodeCode Available | 2 |
| CoD, Towards an Interpretable Medical Agent using Chain of Diagnosis | Jul 18, 2024 | Decision MakingDiagnostic | CodeCode Available | 2 |
| Mask2Map: Vectorized HD Map Construction Using Bird's Eye View Segmentation Masks | Jul 18, 2024 | Autonomous DrivingBEV Segmentation | CodeCode Available | 2 |
| A Closer Look at GAN Priors: Exploiting Intermediate Features for Enhanced Model Inversion Attacks | Jul 18, 2024 | | CodeCode Available | 2 |
| Forecasting GPU Performance for Deep Learning Training and Inference | Jul 18, 2024 | Deep LearningGPU | CodeCode Available | 2 |
| Weak-to-Strong Reasoning | Jul 18, 2024 | GSM8KMath | CodeCode Available | 2 |
| LinSATNet: The Positive Linear Satisfiability Neural Networks | Jul 18, 2024 | Graph Matching | CodeCode Available | 2 |
| GroupMamba: Efficient Group-Based Visual State Space Model | Jul 18, 2024 | image-classificationImage Classification | CodeCode Available | 2 |
| Generalizable Human Gaussians for Sparse View Synthesis | Jul 17, 2024 | NeRFNeural Rendering | CodeCode Available | 2 |
| GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval | Jul 17, 2024 | DecoderImage Enhancement | CodeCode Available | 2 |
| Spectra: Surprising Effectiveness of Pretraining Ternary Language Models at Scale | Jul 17, 2024 | GPULAMBADA | CodeCode Available | 2 |
| Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models | Jul 17, 2024 | BenchmarkingRed Teaming | CodeCode Available | 2 |
| MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models | Jul 17, 2024 | | CodeCode Available | 2 |
| GraphMuse: A Library for Symbolic Music Graph Processing | Jul 17, 2024 | | CodeCode Available | 2 |
| SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow | Jul 17, 2024 | | CodeCode Available | 2 |
| VCP-CLIP: A visual context prompting model for zero-shot anomaly segmentation | Jul 17, 2024 | Anomaly DetectionAnomaly Segmentation | CodeCode Available | 2 |
| Beyond Next Token Prediction: Patch-Level Training for Large Language Models | Jul 17, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Fast Context-Based Low-Light Image Enhancement via Neural Implicit Representations | Jul 17, 2024 | Image EnhancementLow-Light Image Enhancement | CodeCode Available | 2 |
| GeneralAD: Anomaly Detection Across Domains by Attending to Distorted Features | Jul 17, 2024 | Anomaly DetectionSelf-Driving Cars | CodeCode Available | 2 |
| Fisheye-Calib-Adapter: An Easy Tool for Fisheye Camera Model Conversion | Jul 17, 2024 | Autonomous Driving | CodeCode Available | 2 |
| TTSDS -- Text-to-Speech Distribution Score | Jul 17, 2024 | text-to-speechText to Speech | CodeCode Available | 2 |
| Enhancing the Utility of Privacy-Preserving Cancer Classification using Synthetic Data | Jul 17, 2024 | Breast Cancer DetectionCancer Classification | CodeCode Available | 2 |
| EmoFace: Audio-driven Emotional 3D Face Animation | Jul 17, 2024 | 3D Face Animation | CodeCode Available | 2 |
| Towards AI-Powered Video Assistant Referee System (VARS) for Association Football | Jul 17, 2024 | Fairness | CodeCode Available | 2 |
| UrbanWorld: An Urban World Model for 3D City Generation | Jul 16, 2024 | Decision MakingLanguage Modelling | CodeCode Available | 2 |
| Scientific QA System with Verifiable Answers | Jul 16, 2024 | ArticlesInformation Retrieval | CodeCode Available | 2 |
| Does Refusal Training in LLMs Generalize to the Past Tense? | Jul 16, 2024 | | CodeCode Available | 2 |
| Learning to Make Keypoints Sub-Pixel Accurate | Jul 16, 2024 | | CodeCode Available | 2 |
| Temporally Consistent Stereo Matching | Jul 16, 2024 | Depth EstimationStereo Matching | CodeCode Available | 2 |
| LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction | Jul 16, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| TeethDreamer: 3D Teeth Reconstruction from Five Intra-oral Photographs | Jul 16, 2024 | Surface Reconstruction | CodeCode Available | 2 |