| MT-Eval: A Multi-Turn Capabilities Evaluation Benchmark for Large Language Models | Jan 30, 2024 | | CodeCode Available | 2 |
| Diffusion Facial Forgery Detection | Jan 29, 2024 | Image Generation | CodeCode Available | 2 |
| Cross-Scale MAE: A Tale of Multi-Scale Exploitation in Remote Sensing | Jan 29, 2024 | GPURepresentation Learning | CodeCode Available | 2 |
| SHViT: Single-Head Vision Transformer with Memory Efficient Macro Design | Jan 29, 2024 | CPUGPU | CodeCode Available | 2 |
| Simple Policy Optimization | Jan 29, 2024 | MuJoCo | CodeCode Available | 2 |
| MixSup: Mixed-grained Supervision for Label-efficient LiDAR-based 3D Object Detection | Jan 29, 2024 | 3D Object Detectionobject-detection | CodeCode Available | 2 |
| LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection | Jan 29, 2024 | 3D Object DetectionAutonomous Vehicles | CodeCode Available | 2 |
| Synchformer: Efficient Synchronization from Sparse Cues | Jan 29, 2024 | Audio-Visual Synchronization | CodeCode Available | 2 |
| A Comprehensive Survey on Graph Reduction: Sparsification, Coarsening, and Condensation | Jan 29, 2024 | Survey | CodeCode Available | 2 |
| Lips Are Lying: Spotting the Temporal Inconsistency between Audio and Visual in Lip-Syncing DeepFakes | Jan 28, 2024 | DeepFake DetectionFace Swapping | CodeCode Available | 2 |
| SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks | Jan 28, 2024 | 2D Semantic SegmentationDecoder | CodeCode Available | 2 |
| Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach | Jan 28, 2024 | Image Outpainting | CodeCode Available | 2 |
| SCTransNet: Spatial-channel Cross Transformer Network for Infrared Small Target Detection | Jan 28, 2024 | | CodeCode Available | 2 |
| FreeStyle: Free Lunch for Text-guided Style Transfer using Diffusion Models | Jan 28, 2024 | DecoderStyle Transfer | CodeCode Available | 2 |
| ASCNet: Asymmetric Sampling Correction Network for Infrared Image Destriping | Jan 28, 2024 | Feature UpsamplingImage Reconstruction | CodeCode Available | 2 |
| Improving Medical Reasoning through Retrieval and Self-Reflection with Retrieval-Augmented Large Language Models | Jan 27, 2024 | Medical Question AnsweringMultiple-choice | CodeCode Available | 2 |
| SupplyGraph: A Benchmark Dataset for Supply Chain Planning using Graph Neural Networks | Jan 27, 2024 | | CodeCode Available | 2 |
| FaKnow: A Unified Library for Fake News Detection | Jan 27, 2024 | Fake News Detection | CodeCode Available | 2 |
| L-AutoDA: Leveraging Large Language Models for Automated Decision-based Adversarial Attacks | Jan 27, 2024 | Adversarial AttackComputational Efficiency | CodeCode Available | 2 |
| A Survey on Neural Topic Models: Methods, Applications, and Challenges | Jan 27, 2024 | SurveyTopic Models | CodeCode Available | 2 |
| An open dataset for oracle bone script recognition and decipherment | Jan 27, 2024 | Decipherment | CodeCode Available | 2 |
| A Survey on Data Augmentation in Large Model Era | Jan 27, 2024 | Audio Signal ProcessingData Augmentation | CodeCode Available | 2 |
| CascadedGaze: Efficiency in Global Context Extraction for Image Restoration | Jan 26, 2024 | DeblurringDecoder | CodeCode Available | 2 |
| ChemDFM: A Large Language Foundation Model for Chemistry | Jan 26, 2024 | Formmodel | CodeCode Available | 2 |
| Residual Quantization with Implicit Neural Codebooks | Jan 26, 2024 | Data CompressionQuantization | CodeCode Available | 2 |
| LYT-NET: Lightweight YUV Transformer-based Network for Low-light Image Enhancement | Jan 26, 2024 | Color Image DenoisingImage Enhancement | CodeCode Available | 2 |
| The Power of Noise: Redefining Retrieval for RAG Systems | Jan 26, 2024 | Information RetrievalRAG | CodeCode Available | 2 |
| Text Image Inpainting via Global Structure-Guided Diffusion Models | Jan 26, 2024 | Image InpaintingScene Text Recognition | CodeCode Available | 2 |
| Learning Universal Predictors | Jan 26, 2024 | Meta-Learning | CodeCode Available | 2 |
| Sketch and Refine: Towards Fast and Accurate Lane Detection | Jan 26, 2024 | Lane Detection | CodeCode Available | 2 |
| Airavata: Introducing Hindi Instruction-tuned LLM | Jan 26, 2024 | | CodeCode Available | 2 |
| Macro Graph Neural Networks for Online Billion-Scale Recommender Systems | Jan 26, 2024 | Recommendation Systems | CodeCode Available | 2 |
| Unrecognizable Yet Identifiable: Image Distortion with Preserved Embeddings | Jan 26, 2024 | Face RecognitionSecurity Studies | CodeCode Available | 2 |
| Rethinking Patch Dependence for Masked Autoencoders | Jan 25, 2024 | DecoderInstance Segmentation | CodeCode Available | 2 |
| Deconstructing Denoising Diffusion Models for Self-Supervised Learning | Jan 25, 2024 | DenoisingImage Generation | CodeCode Available | 2 |
| Towards Goal-oriented Prompt Engineering for Large Language Models: A Survey | Jan 25, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities | Jan 25, 2024 | | CodeCode Available | 2 |
| Domain-Independent Dynamic Programming | Jan 25, 2024 | Combinatorial OptimizationHeuristic Search | CodeCode Available | 2 |
| MambaMorph: a Mamba-based Framework for Medical MR-CT Deformable Registration | Jan 25, 2024 | Computed Tomography (CT)Image Registration | CodeCode Available | 2 |
| TURNA: A Turkish Encoder-Decoder Language Model for Enhanced Understanding and Generation | Jan 25, 2024 | DecoderLanguage Modeling | CodeCode Available | 2 |
| Routoo: Learning to Route to Large Language Models Effectively | Jan 25, 2024 | MMLUMulti-task Language Understanding | CodeCode Available | 2 |
| Towards 3D Molecule-Text Interpretation in Language Models | Jan 25, 2024 | Instruction FollowingLanguage Modeling | CodeCode Available | 2 |
| Vivim: a Video Vision Mamba for Medical Video Segmentation | Jan 25, 2024 | Lesion SegmentationMamba | CodeCode Available | 2 |
| True Knowledge Comes from Practice: Aligning LLMs with Embodied Environments via Reinforcement Learning | Jan 25, 2024 | Decision MakingReinforcement Learning (RL) | CodeCode Available | 2 |
| Diffusion Enhancement for Cloud Removal in Ultra-Resolution Remote Sensing Imagery | Jan 25, 2024 | Cloud RemovalImage Generation | CodeCode Available | 2 |
| ICASSP 2024 Speech Signal Improvement Challenge | Jan 25, 2024 | | CodeCode Available | 2 |
| LAA-Net: Localized Artifact Attention Network for Quality-Agnostic and Generalizable Deepfake Detection | Jan 24, 2024 | DeepFake DetectionFace Swapping | CodeCode Available | 2 |
| Graph Diffusion Transformers for Multi-Conditional Molecular Generation | Jan 24, 2024 | DecoderDenoising | CodeCode Available | 2 |
| InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions | Jan 24, 2024 | document understandingQuestion Answering | CodeCode Available | 2 |
| SCNet: Sparse Compression Network for Music Source Separation | Jan 24, 2024 | CPUMusic Source Separation | CodeCode Available | 2 |