| PIXELS: Progressive Image Xemplar-based Editing with Latent Surgery | Jan 16, 2025 | Image GenerationPrompt Engineering | CodeCode Available | 1 |
| A Simple Graph Contrastive Learning Framework for Short Text Classification | Jan 16, 2025 | Contrastive LearningData Augmentation | CodeCode Available | 1 |
| ChartInsighter: An Approach for Mitigating Hallucination in Time-series Chart Summary Generation with A Benchmark Dataset | Jan 16, 2025 | HallucinationSentence | CodeCode Available | 1 |
| Neural Honeytrace: A Robust Plug-and-Play Watermarking Framework against Model Extraction Attacks | Jan 16, 2025 | Model extraction | CodeCode Available | 1 |
| FLOL: Fast Baselines for Real-World Low-Light Enhancement | Jan 16, 2025 | Image EnhancementLow-Light Image Enhancement | CodeCode Available | 1 |
| Fine-Grained Image-Text Correspondence with Cost Aggregation for Open-Vocabulary Part Segmentation | Jan 16, 2025 | Open Vocabulary Semantic SegmentationOpen-Vocabulary Semantic Segmentation | CodeCode Available | 1 |
| NS-Gym: Open-Source Simulation Environments and Benchmarks for Non-Stationary Markov Decision Processes | Jan 16, 2025 | Decision Making | CodeCode Available | 1 |
| Normal-NeRF: Ambiguity-Robust Normal Estimation for Highly Reflective Scenes | Jan 16, 2025 | NeRF | CodeCode Available | 1 |
| LAVCap: LLM-based Audio-Visual Captioning using Optimal Transport | Jan 16, 2025 | AudioCapsAudio captioning | CodeCode Available | 1 |
| Hierarchical Deep Reinforcement Learning for Adaptive Resource Management in Integrated Terrestrial and Non-Terrestrial Networks | Jan 16, 2025 | Deep Reinforcement LearningManagement | CodeCode Available | 1 |
| BN-Pool: a Bayesian Nonparametric Approach to Graph Pooling | Jan 16, 2025 | | CodeCode Available | 1 |
| A Study of In-Context-Learning-Based Text-to-SQL Errors | Jan 16, 2025 | In-Context LearningText to SQL | CodeCode Available | 1 |
| FineMedLM-o1: Enhancing the Medical Reasoning Ability of LLM from Supervised Fine-Tuning to Test-Time Training | Jan 16, 2025 | Domain Adaptation | CodeCode Available | 1 |
| Towards Robust and Realistic Human Pose Estimation via WiFi Signals | Jan 16, 2025 | 3D Human Pose EstimationContrastive Learning | CodeCode Available | 1 |
| Leveraging Large Language Models as Knowledge-Driven Agents for Reliable Retrosynthesis Planning | Jan 15, 2025 | Knowledge GraphsRetrieval | CodeCode Available | 1 |
| GRAPPA - A Hybrid Graph Neural Network for Predicting Pure Component Vapor Pressures | Jan 15, 2025 | Graph AttentionGraph Neural Network | CodeCode Available | 1 |
| Multimodal LLMs Can Reason about Aesthetics in Zero-Shot | Jan 15, 2025 | BenchmarkingHallucination | CodeCode Available | 1 |
| Efficient Traffic Prediction Through Spatio-Temporal Distillation | Jan 15, 2025 | Knowledge DistillationPrediction | CodeCode Available | 1 |
| CrystalGRW: Generative Modeling of Crystal Structures with Targeted Properties via Geodesic Random Walks | Jan 15, 2025 | Graph Neural Network | CodeCode Available | 1 |
| Score-based 3D molecule generation with neural fields | Jan 15, 2025 | 3D Molecule Generation | CodeCode Available | 1 |
| WhiSPA: Semantically and Psychologically Aligned Whisper with Self-Supervised Contrastive and Student-Teacher Learning | Jan 15, 2025 | cross-modal alignmentLanguage Modeling | CodeCode Available | 1 |
| ToMATO: Verbalizing the Mental States of Role-Playing LLMs for Benchmarking Theory of Mind | Jan 15, 2025 | BenchmarkingMultiple-choice | CodeCode Available | 1 |
| MANTA: Diffusion Mamba for Efficient and Effective Stochastic Long-Term Dense Anticipation | Jan 15, 2025 | Mamba | CodeCode Available | 1 |
| Enhancing Graph Representation Learning with Localized Topological Features | Jan 15, 2025 | Graph LearningGraph Representation Learning | CodeCode Available | 1 |
| Towards Fast, Specialized Machine Learning Force Fields: Distilling Foundation Models via Energy Hessians | Jan 15, 2025 | Computational chemistryKnowledge Distillation | CodeCode Available | 1 |
| Generative diffusion model with inverse renormalization group flows | Jan 15, 2025 | Audio SynthesisDenoising | CodeCode Available | 1 |
| MeshMask: Physics-Based Simulations with Masked Graph Neural Networks | Jan 15, 2025 | Decoder | CodeCode Available | 1 |
| NeurOp-Diff:Continuous Remote Sensing Image Super-Resolution via Neural Operator Diffusion | Jan 15, 2025 | DenoisingImage Super-Resolution | CodeCode Available | 1 |
| GOTPR: General Outdoor Text-based Place Recognition Using Scene Graph Retrieval with OpenStreetMap | Jan 15, 2025 | Retrieval | CodeCode Available | 1 |
| Knowledge Graph-based Retrieval-Augmented Generation for Schema Matching | Jan 15, 2025 | HallucinationKnowledge Graphs | CodeCode Available | 1 |
| DualOpt: A Dual Divide-and-Optimize Algorithm for the Large-scale Traveling Salesman Problem | Jan 15, 2025 | Computational EfficiencyTraveling Salesman Problem | CodeCode Available | 1 |
| SwinTExCo: Exemplar-based video colorization using Swin Transformer | Jan 15, 2025 | ColorizationVideo Restoration | CodeCode Available | 1 |
| Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation | Jan 14, 2025 | Objectobject-detection | CodeCode Available | 1 |
| CuAsmRL: Optimizing GPU SASS Schedules via Deep Reinforcement Learning | Jan 14, 2025 | Deep Reinforcement LearningGPU | CodeCode Available | 1 |
| Enhancing the De-identification of Personally Identifiable Information in Educational Data | Jan 14, 2025 | De-identification | CodeCode Available | 1 |
| CWEval: Outcome-driven Evaluation on Functionality and Security of LLM Code Generation | Jan 14, 2025 | Code Generation | CodeCode Available | 1 |
| 3UR-LLM: An End-to-End Multimodal Large Language Model for 3D Scene Understanding | Jan 14, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers | Jan 14, 2025 | Future predictionPrediction | CodeCode Available | 1 |
| Facial Dynamics in Video: Instruction Tuning for Improved Facial Expression Perception and Contextual Awareness | Jan 14, 2025 | Event ExtractionInstruction Following | CodeCode Available | 1 |
| Poseidon: A ViT-based Architecture for Multi-Frame Pose Estimation with Adaptive Frame Weighting and Multi-Scale Feature Fusion | Jan 14, 2025 | 2D Human Pose EstimationComputational Efficiency | CodeCode Available | 1 |
| EmoNeXt: an Adapted ConvNeXt for Facial Emotion Recognition | Jan 14, 2025 | Deep LearningEmotion Classification | CodeCode Available | 1 |
| AfriHate: A Multilingual Collection of Hate Speech and Abusive Language Datasets for African Languages | Jan 14, 2025 | Abusive LanguageKeyword Spotting | CodeCode Available | 1 |
| GDiffRetro: Retrosynthesis Prediction with Dual Graph Enhanced Molecular Representation and Diffusion Generation | Jan 14, 2025 | molecular representationRetrosynthesis | CodeCode Available | 1 |
| A Multi-Modal AI Copilot for Single-Cell Analysis with Instruction Following | Jan 14, 2025 | Instruction Following | CodeCode Available | 1 |
| D^2-DPM: Dual Denoising for Quantized Diffusion Probabilistic Models | Jan 14, 2025 | DenoisingImage Generation | CodeCode Available | 1 |
| AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual Segmentation | Jan 14, 2025 | MambaVideo Understanding | CodeCode Available | 1 |
| Gandalf the Red: Adaptive Security for LLMs | Jan 14, 2025 | BlockingLanguage Modeling | CodeCode Available | 1 |
| Optimal Classification Trees for Continuous Feature Data Using Dynamic Programming with Branch-and-Bound | Jan 14, 2025 | Binarization | CodeCode Available | 1 |
| An Adaptive Orthogonal Convolution Scheme for Efficient and Flexible CNN Architectures | Jan 14, 2025 | Adversarial Robustness | CodeCode Available | 1 |
| Enhancing Automated Interpretability with Output-Centric Feature Descriptions | Jan 14, 2025 | Sentence | CodeCode Available | 1 |