| Causal and Local Correlations Based Network for Multivariate Time Series Classification | Nov 27, 2024 | Graph Neural NetworkTime Series | CodeCode Available | 1 |
| Vision Mamba Distillation for Low-resolution Fine-grained Image Classification | Nov 27, 2024 | ClassificationFine-Grained Image Classification | CodeCode Available | 1 |
| SpotLight: Shadow-Guided Object Relighting via Diffusion | Nov 27, 2024 | Image RelightingNeural Rendering | CodeCode Available | 1 |
| Correlation-Aware Graph Convolutional Networks for Multi-Label Node Classification | Nov 26, 2024 | ClassificationGraph Mining | CodeCode Available | 1 |
| AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of Text-to-Video Generation with LMM | Nov 26, 2024 | BenchmarkingText-to-Video Generation | CodeCode Available | 1 |
| LampMark: Proactive Deepfake Detection via Training-Free Landmark Perceptual Watermarks | Nov 26, 2024 | DeepFake DetectionFace Swapping | CodeCode Available | 1 |
| D^2-World: An Efficient World Model through Decoupled Dynamic Flow | Nov 26, 2024 | | CodeCode Available | 1 |
| SatVision-TOA: A Geospatial Foundation Model for Coarse-Resolution All-Sky Remote Sensing Imagery | Nov 26, 2024 | AllCloud Detection | CodeCode Available | 1 |
| Learning Robust Anymodal Segmentor with Unimodal and Cross-modal Distillation | Nov 26, 2024 | | CodeCode Available | 1 |
| RoboPEPP: Vision-Based Robot Pose and Joint Angle Estimation through Embedding Predictive Pre-Training | Nov 26, 2024 | Pose Estimation | CodeCode Available | 1 |
| cWDM: Conditional Wavelet Diffusion Models for Cross-Modality 3D Medical Image Synthesis | Nov 26, 2024 | Brain Tumor SegmentationImage Generation | CodeCode Available | 1 |
| NumGrad-Pull: Numerical Gradient Guided Tri-plane Representation for Surface Reconstruction from Point Clouds | Nov 26, 2024 | Surface Reconstruction | CodeCode Available | 1 |
| LongKey: Keyphrase Extraction for Long Documents | Nov 26, 2024 | Keyphrase ExtractionLanguage Modeling | CodeCode Available | 1 |
| Attamba: Attending To Multi-Token States | Nov 26, 2024 | ChunkingState Space Models | CodeCode Available | 1 |
| Beyond Walking: A Large-Scale Image-Text Benchmark for Text-based Person Anomaly Search | Nov 26, 2024 | Person SearchText based Person Search | CodeCode Available | 1 |
| Disentangled Interpretable Representation for Efficient Long-term Time Series Forecasting | Nov 26, 2024 | Multivariate Time Series ForecastingTime Series | CodeCode Available | 1 |
| Distractor-free Generalizable 3D Gaussian Splatting | Nov 26, 2024 | 3DGS | CodeCode Available | 1 |
| GrokFormer: Graph Fourier Kolmogorov-Arnold Transformers | Nov 26, 2024 | Graph ClassificationGraph Representation Learning | CodeCode Available | 1 |
| Efficient Data-aware Distance Comparison Operations for High-Dimensional Approximate Nearest Neighbor Search | Nov 26, 2024 | Information Retrieval | CodeCode Available | 1 |
| P2DFlow: A Protein Ensemble Generative Model with SE(3) Flow Matching | Nov 26, 2024 | | CodeCode Available | 1 |
| Hotspot-Driven Peptide Design via Multi-Fragment Autoregressive Extension | Nov 26, 2024 | | CodeCode Available | 1 |
| Event-based Spiking Neural Networks for Object Detection: A Review of Datasets, Architectures, Learning Rules, and Implementation | Nov 26, 2024 | Articlesobject-detection | CodeCode Available | 1 |
| Condense, Don't Just Prune: Enhancing Efficiency and Performance in MoE Layer Pruning | Nov 26, 2024 | Mixture-of-Experts | CodeCode Available | 1 |
| MAT: Multi-Range Attention Transformer for Efficient Image Super-Resolution | Nov 26, 2024 | DiversityImage Super-Resolution | CodeCode Available | 1 |
| g3D-LF: Generalizable 3D-Language Feature Fields for Embodied Tasks | Nov 26, 2024 | Contrastive LearningQuestion Answering | CodeCode Available | 1 |
| Can LLMs be Good Graph Judge for Knowledge Graph Construction? | Nov 26, 2024 | Denoisinggraph construction | CodeCode Available | 1 |
| Learning Monotonic Attention in Transducer for Streaming Generation | Nov 26, 2024 | | CodeCode Available | 1 |
| Even Sparser Graph Transformers | Nov 25, 2024 | | CodeCode Available | 1 |
| Deformable Mamba for Wide Field of View Segmentation | Nov 25, 2024 | DecoderMamba | CodeCode Available | 1 |
| Refining Focus in AI for Lung Cancer: Comparing Lesion-Centric and Chest-Region Models with Performance Insights from Internal and External Validation | Nov 25, 2024 | Cancer ClassificationLung Cancer Diagnosis | CodeCode Available | 1 |
| All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages | Nov 25, 2024 | AllLong Question Answer | CodeCode Available | 1 |
| Learn from Foundation Model: Fruit Detection Model without Manual Annotation | Nov 25, 2024 | Instance SegmentationKnowledge Distillation | CodeCode Available | 1 |
| Multi-modal Retrieval Augmented Multi-modal Generation: A Benchmark, Evaluate Metrics and Strong Baselines | Nov 25, 2024 | multimodal generationRAG | CodeCode Available | 1 |
| LaB-RAG: Label Boosted Retrieval Augmented Generation for Radiology Report Generation | Nov 25, 2024 | Image CaptioningRAG | CodeCode Available | 1 |
| A SAM-guided and Match-based Semi-Supervised Segmentation Framework for Medical Imaging | Nov 25, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 1 |
| ZoomLDM: Latent Diffusion Model for multi-scale image generation | Nov 25, 2024 | Image GenerationMultiple Instance Learning | CodeCode Available | 1 |
| Image Generation Diversity Issues and How to Tame Them | Nov 25, 2024 | DiversityImage Generation | CodeCode Available | 1 |
| MarketGPT: Developing a Pre-trained transformer (GPT) for Modeling Financial Time Series | Nov 25, 2024 | Time Series | CodeCode Available | 1 |
| VICON: Vision In-Context Operator Networks for Multi-Physics Fluid Dynamics Prediction | Nov 25, 2024 | Computational EfficiencyIn-Context Learning | CodeCode Available | 1 |
| Machine Learning for the Digital Typhoon Dataset: Extensions to Multiple Basins and New Developments in Representations and Tasks | Nov 25, 2024 | Benchmarkingobject-detection | CodeCode Available | 1 |
| DiffBreak: Is Diffusion-Based Purification Robust? | Nov 25, 2024 | Face Swapping | CodeCode Available | 1 |
| AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning | Nov 25, 2024 | HallucinationQuestion Answering | CodeCode Available | 1 |
| VidHal: Benchmarking Temporal Hallucinations in Vision LLMs | Nov 25, 2024 | BenchmarkingHallucination | CodeCode Available | 1 |
| InTraGen: Trajectory-controlled Video Generation for Object Interactions | Nov 25, 2024 | ObjectVideo Generation | CodeCode Available | 1 |
| Edge Weight Prediction For Category-Agnostic Pose Estimation | Nov 25, 2024 | 2D Pose EstimationAnimal Pose Estimation | CodeCode Available | 1 |
| Language Driven Occupancy Prediction | Nov 25, 2024 | Prediction | CodeCode Available | 1 |
| Context Awareness Gate For Retrieval Augmented Generation | Nov 25, 2024 | Open-Domain Question AnsweringQuestion Answering | CodeCode Available | 1 |
| GeoFormer: A Multi-Polygon Segmentation Transformer | Nov 25, 2024 | | CodeCode Available | 1 |
| FUN-AD: Fully Unsupervised Learning for Anomaly Detection with Noisy Training Data | Nov 25, 2024 | Anomaly DetectionOne-Class Classification | CodeCode Available | 1 |
| ADAF: An Artificial Intelligence Data Assimilation Framework for Weather Forecasting | Nov 25, 2024 | GPUWeather Forecasting | CodeCode Available | 1 |