| QAEncoder: Towards Aligned Representation Learning in Question Answering System | Sep 30, 2024 | Document EmbeddingQuestion Answering | CodeCode Available | 2 |
| Procedure-Aware Surgical Video-language Pretraining with Hierarchical Knowledge Augmentation | Sep 30, 2024 | Cross-Modal RetrievalDynamic Time Warping | CodeCode Available | 2 |
| Spiking Transformer with Spatial-Temporal Attention | Sep 29, 2024 | | CodeCode Available | 2 |
| Effective Diffusion Transformer Architecture for Image Super-Resolution | Sep 29, 2024 | Image GenerationImage Super-Resolution | CodeCode Available | 2 |
| Underwater Organism Color Enhancement via Color Code Decomposition, Adaptation and Interpolation | Sep 29, 2024 | Image Enhancement | CodeCode Available | 2 |
| One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos | Sep 29, 2024 | AllImage Segmentation | CodeCode Available | 2 |
| A Survey on Graph Neural Networks for Remaining Useful Life Prediction: Methodologies, Evaluation and Future Trends | Sep 29, 2024 | Benchmarkinggraph construction | CodeCode Available | 2 |
| CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling | Sep 28, 2024 | image-classificationImage Classification | CodeCode Available | 2 |
| MedCLIP-SAMv2: Towards Universal Text-Driven Medical Image Segmentation | Sep 28, 2024 | Image SegmentationMedical Image Analysis | CodeCode Available | 2 |
| CycleBNN: Cyclic Precision Training in Binary Neural Networks | Sep 28, 2024 | Inference Optimization | CodeCode Available | 2 |
| MicroFlow: An Efficient Rust-Based Inference Engine for TinyML | Sep 28, 2024 | Human Detection | CodeCode Available | 2 |
| 1st Place Solution of Multiview Egocentric Hand Tracking Challenge ECCV2024 | Sep 28, 2024 | Position | CodeCode Available | 2 |
| Epidemiology-Aware Neural ODE with Continuous Disease Transmission Graph | Sep 28, 2024 | Epidemiology | CodeCode Available | 2 |
| Brain-JEPA: Brain Dynamics Foundation Model with Gradient Positioning and Spatiotemporal Masking | Sep 28, 2024 | Prognosis | CodeCode Available | 2 |
| Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration | Sep 28, 2024 | AllAttribute | CodeCode Available | 2 |
| Conditional Image Synthesis with Diffusion Models: A Survey | Sep 28, 2024 | DenoisingDiversity | CodeCode Available | 2 |
| Cross-video Identity Correlating for Person Re-identification Pre-training | Sep 27, 2024 | DenoisingPerson Re-Identification | CodeCode Available | 2 |
| Positional Encoder Graph Quantile Neural Networks for Geographic Data | Sep 27, 2024 | Density EstimationUncertainty Quantification | CodeCode Available | 2 |
| Do We Need Domain-Specific Embedding Models? An Empirical Investigation | Sep 27, 2024 | | CodeCode Available | 2 |
| YOLOv8-ResCBAM: YOLOv8 Based on An Effective Attention Module for Pediatric Wrist Fracture Detection | Sep 27, 2024 | Fracture detection | CodeCode Available | 2 |
| DualDn: Dual-domain Denoising via Differentiable ISP | Sep 27, 2024 | DenoisingImage Denoising | CodeCode Available | 2 |
| Space-time 2D Gaussian Splatting for Accurate Surface Reconstruction under Complex Dynamic Scenes | Sep 27, 2024 | Human-Object Interaction DetectionSurface Reconstruction | CodeCode Available | 2 |
| A Survey on the Honesty of Large Language Models | Sep 27, 2024 | Survey | CodeCode Available | 2 |
| Rethinking the Power of Timestamps for Robust Time Series Forecasting: A Global-Local Fusion Perspective | Sep 27, 2024 | Time SeriesTime Series Forecasting | CodeCode Available | 2 |
| A Novel Unified Architecture for Low-Shot Counting by Detection and Segmentation | Sep 27, 2024 | Exemplar-Free CountingFew-shot Object Counting and Detection | CodeCode Available | 2 |
| SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusion | Sep 26, 2024 | DescriptiveGeneralized Referring Expression Comprehension | CodeCode Available | 2 |
| Robot See Robot Do: Imitating Articulated Object Manipulation with Monocular 4D Reconstruction | Sep 26, 2024 | 4D reconstructionObject | CodeCode Available | 2 |
| Control Industrial Automation System with Large Language Model Agents | Sep 26, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Mamba Meets Financial Markets: A Graph-Mamba Approach for Stock Price Prediction | Sep 26, 2024 | MambaPrediction | CodeCode Available | 2 |
| From News to Forecast: Integrating Event Analysis in LLM-Based Time Series Forecasting with Reflection | Sep 26, 2024 | Time SeriesTime Series Forecasting | CodeCode Available | 2 |
| PGN: The RNN's New Successor is Effective for Long-Range Time Series Forecasting | Sep 26, 2024 | Time SeriesTime Series Forecasting | CodeCode Available | 2 |
| Revisit Anything: Visual Place Recognition via Image Segment Retrieval | Sep 26, 2024 | Image SegmentationNavigate | CodeCode Available | 2 |
| EM-Net: Efficient Channel and Frequency Learning with Mamba for 3D Medical Image Segmentation | Sep 26, 2024 | Image SegmentationMamba | CodeCode Available | 2 |
| FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner | Sep 26, 2024 | Image GenerationText to Image Generation | CodeCode Available | 2 |
| A Survey of Spatio-Temporal EEG data Analysis: from Models to Applications | Sep 26, 2024 | EEGSelf-Supervised Learning | CodeCode Available | 2 |
| Resolving Multi-Condition Confusion for Finetuning-Free Personalized Image Generation | Sep 26, 2024 | Image GenerationObject | CodeCode Available | 2 |
| Event-based Stereo Depth Estimation: A Survey | Sep 26, 2024 | Depth EstimationNavigate | CodeCode Available | 2 |
| Neural Light Spheres for Implicit Image Stitching and View Synthesis | Sep 26, 2024 | Image Stitching | CodeCode Available | 2 |
| Prototype based Masked Audio Model for Self-Supervised Learning of Sound Event Detection | Sep 26, 2024 | Event DetectionRepresentation Learning | CodeCode Available | 2 |
| MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models | Sep 26, 2024 | Large Language ModelModel Compression | CodeCode Available | 2 |
| E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding | Sep 26, 2024 | Question AnsweringVideo Understanding | CodeCode Available | 2 |
| Source-Free Domain Adaptation for YOLO Object Detection | Sep 25, 2024 | Domain AdaptationModel Selection | CodeCode Available | 2 |
| ECG-Image-Database: A Dataset of ECG Images with Real-World Imaging and Scanning Artifacts; A Foundation for Computerized ECG Image Digitization and Analysis | Sep 25, 2024 | ECG DigitizationTime Series | CodeCode Available | 2 |
| General Detection-based Text Line Recognition | Sep 25, 2024 | HTROptical Character Recognition (OCR) | CodeCode Available | 2 |
| Statewide Visual Geolocalization in the Wild | Sep 25, 2024 | | CodeCode Available | 2 |
| Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction | Sep 25, 2024 | GPUToken Reduction | CodeCode Available | 2 |
| Attention Prompting on Image for Large Vision-Language Models | Sep 25, 2024 | MM-VetVisual Prompting | CodeCode Available | 2 |
| Progressive Representation Learning for Real-Time UAV Tracking | Sep 25, 2024 | ObjectObject Tracking | CodeCode Available | 2 |
| Empirical Asset Pricing with Large Language Model Agents | Sep 25, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D Diffusion | Sep 25, 2024 | Text to 3D | CodeCode Available | 2 |