| SA-Occ: Satellite-Assisted 3D Occupancy Prediction in Real World | Mar 20, 2025 | | CodeCode Available | 1 |
| MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion | Mar 20, 2025 | Data AugmentationMathematical Problem-Solving | CodeCode Available | 1 |
| FedAWA: Adaptive Optimization of Aggregation Weights in Federated Learning Using Client Vectors | Mar 20, 2025 | Federated Learningglobal-optimization | CodeCode Available | 1 |
| Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models | Mar 20, 2025 | Multiple-choiceVideo Understanding | CodeCode Available | 1 |
| Agentic Keyframe Search for Video Question Answering | Mar 20, 2025 | EgoSchemaQuestion Answering | CodeCode Available | 1 |
| STOP: Integrated Spatial-Temporal Dynamic Prompting for Video Understanding | Mar 20, 2025 | Video UnderstandingZero-shot Generalization | CodeCode Available | 1 |
| Probabilistic Prompt Distribution Learning for Animal Pose Estimation | Mar 20, 2025 | Animal Pose EstimationDiversity | CodeCode Available | 1 |
| QCPINN: Quantum-Classical Physics-Informed Neural Networks for Solving PDEs | Mar 20, 2025 | BenchmarkingPhysics-informed machine learning | CodeCode Available | 1 |
| QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge | Mar 20, 2025 | Depth EstimationMonocular Depth Estimation | CodeCode Available | 1 |
| SALT: Singular Value Adaptation with Low-Rank Transformation | Mar 20, 2025 | Image SegmentationMedical Image Segmentation | CodeCode Available | 1 |
| CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners | Mar 20, 2025 | knowledge editing | CodeCode Available | 1 |
| Zero-1-to-A: Zero-Shot One Image to Animatable Head Avatars Using Video Diffusion | Mar 20, 2025 | | CodeCode Available | 1 |
| Parameters vs. Context: Fine-Grained Control of Knowledge Reliance in Language Models | Mar 20, 2025 | counterfactualRAG | CodeCode Available | 1 |
| Design and Implementation of an FPGA-Based Hardware Accelerator for Transformer | Mar 20, 2025 | CPUHigh-Level Synthesis | CodeCode Available | 1 |
| Narrative Trails: A Method for Coherent Storyline Extraction via Maximum Capacity Path Optimization | Mar 19, 2025 | Information Retrieval | CodeCode Available | 1 |
| Performance-bounded Online Ensemble Learning Method Based on Multi-armed bandits and Its Applications in Real-time Safety Assessment | Mar 19, 2025 | Ensemble LearningMulti-Armed Bandits | CodeCode Available | 1 |
| A Bird Song Detector for improving bird identification through Deep Learning: a case study from Doñana | Mar 19, 2025 | | CodeCode Available | 1 |
| SkyLadder: Better and Faster Pretraining via Context Window Scheduling | Mar 19, 2025 | 8kScheduling | CodeCode Available | 1 |
| HAD-Gen: Human-like and Diverse Driving Behavior Modeling for Controllable Scenario Generation | Mar 19, 2025 | Autonomous VehiclesImitation Learning | CodeCode Available | 1 |
| From 1,000,000 Users to Every User: Scaling Up Personalized Preference for User-level Alignment | Mar 19, 2025 | Diversity | CodeCode Available | 1 |
| Optimizing Retrieval Strategies for Financial Question Answering Documents in Retrieval-Augmented Generation Systems | Mar 19, 2025 | Question AnsweringRAG | CodeCode Available | 1 |
| GIVEPose: Gradual Intra-class Variation Elimination for RGB-based Category-Level Object Pose Estimation | Mar 19, 2025 | ObjectPose Estimation | CodeCode Available | 1 |
| DeCaFlow: A Deconfounding Causal Generative Model | Mar 19, 2025 | Causal Inferencecounterfactual | CodeCode Available | 1 |
| Visual Position Prompt for MLLM based Visual Grounding | Mar 19, 2025 | PositionVisual Grounding | CodeCode Available | 1 |
| MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer | Mar 19, 2025 | Answer GenerationMathematical Reasoning | CodeCode Available | 1 |
| PiEEG kit - bioscience Lab in home for your Brain and Body | Mar 19, 2025 | EEG | CodeCode Available | 1 |
| Efficient Personalization of Quantized Diffusion Model without Backpropagation | Mar 19, 2025 | Image Generation | CodeCode Available | 1 |
| EarthScape: A Multimodal Dataset for Surficial Geologic Mapping and Earth Surface Analysis | Mar 19, 2025 | | CodeCode Available | 1 |
| What Makes a Reward Model a Good Teacher? An Optimization Perspective | Mar 19, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Explainable AI Components for Narrative Map Extraction | Mar 19, 2025 | Event DetectionExplainable artificial intelligence | CodeCode Available | 1 |
| EmpathyAgent: Can Embodied Agents Conduct Empathetic Actions? | Mar 19, 2025 | | CodeCode Available | 1 |
| EdgeRegNet: Edge Feature-based Multimodal Registration Network between Images and LiDAR Point Clouds | Mar 19, 2025 | Autonomous DrivingComputational Efficiency | CodeCode Available | 1 |
| Recover and Match: Open-Vocabulary Multi-Label Recognition through Knowledge-Constrained Optimal Transport | Mar 19, 2025 | | CodeCode Available | 1 |
| UltraFlwr -- An Efficient Federated Medical and Surgical Object Detection Framework | Mar 19, 2025 | Federated LearningObject | CodeCode Available | 1 |
| Ambient Noise Full Waveform Inversion with Neural Operators | Mar 19, 2025 | | CodeCode Available | 1 |
| BigO(Bench) -- Can LLMs Generate Code with Controlled Time and Space Complexity? | Mar 19, 2025 | Code Generation | CodeCode Available | 1 |
| Multi-focal Conditioned Latent Diffusion for Person Image Synthesis | Mar 19, 2025 | Image Generation | CodeCode Available | 1 |
| Improving Adversarial Transferability on Vision Transformers via Forward Propagation Refinement | Mar 19, 2025 | | CodeCode Available | 1 |
| High Temporal Consistency through Semantic Similarity Propagation in Semi-Supervised Video Semantic Segmentation for Autonomous Flight | Mar 19, 2025 | Image SegmentationKnowledge Distillation | CodeCode Available | 1 |
| When the Future Becomes the Past: Taming Temporal Correspondence for Self-supervised Video Representation Learning | Mar 19, 2025 | Representation LearningSelf-Supervised Learning | CodeCode Available | 1 |
| Exploiting Diffusion Prior for Real-World Image Dehazing with Unpaired Training | Mar 19, 2025 | Image DehazingWorld Knowledge | CodeCode Available | 1 |
| MP-GUI: Modality Perception with MLLMs for GUI Understanding | Mar 18, 2025 | | CodeCode Available | 1 |
| MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segmentation | Mar 18, 2025 | ObjectReasoning Segmentation | CodeCode Available | 1 |
| MeshFleet: Filtered and Annotated 3D Vehicle Dataset for Domain Specific Generative Modeling | Mar 18, 2025 | | CodeCode Available | 1 |
| Advancing Medical Representation Learning Through High-Quality Data | Mar 18, 2025 | Representation Learningzero-shot-classification | CodeCode Available | 1 |
| FusDreamer: Label-efficient Remote Sensing World Model for Multimodal Data Classification | Mar 18, 2025 | Combinatorial OptimizationContrastive Learning | CodeCode Available | 1 |
| Capturing Smile Dynamics with the Quintic Volatility Model: SPX, Skew-Stickiness Ratio and VIX | Mar 18, 2025 | | CodeCode Available | 1 |
| Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future Perspectives | Mar 18, 2025 | Image Captioning | CodeCode Available | 1 |
| VisEscape: A Benchmark for Evaluating Exploration-driven Decision-making in Virtual Escape Rooms | Mar 18, 2025 | Decision Making | CodeCode Available | 1 |
| Inferring Event Descriptions from Time Series with Language Models | Mar 18, 2025 | Time Series | CodeCode Available | 1 |