| On the Feasibility of Using LLMs to Autonomously Execute Multi-host Network Attacks | Jan 27, 2025 | | CodeCode Available | 2 |
| TopoNets: High Performing Vision and Language Models with Brain-Like Topography | Jan 27, 2025 | | CodeCode Available | 2 |
| MM-Retinal V2: Transfer an Elite Knowledge Spark into Fundus Vision-Language Pretraining | Jan 27, 2025 | Contrastive LearningTransfer Learning | CodeCode Available | 2 |
| LLM-powered Multi-agent Framework for Goal-oriented Learning in Intelligent Tutoring System | Jan 27, 2025 | | CodeCode Available | 2 |
| Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with Modality-Aware Sparsity | Jan 27, 2025 | Computational EfficiencyMamba | CodeCode Available | 2 |
| LUCY: Linguistic Understanding and Control Yielding Early Stage of Her | Jan 27, 2025 | Question Answering | CodeCode Available | 2 |
| Efficient Attention-Sharing Information Distillation Transformer for Lightweight Single Image Super-Resolution | Jan 27, 2025 | Image Super-ResolutionSuper-Resolution | CodeCode Available | 2 |
| Universal Image Restoration Pre-training via Degradation Classification | Jan 26, 2025 | 5-Degradation Blind All-in-One Image RestorationImage Restoration | CodeCode Available | 2 |
| Baichuan-Omni-1.5 Technical Report | Jan 26, 2025 | Audio Generation | CodeCode Available | 2 |
| TinyLLaVA-Video: A Simple Framework of Small-scale Large Multimodal Models for Video Understanding | Jan 26, 2025 | Video Understanding | CodeCode Available | 2 |
| iFormer: Integrating ConvNet and Transformer for Mobile Application | Jan 26, 2025 | Instance Segmentationobject-detection | CodeCode Available | 2 |
| GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting | Jan 26, 2025 | Quantization | CodeCode Available | 2 |
| Visual Generation Without Guidance | Jan 26, 2025 | Diversity | CodeCode Available | 2 |
| Improving Retrieval-Augmented Generation through Multi-Agent Reinforcement Learning | Jan 25, 2025 | Answer GenerationMulti-agent Reinforcement Learning | CodeCode Available | 2 |
| Analyzing and Boosting the Power of Fine-Grained Visual Recognition for Multi-modal Large Language Models | Jan 25, 2025 | AttributeContrastive Learning | CodeCode Available | 2 |
| Uni-Sign: Toward Unified Sign Language Understanding at Scale | Jan 25, 2025 | Computational EfficiencyGloss-free Sign Language Translation | CodeCode Available | 2 |
| VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking | Jan 24, 2025 | DenoisingImage Generation | CodeCode Available | 2 |
| Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image Understanding | Jan 24, 2025 | AnatomyContrastive Learning | CodeCode Available | 2 |
| STAMP: Scalable Task And Model-agnostic Collaborative Perception | Jan 24, 2025 | Autonomous Driving | CodeCode Available | 2 |
| Deeply Optimizing the SAT Solver for the IC3 Algorithm | Jan 24, 2025 | | CodeCode Available | 2 |
| Advancing MRI Reconstruction: A Systematic Review of Deep Learning and Compressed Sensing Integration | Jan 24, 2025 | compressed sensingFederated Learning | CodeCode Available | 2 |
| Fast Think-on-Graph: Wider, Deeper and Faster Reasoning of Large Language Model on Knowledge Graph | Jan 24, 2025 | Community DetectionHallucination | CodeCode Available | 2 |
| Scalable Benchmarking and Robust Learning for Noise-Free Ego-Motion and 3D Reconstruction from Noisy Video | Jan 24, 2025 | 3D ReconstructionBenchmarking | CodeCode Available | 2 |
| Bayesian Neural Networks for One-to-Many Mapping in Image Enhancement | Jan 24, 2025 | Image Enhancement | CodeCode Available | 2 |
| OstQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting | Jan 23, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| An Efficient Sparse Kernel Generator for O(3)-Equivariant Deep Networks | Jan 23, 2025 | GPU | CodeCode Available | 2 |
| NUDT4MSTAR: A Large Dataset and Benchmark Towards Remote Sensing Object Recognition in the Wild | Jan 23, 2025 | Earth ObservationObject Recognition | CodeCode Available | 2 |
| Spurious Forgetting in Continual Learning of Language Models | Jan 23, 2025 | Continual Learning | CodeCode Available | 2 |
| PointOBB-v3: Expanding Performance Boundaries of Single Point-Supervised Oriented Object Detection | Jan 23, 2025 | object-detectionObject Detection | CodeCode Available | 2 |
| YOLO11-JDE: Fast and Accurate Multi-Object Tracking with Self-Supervised Re-ID | Jan 23, 2025 | Multi-Object Trackingobject-detection | CodeCode Available | 2 |
| Parameter-Efficient Fine-Tuning for Foundation Models | Jan 23, 2025 | parameter-efficient fine-tuningSurvey | CodeCode Available | 2 |
| Querying Databases with Function Calling | Jan 23, 2025 | | CodeCode Available | 2 |
| GS-CPR: Efficient Camera Pose Refinement via 3D Gaussian Splatting | Jan 23, 2025 | 3DGSNeRF | CodeCode Available | 2 |
| Tensor-Var: Variational Data Assimilation in Tensor Product Feature Space | Jan 23, 2025 | | CodeCode Available | 2 |
| GeoPixel: Pixel Grounding Large Multimodal Model in Remote Sensing | Jan 23, 2025 | 4k | CodeCode Available | 2 |
| Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge | Jan 23, 2025 | SchedulingStreaming video understanding | CodeCode Available | 2 |
| Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback | Jan 22, 2025 | Instruction Following | CodeCode Available | 2 |
| TimeFilter: Patch-Specific Spatial-Temporal Graph Filtration for Time Series Forecasting | Jan 22, 2025 | ClusteringTime Series | CodeCode Available | 2 |
| A Survey on Multimodal Recommender Systems: Recent Advances and Future Directions | Jan 22, 2025 | Recommendation Systems | CodeCode Available | 2 |
| GS-LiDAR: Generating Realistic LiDAR Point Clouds with Panoramic Gaussian Splatting | Jan 22, 2025 | Autonomous DrivingNeRF | CodeCode Available | 2 |
| O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning | Jan 22, 2025 | Mathematical Reasoning | CodeCode Available | 2 |
| Towards Robust Multi-tab Website Fingerprinting | Jan 22, 2025 | Multi-Label ClassificationMUlTI-LABEL-ClASSIFICATION | CodeCode Available | 2 |
| Distillation Quantification for Large Language Models | Jan 22, 2025 | | CodeCode Available | 2 |
| MedS^3: Towards Medical Small Language Models with Self-Evolved Slow Thinking | Jan 21, 2025 | Multiple-choice | CodeCode Available | 2 |
| MMVU: Measuring Expert-Level Multi-Discipline Video Understanding | Jan 21, 2025 | Video Understanding | CodeCode Available | 2 |
| Supervised Learning for Analog and RF Circuit Design: Benchmarks and Comparative Insights | Jan 21, 2025 | | CodeCode Available | 2 |
| EmbodiedEval: Evaluate Multimodal LLMs as Embodied Agents | Jan 21, 2025 | AttributeQuestion Answering | CodeCode Available | 2 |
| Automating High Quality RT Planning at Scale | Jan 21, 2025 | | CodeCode Available | 2 |
| Exploring Temporally-Aware Features for Point Tracking | Jan 21, 2025 | Point TrackingVideo Editing | CodeCode Available | 2 |
| Episodic Memories Generation and Evaluation Benchmark for Large Language Models | Jan 21, 2025 | | CodeCode Available | 2 |