| Do Language Models Understand Time? | Dec 18, 2024 | Action RecognitionAnomaly Detection | CodeCode Available | 1 |
| Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN | Dec 18, 2024 | Model Compression | CodeCode Available | 1 |
| MixRec: Heterogeneous Graph Collaborative Filtering | Dec 18, 2024 | Collaborative FilteringContrastive Learning | CodeCode Available | 1 |
| Robust Tracking via Mamba-based Context-aware Token Learning | Dec 18, 2024 | Mamba | CodeCode Available | 1 |
| Hybrid CNN-LSTM based Indoor Pedestrian Localization with CSI Fingerprint Maps | Dec 18, 2024 | Diversity | CodeCode Available | 1 |
| PowerMLP: An Efficient Version of KAN | Dec 18, 2024 | | CodeCode Available | 1 |
| Crabs: Consuming Resource via Auto-generation for LLM-DoS Attack under Black-box Settings | Dec 18, 2024 | GPU | CodeCode Available | 1 |
| PixelMan: Consistent Object Editing with Diffusion Models via Pixel Manipulation and Generation | Dec 18, 2024 | Object | CodeCode Available | 1 |
| Context-DPO: Aligning Language Models for Context-Faithfulness | Dec 18, 2024 | RAGRetrieval-augmented Generation | CodeCode Available | 1 |
| TRecViT: A Recurrent Video Transformer | Dec 18, 2024 | | CodeCode Available | 1 |
| SemiDFL: A Semi-Supervised Paradigm for Decentralized Federated Learning | Dec 18, 2024 | Federated Learning | CodeCode Available | 1 |
| Enabling Realtime Reinforcement Learning at Scale with Staggered Asynchronous Inference | Dec 18, 2024 | Reinforcement Learning (RL) | CodeCode Available | 1 |
| Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with Adapters | Dec 18, 2024 | Face GenerationTalking Face Generation | CodeCode Available | 1 |
| GraphAvatar: Compact Head Avatars with GNN-Generated 3D Gaussians | Dec 18, 2024 | 3DGSNeRF | CodeCode Available | 1 |
| Look Inside for More: Internal Spatial Modality Perception for 3D Anomaly Detection | Dec 18, 2024 | 3D Anomaly DetectionAnomaly Detection | CodeCode Available | 1 |
| HA-RDet: Hybrid Anchor Rotation Detector for Oriented Object Detection | Dec 18, 2024 | object-detectionObject Detection | CodeCode Available | 1 |
| Bridging the User-side Knowledge Gap in Knowledge-aware Recommendations with Large Language Models | Dec 18, 2024 | Contrastive LearningKnowledge Graphs | CodeCode Available | 1 |
| Knowledge Editing with Dynamic Knowledge Graphs for Multi-Hop Question Answering | Dec 18, 2024 | graph constructionknowledge editing | CodeCode Available | 1 |
| ECG-Byte: A Tokenizer for End-to-End Generative Electrocardiogram Language Modeling | Dec 18, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| When Should We Prefer State-to-Visual DAgger Over Visual Reinforcement Learning? | Dec 18, 2024 | | CodeCode Available | 1 |
| Balans: Multi-Armed Bandits-based Adaptive Large Neighborhood Search for Mixed-Integer Programming Problem | Dec 18, 2024 | Combinatorial OptimizationMulti-Armed Bandits | CodeCode Available | 1 |
| Beyond Outcomes: Transparent Assessment of LLM Reasoning in Games | Dec 18, 2024 | | CodeCode Available | 1 |
| Physics-Based Adversarial Attack on Near-Infrared Human Detector for Nighttime Surveillance Camera Systems | Dec 18, 2024 | Adversarial Attack | CodeCode Available | 1 |
| EscapeBench: Pushing Language Models to Think Outside the Box | Dec 18, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding | Dec 18, 2024 | Highlight DetectionMoment Retrieval | CodeCode Available | 1 |
| Neural Combinatorial Optimization for Stochastic Flexible Job Shop Scheduling Problems | Dec 18, 2024 | Combinatorial OptimizationJob Shop Scheduling | CodeCode Available | 1 |
| QueryCDR: Query-Based Controllable Distortion Rectification Network for Fisheye Images | Dec 18, 2024 | | CodeCode Available | 1 |
| Event-based Photometric Bundle Adjustment | Dec 18, 2024 | | CodeCode Available | 1 |
| Adaptive Calibration: A Unified Conversion Framework of Spiking Neural Network | Dec 18, 2024 | object-detectionObject Detection | CodeCode Available | 1 |
| TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks | Dec 18, 2024 | Benchmarking | CodeCode Available | 1 |
| Generative AI Toolkit -- a framework for increasing the quality of LLM-based applications over their whole life cycle | Dec 18, 2024 | | CodeCode Available | 1 |
| Autonomous Microscopy Experiments through Large Language Model Agents | Dec 18, 2024 | BenchmarkingExperimental Design | CodeCode Available | 1 |
| Real-time One-Step Diffusion-based Expressive Portrait Videos Generation | Dec 18, 2024 | Video Generation | CodeCode Available | 1 |
| G-VEval: A Versatile Metric for Evaluating Image and Video Captions Using GPT-4o | Dec 18, 2024 | Image CaptioningVideo Captioning | CodeCode Available | 1 |
| Exploring Multi-Modal Data with Tool-Augmented LLM Agents for Precise Causal Discovery | Dec 18, 2024 | Causal DiscoveryCausal Inference | CodeCode Available | 1 |
| Plug-and-Play Tri-Branch Invertible Block for Image Rescaling | Dec 18, 2024 | Compressed Image Super-resolutionImage Reconstruction | CodeCode Available | 1 |
| I0T: Embedding Standardization Method Towards Zero Modality Gap | Dec 18, 2024 | Contrastive LearningImage-text Retrieval | CodeCode Available | 1 |
| Physics Reasoner: Knowledge-Augmented Reasoning for Solving Physics Problems with Large Language Models | Dec 18, 2024 | | CodeCode Available | 1 |
| M^3-VOS: Multi-Phase, Multi-Transition, and Multi-Scenery Video Object Segmentation | Dec 18, 2024 | ObjectSemantic Segmentation | CodeCode Available | 1 |
| CA-Edit: Causality-Aware Condition Adapter for High-Fidelity Local Facial Attribute Editing | Dec 18, 2024 | Attribute | CodeCode Available | 1 |
| Boosting Fine-Grained Visual Anomaly Detection with Coarse-Knowledge-Aware Adversarial Learning | Dec 17, 2024 | Anomaly Detection | CodeCode Available | 1 |
| XPath Agent: An Efficient XPath Programming Agent Based on LLM for Web Crawler | Dec 17, 2024 | | CodeCode Available | 1 |
| Faster Vision Mamba is Rebuilt in Minutes via Merged Token Re-training | Dec 17, 2024 | MambaToken Reduction | CodeCode Available | 1 |
| DocFusion: A Unified Framework for Document Parsing Tasks | Dec 17, 2024 | | CodeCode Available | 1 |
| SnakModel: Lessons Learned from Training an Open Danish Large Language Model | Dec 17, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| EXIT: Context-Aware Extractive Compression for Enhancing Retrieval-Augmented Generation | Dec 17, 2024 | Question AnsweringRAG | CodeCode Available | 1 |
| ASAP: Advancing Semantic Alignment Promotes Multi-Modal Manipulation Detecting and Grounding | Dec 17, 2024 | cross-modal alignment | CodeCode Available | 1 |
| Human-in-the-Loop Generation of Adversarial Texts: A Case Study on Tibetan Script | Dec 17, 2024 | Adversarial AttackAdversarial Robustness | CodeCode Available | 1 |
| 4DRGS: 4D Radiative Gaussian Splatting for Efficient 3D Vessel Reconstruction from Sparse-View Dynamic DSA Images | Dec 17, 2024 | | CodeCode Available | 1 |
| ZoRI: Towards Discriminative Zero-Shot Remote Sensing Instance Segmentation | Dec 17, 2024 | Instance SegmentationSegmentation | CodeCode Available | 1 |