| Which One? Leveraging Context Between Objects and Multiple Views for Language Grounding | Nov 12, 2023 | ObjectPosition | CodeCode Available | 1 |
| MultiSPANS: A Multi-range Spatial-Temporal Transformer Network for Traffic Forecast via Structural Entropy Optimization | Nov 6, 2023 | ManagementPosition | CodeCode Available | 1 |
| Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and Audio | Nov 1, 2023 | Position | CodeCode Available | 1 |
| Towards A Holistic Landscape of Situated Theory of Mind in Large Language Models | Oct 30, 2023 | PositionTheory of Mind Modeling | CodeCode Available | 1 |
| NLP Evaluation in trouble: On the Need to Measure LLM Data Contamination for each Benchmark | Oct 27, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| CLEX: Continuous Length Extrapolation for Large Language Models | Oct 25, 2023 | 4kPosition | CodeCode Available | 1 |
| Semi-Supervised End-to-End Learning for Integrated Sensing and Communications | Oct 15, 2023 | ISACPosition | CodeCode Available | 1 |
| Generative Modeling with Phase Stochastic Bridges | Oct 11, 2023 | Image GenerationPosition | CodeCode Available | 1 |
| Fast, Expressive SE(n) Equivariant Networks through Weight-Sharing in Position-Orientation Space | Oct 4, 2023 | Computational EfficiencyPosition | CodeCode Available | 1 |
| CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending | Sep 15, 2023 | 2kPosition | CodeCode Available | 1 |
| Mutation-based Fault Localization of Deep Neural Networks | Sep 10, 2023 | Fault localizationPosition | CodeCode Available | 1 |
| DropPos: Pre-Training Vision Transformers by Reconstructing Dropped Positions | Sep 7, 2023 | PositionSpatial Reasoning | CodeCode Available | 1 |
| Mask-Attention-Free Transformer for 3D Instance Segmentation | Sep 4, 2023 | 3D Instance SegmentationInstance Segmentation | CodeCode Available | 1 |
| A lightweight 3D dense facial landmark estimation model from position map data | Aug 29, 2023 | Keypoint DetectionPosition | CodeCode Available | 1 |
| Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models | Aug 25, 2023 | cross-modal alignmentPosition | CodeCode Available | 1 |
| Relighting Neural Radiance Fields with Shadow and Highlight Hints | Aug 25, 2023 | Position | CodeCode Available | 1 |
| Instruction Position Matters in Sequence Generation with Large Language Models | Aug 23, 2023 | Instruction FollowingPosition | CodeCode Available | 1 |
| DALNet: A Rail Detection Network Based on Dynamic Anchor Line | Aug 22, 2023 | DiversityLane Detection | CodeCode Available | 1 |
| Spatial LibriSpeech: An Augmented Dataset for Spatial Audio Learning | Aug 18, 2023 | 8kPosition | CodeCode Available | 1 |
| DeSCo: Towards Generalizable and Scalable Deep Subgraph Counting | Aug 16, 2023 | Graph Neural NetworkGraph Regression | CodeCode Available | 1 |
| Exploring Lightweight Hierarchical Vision Transformers for Efficient Visual Tracking | Aug 14, 2023 | PositionVisual Tracking | CodeCode Available | 1 |
| V-DETR: DETR with Vertex Relative Position Encoding for 3D Object Detection | Aug 8, 2023 | 3D Object DetectionDecoder | CodeCode Available | 1 |
| Point Anywhere: Directed Object Estimation from Omnidirectional Images | Aug 2, 2023 | Objectobject-detection | CodeCode Available | 1 |
| Advancing Beyond Identification: Multi-bit Watermark for Large Language Models | Aug 1, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Differentiable short-time Fourier transform with respect to the hop length | Jul 26, 2023 | Position | CodeCode Available | 1 |
| Latent-OFER: Detect, Mask, and Reconstruct with Latent Vectors for Occluded Facial Expression Recognition | Jul 21, 2023 | Facial Expression RecognitionFacial Expression Recognition (FER) | CodeCode Available | 1 |
| DSSE: a drone swarm search environment | Jul 12, 2023 | Positionreinforcement-learning | CodeCode Available | 1 |
| 2-D SSM: A General Spatial Layer for Visual Transformers | Jun 11, 2023 | Inductive BiasPosition | CodeCode Available | 1 |
| Everybody Compose: Deep Beats To Music | Jun 9, 2023 | Position | CodeCode Available | 1 |
| DDLP: Unsupervised Object-Centric Video Prediction with Deep Dynamic Latent Particles | Jun 9, 2023 | ObjectPosition | CodeCode Available | 1 |
| ColdNAS: Search to Modulate for User Cold-Start Recommendation | Jun 6, 2023 | Neural Architecture SearchPosition | CodeCode Available | 1 |
| 3rd Place Solution for PVUW2023 VSS Track: A Large Model for Semantic Segmentation on VSPW | Jun 4, 2023 | PositionSegmentation | CodeCode Available | 1 |
| Collect-and-Distribute Transformer for 3D Point Cloud Analysis | Jun 2, 2023 | Point Cloud ClassificationPosition | CodeCode Available | 1 |
| The Impact of Positional Encoding on Length Generalization in Transformers | May 31, 2023 | DecoderPosition | CodeCode Available | 1 |
| Large Language Models are not Fair Evaluators | May 29, 2023 | Language ModellingLarge Language Model | CodeCode Available | 1 |
| Improving Position Encoding of Transformers for Multivariate Time Series Classification | May 26, 2023 | Anomaly DetectionPosition | CodeCode Available | 1 |
| Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization | May 19, 2023 | Image GenerationPosition | CodeCode Available | 1 |
| Toeplitz Neural Network for Sequence Modeling | May 8, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| A Vision Transformer Approach for Efficient Near-Field Irregular SAR Super-Resolution | May 3, 2023 | Image EnhancementImage Super-Resolution | CodeCode Available | 1 |
| Exploiting Inductive Bias in Transformer for Point Cloud Classification and Segmentation | Apr 27, 2023 | 3D Object Classification3D Part Segmentation | CodeCode Available | 1 |
| Optimal Robust Network Design: Formulations and Algorithms for Maximizing Algebraic Connectivity | Apr 17, 2023 | Autonomous VehiclesPosition | CodeCode Available | 1 |
| Towards Flexible Multi-modal Document Models | Mar 31, 2023 | Multi-Task LearningPosition | CodeCode Available | 1 |
| Diffusion Action Segmentation | Mar 31, 2023 | Action SegmentationDenoising | CodeCode Available | 1 |
| A Closer Look at Parameter-Efficient Tuning in Diffusion Models | Mar 31, 2023 | Efficient Diffusion PersonalizationPosition | CodeCode Available | 1 |
| Searching for long faint astronomical high energy transients: a data driven approach | Mar 28, 2023 | Anomaly DetectionPathfinder | CodeCode Available | 1 |
| Position-Guided Point Cloud Panoptic Segmentation Transformer | Mar 23, 2023 | Instance SegmentationPanoptic Segmentation | CodeCode Available | 1 |
| Influencer Backdoor Attack on Semantic Segmentation | Mar 21, 2023 | Backdoor AttackPosition | CodeCode Available | 1 |
| CAPE: Camera View Position Embedding for Multi-View 3D Object Detection | Mar 17, 2023 | 3D Object Detectionobject-detection | CodeCode Available | 1 |
| Semi-Supervised 2D Human Pose Estimation Driven by Position Inconsistency Pseudo Label Correction Module | Mar 8, 2023 | 2D Human Pose EstimationPose Estimation | CodeCode Available | 1 |
| Deep Momentum Multi-Marginal Schrödinger Bridge | Mar 3, 2023 | Position | CodeCode Available | 1 |