| Position: What Can Large Language Models Tell Us about Time Series Analysis | Feb 5, 2024 | Decision MakingPosition | CodeCode Available | 2 |
| How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for Metric Learning | Feb 5, 2024 | In-Context LearningMetric Learning | CodeCode Available | 2 |
| Robot Trajectron: Trajectory Prediction-based Shared Control for Robot Manipulation | Feb 4, 2024 | PositionRobot Manipulation | CodeCode Available | 2 |
| Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model | Jan 17, 2024 | GPUImage Classification | CodeCode Available | 2 |
| Extending LLMs' Context Window with 100 Samples | Jan 13, 2024 | Position | CodeCode Available | 2 |
| Never Lost in the Middle: Mastering Long-Context Question Answering with Position-Agnostic Decompositional Training | Nov 15, 2023 | Passage RetrievalPosition | CodeCode Available | 2 |
| Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster | Nov 14, 2023 | GPUPosition | CodeCode Available | 2 |
| Position Interpolation Improves ALiBi Extrapolation | Oct 18, 2023 | Language ModellingPosition | CodeCode Available | 2 |
| ProbTS: Benchmarking Point and Distributional Forecasting across Diverse Prediction Horizons | Oct 11, 2023 | BenchmarkingPosition | CodeCode Available | 2 |
| PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training | Sep 19, 2023 | 2kPosition | CodeCode Available | 2 |
| Lost in the Middle: How Language Models Use Long Contexts | Jul 6, 2023 | Language ModellingPosition | CodeCode Available | 2 |
| Think Twice before Driving: Towards Scalable Decoders for End-to-End Autonomous Driving | May 10, 2023 | Autonomous DrivingBench2Drive | CodeCode Available | 2 |
| Detection Transformer with Stable Matching | Apr 10, 2023 | DecoderPosition | CodeCode Available | 2 |
| LayoutDM: Discrete Diffusion Model for Controllable Layout Generation | Mar 14, 2023 | Layout Generationmodel | CodeCode Available | 2 |
| A Length-Extrapolatable Transformer | Dec 20, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| CroCo v2: Improved Cross-view Completion Pre-training for Stereo Matching and Optical Flow | Nov 18, 2022 | Optical Flow EstimationPosition | CodeCode Available | 2 |
| Point Transformer V2: Grouped Vector Attention and Partition-based Pooling | Oct 11, 2022 | 3D Point Cloud Classification3D Semantic Segmentation | CodeCode Available | 2 |
| Mega: Moving Average Equipped Gated Attention | Sep 21, 2022 | Image ClassificationInductive Bias | CodeCode Available | 2 |
| DeepInteraction: 3D Object Detection via Modality Interaction | Aug 23, 2022 | 3D Object DetectionDecoder | CodeCode Available | 2 |
| Stratified Transformer for 3D Point Cloud Segmentation | Mar 28, 2022 | Point Cloud SegmentationPosition | CodeCode Available | 2 |
| ParC-Net: Position Aware Circular Convolution with Merits from ConvNets and Transformer | Mar 8, 2022 | Image Classificationobject-detection | CodeCode Available | 2 |
| Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation | Aug 27, 2021 | Inductive BiasPlaying the Game of 2048 | CodeCode Available | 2 |
| FLAT: Chinese NER Using Flat-Lattice Transformer | Apr 24, 2020 | Chinese Named Entity Recognitionnamed-entity-recognition | CodeCode Available | 2 |
| MPNet: Masked and Permuted Pre-training for Language Understanding | Apr 20, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation | Mar 17, 2020 | image-classificationImage Classification | CodeCode Available | 2 |