| FateZero: Fusing Attentions for Zero-shot Text-based Video Editing | Mar 16, 2023 | AttributeText-to-Video Editing | CodeCode Available | 3 | 5 |
| A Comprehensive Survey on Test-Time Adaptation under Distribution Shifts | Mar 27, 2023 | Domain AdaptationSource-Free Domain Adaptation | CodeCode Available | 3 | 5 |
| Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos | Apr 3, 2023 | Image GenerationText to Image Generation | CodeCode Available | 3 | 5 |
| Prompting with Pseudo-Code Instructions | May 19, 2023 | | CodeCode Available | 3 | 5 |
| Hierarchical Prompting Assists Large Language Model on Web Navigation | May 23, 2023 | Decision MakingLanguage Modeling | CodeCode Available | 3 | 5 |
| Taming 3DGS: High-Quality Radiance Fields with Limited Resources | Jun 21, 2024 | 3DGSAttribute | CodeCode Available | 3 | 5 |
| Improving visual image reconstruction from human brain activity using latent diffusion models via multiple decoded inputs | Jun 20, 2023 | Deep LearningImage Reconstruction | CodeCode Available | 3 | 5 |
| GlyphNet: Homoglyph domains dataset and detection using attention-based Convolutional Neural Networks | Jun 17, 2023 | Binary Classification | CodeCode Available | 3 | 5 |
| Segment Anything Meets Point Tracking | Jul 3, 2023 | Interactive Video Object SegmentationObject | CodeCode Available | 3 | 5 |
| EEGPT: Pretrained Transformer for Universal and Reliable Representation of EEG Signals | Jan 1, 2024 | EEGRepresentation Learning | CodeCode Available | 3 | 5 |
| WebArena: A Realistic Web Environment for Building Autonomous Agents | Jul 25, 2023 | | CodeCode Available | 3 | 5 |
| Pixel-Aware Stable Diffusion for Realistic Image Super-resolution and Personalized Stylization | Aug 28, 2023 | Image EnhancementImage Generation | CodeCode Available | 3 | 5 |
| nanoT5: A PyTorch Framework for Pre-training and Fine-tuning T5-style Models with Limited Resources | Sep 5, 2023 | DecoderGPU | CodeCode Available | 3 | 5 |
| LSNet: See Large, Focus Small | Mar 29, 2025 | | CodeCode Available | 3 | 5 |
| Sparse Autoencoders Find Highly Interpretable Features in Language Models | Sep 15, 2023 | counterfactualLanguage Modelling | CodeCode Available | 3 | 5 |
| FreeU: Free Lunch in Diffusion U-Net | Sep 20, 2023 | DecoderDenoising | CodeCode Available | 3 | 5 |
| Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition | Sep 21, 2023 | Speaker Recognition | CodeCode Available | 3 | 5 |
| AutoAgents: A Framework for Automatic Agent Generation | Sep 29, 2023 | | CodeCode Available | 3 | 5 |
| Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting | Oct 12, 2023 | DecoderProbabilistic Time Series Forecasting | CodeCode Available | 3 | 5 |
| Putting the Object Back into Video Object Segmentation | Oct 19, 2023 | ObjectSegmentation | CodeCode Available | 3 | 5 |
| Skywork: A More Open Bilingual Foundation Model | Oct 30, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| PixelFlow: Pixel-Space Generative Models with Flow | Apr 10, 2025 | Conditional Image GenerationImage Generation | CodeCode Available | 3 | 5 |
| Class Symbolic Regression: Gotta Fit 'Em All | Dec 4, 2023 | AllDeep Reinforcement Learning | CodeCode Available | 3 | 5 |
| An LLM Compiler for Parallel Function Calling | Dec 7, 2023 | | CodeCode Available | 3 | 5 |
| EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM | Dec 11, 2023 | Decoder | CodeCode Available | 3 | 5 |
| General Object Foundation Model for Images and Videos at Scale | Dec 14, 2023 | Instance SegmentationLong-tail Video Object Segmentation | CodeCode Available | 3 | 5 |
| DreamTalk: When Emotional Talking Head Generation Meets Diffusion Probabilistic Models | Dec 15, 2023 | DenoisingTalking Head Generation | CodeCode Available | 3 | 5 |
| Generative Multimodal Models are In-Context Learners | Dec 20, 2023 | In-Context LearningPersonalized Image Generation | CodeCode Available | 3 | 5 |
| Attention is not not Explanation | Aug 13, 2019 | Decision MakingDiagnostic | CodeCode Available | 3 | 5 |
| Evaluating Language Model Agency through Negotiations | Jan 9, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 3 | 5 |
| DiffusionEdge: Diffusion Probabilistic Model for Crisp Edge Detection | Jan 4, 2024 | DecoderDenoising | CodeCode Available | 3 | 5 |
| Pheme: Efficient and Conversational Speech Generation | Jan 5, 2024 | | CodeCode Available | 3 | 5 |
| Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models | Jan 17, 2024 | Task Planning | CodeCode Available | 3 | 5 |
| VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks | Jan 24, 2024 | | CodeCode Available | 3 | 5 |
| FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design | Jan 25, 2024 | GPUQuantization | CodeCode Available | 3 | 5 |
| SliceGPT: Compress Large Language Models by Deleting Rows and Columns | Jan 26, 2024 | | CodeCode Available | 3 | 5 |
| Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation | Jan 31, 2024 | Hierarchical Text Segmentationparameter-efficient fine-tuning | CodeCode Available | 3 | 5 |
| LongAlign: A Recipe for Long Context Alignment of Large Language Models | Jan 31, 2024 | DiversityInstruction Following | CodeCode Available | 3 | 5 |
| Noise Contrastive Alignment of Language Models with Explicit Rewards | Feb 8, 2024 | Language ModellingMath | CodeCode Available | 3 | 5 |
| HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting | Feb 9, 2024 | | CodeCode Available | 3 | 5 |
| Pathformer: Multi-scale Transformers with Adaptive Pathways for Time Series Forecasting | Feb 4, 2024 | Time SeriesTime Series Forecasting | CodeCode Available | 3 | 5 |
| PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models | Feb 12, 2024 | Answer GenerationHallucination | CodeCode Available | 3 | 5 |
| Magic-Me: Identity-Specific Video Customized Diffusion | Feb 14, 2024 | Image GenerationText to Image Generation | CodeCode Available | 3 | 5 |
| BitDelta: Your Fine-Tune May Only Be Worth One Bit | Feb 15, 2024 | GPU | CodeCode Available | 3 | 5 |
| QuRating: Selecting High-Quality Data for Training Language Models | Feb 15, 2024 | In-Context Learning | CodeCode Available | 3 | 5 |
| LLMDFA: Analyzing Dataflow in Code with Large Language Models | Feb 16, 2024 | Hallucination | CodeCode Available | 3 | 5 |
| Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive | Feb 20, 2024 | | CodeCode Available | 3 | 5 |
| Codec-SUPERB: An In-Depth Analysis of Sound Codec Models | Feb 20, 2024 | | CodeCode Available | 3 | 5 |
| Towards Building Multilingual Language Model for Medicine | Feb 21, 2024 | Domain AdaptationLanguage Modeling | CodeCode Available | 3 | 5 |
| ChatMusician: Understanding and Generating Music Intrinsically with LLM | Feb 25, 2024 | MMLUText Generation | CodeCode Available | 3 | 5 |