| ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMs | Sep 22, 2023 | Math | CodeCode Available | 2 | 5 |
| CATT: Character-based Arabic Tashkeel Transformer | Jul 3, 2024 | Arabic Text DiacritizationDecoder | CodeCode Available | 2 | 5 |
| Monocular Occupancy Prediction for Scalable Indoor Scenes | Jul 16, 2024 | 3D Semantic Scene Completion from a single RGB imagePrediction | CodeCode Available | 2 | 5 |
| Very fast Bayesian Additive Regression Trees on GPU | Oct 30, 2024 | CPUGPU | CodeCode Available | 2 | 5 |
| Raindrop Clarity: A Dual-Focused Dataset for Day and Night Raindrop Removal | Jul 24, 2024 | Raindrop RemovalRain Removal | CodeCode Available | 2 | 5 |
| MARG: Multi-Agent Review Generation for Scientific Papers | Jan 8, 2024 | Review GenerationSpecificity | CodeCode Available | 2 | 5 |
| YOLOv8-ResCBAM: YOLOv8 Based on An Effective Attention Module for Pediatric Wrist Fracture Detection | Sep 27, 2024 | Fracture detection | CodeCode Available | 2 | 5 |
| Ignore Previous Prompt: Attack Techniques For Language Models | Nov 17, 2022 | Adversarial AttackAdversarial Text | CodeCode Available | 2 | 5 |
| IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation | Jul 15, 2024 | DenoisingDepth Estimation | CodeCode Available | 2 | 5 |
| GPT Can Solve Mathematical Problems Without a Calculator | Sep 6, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| Understanding Reinforcement Learning-Based Fine-Tuning of Diffusion Models: A Tutorial and Review | Jul 18, 2024 | Reinforcement Learning (RL) | CodeCode Available | 2 | 5 |
| VSSD: Vision Mamba with Non-Causal State Space Duality | Jul 26, 2024 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| SwapAnyone: Consistent and Realistic Video Synthesis for Swapping Any Person into Any Video | Mar 12, 2025 | Video Inpainting | CodeCode Available | 2 | 5 |
| Dynamic Factor Allocation Leveraging Regime-Switching Signals | Oct 18, 2024 | | CodeCode Available | 2 | 5 |
| BoQ: A Place is Worth a Bag of Learnable Queries | May 12, 2024 | Image Similarity SearchRetrieval | CodeCode Available | 2 | 5 |
| Alpha-CLIP: A CLIP Model Focusing on Wherever You Want | Dec 6, 2023 | 3D Generation | CodeCode Available | 2 | 5 |
| Dual Vision Transformer | Jul 11, 2022 | | CodeCode Available | 2 | 5 |
| DreamMix: Decoupling Object Attributes for Enhanced Editability in Customized Image Inpainting | Nov 26, 2024 | AttributeDiversity | CodeCode Available | 2 | 5 |
| Unsupervised Night Image Enhancement: When Layer Decomposition Meets Light-Effects Suppression | Jul 21, 2022 | HallucinationImage Enhancement | CodeCode Available | 2 | 5 |
| X-IL: Exploring the Design Space of Imitation Learning Policies | Feb 17, 2025 | Imitation LearningMamba | CodeCode Available | 2 | 5 |
| Exploring Enhanced Contextual Information for Video-Level Object Tracking | Dec 15, 2024 | ObjectObject Tracking | CodeCode Available | 2 | 5 |
| CARTE: Pretraining and Transfer for Tabular Learning | Feb 26, 2024 | Data IntegrationTransfer Learning | CodeCode Available | 2 | 5 |
| ZITS++: Image Inpainting by Improving the Incremental Transformer on Structural Priors | Oct 12, 2022 | Image Inpainting | CodeCode Available | 2 | 5 |
| Automating High Quality RT Planning at Scale | Jan 21, 2025 | | CodeCode Available | 2 | 5 |
| BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains | Feb 15, 2024 | Few-Shot LearningMedical Question Answering | CodeCode Available | 2 | 5 |
| MAGVIT: Masked Generative Video Transformer | Dec 10, 2022 | Multi-Task LearningText-to-Video Generation | CodeCode Available | 2 | 5 |
| HyperSeg: Towards Universal Visual Segmentation with Large Language Model | Nov 26, 2024 | Language ModelingLarge Language Model | CodeCode Available | 2 | 5 |
| GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer | Jun 3, 2024 | 3D Object DetectionImage-to-Image Translation | CodeCode Available | 2 | 5 |
| Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs | Jun 14, 2024 | Memorization | CodeCode Available | 2 | 5 |
| H3WB: Human3.6M 3D WholeBody Dataset and Benchmark | Nov 28, 2022 | 3D Facial Landmark Localization3D Hand Pose Estimation | CodeCode Available | 2 | 5 |
| PointPillars: Fast Encoders for Object Detection from Point Clouds | Dec 14, 2018 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 | 5 |
| VTimeLLM: Empower LLM to Grasp Video Moments | Nov 30, 2023 | Dense Video CaptioningTemporal Relation Extraction | CodeCode Available | 2 | 5 |
| SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories | Mar 11, 2025 | Decision MakingInteractive Segmentation | CodeCode Available | 2 | 5 |
| Correlation-Guided Query-Dependency Calibration for Video Temporal Grounding | Nov 15, 2023 | Highlight DetectionMoment Retrieval | CodeCode Available | 2 | 5 |
| Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors | Mar 24, 2022 | Image GenerationSemantic Segmentation | CodeCode Available | 2 | 5 |
| GRES: Generalized Referring Expression Segmentation | Jun 1, 2023 | Generalized Referring Expression SegmentationReferring Expression | CodeCode Available | 2 | 5 |
| SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization | Jun 18, 2024 | Landmark-based LipreadingLipreading | CodeCode Available | 2 | 5 |
| ConvLoRA and AdaBN based Domain Adaptation via Self-Training | Feb 7, 2024 | Domain AdaptationMulti-target Domain Adaptation | CodeCode Available | 2 | 5 |
| A Scalable Communication Protocol for Networks of Large Language Models | Oct 14, 2024 | | CodeCode Available | 2 | 5 |
| Recurrent Diffusion for Large-Scale Parameter Generation | Jan 20, 2025 | GPU | CodeCode Available | 2 | 5 |
| Universal Image Restoration Pre-training via Degradation Classification | Jan 26, 2025 | 5-Degradation Blind All-in-One Image RestorationImage Restoration | CodeCode Available | 2 | 5 |
| DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis | Mar 19, 2025 | | CodeCode Available | 2 | 5 |
| MultiBooth: Towards Generating All Your Concepts in an Image from Text | Apr 22, 2024 | AllComputational Efficiency | CodeCode Available | 2 | 5 |
| Causal Reasoning and Large Language Models: Opening a New Frontier for Causality | Apr 28, 2023 | Causal DiscoveryCommon Sense Reasoning | CodeCode Available | 2 | 5 |
| AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss | May 14, 2019 | Style TransferVoice Conversion | CodeCode Available | 2 | 5 |
| Graph-of-Thought: Utilizing Large Language Models to Solve Complex and Dynamic Business Problems | Jan 10, 2024 | Decision Making | CodeCode Available | 2 | 5 |
| A Large-Scale Chinese Short-Text Conversation Dataset | Aug 10, 2020 | Dialogue GenerationShort-Text Conversation | CodeCode Available | 2 | 5 |
| Advbox: a toolbox to generate adversarial examples that fool neural networks | Jan 13, 2020 | BIG-bench Machine LearningFace Recognition | CodeCode Available | 2 | 5 |
| Enhancing Fine-grained Sentiment Classification Exploiting Local Context Embedding | Oct 2, 2020 | Aspect-Based Sentiment Analysis (ABSA)Classification | CodeCode Available | 2 | 5 |
| Good practices for evaluation of machine learning systems | Dec 4, 2024 | | CodeCode Available | 2 | 5 |