| SR-LIVO: LiDAR-Inertial-Visual Odometry and Mapping with Sweep Reconstruction | Dec 28, 2023 | Pose EstimationVisual Odometry | CodeCode Available | 2 |
| Can Language Models Solve Olympiad Programming? | Apr 16, 2024 | | CodeCode Available | 2 |
| Improving Autoformalization using Type Checking | Jun 11, 2024 | Informal-to-formal Style Transfer | CodeCode Available | 2 |
| Prototype based Masked Audio Model for Self-Supervised Learning of Sound Event Detection | Sep 26, 2024 | Event DetectionRepresentation Learning | CodeCode Available | 2 |
| Masked Autoencoders for Point Cloud Self-supervised Learning | Mar 13, 2022 | 3D Part Segmentation3D Point Cloud Classification | CodeCode Available | 2 |
| MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning | May 4, 2024 | Earth Observationimage-classification | CodeCode Available | 2 |
| Mitigate the Gap: Investigating Approaches for Improving Cross-Modal Alignment in CLIP | Jun 25, 2024 | cross-modal alignmentImage Classification | CodeCode Available | 2 |
| SmartRefine: A Scenario-Adaptive Refinement Framework for Efficient Motion Prediction | Mar 18, 2024 | Autonomous Vehiclesmotion prediction | CodeCode Available | 2 |
| ZooPFL: Exploring Black-box Foundation Models for Personalized Federated Learning | Oct 8, 2023 | Federated LearningPersonalized Federated Learning | CodeCode Available | 2 |
| Fancy123: One Image to High-Quality 3D Mesh Generation via Plug-and-Play Deformation | Nov 25, 2024 | Image to 3D | CodeCode Available | 2 |
| Fast-Poly: A Fast Polyhedral Framework For 3D Multi-Object Tracking | Mar 20, 2024 | 3D Multi-Object TrackingCPU | CodeCode Available | 2 |
| Attention Concatenation Volume for Accurate and Efficient Stereo Matching | Mar 4, 2022 | Patch MatchingStereo Depth Estimation | CodeCode Available | 2 |
| Crafting Interpretable Embeddings by Asking LLMs Questions | May 26, 2024 | Question Answering | CodeCode Available | 2 |
| PodAgent: A Comprehensive Framework for Podcast Generation | Mar 1, 2025 | Audio GenerationSpeech Synthesis | CodeCode Available | 2 |
| FLAME: Financial Large-Language Model Assessment and Metrics Evaluation | Jan 3, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Octopus: Embodied Vision-Language Programmer from Environmental Feedback | Oct 12, 2023 | BenchmarkingCode Generation | CodeCode Available | 2 |
| Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation | Feb 28, 2024 | Code GenerationIn-Context Learning | CodeCode Available | 2 |
| Learning Human-Inspired Force Strategies for Robotic Assembly | Mar 22, 2023 | | CodeCode Available | 2 |
| Self-Supervised Learning for Real-World Super-Resolution from Dual and Multiple Zoomed Observations | May 3, 2024 | Optical Flow EstimationReference-based Super-Resolution | CodeCode Available | 2 |
| MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning | Nov 4, 2023 | Multi-Task Learning | CodeCode Available | 2 |
| When is Tree Search Useful for LLM Planning? It Depends on the Discriminator | Feb 16, 2024 | Mathematical ReasoningRe-Ranking | CodeCode Available | 2 |
| ScreenAI: A Vision-Language Model for UI and Infographics Understanding | Feb 7, 2024 | Chart Question AnsweringLanguage Modeling | CodeCode Available | 2 |
| Learning to Prompt for Vision-Language Models | Sep 2, 2021 | Domain GeneralizationFew-shot Age Estimation | CodeCode Available | 2 |
| EmoFace: Audio-driven Emotional 3D Face Animation | Jul 17, 2024 | 3D Face Animation | CodeCode Available | 2 |
| OmniBench: Towards The Future of Universal Omni-Language Models | Sep 23, 2024 | Instruction Following | CodeCode Available | 2 |
| ADATIME: A Benchmarking Suite for Domain Adaptation on Time Series Data | Mar 15, 2022 | BenchmarkingDomain Adaptation | CodeCode Available | 2 |
| ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction | Jul 9, 2024 | Image GenerationText to Image Generation | CodeCode Available | 2 |
| InteractRank: Personalized Web-Scale Search Pre-Ranking with Cross Interaction Features | Apr 9, 2025 | Computational Efficiency | CodeCode Available | 2 |
| Specializing Smaller Language Models towards Multi-Step Reasoning | Jan 30, 2023 | MathModel Selection | CodeCode Available | 2 |
| Stitchable Neural Networks | Feb 13, 2023 | Image Classification | CodeCode Available | 2 |
| Respecting causality is all you need for training physics-informed neural networks | Mar 14, 2022 | AllAttribute | CodeCode Available | 2 |
| Towards Interpretable Mental Health Analysis with Large Language Models | Apr 6, 2023 | Causal Emotion EntailmentEmotion Recognition | CodeCode Available | 2 |
| Cross-Modality Safety Alignment | Jun 21, 2024 | Safety Alignment | CodeCode Available | 2 |
| FiLo: Zero-Shot Anomaly Detection by Fine-Grained Description and High-Quality Localization | Apr 21, 2024 | Anomaly DetectionPosition | CodeCode Available | 2 |
| HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference | Apr 8, 2025 | CPUGPU | CodeCode Available | 2 |
| Target conversation extraction: Source separation using turn-taking dynamics | Jul 15, 2024 | | CodeCode Available | 2 |
| Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts | Mar 14, 2024 | DenoisingMixture-of-Experts | CodeCode Available | 2 |
| GPT-InvestAR: Enhancing Stock Investment Strategies through Annual Report Analysis with Large Language Models | Sep 6, 2023 | | CodeCode Available | 2 |
| BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions | Aug 19, 2023 | MMEOptical Character Recognition (OCR) | CodeCode Available | 2 |
| A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future | Jul 18, 2023 | Knowledge Distillationobject-detection | CodeCode Available | 2 |
| normflows: A PyTorch Package for Normalizing Flows | Jan 26, 2023 | Image GenerationVariational Inference | CodeCode Available | 2 |
| WidthFormer: Toward Efficient Transformer-based BEV View Transformation | Jan 8, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| Evidential Detection and Tracking Collaboration: New Problem, Benchmark and Algorithm for Robust Anti-UAV System | Jun 27, 2023 | | CodeCode Available | 2 |
| Deep Incubation: Training Large Models by Divide-and-Conquering | Dec 8, 2022 | Image Segmentationobject-detection | CodeCode Available | 2 |
| Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language Models | Mar 18, 2025 | AnatomyAttribute | CodeCode Available | 2 |
| Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmark | May 14, 2024 | | CodeCode Available | 2 |
| MARLIN: Masked Autoencoder for facial video Representation LearnINg | Nov 12, 2022 | Action ClassificationAttribute | CodeCode Available | 2 |
| GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization | Sep 27, 2023 | Contrastive Learninggeo-localization | CodeCode Available | 2 |
| Large Language Models for Anomaly and Out-of-Distribution Detection: A Survey | Sep 3, 2024 | Out-of-Distribution Detection | CodeCode Available | 2 |
| StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams | Jun 10, 2025 | 3DGS3D Reconstruction | CodeCode Available | 2 |