| Prototypical Networks for Few-shot Learning | Mar 15, 2017 | Category-Agnostic Pose EstimationFew-Shot Image Classification | CodeCode Available | 2 |
| BEVStereo: Enhancing Depth Estimation in Multi-view 3D Object Detection with Dynamic Temporal Stereo | Sep 21, 2022 | 3D Object DetectionDepth Estimation | CodeCode Available | 2 |
| Language is All a Graph Needs | Aug 14, 2023 | AllGraph Learning | CodeCode Available | 2 |
| Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark | Apr 23, 2025 | | CodeCode Available | 2 |
| GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians | Dec 4, 2023 | Motion Estimation | CodeCode Available | 2 |
| Decoupling Features in Hierarchical Propagation for Video Object Segmentation | Oct 18, 2022 | ObjectSemantic Segmentation | CodeCode Available | 2 |
| Collaborative Neural Rendering using Anime Character Sheets | Jul 12, 2022 | Image GenerationImage to 3D | CodeCode Available | 2 |
| PACO: Parts and Attributes of Common Objects | Jan 4, 2023 | 2D Object DetectionAttribute | CodeCode Available | 2 |
| Neighbourhood Representative Sampling for Efficient End-to-end Video Quality Assessment | Oct 11, 2022 | Video Quality AssessmentVisual Question Answering (VQA) | CodeCode Available | 2 |
| Bayesian Enhancement Models for One-to-Many Mapping in Image Enhancement | Oct 13, 2024 | Image EnhancementLow-Light Image Enhancement | CodeCode Available | 2 |
| ET-BERT: A Contextualized Datagram Representation with Pre-training Transformers for Encrypted Traffic Classification | Feb 13, 2022 | ClassificationManagement | CodeCode Available | 2 |
| READ: Large-Scale Neural Scene Rendering for Autonomous Driving | May 11, 2022 | 3D Scene ReconstructionAutonomous Driving | CodeCode Available | 2 |
| LVLM-eHub: A Comprehensive Evaluation Benchmark for Large Vision-Language Models | Jun 15, 2023 | HallucinationImage Captioning | CodeCode Available | 2 |
| Probing the limitations of multimodal language models for chemistry and materials research | Nov 25, 2024 | Experimental DesignSpatial Reasoning | CodeCode Available | 2 |
| CostFilter-AD: Enhancing Anomaly Detection through Matching Cost Filtering | May 2, 2025 | Anomaly DetectionUnsupervised Anomaly Detection | CodeCode Available | 2 |
| The Importance of Being Scalable: Improving the Speed and Accuracy of Neural Network Interatomic Potentials Across Chemical Domains | Oct 31, 2024 | GPUPhilosophy | CodeCode Available | 2 |
| Machine Unlearning in Generative AI: A Survey | Jul 30, 2024 | Machine UnlearningSurvey | CodeCode Available | 2 |
| A Survey on Diffusion Models for Anomaly Detection | Jan 20, 2025 | Anomaly DetectionComputational Efficiency | CodeCode Available | 2 |
| PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop | Mar 12, 2025 | DiagnosticVideo Generation | CodeCode Available | 2 |
| NeuS2: Fast Learning of Neural Implicit Surfaces for Multi-view Reconstruction | Dec 10, 2022 | Surface Reconstruction | CodeCode Available | 2 |
| DreamText: High Fidelity Scene Text Synthesis | May 23, 2024 | | CodeCode Available | 2 |
| GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models | Jul 2, 2024 | Marketing | CodeCode Available | 2 |
| Video in 10 Bits: Few-Bit VideoQA for Efficiency and Privacy | Oct 15, 2022 | Feature CompressionQuestion Answering | CodeCode Available | 2 |
| Cross-Domain Pre-training with Language Models for Transferable Time Series Representations | Mar 19, 2024 | Language ModellingTime Series | CodeCode Available | 2 |
| Co-SLAM: Joint Coordinate and Sparse Parametric Encodings for Neural Real-Time SLAM | Apr 27, 2023 | Surface Reconstruction | CodeCode Available | 2 |
| GP-VTON: Towards General Purpose Virtual Try-on via Collaborative Local-Flow Global-Parsing Learning | Mar 24, 2023 | Virtual Try-on | CodeCode Available | 2 |
| Decomposing and Editing Predictions by Modeling Model Computation | Apr 17, 2024 | counterfactualmodel | CodeCode Available | 2 |
| Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents | Apr 25, 2024 | Decision MakingSpecificity | CodeCode Available | 2 |
| Momentum-GS: Momentum Gaussian Self-Distillation for High-Quality Large Scene Reconstruction | Dec 6, 2024 | 3D Reconstruction3D Scene Reconstruction | CodeCode Available | 2 |
| Segment This Thing: Foveated Tokenization for Efficient Point-Prompted Segmentation | Jun 10, 2025 | FoveationImage Segmentation | CodeCode Available | 2 |
| Boost 3D Reconstruction using Diffusion-based Monocular Camera Calibration | Nov 26, 2024 | 3D ReconstructionCamera Calibration | CodeCode Available | 2 |
| Counting-Stars: A Multi-evidence, Position-aware, and Scalable Benchmark for Evaluating Long-Context Large Language Models | Mar 18, 2024 | 4kPosition | CodeCode Available | 2 |
| Flow-Guided Transformer for Video Inpainting | Aug 14, 2022 | RetrievalVideo Inpainting | CodeCode Available | 2 |
| DISC-MedLLM: Bridging General Large Language Models and Real-World Medical Consultation | Aug 28, 2023 | Knowledge Graphs | CodeCode Available | 2 |
| A Survey on Multimodal Benchmarks: In the Era of Large AI Models | Sep 21, 2024 | BenchmarkingSurvey | CodeCode Available | 2 |
| SocialBench: Sociality Evaluation of Role-Playing Conversational Agents | Mar 20, 2024 | | CodeCode Available | 2 |
| Have Faith in Faithfulness: Going Beyond Circuit Overlap When Finding Model Mechanisms | Mar 26, 2024 | Language Modelling | CodeCode Available | 2 |
| Unrecognizable Yet Identifiable: Image Distortion with Preserved Embeddings | Jan 26, 2024 | Face RecognitionSecurity Studies | CodeCode Available | 2 |
| Convex Relaxation for Robust Vanishing Point Estimation in Manhattan World | May 7, 2025 | | CodeCode Available | 2 |
| Masked Modeling for Self-supervised Representation Learning on Vision and Beyond | Dec 31, 2023 | Representation LearningSelf-Supervised Learning | CodeCode Available | 2 |
| MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild | Apr 13, 2024 | cross-modal alignmentDynamic Facial Expression Recognition | CodeCode Available | 2 |
| Neural Fields with Thermal Activations for Arbitrary-Scale Super-Resolution | Nov 29, 2023 | Image Super-ResolutionSuper-Resolution | CodeCode Available | 2 |
| Pengi: An Audio Language Model for Audio Tasks | May 19, 2023 | Audio captioningAudio Question Answering | CodeCode Available | 2 |
| EMO-SUPERB: An In-depth Look at Speech Emotion Recognition | Feb 20, 2024 | Emotion RecognitionSelf-Supervised Learning | CodeCode Available | 2 |
| Latent Neural Operator for Solving Forward and Inverse PDE Problems | Jun 6, 2024 | Computational EfficiencyGPU | CodeCode Available | 2 |
| EVA3D: Compositional 3D Human Generation from 2D Image Collections | Oct 10, 2022 | DiversityNeRF | CodeCode Available | 2 |
| Tightly-Coupled LiDAR-IMU-Wheel Odometry with Online Calibration of a Kinematic Model for Skid-Steering Robots | Apr 3, 2024 | | CodeCode Available | 2 |
| CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement | Sep 22, 2022 | Audio Super-ResolutionAutomatic Speech Recognition | CodeCode Available | 2 |
| Progressive Pretext Task Learning for Human Trajectory Prediction | Jul 16, 2024 | Knowledge DistillationPrediction | CodeCode Available | 2 |
| Natural Language Reinforcement Learning | Nov 21, 2024 | Decision Makingreinforcement-learning | CodeCode Available | 2 |