| Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? | Nov 27, 2017 | Action Recognition | CodeCode Available | 2 |
| LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning | Mar 19, 2025 | Instruction FollowingMultimodal Reasoning | CodeCode Available | 2 |
| WavJourney: Compositional Audio Creation with Large Language Models | Jul 26, 2023 | Audio Generation | CodeCode Available | 2 |
| OpenNRE: An Open and Extensible Toolkit for Neural Relation Extraction | Sep 28, 2019 | Information RetrievalQuestion Answering | CodeCode Available | 2 |
| Temporal Action Detection with Structured Segment Networks | Apr 20, 2017 | Action DetectionAction Recognition | CodeCode Available | 2 |
| Flash normalization: fast RMSNorm for LLMs | Jul 12, 2024 | | CodeCode Available | 2 |
| Adaptive Graph of Thoughts: Test-Time Adaptive Reasoning Unifying Chain, Tree, and Graph Structures | Feb 7, 2025 | Mathematical Problem-Solvingreinforcement-learning | CodeCode Available | 2 |
| PivotNet: Vectorized Pivot Learning for End-to-end HD Map Construction | Aug 31, 2023 | Autonomous Driving | CodeCode Available | 2 |
| An Open-Source American Sign Language Fingerspell Recognition and Semantic Pose Retrieval Interface | Aug 17, 2024 | Pose EstimationPose Retrieval | CodeCode Available | 2 |
| Neptune: The Long Orbit to Benchmarking Long Video Understanding | Dec 12, 2024 | BenchmarkingMultimodal Reasoning | CodeCode Available | 2 |
| Unified Vision-Language Pre-Training for Image Captioning and VQA | Sep 24, 2019 | DecoderImage Captioning | CodeCode Available | 2 |
| Simulation to Scaled City: Zero-Shot Policy Transfer for Traffic Control via Autonomous Vehicles | Dec 14, 2018 | Autonomous VehiclesDeep Reinforcement Learning | CodeCode Available | 2 |
| SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning | Apr 10, 2025 | | CodeCode Available | 2 |
| MatMamba: A Matryoshka State Space Model | Oct 9, 2024 | modelRepresentation Learning | CodeCode Available | 2 |
| High-dimensional Convolutional Networks for Geometric Pattern Recognition | May 17, 2020 | Vocal Bursts Intensity Prediction | CodeCode Available | 2 |
| Boosting Neural Representations for Videos with a Conditional Decoder | Feb 28, 2024 | Decoder | CodeCode Available | 2 |
| A Pilot Study for Chinese SQL Semantic Parsing | Sep 29, 2019 | Cross-Lingual Word EmbeddingsQuestion Answering | CodeCode Available | 2 |
| Differentiable Convex Optimization Layers | Oct 28, 2019 | Inductive Bias | CodeCode Available | 2 |
| Thought Cloning: Learning to Think while Acting by Imitating Human Thinking | Jun 1, 2023 | Imitation LearningReinforcement Learning (RL) | CodeCode Available | 2 |
| DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models | Aug 11, 2023 | Dataset GenerationDecoder | CodeCode Available | 2 |
| Transferability of Adversarial Examples to Attack Cloud-based Image Classifier Service | Jan 8, 2020 | ClassificationGeneral Classification | CodeCode Available | 2 |
| A Little Fog for a Large Turn | Jan 16, 2020 | Adversarial AttackAutonomous Navigation | CodeCode Available | 2 |
| Torch-Struct: Deep Structured Prediction Library | Feb 3, 2020 | Deep LearningPrediction | CodeCode Available | 2 |
| Don't be lazy: CompleteP enables compute-efficient deep transformers | May 2, 2025 | | CodeCode Available | 2 |
| Semantically-Guided Representation Learning for Self-Supervised Monocular Depth | Feb 27, 2020 | Depth EstimationDepth Prediction | CodeCode Available | 2 |
| Unbiased Scene Graph Generation from Biased Training | Feb 27, 2020 | Causal Inferencecounterfactual | CodeCode Available | 2 |
| Knowledge Graphs | Mar 4, 2020 | Knowledge Graphs | CodeCode Available | 2 |
| Compressing deep neural networks on FPGAs to binary and ternary precision with HLS4ML | Mar 11, 2020 | Handwritten Digit Recognition | CodeCode Available | 2 |
| UnetTSF: A Better Performance Linear Complexity Time Series Prediction Model | Jan 5, 2024 | Time SeriesTime Series Prediction | CodeCode Available | 2 |
| Detection in Crowded Scenes: One Proposal, Multiple Predictions | Mar 20, 2020 | Object DetectionPedestrian Detection | CodeCode Available | 2 |
| Quantile Encoder: Tackling High Cardinality Categorical Features in Regression Problems | May 27, 2021 | regressionSpecificity | CodeCode Available | 2 |
| Fixing the train-test resolution discrepancy: FixEfficientNet | Mar 18, 2020 | Data AugmentationImage Classification | CodeCode Available | 2 |
| Self-Supervised Log Parsing | Mar 17, 2020 | Anomaly DetectionFault Detection | CodeCode Available | 2 |
| Augmenting Differentiable Simulators with Neural Networks to Close the Sim2Real Gap | Jul 12, 2020 | | CodeCode Available | 2 |
| COVID-Net: A Tailored Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest X-Ray Images | Mar 22, 2020 | COVID-19 Diagnosis | CodeCode Available | 2 |
| Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data | Jan 12, 2024 | | CodeCode Available | 2 |
| Flatten Anything: Unsupervised Neural Surface Parameterization | May 23, 2024 | | CodeCode Available | 2 |
| SLOT: Sample-specific Language Model Optimization at Test-time | May 18, 2025 | GSM8KLanguage Modeling | CodeCode Available | 2 |
| Omni-sourced Webly-supervised Learning for Video Recognition | Mar 29, 2020 | Action ClassificationAction Recognition | CodeCode Available | 2 |
| Background Matting: The World is Your Green Screen | Apr 1, 2020 | Image Matting | CodeCode Available | 2 |
| BAE: BERT-based Adversarial Examples for Text Classification | Apr 4, 2020 | Adversarial AttackAdversarial Text | CodeCode Available | 2 |
| D4RL: Datasets for Deep Data-Driven Reinforcement Learning | Apr 15, 2020 | D4RLOffline RL | CodeCode Available | 2 |
| Cross-lingual Contextualized Topic Models with Zero-shot Learning | Apr 16, 2020 | Topic ModelsTransfer Learning | CodeCode Available | 2 |
| The Variational Bandwidth Bottleneck: Stochastic Evaluation on an Information Budget | Apr 24, 2020 | reinforcement-learningReinforcement Learning | CodeCode Available | 2 |
| FetalCLIP: A Visual-Language Foundation Model for Fetal Ultrasound Image Analysis | Feb 20, 2025 | Age EstimationBenchmarking | CodeCode Available | 2 |
| LGSVL Simulator: A High Fidelity Simulator for Autonomous Driving | May 7, 2020 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 2 |
| vMAP: Vectorised Object Mapping for Neural Field SLAM | Feb 3, 2023 | Object | CodeCode Available | 2 |
| Portrait3D: Text-Guided High-Quality 3D Portrait Generation Using Pyramid Representation and GANs Prior | Apr 16, 2024 | Neural RenderingText to 3D | CodeCode Available | 2 |
| Graph Structure Learning for Robust Graph Neural Networks | May 20, 2020 | Graph Neural NetworkGraph structure learning | CodeCode Available | 2 |
| AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs | Apr 21, 2024 | MMLURed Teaming | CodeCode Available | 2 |