| Video Action Understanding | Oct 13, 2020 | Action UnderstandingDeep Learning | CodeCode Available | 0 |
| Global Self-Attention Networks for Image Recognition | Oct 6, 2020 | Video Understanding | —Unverified | 0 |
| Features Understanding in 3D CNNs for Actions Recognition in Video | Oct 1, 2020 | Action RecognitionDecision Making | CodeCode Available | 0 |
| Residual Frames with Efficient Pseudo-3D CNN for Human Action Recognition | Aug 3, 2020 | Action RecognitionOptical Flow Estimation | —Unverified | 0 |
| Self-supervised Motion Representation via Scattering Local Motion Cues | Aug 1, 2020 | Action RecognitionOptical Flow Estimation | —Unverified | 0 |
| Detection and Localization of Robotic Tools in Robot-Assisted Surgery Videos Using Deep Neural Networks for Region Proposal and Detection | Jul 29, 2020 | object-detectionObject Detection | —Unverified | 0 |
| Perceptron Synthesis Network: Rethinking the Action Scale Variances in Videos | Jul 22, 2020 | Action RecognitionTemporal Action Localization | —Unverified | 0 |
| MovieNet: A Holistic Dataset for Movie Understanding | Jul 21, 2020 | Video Understanding | —Unverified | 0 |
| Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training | Jul 5, 2020 | DecoderQuestion Answering | —Unverified | 0 |
| Video Understanding as Machine Translation | Jun 12, 2020 | Machine TranslationMetric Learning | —Unverified | 0 |
| Screencast Tutorial Video Understanding | Jun 1, 2020 | object-detectionObject Detection | CodeCode Available | 0 |
| Large Scale Video Representation Learning via Relational Graph Clustering | Jun 1, 2020 | ClusteringGraph Clustering | —Unverified | 0 |
| CARPe Posterum: A Convolutional Approach for Real-time Pedestrian Path Prediction | May 26, 2020 | Autonomous VehiclesPrediction | CodeCode Available | 0 |
| DramaQA: Character-Centered Video Story Understanding with Hierarchical QA | May 7, 2020 | Question AnsweringVideo Question Answering | CodeCode Available | 0 |
| HLVU : A New Challenge to Test Deep Understanding of Movies the Way Humans do | May 1, 2020 | Video Understanding | —Unverified | 0 |
| CATER: A diagnostic dataset for Compositional Actions & TEmporal Reasoning | May 1, 2020 | DiagnosticObject | —Unverified | 0 |
| Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTube | Apr 29, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| DriftNet: Aggressive Driving Behavior Classification using 3D EfficientNet Architecture | Apr 18, 2020 | Anomaly DetectionClassification | CodeCode Available | 0 |
| Knowledge-Based Visual Question Answering in Videos | Apr 17, 2020 | Question AnsweringVideo Question Answering | —Unverified | 0 |
| Real-Time Segmentation Networks should be Latency Aware | Apr 6, 2020 | Autonomous VehiclesScene Segmentation | —Unverified | 0 |
| Context Modulated Dynamic Networks for Actor and Action Video Segmentation with Language Queries | Apr 3, 2020 | Referring Expression SegmentationVideo Segmentation | —Unverified | 0 |
| Fully Automated Hand Hygiene Monitoring\ Operating Room using 3D Convolutional Neural Network | Mar 20, 2020 | Optical Flow EstimationTransfer Learning | —Unverified | 0 |
| Beyond the Camera: Neural Networks in World Coordinates | Mar 12, 2020 | Action RecognitionVideo Stabilization | —Unverified | 0 |
| CTM: Collaborative Temporal Modeling for Action Recognition | Feb 8, 2020 | Action RecognitionVideo Understanding | —Unverified | 0 |
| Cut-Based Graph Learning Networks to Discover Compositional Structure of Sequential Video Data | Jan 17, 2020 | Graph LearningVideo Understanding | —Unverified | 0 |
| SoccerDB: A Large-Scale Database for Comprehensive Video Understanding | Dec 10, 2019 | Action ClassificationAction Detection | CodeCode Available | 0 |
| Video action detection by learning graph-based spatio-temporal interactions | Dec 9, 2019 | Action DetectionAction Localization | CodeCode Available | 0 |
| VideoDG: Generalizing Temporal Relations in Videos to Novel Domains | Dec 8, 2019 | Action RecognitionData Augmentation | CodeCode Available | 0 |
| Context R-CNN: Long Term Temporal Context for Per-Camera Object Detection | Dec 7, 2019 | object-detectionObject Detection | CodeCode Available | 0 |
| A Context-Aware Loss Function for Action Spotting in Soccer Videos | Dec 3, 2019 | Action SpottingVideo Understanding | CodeCode Available | 0 |
| BERT for Large-scale Video Segment Classification with Test-time Augmentation | Dec 2, 2019 | General ClassificationVideo Understanding | —Unverified | 0 |
| AdapNet: Adaptability Decomposing Encoder-Decoder Network for Weakly Supervised Action Recognition and Localization | Nov 27, 2019 | Action ClassificationAction Recognition | —Unverified | 0 |
| Mimic The Raw Domain: Accelerating Action Recognition in the Compressed Domain | Nov 19, 2019 | Action RecognitionVideo Recognition | —Unverified | 0 |
| Cross-Class Relevance Learning for Temporal Concept Localization | Nov 19, 2019 | Feature EngineeringVideo Understanding | —Unverified | 0 |
| Multi-attention Networks for Temporal Localization of Video-level Labels | Nov 15, 2019 | Action RecognitionTemporal Action Localization | CodeCode Available | 0 |
| Multi-Moments in Time: Learning and Interpreting Models for Multi-Action Video Understanding | Nov 1, 2019 | Action DetectionAction Recognition | CodeCode Available | 0 |
| Comprehensive Video Understanding: Video summarization with content-based video recommender design | Oct 30, 2019 | Action RecognitionData Augmentation | —Unverified | 0 |
| MOD: A Deep Mixture Model with Online Knowledge Distillation for Large Scale Video Temporal Concept Localization | Oct 27, 2019 | Knowledge DistillationVideo Understanding | CodeCode Available | 0 |
| KnowIT VQA: Answering Knowledge-Based Questions about Videos | Oct 23, 2019 | Question AnsweringVideo Question Answering | —Unverified | 0 |
| AFO-TAD: Anchor-free One-Stage Detector for Temporal Action Detection | Oct 18, 2019 | Action Detectionobject-detection | —Unverified | 0 |
| Tiny Video Networks | Oct 15, 2019 | CPUGPU | CodeCode Available | 0 |
| OmniTrack: Real-time detection and tracking of objects, text and logos in video | Oct 14, 2019 | GPUobject-detection | —Unverified | 0 |
| ViP: Video Platform for PyTorch | Oct 7, 2019 | BenchmarkingVideo Understanding | CodeCode Available | 0 |
| A SPIKING SEQUENTIAL MODEL: RECURRENT LEAKY INTEGRATE-AND-FIRE | Sep 25, 2019 | Text SummarizationVideo Understanding | —Unverified | 0 |
| Question Answering is a Format; When is it Useful? | Sep 25, 2019 | Machine TranslationQuestion Answering | —Unverified | 0 |
| Zero-Shot Action Recognition in Videos: A Survey | Sep 13, 2019 | Action RecognitionAction Recognition In Still Images | —Unverified | 0 |
| Gaussian Temporal Awareness Networks for Action Localization | Sep 9, 2019 | Action Localizationobject-detection | CodeCode Available | 0 |
| Only Time Can Tell: Discovering Temporal Data for Temporal Modeling | Jul 19, 2019 | BenchmarkingMotion Estimation | —Unverified | 0 |
| Localizing Unseen Activities in Video via Image Query | Jun 28, 2019 | Action LocalizationVideo Understanding | —Unverified | 0 |
| UniDual: A Unified Model for Image and Video Understanding | Jun 10, 2019 | Multi-Task LearningVideo Understanding | —Unverified | 0 |