| Text-Audio-Visual-conditioned Diffusion Model for Video Saliency Prediction | Apr 19, 2025 | DenoisingImage Generation | —Unverified | 0 |
| DTFSal: Audio-Visual Dynamic Token Fusion for Video Saliency Prediction | Apr 14, 2025 | Computational EfficiencySaliency Prediction | —Unverified | 0 |
| Minimalistic Video Saliency Prediction via Efficient Decoder & Spatio Temporal Action Cues | Feb 1, 2025 | Action ClassificationAction Localization | —Unverified | 0 |
| Relevance-guided Audio Visual Fusion for Video Saliency Prediction | Nov 18, 2024 | PredictionSaliency Prediction | —Unverified | 0 |
| AIM 2024 Challenge on Video Saliency Prediction: Methods and Results | Sep 23, 2024 | Saliency DetectionSaliency Prediction | CodeCode Available | 1 |
| CaRDiff: Video Salient Object Ranking Chain of Thought Reasoning for Saliency Prediction with Diffusion | Aug 21, 2024 | Language ModellingLarge Language Model | —Unverified | 0 |
| SalFoM: Dynamic Saliency Prediction with Video Foundation Models | Apr 3, 2024 | DecoderPrediction | —Unverified | 0 |
| Transformer-based Video Saliency Prediction with High Temporal Dimension Decoding | Jan 15, 2024 | DecoderSaliency Prediction | —Unverified | 0 |
| UniST: Towards Unifying Saliency Transformer for Video Saliency Prediction and Detection | Sep 15, 2023 | Decoderobject-detection | —Unverified | 0 |
| Spherical Vision Transformer for 360-degree Video Saliency Prediction | Aug 24, 2023 | PredictionSaliency Prediction | CodeCode Available | 1 |