| Online Pseudo-average Shifting Attention(PASA) for Robust Low-precision LLM Inference: Algorithms and Numerical Analysis | Feb 26, 2025 | Video Generation | —Unverified | 0 | 0 |
| On the Content Bias in Frechet Video Distance | Jan 1, 2024 | Video Generation | —Unverified | 0 | 0 |
| On the Limitations of Vision-Language Models in Understanding Image Transforms | Mar 12, 2025 | Question AnsweringVideo Generation | —Unverified | 0 | 0 |
| JOG3R: Towards 3D-Consistent Video Generators | Jan 2, 2025 | Camera Pose EstimationPose Estimation | —Unverified | 0 | 0 |
| OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation | Nov 28, 2024 | Video Generation | —Unverified | 0 | 0 |
| OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation | Jul 2, 2024 | Text-to-Video GenerationVideo Generation | —Unverified | 0 | 0 |
| Opportunities and challenges in the application of large artificial intelligence models in radiology | Mar 24, 2024 | Video Generation | —Unverified | 0 | 0 |
| Optical-Flow Guided Prompt Optimization for Coherent Video Generation | Nov 23, 2024 | Optical Flow EstimationVideo Generation | —Unverified | 0 | 0 |
| Optical Flow Representation Alignment Mamba Diffusion Model for Medical Video Generation | Nov 3, 2024 | MambaOptical Flow Estimation | —Unverified | 0 | 0 |
| POS: A Prompts Optimization Suite for Augmenting Text-to-Video Generation | Nov 2, 2023 | DenoisingPOS | —Unverified | 0 | 0 |
| OSM-Net: One-to-Many One-shot Talking Head Generation with Spontaneous Head Motions | Sep 28, 2023 | Talking Head GenerationVideo Generation | —Unverified | 0 | 0 |
| OSV: One Step is Enough for High-Quality Image to Video Generation | Sep 17, 2024 | Image to Video GenerationVideo Generation | —Unverified | 0 | 0 |
| Other Vehicle Trajectories Are Also Needed: A Driving World Model Unifies Ego-Other Vehicle Trajectories in Video Latant Space | Mar 12, 2025 | Autonomous DrivingVideo Generation | —Unverified | 0 | 0 |
| Ouroboros-Diffusion: Exploring Consistent Content Generation in Tuning-free Long Video Diffusion | Jan 15, 2025 | DenoisingVideo Denoising | —Unverified | 0 | 0 |
| A Unit Enhancement and Guidance Framework for Audio-Driven Avatar Video Generation | May 6, 2025 | Human AnimationVideo Generation | —Unverified | 0 | 0 |
| PaintScene4D: Consistent 4D Scene Generation from Text Prompts | Dec 5, 2024 | Scene GenerationVideo Generation | —Unverified | 0 | 0 |
| PanoWan: Lifting Diffusion Video Generation Models to 360° with Latitude/Longitude-aware Mechanisms | May 28, 2025 | DenoisingVideo Generation | —Unverified | 0 | 0 |
| Parallelized Autoregressive Visual Generation | Dec 19, 2024 | Video Generation | —Unverified | 0 | 0 |
| Parallel Multiscale Autoregressive Density Estimation | Mar 10, 2017 | Conditional Image GenerationDensity Estimation | —Unverified | 0 | 0 |
| PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models | Jun 19, 2025 | Image GenerationQuantization | —Unverified | 0 | 0 |
| Passive Deepfake Detection Across Multi-modalities: A Comprehensive Survey | Nov 26, 2024 | DeepFake DetectionFace Swapping | —Unverified | 0 | 0 |
| Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception | Jan 1, 2025 | Image CaptioningImage Generation | —Unverified | 0 | 0 |
| PatchVSR: Breaking Video Diffusion Resolution Limits with Patch-wise Video Super-Resolution | Jan 1, 2025 | 4kSuper-Resolution | —Unverified | 0 | 0 |
| Pathways on the Image Manifold: Image Editing via Video Generation | Nov 25, 2024 | Text-based Image EditingVideo Generation | —Unverified | 0 | 0 |
| People are poorly equipped to detect AI-powered voice clones | Oct 3, 2024 | Video Generation | —Unverified | 0 | 0 |
| PersonalVideo: High ID-Fidelity Video Customization without Dynamic and Semantic Degradation | Nov 26, 2024 | Video Generation | —Unverified | 0 | 0 |
| Photorealistic Video Generation with Diffusion Models | Dec 11, 2023 | Super-ResolutionText-to-Video Generation | —Unverified | 0 | 0 |
| PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation | Apr 19, 2024 | motion predictionObject | —Unverified | 0 | 0 |
| PhysMotion: Physics-Grounded Dynamics From a Single Image | Nov 26, 2024 | Video Generation | —Unverified | 0 | 0 |
| PlayerOne: Egocentric World Simulator | Jun 11, 2025 | Video Generation | —Unverified | 0 | 0 |
| PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement | Jun 9, 2025 | Video Generation | —Unverified | 0 | 0 |
| PoseCrafter: One-Shot Personalized Video Synthesis Following Flexible Pose Control | May 23, 2024 | Video Generation | —Unverified | 0 | 0 |
| Pose-Guided Fine-Grained Sign Language Video Generation | Sep 25, 2024 | Image GenerationOptical Flow Estimation | —Unverified | 0 | 0 |
| Pose-Guided High-Resolution Appearance Transfer via Progressive Training | Aug 27, 2020 | Appearance TransferDecoder | —Unverified | 0 | 0 |
| Pose Guided Human Video Generation | Jul 30, 2018 | Generative Adversarial Networkmotion prediction | —Unverified | 0 | 0 |
| PosePilot: Steering Camera Pose for Generative World Models with Self-supervised Depth | May 3, 2025 | Autonomous DrivingCamera Pose Estimation | —Unverified | 0 | 0 |
| PoseTraj: Pose-Aware Trajectory Control in Video Diffusion | Mar 20, 2025 | DisentanglementVideo Generation | —Unverified | 0 | 0 |
| Position: Interactive Generative Video as Next-Generation Game Engine | Mar 21, 2025 | PositionVideo Generation | —Unverified | 0 | 0 |
| Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models | May 17, 2023 | Image GenerationText-to-Video Generation | —Unverified | 0 | 0 |
| ProFashion: Prototype-guided Fashion Video Generation with Multiple Reference Images | May 10, 2025 | DenoisingVideo Generation | —Unverified | 0 | 0 |
| Progressive Growing of Video Tokenizers for Highly Compressed Latent Spaces | Jan 9, 2025 | Video Generation | —Unverified | 0 | 0 |
| PromptCoT: Align Prompt Distribution via Adapted Chain-of-Thought | Jan 1, 2024 | Computational EfficiencyPrompt Engineering | —Unverified | 0 | 0 |
| ProphetDWM: A Driving World Model for Rolling Out Future Actions and Videos | May 24, 2025 | Action GenerationAutonomous Driving | —Unverified | 0 | 0 |
| Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics | Aug 8, 2024 | Video Generation | —Unverified | 0 | 0 |
| PV3D: A 3D Generative Model for Portrait Video Generation | Dec 13, 2022 | Video Generation | —Unverified | 0 | 0 |
| Physical Informed Driving World Model | Dec 11, 2024 | 3D Object DetectionAutonomous Driving | —Unverified | 0 | 0 |
| Q-Bench-Video: Benchmarking the Video Quality Understanding of LMMs | Sep 30, 2024 | BenchmarkingMultiple-choice | —Unverified | 0 | 0 |
| Q-Bench-Video: Benchmark the Video Quality Understanding of LMMs | Jan 1, 2025 | Multiple-choiceVideo Generation | —Unverified | 0 | 0 |
| Qffusion: Controllable Portrait Video Editing via Quadrant-Grid Attention Learning | Jan 11, 2025 | Video EditingVideo Generation | —Unverified | 0 | 0 |
| Qualitative Failures of Image Generation Models and Their Application in Detecting Deepfakes | Mar 29, 2023 | Image GenerationVideo Generation | —Unverified | 0 | 0 |