SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset

2024-06-20Code Available1· sign in to hype

Josef Dai, Tianle Chen, Xuyao Wang, Ziran Yang, Taiye Chen, Jiaming Ji, Yaodong Yang

Code Available — Be the first to reproduce this paper.

Code

github.com/pku-alignment/safe-sora
OfficialIn paperpytorch★ 34

Abstract

To mitigate the risk of harmful outputs from large vision models (LVMs), we introduce the SafeSora dataset to promote research on aligning text-to-video generation with human values. This dataset encompasses human preferences in text-to-video generation tasks along two primary dimensions: helpfulness and harmlessness. To capture in-depth human preferences and facilitate structured reasoning by crowdworkers, we subdivide helpfulness into 4 sub-dimensions and harmlessness into 12 sub-categories, serving as the basis for pilot annotations. The SafeSora dataset includes 14,711 unique prompts, 57,333 unique videos generated by 4 distinct LVMs, and 51,691 pairs of preference annotations labeled by humans. We further demonstrate the utility of the SafeSora dataset through several applications, including training the text-video moderation model and aligning LVMs with human preference by fine-tuning a prompt augmentation module or the diffusion model. These applications highlight its potential as the foundation for text-to-video alignment research, such as human preference modeling and the development and validation of alignment algorithms.

Tasks

Safety Alignment Text-to-Video Generation Video Alignment Video Generation

SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset

Code

Abstract

Tasks

Reproductions