Privacy-Preserving Synthetic Educational Data Generation
2022-07-07Code Available0· sign in to hype
Jill-Jênn Vie, Tomas Rigaux, Sein Minn
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/akulen/privgenOfficialIn paperpytorch★ 3
Abstract
Institutions collect massive learning traces but they may not disclose it for privacy issues. Synthetic data generation opens new opportunities for research in education. In this paper we present a generative model for educational data that can preserve the privacy of participants, and an evaluation framework for comparing synthetic data generators. We show how naive pseudonymization can lead to re-identification threats and suggest techniques to guarantee privacy. We evaluate our method on existing massive educational open datasets.