Attesting Distributional Properties of Training Data for Machine Learning

2023-08-18Code Available0· sign in to hype

Vasisht Duddu, Anudeep Das, Nora Khayata, Hossein Yalame, Thomas Schneider, N. Asokan

Code Available — Be the first to reproduce this paper.

Code

github.com/ssg-research/distribution-attestation
OfficialIn paperpytorch★ 4

Abstract

The success of machine learning (ML) has been accompanied by increased concerns about its trustworthiness. Several jurisdictions are preparing ML regulatory frameworks. One such concern is ensuring that model training data has desirable distributional properties for certain sensitive attributes. For example, draft regulations indicate that model trainers are required to show that training datasets have specific distributional properties, such as reflecting diversity of the population. We propose the notion of property attestation allowing a prover (e.g., model trainer) to demonstrate relevant distributional properties of training data to a verifier (e.g., a customer) without revealing the data. We present an effective hybrid property attestation combining property inference with cryptographic mechanisms.

Tasks

Diversity

Attesting Distributional Properties of Training Data for Machine Learning

Code

Abstract

Tasks

Reproductions