Investigating Privacy Bias in Training Data of Language Models

2024-09-05Unverified0· sign in to hype

Yan Shvartzshnaider, Vasisht Duddu

Unverified — Be the first to reproduce this paper.

Abstract

As LLMs are integrated into sociotechnical systems, it is crucial to examine the privacy biases they exhibit. A privacy bias refers to the skew in the appropriateness of information flows within a given context that LLMs acquire from large amounts of non-publicly available training data. This skew may either align with existing expectations or signal a symptom of systemic issues reflected in the training datasets. We formulate a novel research question: how can we examine privacy biases in the training data of LLMs? We present a novel approach to assess the privacy biases using a contextual integrity-based methodology to evaluate the responses from different LLMs. Our approach accounts for the sensitivity of responses across prompt variations, which hinders the evaluation of privacy biases. We investigate how privacy biases are affected by model capacities and optimizations.

Tasks

Quantization

Investigating Privacy Bias in Training Data of Language Models

Abstract

Tasks

Reproductions