SOTAVerified

SimMark: A Robust Sentence-Level Similarity-Based Watermarking Algorithm for Large Language Models

2025-02-05Code Available1· sign in to hype

Amirhossein Dabiriaghdam, Lele Wang

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

The rapid proliferation of large language models (LLMs) has created an urgent need for reliable methods to detect whether a text is generated by such models. In this paper, we propose SimMark, a posthoc watermarking algorithm that makes LLMs' outputs traceable without requiring access to the model's internal logits, enabling compatibility with a wide range of LLMs, including API-only models. By leveraging the similarity of semantic sentence embeddings and rejection sampling to impose detectable statistical patterns imperceptible to humans, and employing a soft counting mechanism, SimMark achieves robustness against paraphrasing attacks. Experimental results demonstrate that SimMark sets a new benchmark for robust watermarking of LLM-generated content, surpassing prior sentence-level watermarking techniques in robustness, sampling efficiency, and applicability across diverse domains, all while preserving the text quality.

Tasks

Reproductions