CLUE-Mark

Authors: Kareem Shehata, Aashish Kolluri, Prateek Saxena

Abstract

The proliferation of AI-generated images has led to a new problem known as “AI Slop”: low-effort, mass-produced content that is difficult to distinguish from authentic images, leading to misinformation and user trust issues. Watermarking such content from popular diffusion models provides a promising solution: if social media platforms and end users can easily identify images generated by the most common AI providers, then a very large proportion of the AI Slop problem can be avoided. However, existing watermarking techniques are heuristic and lack formal guarantees of undetectability. Techniques that were claimed to not affect image quality were later shown to in fact degrade quality, hampering adoption as providers and users are averse to any reduction in output quality.

In this work, we introduce CLUE-Mark, a provably undetectable watermaking scheme for diffusion models. CLUE-Mark requires no changes to the model being used, is computationally efficient, and because it is provably undetectable, it is guaranteed to have no impact on model output quality. Our approach leverages the Continuous Learning With Errors (CLWE) problem — a cryptographically hard lattice problem — to embed hidden messages in the latent noise vectors used by diffusion models. By proving undetectability via reduction from a cryptographically hard problem we ensure not only that the message is imperceptible to human observers or adhoc heuristics, but to any efficient detector that does not have the secret key. CLUE-Mark focuses primarily on undetectability while maintaining sufficient robustness for common non-adversarial transformations such as JPEG compression. Empirical evaluations on state-of-the-art diffusion models confirm that CLUE-Mark achieves high message recovery, preserves image quality, and is robust to minor perturbations such as JPEG compression and brightness adjustments.