Cryptographic Primitives for ML Security

ML systems are increasingly deployed in security-critical applications, but traditional defenses often rely on heuristics that lack formal guarantees. In this project, we explore how cryptographic primitives can provide provable guarantees for model integrity, provenance, and privacy.

We study data repudiation – proving a given data point was not used in model training – and give the first algebraic conditions for unforgeability of stochastic gradient descent, which guarantees that a model checkpoint cannot be forged from different training data.

We develop CLUE-Mark, the first provably undetectable watermarking scheme for diffusion models. Unlike prior schemes, it leverages the hardness of the CLWE problem to guarantee security against all possible attacks (including steganographic attacks) while maintaining output quality, making it suitable for verifying the provenance of AI-generated content.

To defend against model inversion attacks and preserve privacy of training data, we introduce a cryptographic embedding protection scheme that ensures fuzzy one-wayness for the L2 norm. It allows for similarity search functionality – such as in face authentication ML systems – while preventing reconstruction of sensitive training data from model outputs.

Relevant Publications
Unforgeability in Stochastic Gradient Descent
Teodora Baluta, Ivica Nikolic, Racchit Jain, Divesh Aggarwal, Prateek Saxena
ACM Conference on Computer and Communications Security (CCS 2023). Copenhagen, DK, Nov 2023.
On Cryptographic Countermeasures for Against Model Inversion Attacks
Louise Xu, Mallika Prabhakar, Prateek Saxena
In Review, 2025.
CLUE-Mark: Watermarking Diffusion Models using CLWE
Kareem Shehata, Aashish Kolluri, Prateek Saxena
Arxiv, 2024.
Site PDF Code