Robustness Analysis of Maryland Watermark to Paraphrasing Attacks Under Advanced Techniques

Main Article Content

Rajiv Kumar

Abstract

Abstract—Developing trustworthy techniques to recognize and validate machine-generated material is necessary due to the proliferation of large language models (LLMs) and the possibility of abuse. The Maryland Watermark proposed is a notable technique that embeds identifiable signatures into text generated by LLMs. This study investigates the robustness of the Maryland Watermark against paraphrasing-based evasion strategies in AI-generated text. With growing concerns over detecting machine-generated content, watermarking methods like Maryland, which subtly alter token selection probabilities, are critical for content attribution. Using the Mistral-7B-Instruct-v0.2 model and prompts from the DAIGT dataset, 1,000 documents (500 watermarked) were generated and subjected to three types of attacks: paragraph-based paraphrasing using a Seq2Seq model trained on kPar3, sentence-level paraphrasing using a T5-based ChatGPT Paraphraser, and word-level synonym substitution using a POS-aware WordNet approach. Evaluation metrics included watermark detectability (z-score, TPR, FPR), semantic similarity, and text quality (perplexity). Results show that paragraph-based paraphrasing yielded the lowest perplexity (19.53) while degrading semantic similarity most significantly, followed by sentence-based paraphrasing (perplexity 24.89). Recursive paraphrasing reduced watermark detection initially but showed recovery in detection accuracy in subsequent iterations. Word replacement attacks achieved high TPRs (95.78% for noun substitution and 39.76% for 25% token replacement), indicating their ineffectiveness. Overall, the Maryland Watermark remains robust against word-level modifications but is moderately vulnerable to advanced paraphrasing that alters semantic integrity.

Article Details

Section
Research Article

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.