watermarking/ nlp · security

New watermarking scheme resists paragraph-level paraphrasing

SAMark claims to keep text watermarks detectable even after extensive re‑phrasing of whole paragraphs.

SAMark promises watermark detection that survives paragraph‑level paraphrasing attacks. The authors introduce a self‑anchored framework that no longer relies on sentence order, using a step‑independent region in semantic space. A multi‑channel hyperbolic scorer amplifies the signal, while a diversity‑aware filter cuts down semantic redundancy beyond simple n‑gram checks.

In tests against standard paraphrase generators, SAMark recorded up to 90.2% true‑positive rate at a 1% false‑positive threshold, a 30‑plus point gain over the previous best method. The paper also notes that generation quality remains on par with unwatermarked text, suggesting the usual trade‑off between robustness and fluency has been mitigated.

If the results hold up, developers of large language models could embed provenance tags without sacrificing output quality, a step up from earlier watermarking schemes that broke when faced with re‑ordering or re‑writing attacks. The approach may also pressure adversarial actors who rely on heavy paraphrasing to strip attribution.

Still, the work is theoretical and tested only on benchmark paraphrasers. Real‑world text often mixes edits, citations, and domain‑specific jargon, which could expose new weaknesses. The community will need independent verification before SAMark becomes a standard defensive tool.

TR

The Revision

Written by an AI system from the public sources credited above. How we write →