SOTAVerified

Adversarial Text

Adversarial Text refers to a specialised text sequence that is designed specifically to influence the prediction of a language model. Generally, Adversarial Text attack are carried out on Large Language Models (LLMs). Research on understanding different adversarial approaches can help us build effective defense mechanisms to detect malicious text input and build robust language models.

Papers

Showing 7180 of 114 papers

TitleStatusHype
Don’t Search for a Search Method — Simple Heuristics Suffice for Adversarial Text Attacks0
Generating Watermarked Adversarial Texts0
SEPP: Similarity Estimation of Predicted Probabilities for Defending and Detecting Adversarial TextCode0
Adversarial Training: A simple and efficient technique to Improving NLP Robustness0
Don't Search for a Search Method -- Simple Heuristics Suffice for Adversarial Text Attacks0
Semantic-Preserving Adversarial Text AttacksCode1
Reinforce Attack: Adversarial Attack against BERT with Reinforcement Learning0
DISCO : efficient unsupervised decoding for discrete natural language problems via convex relaxation0
Detecting Adversarial Text Attacks via SHapley Additive exPlanations0
MATE-KD: Masked Adversarial TExt, a Companion to Knowledge DistillationCode1
Show:102550
← PrevPage 8 of 12Next →

No leaderboard results yet.