Universal Adversarial Perturbations for Vision-Language Pre-trained Models
arxiv(2024)
摘要
Vision-language pre-trained (VLP) models have been the foundation of numerous
vision-language tasks. Given their prevalence, it be- comes imperative to
assess their adversarial robustness, especially when deploying them in
security-crucial real-world applications. Traditionally, adversarial
perturbations generated for this assessment target specific VLP models,
datasets, and/or downstream tasks. This practice suffers from low
transferability and additional computation costs when transitioning to new
scenarios.
In this work, we thoroughly investigate whether VLP models are commonly
sensitive to imperceptible perturbations of a specific pattern for the image
modality. To this end, we propose a novel black-box method to generate
Universal Adversarial Perturbations (UAPs), which is so called the Effective
and T ransferable Universal Adversarial Attack (ETU), aiming to mislead a
variety of existing VLP models in a range of downstream tasks. The ETU
comprehensively takes into account the characteristics of UAPs and the
intrinsic cross-modal interactions to generate effective UAPs. Under this
regime, the ETU encourages both global and local utilities of UAPs. This
benefits the overall utility while reducing interactions between UAP units,
improving the transferability. To further enhance the effectiveness and
transferability of UAPs, we also design a novel data augmentation method named
ScMix. ScMix consists of self-mix and cross-mix data transformations, which can
effectively increase the multi-modal data diversity while preserving the
semantics of the original data. Through comprehensive experiments on various
downstream tasks, VLP models, and datasets, we demonstrate that the proposed
method is able to achieve effective and transferrable universal adversarial
attacks.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要