Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models
arxiv(2024)
摘要
Concept Bottleneck Models (CBMs) ground image classification on
human-understandable concepts to allow for interpretable model decisions.
Crucially, the CBM design inherently allows for human interventions, in which
expert users are given the ability to modify potentially misaligned concept
choices to influence the decision behavior of the model in an interpretable
fashion. However, existing approaches often require numerous human
interventions per image to achieve strong performances, posing practical
challenges in scenarios where obtaining human feedback is expensive. In this
paper, we find that this is noticeably driven by an independent treatment of
concepts during intervention, wherein a change of one concept does not
influence the use of other ones in the model's final decision. To address this
issue, we introduce a trainable concept intervention realignment module, which
leverages concept relations to realign concept assignments post-intervention.
Across standard, real-world benchmarks, we find that concept realignment can
significantly improve intervention efficacy; significantly reducing the number
of interventions needed to reach a target classification performance or concept
prediction accuracy. In addition, it easily integrates into existing
concept-based architectures without requiring changes to the models themselves.
This reduced cost of human-model collaboration is crucial to enhancing the
feasibility of CBMs in resource-constrained environments.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要