Information Leakage Detection through Approximate Bayes-optimal Prediction
CoRR(2024)
摘要
In today's data-driven world, the proliferation of publicly available
information intensifies the challenge of information leakage (IL), raising
security concerns. IL involves unintentionally exposing secret (sensitive)
information to unauthorized parties via systems' observable information.
Conventional statistical approaches, which estimate mutual information (MI)
between observable and secret information for detecting IL, face challenges
such as the curse of dimensionality, convergence, computational complexity, and
MI misestimation. Furthermore, emerging supervised machine learning (ML)
methods, though effective, are limited to binary system-sensitive information
and lack a comprehensive theoretical framework. To address these limitations,
we establish a theoretical framework using statistical learning theory and
information theory to accurately quantify and detect IL. We demonstrate that MI
can be accurately estimated by approximating the log-loss and accuracy of the
Bayes predictor. As the Bayes predictor is typically unknown in practice, we
propose to approximate it with the help of automated machine learning (AutoML).
First, we compare our MI estimation approaches against current baselines, using
synthetic data sets generated using the multivariate normal (MVN) distribution
with known MI. Second, we introduce a cut-off technique using one-sided
statistical tests to detect IL, employing the Holm-Bonferroni correction to
increase confidence in detection decisions. Our study evaluates IL detection
performance on real-world data sets, highlighting the effectiveness of the
Bayes predictor's log-loss estimation, and finds our proposed method to
effectively estimate MI on synthetic data sets and thus detect ILs accurately.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要