Hark: A Deep Learning System for Navigating Privacy Feedback at Scale

Hamza Harkous,Sai Teja Peddinti,Rishabh Khandelwal,Animesh Srivastava,Nina Taft

2022 IEEE Symposium on Security and Privacy (SP)（2022）

引用 12|浏览4

暂无评分

摘要

Integrating user feedback is one of the pillars for building successful products. However, this feedback is generally collected in an unstructured free-text form, which is challenging to understand at scale. This is particularly demanding in the privacy domain due to the nuances associated with the concept and the limited existing solutions. In this work, we present Hark1, a system for discovering and summarizing privacyrelated feedback at scale. Hark automates the entire process of summarizing privacy feedback, starting from unstructured text and resulting in a hierarchy of high-level privacy themes and fine-grained issues within each theme, along with representative reviews for each issue. At the core of Hark is a set of new deep learning models trained on di.erent tasks, such as privacy feedback classification, privacy issues generation, and high-level theme creation. We illustrate Hark's efficacy on a corpus of 626M Google Play reviews. Out of this corpus, our privacy feedback classifier extracts 6M privacy-related reviews (with an AUC-ROC of 0.92). With three annotation studies, we show that Hark's generated issues are of high accuracy and coverage and that the theme titles are of high quality. We illustrate Hark's capabilities by presenting high-level insights from 1.3M Android apps.

查看译文

关键词

privacy feedback,deep learning system

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要