Understanding Screen Relationships from Screenshots of Smartphone Applications

Shirin Feiz,Jason Wu,Xiaoyi Zhang,Amanda Swearngin,Titus Barik,Jeffrey Nichols

Intelligent User Interfaces（2022）

引用 5|浏览36

暂无评分

摘要

ABSTRACTAll graphical user interfaces are comprised of one or more screens that may be shown to the user depending on their interactions. Identifying different screens of an app and understanding the type of changes that happen on the screens is a challenging task that can be applied in many areas including automatic app crawling, playback of app automation macros and large scale app dataset analysis. For example, an automated app crawler needs to understand if the screen it is currently viewing is the same as any previous screen that it has encountered, so it can focus its efforts on portions of the app that it has not yet explored. Moreover, identifying the type of change on the screen, such as whether any dialogues or keyboards have opened or closed, is useful for an automatic crawler to handle such events while crawling. Understanding screen relationships is a difficult task as instances of the same screen may have visual and structural variation, for example due to different content in a database-backed application, scrolling, dialog boxes opening or closing, or content loading delays. At the same time, instances of different screens from the same app may share some similarities in terms of design, structure, and content. This paper uses a dataset of screenshots from more than 1K iPhone applications to train two ML models that understand similarity in different ways: (1) a screen similarity model that combines a UI object detector with a transformer model architecture to recognize instances of the same screen from a collection of screenshots from a single app, and (2) a screen transition model that uses a siamese network architecture to identify both similarity and three types of events that appear in an interaction trace: the keyboard or a dialog box appearing or disappearing, and scrolling. Our models achieve an F1 score of 0.83 on the screen similarity task, improving on comparable baselines, and an average F1 score of 0.71 across all events in the transition task.

查看译文

关键词

user interface similarity,ui modeling,ui semantics

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要