Towards Designing a Context-Aware Multimodal Voice Assistant for Pronoun Disambiguation: A Demonstration of GazePointAR

Jaewook Lee, Jun Wang, Elizabeth Brown, Liam Chu,Sebastian S. Rodriguez,Jon E. Froehlich

ADJUNCT PROCEEDINGS OF THE 36TH ANNUAL ACM SYMPOSIUM ON USER INTERFACE SOFTWARE & TECHNOLOGY, UIST 2023 ADJUNCT（2023）

引用 0|浏览6

暂无评分

摘要

Voice assistants (VAs) like Siri and Alexa have transformed how humans interact with technology; however, their inability to consider a user's spatiotemporal context, such as surrounding objects, drammatically limits natural dialogue. In this demo paper, we introduce GazePointAR, a wearable augmented reality (AR) system that resolves ambiguity in speech queries using eye gaze, pointing gesture, and conversation history. With GazePointAR, a user can ask "what's over there?" or "how do I solve this math problem?" simply by looking and/or pointing. We describe GazePointAR's design and highlight supported use cases.

查看译文

关键词

augmented reality,multimodal input,voice assistants,gaze tracking,pointing gesture recognition,LLM

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要