Towards Designing a Context-Aware Multimodal Voice Assistant for Pronoun Disambiguation: A Demonstration of GazePointAR

Jaewook Lee, Jun Wang, Elizabeth Brown, Liam Chu,Sebastian S. Rodriguez,Jon E. Froehlich

ADJUNCT PROCEEDINGS OF THE 36TH ANNUAL ACM SYMPOSIUM ON USER INTERFACE SOFTWARE & TECHNOLOGY, UIST 2023 ADJUNCT(2023)

引用 0|浏览6
暂无评分
摘要
Voice assistants (VAs) like Siri and Alexa have transformed how humans interact with technology; however, their inability to consider a user's spatiotemporal context, such as surrounding objects, drammatically limits natural dialogue. In this demo paper, we introduce GazePointAR, a wearable augmented reality (AR) system that resolves ambiguity in speech queries using eye gaze, pointing gesture, and conversation history. With GazePointAR, a user can ask "what's over there?" or "how do I solve this math problem?" simply by looking and/or pointing. We describe GazePointAR's design and highlight supported use cases.
更多
查看译文
关键词
augmented reality,multimodal input,voice assistants,gaze tracking,pointing gesture recognition,LLM
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要