Crowd++: unsupervised speaker count with smartphones

Chenren Xu,Sugang Li,Gang Liu,Yanyong Zhang,Emiliano Miluzzo,Yih-Farn Chen,Jun Li,Bernhard Firner

UbiComp（2013）

引用 168|浏览162

暂无评分

摘要

Smartphones are excellent mobile sensing platforms, with the microphone in particular being exercised in several audio inference applications. We take smartphone audio inference a step further and demonstrate for the first time that it's possible to accurately estimate the number of people talking in a certain place -- with an average error distance of 1.5 speakers -- through unsupervised machine learning analysis on audio segments captured by the smartphones. Inference occurs transparently to the user and no human intervention is needed to derive the classification model. Our results are based on the design, implementation, and evaluation of a system called Crowd++, involving 120 participants in 10 very different environments. We show that no dedicated external hardware or cumbersome supervised learning approaches are needed but only off-the-shelf smartphones used in a transparent manner. We believe our findings have profound implications in many research fields, including social sensing and personal wellbeing assessment.

查看译文

关键词

unsupervised speaker count,average error distance,audio inference application,classification model,certain place,smartphone audio inference,cumbersome supervised learning approach,different environment,off-the-shelf smartphones,audio segment,dedicated external hardware

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要