Clustering distributed data streams in peer-to-peer environments

Information Sciences(2006)

引用 223|浏览7
暂无评分
摘要
This paper describes a technique for clustering homogeneously distributed data in a peer-to-peer environment like sensor networks. The proposed technique is based on the principles of the K-Means algorithm. It works in a localized asynchronous manner by communicating with the neighboring nodes. The paper offers extensive theoretical analysis of the algorithm that bounds the error in the distributed clustering process compared to the centralized approach that requires downloading all the observed data to a single site. Experimental results show that, in contrast to the case when all the data is transmitted to a central location for application of the conventional clustering algorithm, the communication cost (an important consideration in sensor networks which are typically equipped with limited battery power) of the proposed approach is significantly smaller. At the same time, the accuracy of the obtained centroids is high and the number of samples which are incorrectly labeled is also small.
更多
查看译文
关键词
clustering process,observed data,peer-to-peer environment,proposed technique,central location,centralized approach,k-means algorithm,sensor network,conventional clustering algorithm,clustering homogeneously,data stream,data streams,cluster analysis,data mining,k means algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要