Scalable Data Gathering for Real-Time Monitoring Systems on Distributed Computing

Lyon(2008)

引用 25|浏览2
暂无评分
摘要
Real-time monitoring is increasingly becoming important in various scenes of large scale, multi-site distributed/parallel computing, e.g, understanding behavior of systems, scheduling resources, and debugging applications. Dedicated networks on inter-site communications are rarely available for the monitoring purposes. Therefore, for real-time monitoring systems, reducing communication cost is important to handle a large number of nodes with limited network resources. We implemented a real-time Grid monitoring system called VGXP, with techniques for low cost data gathering. It tries to send only diffs to recent data, and adapts to the requested data freshness and tolerable errors to minimize required communication. We evaluate monitoring overheads of the proposed method on a distributed environment consisting of 8-sites with 500 nodes. In a realistic setting where the sampling interval is set to 0.5 seconds and the tolerable error to 2%, the CPU usage of the server to gather data from all nodes was 0.2% and the transfer rate was less than 5kbps. The transfer rate did not exceed 50kbps even if we gather a detailed per-process statistics.
更多
查看译文
关键词
low cost data gathering,real-time monitoring,monitoring purpose,real-time monitoring systems,communication cost,real-time grid monitoring system,recent data,tolerable error,real-time monitoring system,scalable data gathering,requested data freshness,transfer rate,distributed computing,grid computing,visualization,real time,distributed environment,real time systems,layout,parallel processing,sampling methods,data gathering,parallel computer,debugging
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要