WOWMON

Procedia Computer Science(2016)

引用 5|浏览54
暂无评分
摘要
Performance debugging using program profiling and tracing for scientific workflows can be extremely difficult for two reasons. 1) Existing performance tools lack the ability to automatically produce global performance data based on local information from coupled scientific applications of workflows, particularly at runtime. 2) Profiling/tracing with static instrumentation may incur high overhead and significantly slow down science-critical tasks. To gain more insights on workflows we introduce a lightweight workflow monitoring infrastructure, WOW-MON (WOrkfloW MONitor), which enables user's access not only to cross-application performance data such as end-to-end latency and execution time of individual workflow components at runtime, but also to customized performance events. To reduce profiling overhead, WOW-MON uses adaptive selection of performance metrics based on machine learning algorithms to guide profilers collecting only metrics that have most impact on performance of workflows. Through the study of real scientific workflows (e.g., LAMMPS) with the help of WOWMON, we found that the performance of the workflows can be significantly affected by both software and hardware factors, such as the policy of process mapping and in-situ buffer size. Moreover, we experimentally show that WOWMON can reduce data movement for profiling by up to 54% without missing the key metrics for performance debugging.
更多
查看译文
关键词
TAU, Scientific workflows, Performance monitoring
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要