McUDI: Model-Centric Unsupervised Degradation Indicator for Failure Prediction AIOps Solutions
CoRR(2024)
摘要
Due to the continuous change in operational data, AIOps solutions suffer from
performance degradation over time. Although periodic retraining is the
state-of-the-art technique to preserve the failure prediction AIOps models'
performance over time, this technique requires a considerable amount of labeled
data to retrain. In AIOps obtaining label data is expensive since it requires
the availability of domain experts to intensively annotate it. In this paper,
we present McUDI, a model-centric unsupervised degradation indicator that is
capable of detecting the exact moment the AIOps model requires retraining as a
result of changes in data. We further show how employing McUDI in the
maintenance pipeline of AIOps solutions can reduce the number of samples that
require annotations with 30k for job failure prediction and 260k for disk
failure prediction while achieving similar performance with periodic
retraining.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要