Disk Failure Prediction in Heterogeneous Environments Using Neural Networks
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Various studies have attempted to predict individual disk failures based on the values of the SMART (Self-Monitoring, Analysis, and Reporting Technology) attributes of each disk. The main problem is that an effective predictive model should provide at the same time a high-failure detection rate (FDR) and a low false-alarm rate (FAR) because false alarms cause unnecessary data transfers and immobilize additional storage resources. We propose a predictive framework consisting of a baseline generic predictor for heterogeneous disk populations comprising disks coming from various manufacturers and a core predictor that predicts disk failures for disks of a specific manufacturer and model. Both predictors base their decisions on the values of six of the most significant SMART attributes of each disk and eighteen newly defined features that capture how these six attributes vary over time. We used the 2015 BackBlaze data set to train and validate various implementations of our predictors and the 2016 BackBlaze data set to evaluate these implementations. Our results indicate that predictors that only apply to disks of a specific disk model perform overwhelmingly better than predictors that apply to heterogeneous disk populations. In addition, we observed that Decision Trees seemed to be the best approach to build predictors for heterogeneous disk populations and that taking into account past values of SMART attributes improved the failure-detection rate and decreased the false-alarm rate of predictors for homogenous disk populations.