Rainfall Prediction Based on Environmental Data

Journal Title
Journal ISSN
Volume Title

Our study explores the feasibility of predicting the rare rainfall events in the tropical East Pacific area using the environmental variables from the profiles of humidity, temperature, and wind. The prediction was conducted over three different rain types (deep convective, stratiform, and shallow convective) from GPM satellite radar. Environmental variables were from MERRA-2 reanalysis data. The environmental features at 28 pressure levels at the target pixel with the features at the neighboring pixels were used as our predictors for predicting the rare events at the target pixel. Prior to fitting the model, we applied feature rarity recoding. Rather than predicting rainfall intensity, our objective was to predict the occurrence of rare rainfall events. Our study defined three rare rainfall events by three threshold values, namely, 99.5%, 99%, and 95% quantiles of rain levels. To address the issue that rainfall patterns in the large East Pacific area are not uniform, we proposed dividing it into 15 distinct geographic zones. Each rare event prediction task focused on a specific combination of rain type, rare event, and geographic zone. Multiple independent random forest classifiers were employed to construct an optimal classifier for each prediction task, with evaluation conducted on the validation set. The results suggest that our methodology achieved an accuracy of over 90% in predicting rare rainfall events. The validation set accuracies for rare rainfall defined by 99.5% and 99% quantiles are higher than those defined by 95% quantiles. The validation set accuracies of predicting deep convective and stratiform rain types are comparable and higher than those for predicting shallow convective rain. To assess the impact of environmental features on rare rainfall prediction, we employed the permutation importance method to evaluate feature importance. This analysis was implemented for the corresponding optimal classifier and systematically on regrouped features across 5 successive pressure levels. Our findings indicate that humidity and temperature play a critical role in predicting rare rainfall events, surpassing the importance of wind features. Specifically, humidity features at middle altitudes are important for deep convective and stratiform rain, while low altitude humidity features are important for shallow convective rain. Temperature features at low altitudes are crucial for deep convective and stratiform rain, while both low and high altitude temperature features are relevant for shallow convective rain. Wind features at low altitudes exhibit importance across all three rain types. Furthermore, we applied the K-means clustering algorithm to regroup the 15 geographic zones into 4 distinct clusters. This clustering approach allowed us to explore similarities and patterns across the geographic zones.

Rainfall, Rare events, Prediction, Random forest classifier, Permutation importance