Building Static Geological Models in the Presence of Missing Values and Sparse Data



Journal Title

Journal ISSN

Volume Title



Earth modeling is a critical tool available for geoscientists and engineers to assess risk and make important decisions in drilling, completing, and producing wells. It requires input of available data such as well log and seismic data, to generate the present-day distribution of rock and fluid properties of a petroleum system and its elements. Model accuracy is highly dependent on the quality and quantity of available data. Two fundamental issues confronted in earth modeling is the occurrence of missing values and sparse data. Missing values refers to gaps in well logs that remain unrecorded or intervals of unrecorded data attributed to various sources. Sparse data occurs when there is a lack of data to build a reliable model. The missing value and sparse data problems often occur in unconventional reservoirs where well logs are often not run, or poor hole conditions can often prevent continuous data collection. Such problems can also take place in conventional reservoirs. The lack of data and occurrence of missing values make it challenging to create reliable earth models and understand their uncertainty.

The objective of this study is to solve the missing values issue using statistical imputation, machine learning, and geostatistical techniques to predict proxy values where data values are missing. The sparse data problem may also be obviated by the integration of a 3-D finite-volume basin model, containing estimates of key variables, collocated with reservoir petrophysical well data. A finite-volume of petrophysical, geochemical, and geomechanical properties derived from the finite-volume basin model provide “soft” data where “hard” well data does not exist. Often, the creation of an earth model from sparse data is achieved by the use of a secondary variable such as seismic data, which is not always available and can be expensive to acquire. A finite-volume basin model can be effectively used as a co-variable in place or in addition to seismic, is inexpensive compared to seismic and requires a minimum set of data input which is commonly accessible from public domain sources. The proposed methods of missing value imputation or prediction and collocation of a finite-volume basin model with local petrophysical well data can provide qualitative and quantitative information essential for the generation of a more reliable earth model.

Missing values are typically handled by petrophysicists and geostatisticians as a pre-modeling procedure prior to simulation during earth modeling. The time allocated to data preparation far exceeds the amount of time focused on earth modeling. Big data exacerbates this dilemma along with problems associated with missing values and sparse data. The techniques suggested in this thesis are scalable to high performance computing, machine learning, and automated environments, providing an opportunity to reduce the amount of time spent on data quality control and increase earth model reliability.



Data science, Earth modeling