Scientific Machine Learning for Bayesian Inverse Problems Governed by the Fitzhugh-Nagumo Model

Journal Title
Journal ISSN
Volume Title

This dissertation discusses research on scientific machine learning for Bayesian inverse problems governed by non-linear ordinary differential equations. More precisely, the forward operator of the considered problem corresponds to the FitzHugh--Nagumo Model. This ordinary differential equation is a simple model that approximately captures the dynamics governing action potential propagation in a single neuron. The parameter space associated with this model is low-dimensional, and the evaluation of the forward operator is relatively fast. This allows us to effectively explore various methods for Bayesian inference. Despite its simplicity, this model poses significant computational and mathematical challenges for Bayesian inference. Strong non-linearities in the forward and inverse problem result in a complicated, highly ill-conditioned optimization landscape with sharp gradients and narrow optimality zones, making it intractable for traditional variational techniques used to compute point estimates and poses significant challenges for sampling-type approaches for Bayesian inference. The contributions of this dissertation are the study, design, and analysis of artificial neural networks for Bayesian inference in challenging problems governed by non-linear ordinary differential equations. In this dissertation, we take the naive approach of directly learning a surrogate for the pseudo-inverse of the parameter-to-observation map. We develop a framework that allows us to simultaneously infer the model parameters, the noise parameters, and the covariance matrix of the posterior distribution of the model parameters conditioned on noisy, observational data. We explore the performance of different neural network architectures, namely dense neural networks and convolutional neural networks. We study different problem settings, including (i) extrinsic and intrinsic noise perturbations as well as their combination, (ii) the use of different features (time series data, spectral coefficients, and the combination thereof), and (iii) the reconstructions from full and partial observations. We explore different flavors of Markov Chain Monte Carlo sampling techniques to generate training data for the covariance of the posterior distribution. We also consider a Laplace approximation of the posterior to more efficiently generate training data. Our results are promising. We show that our methodology is capable of inferring hidden parameters from noisy perturbations effectively and accurately. However, significant mathematical and computational challenges remain to make these algorithms tractable in large-scale inverse problems. Moreover, while we tie our approach to a Bayesian framework, it is heuristic and ad hoc; we do not provide any mathematical guarantees nor do we explicitly address the ill-posedness of the inverse problem.

Scientific Machine Learning, Inverse Problem, Non-linearities, Dynamical System, FitzHugh--Nagumo, Variational Techniques, Non-Convexity, Objective Function, Neural Networks, Bayesian Inference, Dense Neural Networks, Convolutional Neural Networks, Extrinsic Noise, Intrinsic Noise, Markov Chain Monte Carlo Sampling Techniques, Covariance Matrix, Posterior Distribution Estimation, Laplace Approximation, Parameter Estimation, Uncertainty Quantification