Optimization and Optimal Control in Machine Learning

Mang, AndreasSyed, Ali Hamza Abidi2021-07-072021-07-072021-04-01https://hdl.handle.net/10657/7794The objectives of this study are the analysis and design of efficient computational methods for deep learning, focusing on numerical optimization schemes. Optimization is about computing the best element concerning some criterion from a set of available alternatives (the search space). If we define this criterion to be a function that assigns costs to different realizations of the elements in the search space, we can formulate the search for the best element as the minimization of this “cost function.” A prominent approach to compute this minimizer is gradient descent strategies. However, it is established that these methods typically require a prohibitive amount of iterations until one reaches an acceptable minimizer, and they do not scale with the number of unknowns. Methods that consider second-order derivative information are in many cases more effective but also more involved. Initially, we considered a non-linear least squares problem as a simplified example for a learning problem. We developed an iterative, matrix-free Newton--Krylov method. We tested our method for the MNIST dataset. Subsequently, we considered an optimal control formulation for training deep neural networks motivated by a partial differential equation interpretation of the forward propagation. We discretized the forward propagation using different numerical schemes available in the literature. In particular, we studied explicit Euler schemes with antisymmetric weight matrices and a Verlet method to solve the associated Hamiltonian system. In our future work, we will explore these implementations in the context of evaluating adjoint operators as they arise in the optimization setting.en-USThe author of this work is the copyright owner. UH Libraries and the Texas Digital Library have their permission to store and provide access to this work. Further transmission, reproduction, or presentation of this work is prohibited except with permission of the author(s).Optimization and Optimal Control in Machine LearningPoster