Highly Scalable and Accelerated Kernel machine Training For Classification and Regression Problems
Abstract
Mathematical optimization is the backbone of any machine learning algorithm, data science, and engineering. Kernel machines are a class of machine learning algorithms primarily for classification and regression problems. They are statistically well-founded for linear data such as Support Vector Machine (SVM) and Logistic Regression (LR). However, in the real world, the data often establishes non-linear patterns that are harder to characterize using traditional linear models. A set of positive definite kernel functions were developed to capture and analyze these unknown patterns in an effective manner. Although Kernel machines have demonstrated a solid ability to characterize intricate patterns, scaling Kernel machines for large-scale datasets is prohibitively expensive, even for a cluster of computers. Over the last decade, there have been synergistic advancements in architectures and capabilities of HPC (High-Performance Computing) systems, such as distributed memory clusters, many-core systems, accelerators, and emerging technology called Performance Computing) systems, such as distributed memory clusters, many-core systems, accelerators, and emerging technology called neural engines or tensor core units that present new opportunities and challenges for enabling large-scale Kernel machines. This thesis offers a high-performance software system that is a collection of Kernel machine training algorithms. These algorithms are fast and scalable to large-scale datasets and computing resources. They employ numerical optimization and structured low-rank linear algebra algorithms that are conducive to parallelization and acceleration and pioneer the use of neural engines in numerical linear algebra and optimization.