Highly Scalable and Accelerated Kernel machine Training For Classification and Regression Problems

Date

2022-12-09

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Mathematical optimization is the backbone of any machine learning algorithm, data science, and engineering. Kernel machines are a class of machine learning algorithms primarily for classification and regression problems. They are statistically well-founded for linear data such as Support Vector Machine (SVM) and Logistic Regression (LR). However, in the real world, the data often establishes non-linear patterns that are harder to characterize using traditional linear models. A set of positive definite kernel functions were developed to capture and analyze these unknown patterns in an effective manner. Although Kernel machines have demonstrated a solid ability to characterize intricate patterns, scaling Kernel machines for large-scale datasets is prohibitively expensive, even for a cluster of computers. Over the last decade, there have been synergistic advancements in architectures and capabilities of HPC (High-Performance Computing) systems, such as distributed memory clusters, many-core systems, accelerators, and emerging technology called Performance Computing) systems, such as distributed memory clusters, many-core systems, accelerators, and emerging technology called neural engines or tensor core units that present new opportunities and challenges for enabling large-scale Kernel machines. This thesis offers a high-performance software system that is a collection of Kernel machine training algorithms. These algorithms are fast and scalable to large-scale datasets and computing resources. They employ numerical optimization and structured low-rank linear algebra algorithms that are conducive to parallelization and acceleration and pioneer the use of neural engines in numerical linear algebra and optimization.

Description

Keywords

Support vector machines, SVM, Kernel trick, Large scale machine learning, Randomized linear algebra, Interior point method, Quadratic program, GPU, TensorCore, HPC, Kernel machine, Gaussian Process Regression

Citation

Portions of this document appear in: Shah, Ruchi, Shaoshuai Zhang, Ying Lin, and Panruo Wu. "xSVM: Scalable distributed kernel support vector machine training." In 2019 IEEE International Conference on Big Data (Big Data), pp. 155-164. IEEE, 2019; and in: Zhang, Shaoshuai, Ruchi Shah, and Panruo Wu. "TensorSVM: accelerating kernel machines with tensor engine." In Proceedings of the 34th ACM International Conference on Supercomputing, pp. 1-11. 2020.