Towards Plausible Collaborative Machine Learning: Privacy, Efficiency and Fairness
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Nowadays, the development of machine learning shows great potential in a variety of fields, such as retail, healthcare, and insurance. Effective machine learning models can automatically learn useful information from a large amount of data and provide decisions with high average accuracy. Although machine learning has infiltrated into many areas due to its ad- vantages, a vast amount of data has been generated at an ever-increasing rate, which leads to significant computational complexity for data collection and processing via a centralized machine learning approach. Distributed machine learning thus has received huge interest due to its capability of exploiting the collective computing power of edge devices. However, during the learning process, model updates using local private samples and large-scale parameter exchanges among agents impose severe privacy concerns and communication bottlenecks. Moreover, the decisions and predictions offered by the learning models may cause certain fairness concerns among population groups of interest, when the grouping is based on such sensitive attributes as race and gender. To address those challenges, in this dissertation, we first propose a number of differentially private Alternating Direction Method of Multipliers (ADMM) algorithms that leverage two key ideas to balance the privacy-accuracy tradeoff: (1) adding Gaussian noise with decaying variance to reduce the negative effects of noise addition and maintain the convergence behaviors; and (2) outputting a noisy approximate solution for the perturbed objective to release the shackles of the exact optimal solution during each ADMM iteration to ensure DP. It is shown that our algorithms can significantly improve the privacy-accuracy tradeoff over existing solutions. Second, we develop a differentially private and communication efficient decentralized gradient descent method that will update the local models by integrating DP noise and random quantization operator to simultaneously enforce DP and communication efficiency. Finally, we focus on addressing the discrimination and privacy concerns in classification models by incorporating functional mechanism and decision boundary covariance, a novel measure of decision boundary fairness.