Towards Plausible Collaborative Machine Learning: Privacy, Efficiency and Fairness

Date

2022-05-12

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Nowadays, the development of machine learning shows great potential in a variety of fields, such as retail, healthcare, and insurance. Effective machine learning models can automatically learn useful information from a large amount of data and provide decisions with high average accuracy. Although machine learning has infiltrated into many areas due to its ad- vantages, a vast amount of data has been generated at an ever-increasing rate, which leads to significant computational complexity for data collection and processing via a centralized machine learning approach. Distributed machine learning thus has received huge interest due to its capability of exploiting the collective computing power of edge devices. However, during the learning process, model updates using local private samples and large-scale parameter exchanges among agents impose severe privacy concerns and communication bottlenecks. Moreover, the decisions and predictions offered by the learning models may cause certain fairness concerns among population groups of interest, when the grouping is based on such sensitive attributes as race and gender. To address those challenges, in this dissertation, we first propose a number of differentially private Alternating Direction Method of Multipliers (ADMM) algorithms that leverage two key ideas to balance the privacy-accuracy tradeoff: (1) adding Gaussian noise with decaying variance to reduce the negative effects of noise addition and maintain the convergence behaviors; and (2) outputting a noisy approximate solution for the perturbed objective to release the shackles of the exact optimal solution during each ADMM iteration to ensure DP. It is shown that our algorithms can significantly improve the privacy-accuracy tradeoff over existing solutions. Second, we develop a differentially private and communication efficient decentralized gradient descent method that will update the local models by integrating DP noise and random quantization operator to simultaneously enforce DP and communication efficiency. Finally, we focus on addressing the discrimination and privacy concerns in classification models by incorporating functional mechanism and decision boundary covariance, a novel measure of decision boundary fairness.

Description

Keywords

Collaborative Machine Learning, Differential Privacy, Fairness

Citation

Portions of this document appear in: Jiahao Ding, Xinyue Zhang, Mingsong Chen, Kaiping Xue, Chi Zhang, and Miao Pan, “Differentially Private Robust ADMM for Distributed Machine Learning”, IEEE International Conference on Big Data (BigData’19), Los Angeles, CA, December 9-12, 2019; and in: Jiahao Ding, Jingyi Wang, Guannan Liang, Jinbo Bi and Miao Pan, “Towards Plausible Differentially Private ADMM Based Distributed Machine Learning”, ACM International Conference on Information and Knowledge Management (CIKM’20), Fully ONLINE, October 19-23, 2020; and in: Jiahao Ding, Guannan Liang, Jinbo Bi, and Miao Pan, “Differentially Private and Communication Efficient Collaborative Learning”, AAAI Conference on Artificial Intelligence (AAAI’21), Fully ONLINE, February 2-9, 2021; and in: Jiahao Ding, Xinyue Zhang, Xiaohuan Li, Junyi Wang, Rong Yu, and Miao Pan, “Differentially Private and Fair Classification via Calibrated Functional Mechanism”, AAAI Conference on Artificial Intelligence (AAAI’20), New York, NY, February 7-12, 2020.