Mean field games and machine learning in distributed systems
Mean field game (MFG) theory is a game-theoretic framework studying the decision-making of a large number of indistinguishable, rational, and heterogeneous agents. The differential game for a large number of agents are typically untractable due to the complex interactions between them. MFG reduces the differential game to an optimal control problem, where a generic user reacts to the mean effect of other players (mean field) instead of separately dealing with the influence of them. A reference player makes the optimal decision based on the Hamilton-Jacobi-Bellman (HJB) equation. Then the mean field evolves based on the Fokker-Planck-Kolmogorov (FPK) equation. However, when the dimension of the state increases, the computational complexity of traditional numerical methods grows exponentially due to the curse of dimensionality.
In this dissertation, there are three main contributions in the field of MFG and Machine Learning. First, we have explored the applications and computational methods for MFG with both continuous and discrete state. In particular, our research starts with the discrete FPK on graph, which is a well-defined gradient flow to describe the evolution of the population given a utility function. After exploring the MFG in a discrete state space, we begin to work on MFG in continuous state and time, consisting of a PDE system of FPK and HJB. Proximal dual hybrid gradient (PDHG) is the numerical method used to solve the MFG in the low dimension. Second, we have extended the MFG to high dimensions by developing a generative adversarial networks-based method, which can efficiently solve stochastic MFGs up to 100 dimensions. Finally, we have applied MFGs to optimize traditional machine learning frameworks. For example, MFG can be used to compute the target action in the deep reinforcement learning, leading to a faster and stable convergence. MFG is also used to compute the device selection probability in the hierarchical federated learning.
MFGs have been applied to many different real-world scenarios. For instance, we have developed real-time resource allocation algorithm for the multi-access edge computing systems, the trajectory optimization method of swarm of unmanned aerial vehicles, and the driving range estimation framework of battery electric vehicles. Beyond that, during the outbreak of COVID-19, we have made contributions to designing transmission models and opinion diffusion models in social network.