Data-driven Inverse Linear Programming: Integrating Inverse Optimization and Machine Learning



Journal Title

Journal ISSN

Volume Title



Inverse linear programming (LP) has received increasing attention due to its potential to infer efficient optimization formulations that can closely replicate the behavior of a complex system. In this thesis, we integrate data-driven inverse optimization techniques with ideas from statistical machine learning to improve the stability and applicability of inverse optimization in settings that involve imperfect data and observations from decision makers with different preferences. In Chapter 1 of this thesis, we provide background information about inverse optimization and discuss its conceptual similarities with problems in machine learning. In Chapter 2, we review relevant literature on inverse optimization and then present the contribution of this thesis to the literature. In Chapter 3, we first discuss the sensitivity of the inversely inferred parameters and corresponding forward solutions from the existing inverse LP methods to noise, errors, and uncertainty in the input data, and present illustrative examples highlighting their limited applicability in data-driven settings. Then, we introduce the notion of inverse and forward stability in inverse LP and borrow ideas from quantile regression to propose a novel inverse LP method that determines a set of objective functions that are stable under data imperfection and generate forward solutions close to the relevant subset of the data. We formulate the inverse model as a tractable large-scale mixed-integer program (MIP), called mixed-integer quantile inverse optimization (MQIO), and apply it to diet survey data to recommend diets that are consistent with the individual's food preferences. In Chapter 4, we analyze the complexity of the large-scale MQIO formulation from Chapter 3 and elucidate its connection to biclique problems, which we exploit to develop an exact algorithm and heuristics that solve much smaller MIPs instead to construct a solution to the original problem. We then numerically evaluate the performance of the proposed algorithms on randomly generated instances. Lastly, we show that a modified version of the proposed heuristics can accommodate online settings and demonstrate them in the diet recommendation and transshipment applications. Finally, in Chapter 5 we integrate inverse optimization and clustering and propose a new clustering approach, called optimality-based clustering, which clusters the data points based on their encoded decision preferences. We assume that each data point is a decision made by a rational decision maker (i.e., by approximately solving an optimization problem), and cluster the data points by identifying a common objective function of the optimization problems for each cluster such that the worst-case optimality gap for the data points within each cluster is minimized. We propose three clustering models and present tractable MIP formulations that lead to lower and upper bound solutions. We demonstrate these clustering models using various-sized randomly generated instances.



Inverse optimization, Decision Making, Machine Learning, Regression, Clustering


Portions of this document appear in: Shahmoradi, Zahed, and Taewoo Lee. "Quantile inverse optimization: Improving stability in inverse linear programming." Operations Research (2021); and in: Shahmoradi, Zahed, and Taewoo Lee. "Optimality-based clustering: An inverse optimization approach." Operations Research Letters 50, no. 2 (2022): 205-212.