Abstract:
In this paper we are focusing on solving parametric optimization problems, i.e. min
θf(
θ,
w), where
θis the variable to be optimized and w is a vector that parameterize the optimization problem. Such kind of problems are very commonly seen in reality. We propose an efficient method to train a model that connects the solution to the parameters and thus solve all the problems with the same structure and different parameters at the same time. During training process, instead of solving a series of optimization problems with randomly sampled w independently, we adopt reinforcement learning to accelerate the training process. Two networks are trained alternately. The first network is a value network, and it is trained to fit the target loss function. The second network is a policy network, whose output is connected to the input
θof the value network and it is trained to minimize the output of the value network. Experiments demonstrate the effectiveness of the proposed method.