On the difficulty of generalizing deep reinforcement learning framework for combinatorial optimization

Date

2021-01-19

Authors

Pashazadeh, Mostafa

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Combinatorial optimization problems on the graph with real-life applications are canonical challenges in Computer Science. The difficulty of finding quality labels for problem instances holds back leveraging supervised learning across combinatorial problems. Reinforcement learning (RL) algorithms have recently been adopted to solve this challenge automatically. The underlying principle of this approach is to deploy a graph neural network for encoding both the local information of the nodes and the graph-structured data in order to capture the current state of the environment. Then, a reinforcement learning algorithm trains the actor to learn the problem-specific heuristics on its own and make an informed decision at each state for finally reaching a good solution. Recent studies on this subject mainly focus on a family of combinatorial problems on the graph, such as the travel salesman problem, where the proposed model aims to find an ordering of vertices that optimizes some objective function. We use the security-aware phone clone allocation in the cloud as a classical quadratic assignment problem to study whether or not deep RL-based model is generally applicable to solve other classes of such hard problems. Our work contributes in two directions: First, we provide an analytical method that reduces the phone clone allocation problem to the traditional QP programming and evidence its superiority over heuristic algorithms with quality approximation solutions. Second, we build a powerful model that not only captures the node embedding in the context of graph-structured data but also provides valuable information related to the decision making. We then adopt a fitted RL algorithm to train the actor to make informed decisions. Extensive experimental evaluation shows that existing RL-based models may not generalize to discrete quadratic assignment problems, where incrementally constructed solution is not an inherent requirement. Furthermore, we highlight the main features of problems that contribute to the success of applying RL algorithms.

Description

Keywords

Reinforcement learning, Graph neural network, Optimization, Cloud

Citation