On the difficulty of generalizing deep reinforcement learning framework for combinatorial optimization

dc.contributor.authorPashazadeh, Mostafa
dc.contributor.supervisorWu, Kui
dc.date.accessioned2021-01-19T22:38:51Z
dc.date.available2021-01-19T22:38:51Z
dc.date.copyright2021en_US
dc.date.issued2021-01-19
dc.degree.departmentDepartment of Computer Science
dc.degree.levelMaster of Science M.Sc.en_US
dc.description.abstractCombinatorial optimization problems on the graph with real-life applications are canonical challenges in Computer Science. The difficulty of finding quality labels for problem instances holds back leveraging supervised learning across combinatorial problems. Reinforcement learning (RL) algorithms have recently been adopted to solve this challenge automatically. The underlying principle of this approach is to deploy a graph neural network for encoding both the local information of the nodes and the graph-structured data in order to capture the current state of the environment. Then, a reinforcement learning algorithm trains the actor to learn the problem-specific heuristics on its own and make an informed decision at each state for finally reaching a good solution. Recent studies on this subject mainly focus on a family of combinatorial problems on the graph, such as the travel salesman problem, where the proposed model aims to find an ordering of vertices that optimizes some objective function. We use the security-aware phone clone allocation in the cloud as a classical quadratic assignment problem to study whether or not deep RL-based model is generally applicable to solve other classes of such hard problems. Our work contributes in two directions: First, we provide an analytical method that reduces the phone clone allocation problem to the traditional QP programming and evidence its superiority over heuristic algorithms with quality approximation solutions. Second, we build a powerful model that not only captures the node embedding in the context of graph-structured data but also provides valuable information related to the decision making. We then adopt a fitted RL algorithm to train the actor to make informed decisions. Extensive experimental evaluation shows that existing RL-based models may not generalize to discrete quadratic assignment problems, where incrementally constructed solution is not an inherent requirement. Furthermore, we highlight the main features of problems that contribute to the success of applying RL algorithms.en_US
dc.description.scholarlevelGraduateen_US
dc.identifier.urihttp://hdl.handle.net/1828/12576
dc.languageEnglisheng
dc.language.isoenen_US
dc.rightsAvailable to the World Wide Weben_US
dc.subjectReinforcement learningen_US
dc.subjectGraph neural networken_US
dc.subjectOptimizationen_US
dc.subjectClouden_US
dc.titleOn the difficulty of generalizing deep reinforcement learning framework for combinatorial optimizationen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Pashazadeh_Mostafa_MSc_2021.pdf
Size:
630.48 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2 KB
Format:
Item-specific license agreed upon to submission
Description: