On the difficulty of generalizing deep reinforcement learning framework for combinatorial optimization

Pashazadeh, Mostafa

On the difficulty of generalizing deep reinforcement learning framework for combinatorial optimization

dc.contributor.author	Pashazadeh, Mostafa
dc.contributor.supervisor	Wu, Kui
dc.date.accessioned	2021-01-19T22:38:51Z
dc.date.available	2021-01-19T22:38:51Z
dc.date.copyright	2021	en_US
dc.date.issued	2021-01-19
dc.degree.department	Department of Computer Science
dc.degree.level	Master of Science M.Sc.	en_US
dc.description.abstract	Combinatorial optimization problems on the graph with real-life applications are canonical challenges in Computer Science. The difficulty of finding quality labels for problem instances holds back leveraging supervised learning across combinatorial problems. Reinforcement learning (RL) algorithms have recently been adopted to solve this challenge automatically. The underlying principle of this approach is to deploy a graph neural network for encoding both the local information of the nodes and the graph-structured data in order to capture the current state of the environment. Then, a reinforcement learning algorithm trains the actor to learn the problem-specific heuristics on its own and make an informed decision at each state for finally reaching a good solution. Recent studies on this subject mainly focus on a family of combinatorial problems on the graph, such as the travel salesman problem, where the proposed model aims to find an ordering of vertices that optimizes some objective function. We use the security-aware phone clone allocation in the cloud as a classical quadratic assignment problem to study whether or not deep RL-based model is generally applicable to solve other classes of such hard problems. Our work contributes in two directions: First, we provide an analytical method that reduces the phone clone allocation problem to the traditional QP programming and evidence its superiority over heuristic algorithms with quality approximation solutions. Second, we build a powerful model that not only captures the node embedding in the context of graph-structured data but also provides valuable information related to the decision making. We then adopt a fitted RL algorithm to train the actor to make informed decisions. Extensive experimental evaluation shows that existing RL-based models may not generalize to discrete quadratic assignment problems, where incrementally constructed solution is not an inherent requirement. Furthermore, we highlight the main features of problems that contribute to the success of applying RL algorithms.	en_US
dc.description.scholarlevel	Graduate	en_US
dc.identifier.uri	http://hdl.handle.net/1828/12576
dc.language	English	eng
dc.language.iso	en	en_US
dc.rights	Available to the World Wide Web	en_US
dc.subject	Reinforcement learning	en_US
dc.subject	Graph neural network	en_US
dc.subject	Optimization	en_US
dc.subject	Cloud	en_US
dc.title	On the difficulty of generalizing deep reinforcement learning framework for combinatorial optimization	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Pashazadeh_Mostafa_MSc_2021.pdf
Size:: 630.48 KB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Electronic Theses and Dissertations (ETD)