Thompson sampling-based online decision making in network routing

Huang, Zhiming

Thompson sampling-based online decision making in network routing

Files

Huang_Zhiming_MSc_2020.pdf (3.03 MB)

Date

2020-09-02

Authors

Huang, Zhiming

Abstract

Online decision making is a kind of machine learning problems where decisions are made in a sequential manner so as to accumulate as many rewards as possible. Typical examples include multi-armed bandit (MAB) problems where an agent needs to decide which arm to pull in each round, and network routing problems where each router needs to decide the next hop for each packet. Thompson sampling (TS) is an efficient and effective algorithm for online decision making problems. Although TS has been proposed for a long time, it was not until recent years that the theoretical guarantees for TS in the standard MAB were given. In this thesis, we first analyze the performance of TS both theoretically and practically in a special MAB called combinatorial MAB with sleeping arms and long-term fairness constraints (CSMAB-F). Then, we apply TS to a novel reactive network routing problem, called \emph{opportunistic routing without link metrics known a priori}, and use the proof techniques we developed for CSMAB-F to analyze the performance.

Keywords

Online Decision Making, Multi-armed Bandits, Thompson Sampling, Network Routing

URI

http://hdl.handle.net/1828/12095

Collections

Electronic Theses and Dissertations (ETD)

Full item page

Thompson sampling-based online decision making in network routing

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections