Adaptive algorithms for online learning in non-stationary environments

dc.contributor.authorNguyen, Quan M.
dc.contributor.supervisorMehta, Nishant
dc.date.accessioned2026-01-26T21:37:39Z
dc.date.available2026-01-26T21:37:39Z
dc.date.issued2025
dc.degree.departmentDepartment of Computer Science
dc.degree.levelDoctor of Philosophy PhD
dc.description.abstractTraditional online learning literature often assumes static environments, where fundamental properties like data distribution or action spaces do not change over time, and the learner competes against a single best action. This framework, however, fails to capture the complexity of many practical scenarios, such as automated diagnostic systems or inventory management, where the optimal course of action is non-stationary and changes sequentially. In such settings, adaptivity is crucial as algorithms must maintain and leverage past information to respond effectively to unforeseen changes. This thesis advances the theory of online learning in non-stationary environments by developing adaptive algorithms with provably strong theoretical guarantees. Two key non-stationary learning problems are online multi-task reinforcement learning (OMTRL) and multi-armed bandits with sleeping arms. In OMTRL, a learner interacts with a sequence of Markov Decision Processes (MDPs). Each MDP is chosen adversarially from a small collection of MDPs, requiring the learner to efficiently transfer knowledge between tasks. In multi-armed bandits with sleeping arms, the set of available arms varies adversarially across rounds, prompting the learner with unique exploration-exploitation tradeoff methods. A key contribution of this thesis is a number of novel lower bounds and algorithms with near-optimal worst-case regret upper bounds for these two problems. In addition, this thesis applies the new techniques in these new algorithms into deriving improved sample complexity for group distributionally robust optimization (GDRO) and novel data-dependent best-of-both-worlds regret upper bounds for multi-armed bandits. In summary, this thesis provides mathematically-grounded adaptive algorithms that achieve state-of-the-art performance guarantees in learning from non-stationary and adversarially changing environments in reinforcement learning and multi-armed bandits, as well as showing new, fundamental connections between multi-armed bandits with sleeping arms and robust optimization.
dc.description.scholarlevelGraduate
dc.identifier.bibliographicCitationNguyen, Q. and Mehta, N. A. (2023). Adversarial online multi-task reinforcement learning. In International Conference on Algorithmic Learning Theory (ALT).
dc.identifier.bibliographicCitationNguyen, Q. and Mehta, N. A. (2024). Near-Optimal Per-Action Regret Bounds for Sleeping Bandits. In International Conference on Artificial Intelligence and Statistics (AISTATS).
dc.identifier.bibliographicCitationNguyen, Q., Mehta, N. A., and Guzmán, C. (2025b). Beyond minimax rates in group distributionally robust optimization via a novel notion of sparsity. In International Conference on Machine Learning (ICML).
dc.identifier.bibliographicCitationNguyen, Q., Ito, S., Komiyama, J., and Mehta, N. A. (2025a). Data-dependent bounds with T-optimal best-of-both-worlds guarantees in multi-armed bandits using stability-penalty matching. In the Conference on Learning Theory (COLT).
dc.identifier.urihttps://hdl.handle.net/1828/23073
dc.languageEnglisheng
dc.language.isoen
dc.rightsAvailable to the World Wide Web
dc.subjectMachine learning theory
dc.subjectAdaptive algorithms
dc.subjectOnline learning and optimization
dc.subjectStochastic convex optimization
dc.subjectMulti-armed bandits
dc.subjectReinforcement learning
dc.titleAdaptive algorithms for online learning in non-stationary environments
dc.typeThesis

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Nguyen_Quan_PhD_2025.pdf
Size:
4.63 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.62 KB
Format:
Item-specific license agreed upon to submission
Description: